Actuarial Outpost
 
Go Back   Actuarial Outpost > Exams - Please Limit Discussion to Exam-Related Topics > SoA/CAS Preliminary Exams > Exam PA: Predictive Analytics
FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions


Upload your resume securely at https://www.dwsimpson.com
to be contacted when new jobs meet your skills and objectives.


Reply
 
Thread Tools Search this Thread Display Modes
  #221  
Old 05-21-2019, 04:22 PM
DyalDragon's Avatar
DyalDragon DyalDragon is offline
Member
SOA
 
Join Date: Apr 2009
Location: Here
Studying for the hell of it...
College: AASU
Favorite beer: This one...
Posts: 33,511
Default

Searched a bunch of other previews of the book and although probit is mentioned several times in snippets, none of the results were very helpful.
__________________
P FM MFE MLC C Predictive Analytics
VEE FAP FAP FAP FAP FAP FAP FAP FAP APC
Reply With Quote
  #222  
Old 05-21-2019, 04:26 PM
DyalDragon's Avatar
DyalDragon DyalDragon is offline
Member
SOA
 
Join Date: Apr 2009
Location: Here
Studying for the hell of it...
College: AASU
Favorite beer: This one...
Posts: 33,511
Default

Quote:
Originally Posted by ElDucky View Post
Is there a way to just view the modules and not pay to take the exam? I have no interest in doing the exam, but I'm interested in the modules. I'll just read some books and check out some of the free courses otherwise.
Find a dead lizard, soak it in sambuca overnight, and then just keep slapping yourself in the face with it until this interest fades. I'd wager the experience will be more pleasant than reading through the modules...
__________________
P FM MFE MLC C Predictive Analytics
VEE FAP FAP FAP FAP FAP FAP FAP FAP APC
Reply With Quote
  #223  
Old 05-22-2019, 12:16 AM
noone noone is offline
Member
SOA
 
Join Date: Feb 2017
Posts: 138
Default

There seem to be four ways to compare a model;
root mean squared error
mean squared error
sum of squared error
loglikelihood

When do you use one over the other?
Reply With Quote
  #224  
Old 05-22-2019, 10:26 AM
Josh Peck Josh Peck is offline
Member
SOA
 
Join Date: Dec 2016
College: Towson University
Posts: 99
Default

Quote:
Originally Posted by noone View Post
There seem to be four ways to compare a model;
• root mean squared error
• mean squared error
• sum of squared error
• loglikelihood

When do you use one over the other?
RMSE MSE and SSE are all essentially measuring the same thing
SSE is the sum of the squared errors
MSE is essentially the variance of the errors
(note variance uses a square in the formula to penalize error more if they are far off and less if they are close[ <1 ] then divides by n - 1 )
RMSE is essentially the standard deviation of the errors
(note the square root is what makes standard deviation easier to understand because it sort of undoes the previous square)

Out of the three I would use RMSE because it is the most interpretable as people in general understand standard deviation a little bit. You could even describe RMSE as essentially the expected error.

SSE only makes sense to compare between models from the same sample since it depends on the number of samples used. (More samples means more summing which means a bigger number)

Log Likelihood is better to use for a skewed distribution like Poisson, but is less interpretable than RMSE.
It is less interpretable because it is the log of the probability that your dataset takes on the distribution you are assuming. A higher number is better, but the number itself doesn't mean much.

For classification models you can use accuracy, error rate, AUC and AIC (and some additional things like BIC but I wouldn't worry about those)
AUC is hard to sum up briefly, but there should be a lot of resources on it since it does get used in the real world.
Here is a video that explains AIC https://www.youtube.com/watch?v=LkifE44myLc
For classification models I like to use accuracy of the training set, accuracy of the testing set, and then subtract the two as a measure of stability of the model. (Note I scored a 10 on the classification section of the December exam using this method, but with log likelihood)
__________________
P FM MFE C PA

Last edited by Josh Peck; 06-11-2019 at 01:39 PM..
Reply With Quote
  #225  
Old 05-22-2019, 10:59 AM
Josh Peck Josh Peck is offline
Member
SOA
 
Join Date: Dec 2016
College: Towson University
Posts: 99
Default

I also just realized I left out R^2 and deviance

R^2 is the Error Explained by the model divided by the total error.
So it can be thought of as the % of error explained by the model.
For a linear model with one predictor this is equivalent to the correlation between X and Y squared, which is where it got it's name.
Note that if you add any additional predictor to the model it will at least increase R^2 by a tiny amount so take this into account when you are comparing models with different numbers of predictors.
Also note that adjusted R^2 attempts to fix this issue.
If you want more details on R^2 I'm sure its very easy to find

Deviance is -2L where L is the Log Likelihood.
Because of the negative sign, we want to minimize this (for the same reason we want to maximize log likelihood)
Note Null Deviance is the deviance for the null model which simply uses the sample mean for all predictions
More info on deviance: https://bookdown.org/egarpor/SSS2-UC...-deviance.html

If anyone can think of other methods I am leaving out, please add them.
__________________
P FM MFE C PA

Last edited by Josh Peck; 05-22-2019 at 01:51 PM..
Reply With Quote
  #226  
Old 05-22-2019, 06:14 PM
jdman929 jdman929 is offline
SOA
 
Join Date: Aug 2017
College: University of Wisconsin - Madison
Posts: 22
Default

Has anyone else gotten stuck on the Student Success Practice Exam Decision Tree portion? I'm trying to run the code provided and I get errors. The code in question is:

library(rpart)
library(rpart.plot)
set.seed(123)
excluded_variables <- c("G3") # List excluded variables

dt <- rpart(G3.Pass.Flag ~ .,
data = Train.DS[, !(names(Full.DS) %in% excluded_variables)],
control = rpart.control(minbucket = 5, cp = .001, maxdepth = 20),
parms = list(split = "gini"))

rpart.plot(dt)

Error in `[.data.frame`(Train.DS, , !(names(Full.DS) %in% excluded_variables)) : undefined columns selected


Does anyone know what's going on?
Reply With Quote
  #227  
Old 05-22-2019, 10:16 PM
Yossarian Yossarian is offline
Member
SOA
 
Join Date: Jun 2011
Location: SoCal
Posts: 113
Default

Quote:
Originally Posted by jdman929 View Post
Has anyone else gotten stuck on the Student Success Practice Exam Decision Tree portion? I'm trying to run the code provided and I get errors. The code in question is:

library(rpart)
library(rpart.plot)
set.seed(123)
excluded_variables <- c("G3") # List excluded variables

dt <- rpart(G3.Pass.Flag ~ .,
data = Train.DS[, !(names(Full.DS) %in% excluded_variables)],
control = rpart.control(minbucket = 5, cp = .001, maxdepth = 20),
parms = list(split = "gini"))

rpart.plot(dt)

Error in `[.data.frame`(Train.DS, , !(names(Full.DS) %in% excluded_variables)) : undefined columns selected


Does anyone know what's going on?
What happens when you run "excluded_variables" in the console? Do you get [1] "G3"?

Then, check your dataset Train.DS, does it have a variable G3 (easy to check in the Global Environment pane, or type names(Train.DS))?
Reply With Quote
  #228  
Old 05-23-2019, 08:32 AM
jdman929 jdman929 is offline
SOA
 
Join Date: Aug 2017
College: University of Wisconsin - Madison
Posts: 22
Default

Quote:
Originally Posted by Yossarian View Post
What happens when you run "excluded_variables" in the console? Do you get [1] "G3"?

Then, check your dataset Train.DS, does it have a variable G3 (easy to check in the Global Environment pane, or type names(Train.DS))?
Yes, when I run "excluded_variables" I get [1] "G3" and yes Train.DS has the variable G3...

I just figured out that if I delete the entire section about excluded_variables and runthe code with just

data = Train.DS

it works, but there's only one node! It spits G3 based on P or F...which isn't too helpful!
Reply With Quote
  #229  
Old 05-23-2019, 08:36 AM
Gettin Lucky In Kentucky Gettin Lucky In Kentucky is offline
Member
SOA
 
Join Date: Apr 2018
Location: Louisville Ky
Studying for Specialty
College: Eastern Kentucky University Graduate
Posts: 32
Default Model Comparisons

Someone smarter than myself should post a pro's and con's list of the various models on here.
Reply With Quote
  #230  
Old 05-23-2019, 08:45 AM
Whoaminoneofyourbusiness's Avatar
Whoaminoneofyourbusiness Whoaminoneofyourbusiness is offline
Member
SOA
 
Join Date: Jan 2017
Location: The Grand Tournament
Studying for DP
Posts: 943
Default

Quote:
Originally Posted by Gettin Lucky In Kentucky View Post
Someone smarter than myself should post a pro's and con's list of the various models on here.
I believe in u
__________________
Spoiler:
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


All times are GMT -4. The time now is 08:43 AM.


Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
*PLEASE NOTE: Posts are not checked for accuracy, and do not
represent the views of the Actuarial Outpost or its sponsors.
Page generated in 0.24802 seconds with 10 queries