Actuarial Outpost
 
Go Back   Actuarial Outpost > Exams - Please Limit Discussion to Exam-Related Topics > SoA/CAS Preliminary Exams > Exam PA: Predictive Analytics
FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions


Fill in a brief DW Simpson Registration Form
to be contacted when new jobs meet your criteria.


Reply
 
Thread Tools Search this Thread Display Modes
  #231  
Old 05-23-2019, 10:34 AM
rstein rstein is offline
SOA
 
Join Date: Jan 2019
Posts: 12
Default Glm vs glmnet vs cv.glmnet

Can someone confirm that I have understood glm vs glmnet vs cv.glmnet correctly?

GLM:
-for when you want to model a simple glm(i.e. does not follow normal distribution,etc) using any of the family distributions and their respective link functions.
-All variables will be included in the model
-Uses the standard y ~ x formula syntax

GLMNET:
-for when you want to employ regularization to your variables to shrink or remove some of them (depending on if it's lasso or ridge...)
-You need to specify alpha and lambda (or for the latter, allow glmnet to generate default lambda values for you).
-Need to set up a model.matrix etc for the formula

CV.GLMNET:
-using cross-validation, automatically finds optimal value of lambda

I'm just unsure which to use first when applying regularization: glmnet or cv.glment? Or does it not matter? From rmd 6.7 it seems like cv.glmnet was run first to find the lambda.min which was then used in the glmnet model but still unsure about this.

Also, can any family be used with glmnet? I know the default is gaussian and binomial can also be used, but can it be used with poisson, etc?

Any insight or additions to what I wrote above would be appreciated!

Last edited by rstein; 05-23-2019 at 10:39 AM.. Reason: Another question
Reply With Quote
  #232  
Old 05-23-2019, 11:12 AM
Josh Peck Josh Peck is offline
Member
SOA
 
Join Date: Dec 2016
College: Towson University
Posts: 81
Default

Quote:
Originally Posted by rstein View Post
I'm just unsure which to use first when applying regularization: glmnet or cv.glment? Or does it not matter? From rmd 6.7 it seems like cv.glmnet was run first to find the lambda.min which was then used in the glmnet model but still unsure about this.

Also, can any family be used with glmnet? I know the default is gaussian and binomial can also be used, but can it be used with poisson, etc?

Any insight or additions to what I wrote above would be appreciated!
You are correct. Here is a snippet of my notes

Use Elastic Net Reqularization
```{r}

library(glmnet)
set.seed(1234)
f <- as.formula("RESPONSE ~ PREDICTOR + PREDICTOR + ...")
x.train <- model.matrix(f, DATASET.train)

m <- cv.glmnet(x = x.train,
y = DATASET.train$RESPONSE,
family = "FAMILY",
alpha = ALPHA)

m.best <- glmnet(x = X.train,
y = Train.DS$G3.Pass.Flag,
family = "FAMILY",
lambda = m$lambda.min,
alpha = ALPHA)
m.best$beta

```

If you want to see which families are included simply run the command
?glmnet
__________________
P FM MFE C

Last edited by Josh Peck; 05-23-2019 at 11:44 AM..
Reply With Quote
  #233  
Old 05-23-2019, 04:28 PM
Squeenasaurus Squeenasaurus is offline
Member
SOA
 
Join Date: Jul 2016
College: Illinois State University
Favorite beer: Lagunitas
Posts: 184
Default

Quote:
Originally Posted by Josh Peck View Post
I also just realized I left out R^2 and deviance

R^2 is the Error Explained by the model divided by the total error.
So it can be thought of as the % of error explained by the model.
For a linear model with one predictor this is equivalent to the correlation between X and Y squared, which is where it got it's name.
Note that if you add any additional predictor to the model it will at least increase R^2 by a tiny amount so take this into account when you are comparing models with different numbers of predictors.
Also note that adjusted R^2 attempts to fix this issue.
If you want more details on R^2 I'm sure its very easy to find

Deviance is -2L where L is the Log Likelihood.
Because of the negative sign, we want to minimize this (for the same reason we want to maximize log likelihood)
Note Null Deviance is the deviance for the null model which simply uses the sample mean for all predictions
More info on deviance: https://bookdown.org/egarpor/SSS2-UC...-deviance.html

If anyone can think of other methods I am leaving out, please add them.
https://learning.soa.org/CServer/Cou...1B7EA88A51.pdf

I stole this from the FAP modules. It goes into the pros and cons of various accuracy metrics.
Reply With Quote
  #234  
Old 05-23-2019, 06:20 PM
Yossarian Yossarian is offline
Member
SOA
 
Join Date: Jun 2011
Location: SoCal
Posts: 39
Default Not smarter than yourself

Quote:
Originally Posted by Gettin Lucky In Kentucky View Post
Someone smarter than myself should post a pro's and con's list of the various models on here.
https://thereputationalgorithm.com/2...s-infographic/

https://www.hackingnote.com/en/machi...-pros-and-cons

https://rmartinshort.jimdo.com/2019/...ml-algorithms/
Reply With Quote
  #235  
Old 05-24-2019, 12:11 AM
Squeenasaurus Squeenasaurus is offline
Member
SOA
 
Join Date: Jul 2016
College: Illinois State University
Favorite beer: Lagunitas
Posts: 184
Default

Quote:
Originally Posted by rstein View Post
Can someone confirm that I have understood glm vs glmnet vs cv.glmnet correctly?

GLM:
-for when you want to model a simple glm(i.e. does not follow normal distribution,etc) using any of the family distributions and their respective link functions.
-All variables will be included in the model
-Uses the standard y ~ x formula syntax

GLMNET:
-for when you want to employ regularization to your variables to shrink or remove some of them (depending on if it's lasso or ridge...)
-You need to specify alpha and lambda (or for the latter, allow glmnet to generate default lambda values for you).
-Need to set up a model.matrix etc for the formula

CV.GLMNET:
-using cross-validation, automatically finds optimal value of lambda

I'm just unsure which to use first when applying regularization: glmnet or cv.glment? Or does it not matter? From rmd 6.7 it seems like cv.glmnet was run first to find the lambda.min which was then used in the glmnet model but still unsure about this.

Also, can any family be used with glmnet? I know the default is gaussian and binomial can also be used, but can it be used with poisson, etc?

Any insight or additions to what I wrote above would be appreciated!
First create a GLM model with as many predictors as possible. Assess the model. Now we want to simplify the model and (ideally) make it more accurate at the same time. I approach this two ways:

1. Use stepwise-selection based on AIC for feature selection. Achieve this by running stepAIC(glm, direction = "backwards")

2. Use LASSO regularization for feature selection. Do this using Josh's code above (run cv.glmnet --> obtain optimal lambda --> run glmnet with this lambda)

Which model performed best? If the first model is still the most accurate, are the others close enough to sacrifice some model accuracy for model interpretability? Use your best judgement and whatever you decide just make sure you justify it.
Reply With Quote
  #236  
Old 05-24-2019, 12:30 AM
Squeenasaurus Squeenasaurus is offline
Member
SOA
 
Join Date: Jul 2016
College: Illinois State University
Favorite beer: Lagunitas
Posts: 184
Default

Quote:
Originally Posted by Gettin Lucky In Kentucky View Post
Someone smarter than myself should post a pro's and con's list of the various models on here.
I can take a stab.

Decision Trees
Pros
-can handle linear and non-linear relationships
-robust to correlated features (no need for PCA), feature distributions (no need for centering or scaling), and missing values
-simple to understand
-easy to run

Cons
-poor accuracy
-prone to overfitting


Random Forests
Pros
-almost all the pros of decision trees
-much more accurate
-less prone to overfitting (the averaging of many models reduces variance)

Cons
-hard to interpret
-longer to run
-slight increase in bias


Gradient Boosting Machines
Pros
-almost all the pros of decision trees
-MUCH more accurate
-reduces bias

Cons
-hard to interpret
-longer to run
-prone to overfitting
-sensitive to parameter settings ("hunts" for noise if not tuned properly)
Reply With Quote
  #237  
Old 05-24-2019, 03:45 AM
Inactuary Inactuary is offline
SOA
 
Join Date: May 2019
Posts: 2
Default

Hi Guys,
I was looking at the hospital readmission sample project, and came across to task 7 where AIC had to be performed. However, prior to doing AIC, there were several steps being done regarding leveling and binarization for the factor variables. I have been struggling to understand the concept behind this, my questions are:

1. What is / are base level(s)? - it says the base levels are (racewhite, DRGmed.C, and RaceGenderWhiteF)
2. Why do we have to separate all the factor variables into individual variable (such as Race Hispanic, Race Black, DRG Other.surg, etc)
3. Why does when removing the original factor variables, the Gender variable is retained, while (DRG, race, and racegender are removed)
4. Finally, after doing all the work, the solution decided to again remove all variable associated with Male.

I really need help to grasp this concept. Please someone help me!
Thank you...

Last edited by Inactuary; 05-24-2019 at 04:04 AM..
Reply With Quote
  #238  
Old 05-24-2019, 11:29 AM
DjPim's Avatar
DjPim DjPim is offline
Member
SOA
 
Join Date: Nov 2015
Location: SoCal
Posts: 442
Default

Quote:
Originally Posted by Inactuary View Post
Hi Guys,
I was looking at the hospital readmission sample project, and came across to task 7 where AIC had to be performed. However, prior to doing AIC, there were several steps being done regarding leveling and binarization for the factor variables. I have been struggling to understand the concept behind this, my questions are:

1. What is / are base level(s)? - it says the base levels are (racewhite, DRGmed.C, and RaceGenderWhiteF)
2. Why do we have to separate all the factor variables into individual variable (such as Race Hispanic, Race Black, DRG Other.surg, etc)
3. Why does when removing the original factor variables, the Gender variable is retained, while (DRG, race, and racegender are removed)
4. Finally, after doing all the work, the solution decided to again remove all variable associated with Male.

I really need help to grasp this concept. Please someone help me!
Thank you...
There's a thread for this project that might prove useful. Most of these are answered there, but I'll try to summarize:

1. Base level refers to the level of the factor with the most observations. To binarize 'Race', we create 4 indicator variables, one for each race. The problem is that these 4 variables are perfectly correlated; the sum of them always = 1. To 'trick' our model and get around this issue of multicolinearity, we remove the base level. To us, we recognize the meaning of it as 'if other 3 indicators are 0, then the observation is the base level'.

2. When doing variable selection with backward stepAIC, it will calculate the AIC of the model with all variables, then calculate the AIC of the model with all variables -1 (repeated for each different variable), then decide if it should remove a variable and which one. If Race was one variable with 4 levels, this process would only ask 'is it significant to include the Race variable with all the races?' whereas if we split it up, it can ask instead 'is it significant to include a distinction for White race specifically, or are we fine just saying Black, Hispanic, and Other?' (as example of being able to remove just 1 level of the variable and not the whole thing).

3. The factor variables were split into binary indicators, so if we have RaceWhite, RaceBlack, RaceHispanic, RaceOther, then we don't also need the original Race. However, Gender is already binary. It doesn't matter if we call the levels M/F or 0/1. Therefore, there's nothing to binarize/add/remove, it's fine as-is.

4. I don't quite remember the part you're referring to, I'd have to look at it again, but maybe the answers to 1-3 help with this question?
__________________
Quote:
Originally Posted by Dr T Non-Fan View Post
"Cali" SMH.
Reply With Quote
  #239  
Old 05-24-2019, 03:32 PM
Josh Peck Josh Peck is offline
Member
SOA
 
Join Date: Dec 2016
College: Towson University
Posts: 81
Default

Here is great explanation of GLMs and link functions if anyone wants to take a look

https://www.youtube.com/watch?v=X-ix97pw0xY
__________________
P FM MFE C
Reply With Quote
  #240  
Old 05-24-2019, 04:04 PM
Yossarian Yossarian is offline
Member
SOA
 
Join Date: Jun 2011
Location: SoCal
Posts: 39
Default

Quote:
Originally Posted by Josh Peck View Post
Here is great explanation of GLMs and link functions if anyone wants to take a look

https://www.youtube.com/watch?v=X-ix97pw0xY
Thanks, Josh. I feel like I'm on the MIT campus.

Coincidentally, I also picked up Applied Predictive Modeling by Kuhn and Johnson. It almost seems that text should be on the syllabus, as it really applies to what the exam is testing for.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


All times are GMT -4. The time now is 01:24 PM.


Powered by vBulletin®
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
*PLEASE NOTE: Posts are not checked for accuracy, and do not
represent the views of the Actuarial Outpost or its sponsors.
Page generated in 0.15418 seconds with 12 queries