Actuarial Outpost
 
Go Back   Actuarial Outpost > Exams - Please Limit Discussion to Exam-Related Topics > SoA/CAS Preliminary Exams > Exam PA: Predictive Analytics
FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions

Search Actuarial Jobs by State @ DWSimpson.com:
AL AK AR AZ CA CO CT DE FL GA HI ID IL IN IA KS KY LA
ME MD MA MI MN MS MO MT NE NH NJ NM NY NV NC ND
OH OK OR PA RI SC SD TN TX UT VT VA WA WV WI WY

Reply
 
Thread Tools Search this Thread Display Modes
  #191  
Old 05-14-2019, 04:40 PM
Snax Snax is offline
SOA
 
Join Date: Apr 2019
Posts: 3
Default

Quote:
Originally Posted by DjPim View Post
1. and 2. are slightly different if running in .rmd inside chunks rather than R script, I forgot to mention.

Can you share your code or a reproducible example? It may be one of your chunk options like someone else mentioned.
Any RMD file that was provided throughout the modules would be an example. Specifically right now I am reviewing the "SampleProjectSolution.rmd" file and chuck 2 has 3 plots in it. So When I run it, I get a pop up window showing only the last plot. The "Plots" section of the original screen is blank.

It's happened throughout all the modules so I don't think it has to do with the code, but the code chunks provided to us look like:

```{r}
<code>
```

I'm thinking it is some type of setting within RStudio.
Reply With Quote
  #192  
Old 05-15-2019, 11:28 AM
Coolkid22 Coolkid22 is offline
SOA
 
Join Date: Jun 2014
College: Drake '17
Posts: 29
Default

Does anyone have a good explanation for when to use hierarchical clustering as opposed to k-means clustering?
Reply With Quote
  #193  
Old 05-15-2019, 03:02 PM
DjPim's Avatar
DjPim DjPim is offline
Member
SOA
 
Join Date: Nov 2015
Location: SoCal
Posts: 432
Default

Question about variable selection with binarized categorical variables. Spoilered because references solution to latest sample project, in case you haven't done it yet.

Spoiler:

One of the parameters in caret::dummyVars is fullRank, where TRUE sets a base level and FALSE explicitly lists out all levels. I figured if I was going to binarize Race, which has 4 levels, then do stepAIC to possibly remove some levels but not the whole variable, then I would need to have all 4 levels listed out, right?

In the solution they put fullRank = FALSE at first, so I thought I was on the right track, but then they manually take out the base levels for each of the variables. Is this normal / necessary? If I have RaceWhite, RaceBlack, RaceHispanic, and RaceOthers, and RaceWhite is set to base, then getting 0, 0, 0, would imply the race is white. If stepAIC removes both RaceBlack and RaceOthers, then getting a 0 on RaceHispanic might be indicating White but could also be indicating Black/Others, and I would think I'd need to have somewhere that my model mentions if White is significant or not. Is it already taken into account? Does keeping all 4 indicators without a base mess something up? Halp.
__________________
Quote:
Originally Posted by Dr T Non-Fan View Post
"Cali" SMH.
Reply With Quote
  #194  
Old 05-15-2019, 03:21 PM
Squeenasaurus Squeenasaurus is offline
Member
SOA
 
Join Date: Jul 2016
College: Illinois State University
Favorite beer: Lagunitas
Posts: 160
Default

Quote:
Originally Posted by Coolkid22 View Post
Does anyone have a good explanation for when to use hierarchical clustering as opposed to k-means clustering?
I think the modules said hierarchical clustering gives you a sense of which variables are closely related but if you wanted to create a feature that could be used as a predictor, you want to use k-means clustering in place of those other variables.
Reply With Quote
  #195  
Old 05-15-2019, 04:30 PM
TranceBrah's Avatar
TranceBrah TranceBrah is offline
Member
SOA
 
Join Date: Mar 2014
Location: Best Coast
Posts: 162
Default

Quote:
Originally Posted by DjPim View Post
Question about variable selection with binarized categorical variables. Spoilered because references solution to latest sample project, in case you haven't done it yet.

In the solution they put fullRank = FALSE at first, so I thought I was on the right track, but then they manually take out the base levels for each of the variables. Is this normal / necessary? If I have RaceWhite, RaceBlack, RaceHispanic, and RaceOthers, and RaceWhite is set to base, then getting 0, 0, 0, would imply the race is white. If stepAIC removes both RaceBlack and RaceOthers, then getting a 0 on RaceHispanic might be indicating White but could also be indicating Black/Others, and I would think I'd need to have somewhere that my model mentions if White is significant or not. Is it already taken into account? Does keeping all 4 indicators without a base mess something up? Halp.
My thoughts:
The purpose of the glm is to estimate coefficients for the predictors. The reason there is the reference level concept is mainly for interpretability. Prediction wise, it wouldn't matter if you kept all 4, or picked a different reference level because the estimated coefficients would account for that.

If you keep white ppl as the base level, then the coefficient for other races can be used to directly see how the prediction would change if the person was black instead of white (reference). Imagine if you kept all 4 of the binarized variables; then how could you interpret the coefficients?

So the initial model would have Black, Hispanic, and Other. Then if stepAIC removed Black and Other, it just means that the jump from White to Black or White to Other doesn't have a statistically significant impact on the prediction.

Last edited by TranceBrah; 05-15-2019 at 04:34 PM..
Reply With Quote
  #196  
Old 05-16-2019, 09:16 AM
Squeenasaurus Squeenasaurus is offline
Member
SOA
 
Join Date: Jul 2016
College: Illinois State University
Favorite beer: Lagunitas
Posts: 160
Default

Just realized there's no cheat sheets this sitting. IMO they didn't help much anyways
Reply With Quote
  #197  
Old 05-16-2019, 09:24 AM
ActuariallyDecentAtBest ActuariallyDecentAtBest is offline
Member
SOA
 
Join Date: Dec 2016
Posts: 208
Default

Thoughts on the slight changes to the exam format?
Reply With Quote
  #198  
Old 05-16-2019, 11:39 AM
DjPim's Avatar
DjPim DjPim is offline
Member
SOA
 
Join Date: Nov 2015
Location: SoCal
Posts: 432
Default

Quote:
Originally Posted by Squeenasaurus View Post
Just realized there's no cheat sheets this sitting. IMO they didn't help much anyways
Agreed. All SOA 'solutions' graph the most basic things anyway, not much of a need to get creative. I think adding a +labs() on the plots will be perfect.

Quote:
Originally Posted by ActuariallyDecentAtBest View Post
Thoughts on the slight changes to the exam format?
90% of me likes it for the structure and more clear point system. I took the sample project and pretty well with it (IMO), but the other 10% of me is a little salty about needing to go back and refresh on more theory / less common material because it's a lot more likely it will get thrown in.

At least there's less writing now, so I won't spend 2 hours rushing to remember to note every single thing. A lot easier to write everything I'm thinking about each point, especially since they say "do these problems in order".
__________________
Quote:
Originally Posted by Dr T Non-Fan View Post
"Cali" SMH.
Reply With Quote
  #199  
Old 05-16-2019, 12:20 PM
jts75 jts75 is offline
Member
SOA
 
Join Date: May 2018
College: Oklahoma State
Posts: 67
Default

Quote:
Originally Posted by Squeenasaurus View Post
Just realized there's no cheat sheets this sitting. IMO they didn't help much anyways
I forgot they were even there.
Reply With Quote
  #200  
Old 05-16-2019, 03:36 PM
Josh Peck Josh Peck is offline
Member
SOA
 
Join Date: Dec 2016
College: Towson University
Posts: 40
Default

Quote:
Originally Posted by Coolkid22 View Post
Does anyone have a good explanation for when to use hierarchical clustering as opposed to k-means clustering?
https://www.youtube.com/watch?v=Tuuc9Y06tAc
__________________
P FM MFE C
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


All times are GMT -4. The time now is 02:54 PM.


Powered by vBulletin®
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
*PLEASE NOTE: Posts are not checked for accuracy, and do not
represent the views of the Actuarial Outpost or its sponsors.
Page generated in 0.29754 seconds with 12 queries