Actuarial Outpost
 
Go Back   Actuarial Outpost > Exams - Please Limit Discussion to Exam-Related Topics > SoA/CAS Preliminary Exams > Exam PA: Predictive Analytics
FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions

DW Simpson Global Actuarial & Analytics Recruitment
Download our Actuarial Salary Survey
now with state-by-state salary information!


Reply
 
Thread Tools Search this Thread Display Modes
  #301  
Old 06-04-2019, 03:27 PM
DyalDragon's Avatar
DyalDragon DyalDragon is offline
Member
SOA
 
Join Date: Apr 2009
Location: Here
Studying for the hell of it...
College: AASU
Favorite beer: This one...
Posts: 33,511
Default

Quote:
Originally Posted by DjPim View Post
cobalt best theme
I used cobalt at first, but my work monitors have this stupid adaptive brightness feature that can't be disabled, and it makes it very hard to see the blue text for comments when the screen dims.
__________________
P FM MFE MLC C Predictive Analytics
VEE FAP FAP FAP FAP FAP FAP FAP FAP APC
Reply With Quote
  #302  
Old 06-04-2019, 04:37 PM
jractuary0004 jractuary0004 is offline
SOA
 
Join Date: Jun 2019
Posts: 5
Default

Is anybody else at a point where they feel good about the material, but are very worried about the fact we only have five hours to complete the exam? Like I know when I took LTAM, the strategy for the written portion was even if you didn't know the problem 100%, you could just write down what you do know and move one. But for this exam, it seems like each task builds on one another, so that strategy isn't exactly applicable here. I feel like I'm going to run into a roadblock and am never going to have enough time to finish the rests of the tasks.

Anyway, if you are feeling like this, how are you planning on prepping over the next week and 2 days?
Reply With Quote
  #303  
Old 06-04-2019, 05:47 PM
Adapt and Chill Adapt and Chill is offline
Member
SOA AAA
 
Join Date: Sep 2017
College: Davidson College
Posts: 189
Default

Quote:
Originally Posted by jractuary0004 View Post
Is anybody else at a point where they feel good about the material, but are very worried about the fact we only have five hours to complete the exam? Like I know when I took LTAM, the strategy for the written portion was even if you didn't know the problem 100%, you could just write down what you do know and move one. But for this exam, it seems like each task builds on one another, so that strategy isn't exactly applicable here. I feel like I'm going to run into a roadblock and am never going to have enough time to finish the rests of the tasks.

Anyway, if you are feeling like this, how are you planning on prepping over the next week and 2 days?
I sat in December and failed with a 5, largely because I used my time poorly. I submitted an incomplete Executive Summary and got a 2 on that section (7, 5, 6 for the other sections). Based on that, I plan to dedicate a hard limit of 2-2.5 hours for the R coding (preferably towards the lower end), then spend the last 2.5-3 hours writing. I went back-and-forth during the December sitting, which resulted in a disproportionate amount of time on data exploration and only a 7 to show for it. Based on the released solution, it seems like this happened to other candidates as well.

I didn't make the effective adjustments to my data/features during the December exam so my model was pretty bad, and I kept going back and trying to tweak things when I should've focused on the writing portion. If your model sucks or if you get stuck on a task, you need to have the presence of mind to realize that you'll salvage more points by writing about its limitations/constraints than trying to fix it after a certain point.

Example: Task 4 (Select an interaction) from the Hospital Readmissions project. If you end up exploring all of the feature pairs and can't see anything, just commit to an interaction that appears plausible and move on.
Reply With Quote
  #304  
Old 06-04-2019, 07:19 PM
Squeenasaurus Squeenasaurus is offline
Member
SOA
 
Join Date: Jul 2016
College: Illinois State University
Favorite beer: Lagunitas
Posts: 185
Default

Quote:
Originally Posted by TranceBrah View Post
You cannot quantify the effect of your coefficients on your prediction easily like you could with a log or logit link (ie. multiply by exp(beta)).
Can you elaborate on the logit function here? I'm having a hard time finding the interpretation of the coefficients of a GLM with a logit link function.

For a log link, say we have E(Y) = e^(b0 + b1*x1)
Then E(Y|x=0) = e^(b0) and E(Y|x=1) = e^(b0 + b1) = E(Y|x=0)*e^(b1)

This is where the multiplicative effect comes into play for a log link function, but I don't believe you can say the same for a logit link function.
Reply With Quote
  #305  
Old 06-04-2019, 07:35 PM
avocado avocado is offline
Member
SOA
 
Join Date: Apr 2018
Posts: 94
Default

Quote:
Originally Posted by Adapt and Chill View Post
I sat in December and failed with a 5, largely because I used my time poorly. I submitted an incomplete Executive Summary and got a 2 on that section (7, 5, 6 for the other sections). Based on that, I plan to dedicate a hard limit of 2-2.5 hours for the R coding (preferably towards the lower end), then spend the last 2.5-3 hours writing. I went back-and-forth during the December sitting, which resulted in a disproportionate amount of time on data exploration and only a 7 to show for it. Based on the released solution, it seems like this happened to other candidates as well.

I didn't make the effective adjustments to my data/features during the December exam so my model was pretty bad, and I kept going back and trying to tweak things when I should've focused on the writing portion. If your model sucks or if you get stuck on a task, you need to have the presence of mind to realize that you'll salvage more points by writing about its limitations/constraints than trying to fix it after a certain point.

Example: Task 4 (Select an interaction) from the Hospital Readmissions project. If you end up exploring all of the feature pairs and can't see anything, just commit to an interaction that appears plausible and move on.
This is a very good advice. I guess the main goal is to be able to write a complete report and justify your choices (even if the model is bad!), rather than to build a good model.
Given there are 10+ tasks, I'm not sure if I'm able finish the report in 2-2.5 hours under the exam condition. I tried the Hospital example and was barely able to finish writing in 2.5 hours...
Reply With Quote
  #306  
Old 06-04-2019, 07:47 PM
Američanka Američanka is offline
SOA
 
Join Date: Aug 2014
Posts: 14
Default

I'm having a hard time understanding the error diagnostics output from the plot(model) function. Are these important to run for a regression model if you've already looked at RMSE and AIC? Are these useful at all for a classification problem?

Referring to the following:

Residuals versus fitted plot
Normal Q-Q plot
Scale-Location or Residuals versus Fitted plot
Residuals versus Leverage
Reply With Quote
  #307  
Old 06-04-2019, 08:38 PM
PredictTheseAnalytics PredictTheseAnalytics is offline
SOA
 
Join Date: Jun 2019
Posts: 2
Default

Does anyone understand cutoff value? In the student success problem, I understand it conceptually as if the student as probability of x of passing, mark it as P. (Please correct me if I'm wrong). In the solution however, it doesn't seem to have a good explanation behind why its .5. Because the accuracy is similar to the actual data?
Reply With Quote
  #308  
Old 06-04-2019, 08:44 PM
RiskyBusiness7 RiskyBusiness7 is offline
Member
SOA
 
Join Date: Apr 2018
Posts: 53
Default

Quote:
Originally Posted by PredictTheseAnalytics View Post
Does anyone understand cutoff value? In the student success problem, I understand it conceptually as if the student as probability of x of passing, mark it as P. (Please correct me if I'm wrong). In the solution however, it doesn't seem to have a good explanation behind why its .5. Because the accuracy is similar to the actual data?
yeah, I think they set up that check to show you that the "cutoff" they chose creates similar environment as the raw data, i.e. that you're getting 65% passes and 35% fails. so you would've had to change the cutoff if it didn't create representative test sets


on a different note, does anyone know why in all of the exercises and such we only use the caret method of "repeated cv" when doing ensemble methods, and we use normal "cv" for base models?
Reply With Quote
  #309  
Old 06-04-2019, 09:13 PM
TranceBrah's Avatar
TranceBrah TranceBrah is offline
Member
SOA
 
Join Date: Mar 2014
Location: Best Coast
Posts: 238
Default

Quote:
Originally Posted by Squeenasaurus View Post
Can you elaborate on the logit function here? I'm having a hard time finding the interpretation of the coefficients of a GLM with a logit link function.

For a log link, say we have E(Y) = e^(b0 + b1*x1)
Then E(Y|x=0) = e^(b0) and E(Y|x=1) = e^(b0 + b1) = E(Y|x=0)*e^(b1)

This is where the multiplicative effect comes into play for a log link function, but I don't believe you can say the same for a logit link function.
The multiplicative effect is applied on the odds ratio p/(1-p) rather than the probability of success.

log(p/1-p) = XB, where p = probability of success. exp(coefficient) has a multiplicative effect on the odds ratio.

Here is a nice example with factor and continuous predictors https://stats.idre.ucla.edu/other/mu...ic-regression/
Reply With Quote
  #310  
Old 06-04-2019, 09:25 PM
TranceBrah's Avatar
TranceBrah TranceBrah is offline
Member
SOA
 
Join Date: Mar 2014
Location: Best Coast
Posts: 238
Default

Quote:
Originally Posted by Američanka View Post
I'm having a hard time understanding the error diagnostics output from the plot(model) function. Are these important to run for a regression model if you've already looked at RMSE and AIC? Are these useful at all for a classification problem?

Referring to the following:

Residuals versus fitted plot
Normal Q-Q plot
Scale-Location or Residuals versus Fitted plot
Residuals versus Leverage
Pearson residuals are not that useful because in a glm the error term is not assumed to follow N(0,1), so you cannot use qqplots or check for things like homoscedasticity, like you would typically do with normal linear regression.

However, there is something called the deviance residual (you can obtain this with residuals(glm1, type = "deviance")) and this is adjusts the raw residuals to the shape of your distribution and these will follow normal with constant variance if the data fits the distribution well. Note that these are no good for discrete distributions (binomial, poisson) so it wouldn't have been useful in any exam we have seen.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


All times are GMT -4. The time now is 07:18 AM.


Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
*PLEASE NOTE: Posts are not checked for accuracy, and do not
represent the views of the Actuarial Outpost or its sponsors.
Page generated in 0.24475 seconds with 10 queries