Actuarial Outpost
 
Go Back   Actuarial Outpost > Exams - Please Limit Discussion to Exam-Related Topics > SoA/CAS Preliminary Exams > Exam PA: Predictive Analytics
FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions

Search Actuarial Jobs by State @ DWSimpson.com:
AL AK AR AZ CA CO CT DE FL GA HI ID IL IN IA KS KY LA
ME MD MA MI MN MS MO MT NE NH NJ NM NY NV NC ND
OH OK OR PA RI SC SD TN TX UT VT VA WA WV WI WY

Reply
 
Thread Tools Search this Thread Display Modes
  #261  
Old 05-30-2019, 08:20 PM
TranceBrah's Avatar
TranceBrah TranceBrah is offline
Member
SOA
 
Join Date: Mar 2014
Location: Best Coast
Posts: 238
Default

Quote:
Originally Posted by noone View Post
Does anyone know if we are able to use the ? function that exists withing R during the exam?
Better hope so lmao
Reply With Quote
  #262  
Old 05-31-2019, 03:25 AM
neham_86 neham_86 is offline
SOA
 
Join Date: Jun 2015
Location: Texas
Studying for PA
Posts: 5
Default

This is from Module 7.4

$ TERM_FLAG : int 1 1 1 1 0 1 0 1 0 1 ...
$ GENDER : int 1 1 1 1 1 1 0 1 1 1 ...
$ AGE : int 30 50 39 43 61 34 75 29 70 72 ...
$ MARSTAT : int 1 1 1 1 1 1 0 1 1 1 ...
$ EDUCATION : int 16 9 16 17 15 11 8 16 17 17 ...
$ SAGE : int 27 47 38 35 59 31 0 31 74 70 ...
$ SEDUCATION: int 16 8 16 14 12 14 0 17 16 14 ...
$ NUMHH : int 3 3 5 4 2 4 1 3 2 2 ...
$ logINCOME : num 10.67 9.39 11.7 10.6 10.13 ...
$ logCHARITY: num 0 0 6.21 0 6.21 ...
$ AGEdiff : int 3 3 1 8 2 3 0 2 4 2 ...

Should we convert GENDER and MARSTAT to factor variables before using rpart?
Reply With Quote
  #263  
Old 05-31-2019, 09:48 AM
Meepo's Avatar
Meepo Meepo is offline
Member
SOA
 
Join Date: Mar 2014
Location: Here In My garage
Studying for Undecided
College: University of Phoenix
Posts: 237
Default

Quote:
Originally Posted by noone View Post
Does anyone know if we are able to use the ? function that exists withing R during the exam?
I think so, it doesn't look the '?' requires an internet connection.
__________________
P FM MFE C LTAM PA The Rest
Meepo is love, Meepo is life.
Reply With Quote
  #264  
Old 05-31-2019, 09:52 AM
ubhutto ubhutto is offline
SOA
 
Join Date: Oct 2013
College: St. John's university
Posts: 28
Default

Quote:
This is from Module 7.4

$ TERM_FLAG : int 1 1 1 1 0 1 0 1 0 1 ...
$ GENDER : int 1 1 1 1 1 1 0 1 1 1 ...
$ AGE : int 30 50 39 43 61 34 75 29 70 72 ...
$ MARSTAT : int 1 1 1 1 1 1 0 1 1 1 ...
$ EDUCATION : int 16 9 16 17 15 11 8 16 17 17 ...
$ SAGE : int 27 47 38 35 59 31 0 31 74 70 ...
$ SEDUCATION: int 16 8 16 14 12 14 0 17 16 14 ...
$ NUMHH : int 3 3 5 4 2 4 1 3 2 2 ...
$ logINCOME : num 10.67 9.39 11.7 10.6 10.13 ...
$ logCHARITY: num 0 0 6.21 0 6.21 ...
$ AGEdiff : int 3 3 1 8 2 3 0 2 4 2 ...

Should we convert GENDER and MARSTAT to factor variables before using rpart?
When you plot the tree, its cleaner if the variables are factor. Because it just shows Yes or No rather than >= 1, Atleast that's what it showed me. But i usually always convert the variable to a factor for easier visualization
Reply With Quote
  #265  
Old 05-31-2019, 10:29 AM
Josh Peck Josh Peck is offline
Member
SOA
 
Join Date: Dec 2016
College: Towson University
Posts: 99
Default

Quote:
Originally Posted by neham_86 View Post
Should we convert GENDER and MARSTAT to factor variables before using rpart?
One of the first things I do when I pull in data is run summary() and str().

I run str() to look at the data types and view a few example values.
Then convert all the data types to the data types that make sense.

It might not actually end up mattering, but it's good practice.
__________________
P FM MFE C PA
Reply With Quote
  #266  
Old 05-31-2019, 03:01 PM
ChrisPap ChrisPap is offline
SOA
 
Join Date: Aug 2014
Posts: 19
Default

If we have to decide between two, or multiple models , (eg a rf model and a xgbTree, etc) based on training data, I assume we must use the same folds if we' re performing cross-validation, right?
__________________
P,FM,MFE,C,MLC,
VEEs,PA
Reply With Quote
  #267  
Old 05-31-2019, 05:42 PM
ActuariallyDecentAtBest ActuariallyDecentAtBest is offline
Member
SOA
 
Join Date: Dec 2016
Posts: 383
Default

These modules are really not that great.
Reply With Quote
  #268  
Old 06-01-2019, 09:12 AM
RiskyBusiness7 RiskyBusiness7 is offline
Member
SOA
 
Join Date: Apr 2018
Posts: 53
Default

I'm practicing the basics again, and cannot get my predict function to work. Please let me know if you've seen this error...

galton <- read.csv("galton.csv")
lm_galton <- lm(child ~ parent, data = galton)
predictions <- predict(lm_galton,newdata = galton$parent)

output:
Error in eval(predvars, data, env) : numeric 'envir' arg not of length one
Reply With Quote
  #269  
Old 06-01-2019, 09:46 AM
ChrisPap ChrisPap is offline
SOA
 
Join Date: Aug 2014
Posts: 19
Default

Quote:
Originally Posted by RiskyBusiness7 View Post
I'm practicing the basics again, and cannot get my predict function to work. Please let me know if you've seen this error...

galton <- read.csv("galton.csv")
lm_galton <- lm(child ~ parent, data = galton)
predictions <- predict(lm_galton,newdata = galton$parent)

output:
Error in eval(predvars, data, env) : numeric 'envir' arg not of length one
Your new data is a vector not a data.frame that contains the variables used at the formula. Actually because you don't use new data rather than the same your model was build, u could just not include it as argument.
__________________
P,FM,MFE,C,MLC,
VEEs,PA
Reply With Quote
  #270  
Old 06-01-2019, 10:00 AM
ubhutto ubhutto is offline
SOA
 
Join Date: Oct 2013
College: St. John's university
Posts: 28
Default

Quote:
I'm practicing the basics again, and cannot get my predict function to work. Please let me know if you've seen this error...

galton <- read.csv("galton.csv")
lm_galton <- lm(child ~ parent, data = galton)
predictions <- predict(lm_galton,newdata = galton$parent)

output:
Error in eval(predvars, data, env) : numeric 'envir' arg not of length one
The issue here is 2 fold.
1. You have to partition the dataset into a training / testing dataset.
2. in prediction, you just enter the new dataset e.g. newdata = testdf rather than newdata$variable like you did
Quote:
galton <- read.csv("galton.csv")

library(caret)
partition <- createDataPartition(galton$child, p = 0.8, list = F)
traindf <- galton[partition,]
testdf <- galton[-partition,]

lm_galton <- lm(child ~ ., data = traindf)
summary(lm_galton)

predictions <- predict(lm_galton, newdata = testdf)
sse.test <- sum((predictions - testdf$child)^2)
sse.test
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


All times are GMT -4. The time now is 08:00 AM.


Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
*PLEASE NOTE: Posts are not checked for accuracy, and do not
represent the views of the Actuarial Outpost or its sponsors.
Page generated in 0.24882 seconds with 10 queries