
#41




Quote:
Though in the file I downloaded today they added the 3.6.1 folder but it only has one file in it. I replaced the 3.6.1 folder with the copy and it worked.
__________________
P, FM, MFE, MLC, C 
#42




Quote:

#43




Just saw tonight that Actex delayed the release of its manual (again) (and the ASM has also delayed theirs). I prepaid for the Actex manual and am starting to get extremely frustrated. I want to know when I can honestly expect the study guide to be available  I want to be studying with it already. I have the SOA modules and the source texts, so I suppose I can keep trying to grind those, but this is really disappointing. I basically expect Actex to delay again next week.
__________________
Prelims: FAP Modules: 1  2  3  4  IA  6  7  FA VEE: 
#44




Quote:
__________________
Ambrose Lo, PhD, FSA, CERA Associate Professor of Actuarial Science (with tenure) Department of Statistics and Actuarial Science The University of Iowa ACTEX Manual for SOA Exam PA  ACTEX Manual for SOA Exam SRM  ACTEX Manual for CAS Exam MASI Textbook: Derivative Pricing: A ProblemBased Primer (useful for derivatives portion of Exam IFM) 
#45




The ACTEX manual author is above the comment, but is this study note written by ASM manual author? https://www.soa.org/globalassets/ass...restrates.pdf
I thought most ASM manuals were written by Professor Weishaus. I got to take the exam again and wished there's a formal manual existed before this June sitting. 
#46




Quote:

#47




Hey, I just found out I passed the June exam with a 9 after failing the December exam with a 4.
Some advice I'll give. When it comes down to it, this exam is more of a speed writing exam than a predicitive analytics exam. I don't think they care if your analysis results are useless and terrible. All they care about is if you can explain what you did well. Also, the modules are absolutely terrible. Reading the CAS Monograph 5 helped me a lot. Chapters 1, 2, and 6 are most relevant for the exam. https://www.casact.org/pubs/monograp...hareTevet.pdf Last, here are my model notes Model Type Notes ## GLMs Glms are comprehensive models that take all significant variables into account, assesing the relative importance of each predictor while also creating an easy to implement formula to calculate a prediction for a given oberservation Advantages  Produce easy to implement formula to calculate a prediction for a given observation  Good for modeling continuos response variables (as opposed to tree based methods) Disadvantages  Inherintly do not capture no linear relationships well  Risk of multicolinearity producing less than optimal models  Have assumptions which may not always be met #Family Famliy refers to the distribution that the target variable is assumed to follow This will impact how the algorithm fits the model This has to do with the error terms of the target variable In GLM function in r . binomial  Takes on values of 0 and 1 (FALSE and TRUE respectivley)  Mean equals proability of a 1, p  Variance equals p * (1  p) . gaussian  Ranges all real numbers  Symetric  Constant variance . Gamma  Ranges all positive numbers  Skewed with long right tail  Variance equals the mean squared . inverse.gaussian  Ranges all positive numbers  More skewed than gamma with longer right tail and higher peak  Variance equals the mean cubed . poisson  Count variable that ranges all positive integers including 0  Variance equal mean  Skewed with right tail, but as the mean increases, the distribution becomes normal . quasipoisson  In real data sets usually variance is higher than the mean this adds a dispersion parameter to the poisson distribution to account for that  Coefficients will be the same as poisson distribution but p values will be different which can impact feature selection . quasibinomial  Similar to quasipoisson. Variance accounts for overdispersion  Coefficients will be the same as binomial distribution but p values will be different which can impact feature selection . quasi  ???? Other . tweedie  Has discrete probability at 0 and continuos probability for positive values  variance equals the mean to a power p which can be specificed  if p = 0, the distribution is gaussian if p = 1, the distribution is poisson if p = 2, the distribution is gamma if p = 3, the distribution is inverse gausian  this is commonly used in insurance to measure total claim amount thus p is selected to be between 1 and 2 so that it is somewhere in between poisson to model number of claims and gamma to model claim severity #Link Function g(u.i) = B0 + B1*X1.i + ... BP*Xp.i u.i = g'(B0 + B1*X1.i + ... BP*Xp.i) where i is the ith observation in the dataset u.i is the mean predicted target variable value for the ith observation p is the number of predictors B0, B1, BP are the coefficients of the model X1.i, X2.i, Xp.i are the predictor values for the ith observation g() is the link function The linear component of the function above predicts a function of the mean The linear component is on the range of all real numbers (from infinity to +infinity) The link function can be used to force the mean of the prediction for a specific observation to be on a specific range In GLM function in r . logit  g(u) = log(u / (1  u))  g'(u) = exp(u) / (1 + exp(u))  log odds (coefficients can be explained as multiplicative impact on odds)  mean of predicted value must be between 0 and 1  cannonical link function for binomial family . probit  Standard normal CDF (coefficients can be explained as impact on z value for standard normal distribution)  mean of predicted value must be between 0 and 1  curve is almost identical to logit main difference is how to interpret . cauchit  Cauchy CDF (note that Cauchy is symetrical)  mean of predicted value must be between 0 and 1  has heavier tails than logit and probit  can be prone to overfitting . cloglog  mean of predicted value must be between 0 and 1  curve is asymetrical curve near probability of 0 is round curve near probability of 1 is sharp . identity  mean of predicted value can be any real number  canonical link function for gaussian family  a one unit change in a predictor will be a coefficient change in the prediction . log  mean of predicted value must be positive number  canonical link function for poisson family  coefficients can be expained as multiplicative impact . sqrt  mean of predicted value must be positive number inclusive of 0 . 1 / mu^2  canonical link function for inverse gausian family  ???? . inverse  ???? # Regularized Regression An additional term is added to the linear component of the GLM B0 + B1*X1.i + ... BP*Xp.i + lambdaB This is to penalize coefficents to avoid overfitting the model The parameter lambda is a positive value and controls the strength of the penalty As lambda approachs infinity all coefficients approach (or are set to exactly) 0 Ridge Regression . B = SUM(B1^2 + B2^2 + ... + BP^2) . This will make coefficients close to, but not exactly 0 Lasso Regression . B = SUM(abs(B1) + abs(B2) + ... + abs(BP)) . This will allow coefficients to be set to exactly 0 . Thus, this performs feature selection Elastic net . The penalty is a weighted average of ridge and lasso . The parameter alpha is the weight of the lasso penalty Important Note . Predictors must be scaled prior to using this algorithm . Otherwise predictors with high orders of magnitued will force all coeficients to be unreasonably small Advantages  Feature selection is automatic and uses crossvalidation error to minimize prediction error  Binarization is always done and thus each factor level is treated as a seperate feature Disadvantages  Due to features being scaled, the model coefficients are difficult to interpret  The R package only works with a limited number of model choice  We may be trying to optimize a value other than prediction error, such as AUC ## Tree based methods # Decision Trees Decision trees are easily interpretible models based on a series of if/then statements that clearly highlight key factors, interaction, and nonlinear relationships Advantages  Easy to interpret  Highlight interactions  Automatically performs feature selection (by having features not show up in the tree)  Can capture nonlinear relationships Disadvantages  Prone to overfitting even with cost complexity pruneing (high variance)  Splits made for continuos results may suggest discontinuits where little difference exists # Random Forests A random forest is an algorithm that creates many trees using different samples of predictors and data for each tree. The prediction is then the mode or average of all trees. Advantages  Reduces variance in predictions by taking an average result of all trees  Captures interactions  Captures nonlinear relationships  Uses crossvalidation to set the tuning parameter Disadvantages  Not easily interpretable. Can be considered "black box" model  Prone to overfitting  Requires high computational power and thus takes a long time to run  Difficult to implement # Gradient Boosting Machines (GBMs) A gradient boosting machine is an algorthime that fits a series of trees Fitting each subsequent tree with more weight placed on the observations that were not predicted well in the previous tree. The prediction is a weighted average or mode of all trees with the first tree having the most weight, and each subsequent tree being weighted less. Advantages  Improves accuracy by placing weight on poor predictions in subsequent trees (reduces bias)  Reduces overfitting (reduces variance)  Captures interactions  Captures nonlinear relationships Disadvantages  Not easily interpretable. Can be considered "black box" model  Sensitive to hyperparameters  Requires very high computational power and thus takes a very long time to run  Difficult to implement
__________________

#48




To follow up Josh Peck's advice:
Here are 3 recommendations from someone who just got a 9 on the June '19 sitting after getting a 1 on the Dec.'18 sitting:  1. Finish writing the Executive Summary because it is worth the most amount of points (write, write, write). Read through all of their posted papers/executive summaries and form an algorithm you know by heart of what you will write about regardless of what data they throw at you.  2. Since this is a relatively new exam, they are still trying to write questions and scenarios that cover all of the topics on the syllabus. Focus on obscure and difficult topics in the slides that the published sample projects / exams have not discussed or utilized. This strategy paid off 3 times on the June sitting. I think Weights and Offsets are a good bet.  3. Whatever code they give you to use is very intentional. I recommend figuring out how to use their code rather than figuring out something from scratch. They probably have the wrong values in it at the start and you just have to sub in the answer or fields that you really want to use. They honestly are trying to save you as much time as they can in their code. They know it is a 5hr exam and don't want you coding much of the time. Their intent is to take it easy on you with the coding they expect. Focus on learning the basics of R, which I think is best done in all of the Sample Projects / Exams rather than the slide exercises. But referring to my 2nd point, if they don't have a Project/Exam that uses it yet, it would be worth your extra study time to be comfortable with code that uses difficult topics. In short, use whatever code they give you! Unless you are either (a) successful the first try you have at coding it yourself or (b) you are an R expert and can just "wing it", a 2nd attempt code fail not using their code is a sign you just wasted a bunch of time. Go back to their code and figure out why it should be of good use.
__________________
ASA GH Track: 
#49




Quote:

#50




I passed the June sitting and I wanted to share some notes about my experience that might be useful.
1) The points assigned to each task are not proportional to the amount of time you will spend on them. The first four tasks took me about 3 hours to complete, leaving just over 2 hours for everything else. I initially skipped task 10 to write the executive summary instead, but ended up with plenty of time to go back to it afterwards. 2) They're not testing your ability to code. Probably 90% of the code that you need is already provided by your "assistant." Mostly you just tweak it to make it work. This is overwhelmingly an analysis exam and most of your time will be spent processing results. 3) With prior experience in predictive analytics, I was able to pass by studying the modules for about 6 weeks. I'd say if you've taken a graduate level course in predictive modeling, you will likely have a similar experience. The modules were more of a refresher than anything. I didn't feel I needed any of the recommended textbooks, but I did supplement with online resources where needed ( 4) The sample solution they provided was super helpful  much more than the modules were. It really lays out the SOA's expectations and is fairly easy to replicate. I think my most valuable time spent studying were the last couple days where I just went through the sample task by task and compared my answers to the ones given. Hope this can be of some help to those getting started for December. Rest assured that you still have plenty of time, even if you've just begun!!
__________________
IA FA ASA ?????? 
Thread Tools  Search this Thread 
Display Modes  

