Thank you! I am really enjoying the job and people with and for whom I work. A long time coming for me.
I will do my best with the outline, but I do have a tendency to make big studyprep plans like this and then run out of time.
They may ask us to build a RF and warn us to not change a very small ntree. Then they will want to know why the RF is messed up and trivially reduces variance at the expense of interpretability. We say the current model sucks and isn't worth it. They ask for a potential improvement without proofwith another warning (underlined and boldface) not to run the improved RF or change the ntree/other parameters from the stupid model.
Other predictions: Certainly a smaller dataset with fewer useable original predictor variables. Target is pure premiumdiscuss whether to use 1 or 2 final models. Compound Tweedie Bird. Rightskewed data somewhere requiring log transform and subsequent interpretation of coefficients/stuff. Module 8 stuff. Features. Perhaps a logistic regression model. Classification tree.
I am trying to register for the module but what I get is:
Error: Unexpected error encountered, your current change will be reset. Is it because of SOA or me? Should I do something to get transition credit or is it an automated process? Lastly when should we use the ExamPA promotional code?
End of Module 6, exercise question
1. When the answer writer for Rmd 6.8 is trying to figureout which variables to get rid of, and they aren't purely looking at the AIC (for convergence reasons), then they are taking out variables with high pvalues. But a couple of times (like leadingup to Chunk #18), they take out SAGE instead of AGE, even though AGE has a higher pvalue. Running Chunks 18, (19), and 20 yields a final model with AGE having pvalue of 0.398 and the rmse = 764k, but if you change the AGE to SAGE instead, then the pvalue of SAGE is 0.121 in the model and the rmse = 742k (which seems to imply a better model).
And the answer writer is not a priori convinced of never taking AGE out of the model, because just before Chunk #13, they try taking out AGE; but as it happens, the process doesn't converge without AGE and SAGE and so they endup including AGE. 2. I had assumed that one would never settle on a model where any of the betas has a pvalue greater than 0.05 (aside from when two variables need each other as in the MARSTAT with AGEDiff), but the writer does keep high pvalue betas. Like NUMHH in the results from Chungus #18 is at 0.382, which means that the beta for NUMHH could easily be negative instead of positive. And given that the objective of this end of module 6 example is to "assist marketing in identifying important factors", telling marketing that there's a positive beta when it could be negative is not helpful.
Sitting for June PA after taking April LTAM
I'm considering the plausibility of passing PA with only 6 weeks of study. I'm not sure what the suggested study hours are (usually given per hour of exam time for other exams) but suspect it's less than other exams. I work in R every day and have an MS in Statistics so I also suspect I'll have less new ground to cover than the average testtaker here, but have no idea how MUCH less.
Is anyone attempting in such a short amount of time? The upside is pretty high for going for it despite low odds, but I may make the purchase and realize there's no way. Appreciate all input 
