How to handle data only available for certain observations?
In the sample project, SAGE (spouse age) and SEDUCATION (spouse education) have zeros listed for individuals who do not have a spouse. How should I treat these types of variables in a model? I feel as if including the variables without doing some sort of transformation is misleading or could misrepresent the importance of spousal information for individuals who have spouses.
Would an interaction term suffice in a GLM? Would a transformation suffice? Would two different models be needed (one on married individuals and one of single individuals)? 
And NUH is the letter I use to spell Nutches Who live in small caves, known as Nitches, for hutches, These Nutches have troubles, the biggest of which is the fact there are many more Nutches than Nitches. 
Some things I'm focusing on/ thinking about: if continuous graph the target and see what kind a distribution it follows. Think of correlated variables logically given the problem and see if you can't combine them. Know how to read residual plots. Know what likelihood and deviance are. Make sure to be able to interpret coefficients in logistic regression where categorical variables give a percentage value in relation to the base. Aic/bic, forward/backward selection, ridge/lasso (all penalty stuff). Be comfortable with trees, forests and how gradient boosting works. If poisson and grouped data you probably want to log the exposures as an offset, on that same note look for over dispersion. Cross validation techniques and why you choose certain parameters. This is jumbled and quick but feel free to add/critique.
And NUH is the letter I use to spell Nutches Who live in small caves, known as Nitches, for hutches, These Nutches have troubles, the biggest of which is the fact there are many more Nutches than Nitches. 
I got nervous after reading something in mod 6 that I misinterpreted. 
Lasso Residual Plot
Does anyone know if there is a quick way within the available packages for the exam to produce residual, qq, leverage, and scale plots for lasso and ridge regression models similar to the plot() output for regular GLMs?

Here's my main question...how many hours are you guys all thinking is a reasonable number to study for this exam? I've only studied 30 hours but feel like I've covered a decent amount of the material so far. So maybe I'll study about 100 or 150 total?

Thanks!!! This gives me some hope. I am just getting started but then again I have no prior knowledge in predictive modelling
Get familiar with modules 68, as they will he the bulk of your exam. 
