View Single Post
  #6  
Old 12-03-2017, 08:16 PM
Chuck Chuck is offline
Member
SOA AAA
 
Join Date: Oct 2001
Location: Illinois
Posts: 4,414
Default

Hi MPC, Stephen - I was on the webinar and enjoyed it. Certainly a lot to bite off in 90 minutes.

So I am going to ask some really basic, expose my ignorance, questions (feel free to consider any of this feedback that you use however you use feedback)...

I am bad w/the lingo and also not knowledgeable of what languages like R actually do. My main interest is in understanding the life underwriting Predictive modeling/indexing projects. Tell me if this is the what is basically going on or set me straight where I am not making sense...

So say I am LexisNexis (or some other data aggregator) and I have a training database with lots of fields that I think are related to mortality expectations. I assume they are things like credit scores, financial info, medical info, other stuff(?) on individuals. I don't see how you can use that to actually try to directly predict mortality rates. So I am assuming what we are really doing is developing a "formula" or "algorithm" which uses the data to predict the underwriting class (preferred, select, std, rated, etc) that would be assigned if the individual actually went thru traditional medical underwriting.

Presumably then the big reinsurers can take those predictions on their database of risks and back test how well the prediction matches the actual assignments. To the extent that they don't match, presumably they can look at their mortality experience, reallocated to the new classes and try to predict whether the predictive index actually does a better or worse job (I've heard predictions that the PI actually does better in out years then traditional).

So, for a simple example, when you are doing some "multi-variable" linear regression on say, n variables (X1,,XN), you come with some "index formula" of "coefficients" (C1...CN) such that INDEX = C1*X1 + ... + CN*XN and you use the index to categorize the risks into classes.

So what does R do exactly? Is it coming up with the coefficient C1...CN that best fit the test data (under some measure of fit)?

Then is the exercise to choose the relevant variables, or other more complex methods besides linear regression to come up with the index until you decide on the one that appears to work best? And R is the tool that performs some algorithm to acheive the best fit once you have chosen your variables and method?

Is that basically what is being done? Set me straight where I go astray if you will.

Now what if I had additional data, maybe proprietary or maybe other public info that I think could improve upon the index. Would it make sense to build another test database that includes my data plus the published index result (as a single variable in my new test data) and then start to use that plus my fields to see if I can come up with a better working index that builds upon the original index?

Last edited by Chuck; 12-03-2017 at 08:22 PM..
Reply With Quote
 
Page generated in 0.22070 seconds with 9 queries