

FlashChat  Actuarial Discussion  Preliminary Exams  CAS/SOA Exams  Cyberchat  Around the World  Suggestions 

Thread Tools  Search this Thread  Display Modes 
#1




Creating a GLM to compare to current Rater
I'm coming from reserving background and am in pricing now.
I'm tasked with building a GLM to compare to our current rating system that is maintained in Excel. The excel rater is multiplicative with each category having a rate set by underwriters and those with professional judgment. The end comparison will be which model has the best Predicted vs Accuracy. Then I need to compare the individual coefficients of my model to what is in the Excel rater. Once I encode the categorical variables, there will be around 150 coefficients I'm comparing. My initial thought was to build two models: 1.) a Poisson distributed, loglinked GLM targeted at Incurred counts and weighted by Exposure. 2.) a Gamma distributed, loglinked GLM targeted at Incurred and weighted by Incurred Counts. Then at the end, I'd multiply the predictions to get a value. However, I think this would be tough to tie back to the rater's values. So instead, should I do a single GLM with Tweedie distribution with p ~ 1.5 Targeted at Incurred and weighted by Exposure. Does that make sense for this task? Once I prove GLM's are a better choice, I'll have the greenlight to build more sophisticated models, I just need to get underwriting on board. Some additional questions I have regarding building the model:

#2




Quote:
1.Yes, leave in $0 claims. Tweedie has a point mass at $0. 2. Cap the outliers instead of removing them. The average prediction coming out of your model will be lower than the average target in the data. Use an adjustment factor to bring all predictions up to the average target. That way you can still compare with the Excel rater. 3. Probably not, unless Policy Year is a proxy for something else. When you implement the model, do not include Policy Year in the calculations. (But don't take it out of the model if it is controlling for something or your coefficients will change and try to pick up some of its noise in their signal.) 
#3




Thanks! What's a usual R^2 for a baseline model? After doing various tweaks, I'm only getting around .2.
I couldn't get the model to converge when I used exposure as the "weight" argument. Should I just leave the exposure as a predictor or divide incurred by the exposure? One last question: To get the coefficients on the same scale of the rater, do I need to take the log of the coefficients since I used a log link? 
#4




Quote:
Yes, it would be reasonable to have a loss ratio or pure premium target instead of a pure loss target. If you leave the exposure out, then you're assuming that exposure and loss aren't correlated, which is a step backwards. Check out the GLM monograph: https://www.casact.org/pubs/monograp...hareTevet.pdf The equation at the top of page 5 might be what you are looking for with comparing coefficients. 
#5




Thanks! I already read through most of the monograph and started building the model. I think my idea of trying to compare the rating relativity in the excel workbook to the coefficients spit out in the model is a step in the wrong direction. I should just compare their predictive power and give insights to what variables are statistically important.
Ah yeah, I guess R^2 is a useless measure since the model is using deviance rather than least squares. AIC doesn't show up on the summary() model, so I guess I'll have to use them individually. Definitely not used to using Tweedie Last edited by Actuarially Me; 12202018 at 01:44 PM.. 
#6




Quote:
You can get a tweedie AIC: https://www.rdocumentation.org/packa...ics/AICtweedie Last edited by itGetsBetter; 12202018 at 02:21 PM.. Reason: spelling 
#9




I don't think R^2 is a good metric to use. You may want to double check this, but I saw a similar low value for a Poisson model that I built, and my data scientist coworker explained that there's not much variance to reduce in the first place (people only get a handful of claims for a policy), so the R^2 isn't going to be very high.
You want to compare the performance of the GLM against the current rater, so you should be looking at lift charts with the current rater as a baseline. I think another model you should try is to run a GLM with the current rater's variables. If they were never modeled before, you may get a good performance boost by simply calibrating the coefficients (it may also be easier to do this for regulatory/filing purposes). Remember, you don't have to build a perfect model, just one that beats the current one (and the other guy's).
__________________
Recommended Readings for the EL Actuary  Recommended Readings for the EB Actuary 
#10




Quote:
__________________

Tags 
glm tweedie, weights 
Thread Tools  Search this Thread 
Display Modes  

