Actuarial Outpost
 
Go Back   Actuarial Outpost > Actuarial Discussion Forum > Property - Casualty / General Insurance
FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions

Search Actuarial Jobs by State @ DWSimpson.com:
AL AK AR AZ CA CO CT DE FL GA HI ID IL IN IA KS KY LA
ME MD MA MI MN MS MO MT NE NH NJ NM NY NV NC ND
OH OK OR PA RI SC SD TN TX UT VT VA WA WV WI WY

Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old 12-20-2018, 09:38 AM
Actuarially Me Actuarially Me is offline
Member
CAS
 
Join Date: Jun 2013
Posts: 115
Default Creating a GLM to compare to current Rater

I'm coming from reserving background and am in pricing now.

I'm tasked with building a GLM to compare to our current rating system that is maintained in Excel. The excel rater is multiplicative with each category having a rate set by underwriters and those with professional judgment. The end comparison will be which model has the best Predicted vs Accuracy. Then I need to compare the individual coefficients of my model to what is in the Excel rater. Once I encode the categorical variables, there will be around 150 coefficients I'm comparing.


My initial thought was to build two models:

1.) a Poisson distributed, log-linked GLM targeted at Incurred counts and weighted by Exposure.

2.) a Gamma distributed, log-linked GLM targeted at Incurred and weighted by Incurred Counts.

Then at the end, I'd multiply the predictions to get a value. However, I think this would be tough to tie back to the rater's values.


So instead, should I do a single GLM with Tweedie distribution with p ~ 1.5 Targeted at Incurred and weighted by Exposure. Does that make sense for this task?

Once I prove GLM's are a better choice, I'll have the greenlight to build more sophisticated models, I just need to get underwriting on board.

Some additional questions I have regarding building the model:
  • For a Tweedie distribution, do I leave in $0 claims?
  • Claims are limited $500k per occurrence, but there are policies that are outliers. Should I remove those?
  • The losses/counts I'm using are trended, so do I need to include Policy Year?
Reply With Quote
  #2  
Old 12-20-2018, 12:38 PM
itGetsBetter itGetsBetter is offline
Member
CAS AAA
 
Join Date: Feb 2016
Location: Midwest
Studying for Exam 9
Favorite beer: Spruce Springsteen
Posts: 207
Default

Quote:
Originally Posted by Actuarially Me View Post
Some additional questions I have regarding building the model:
  • For a Tweedie distribution, do I leave in $0 claims?
  • Claims are limited $500k per occurrence, but there are policies that are outliers. Should I remove those?
  • The losses/counts I'm using are trended, so do I need to include Policy Year?
Tweedie is reasonable because if you model frequency and severity separately and then multiply them, you are assuming there is no correlation between them, when in reality there probably is. Tweedie will fit better. Use tweedie.profile in R to determine the best p parameter.

1.Yes, leave in $0 claims. Tweedie has a point mass at $0.
2. Cap the outliers instead of removing them. The average prediction coming out of your model will be lower than the average target in the data. Use an adjustment factor to bring all predictions up to the average target. That way you can still compare with the Excel rater.
3. Probably not, unless Policy Year is a proxy for something else. When you implement the model, do not include Policy Year in the calculations. (But don't take it out of the model if it is controlling for something or your coefficients will change and try to pick up some of its noise in their signal.)
Reply With Quote
  #3  
Old 12-20-2018, 01:07 PM
Actuarially Me Actuarially Me is offline
Member
CAS
 
Join Date: Jun 2013
Posts: 115
Default

Thanks! What's a usual R^2 for a baseline model? After doing various tweaks, I'm only getting around .2.

I couldn't get the model to converge when I used exposure as the "weight" argument. Should I just leave the exposure as a predictor or divide incurred by the exposure?

One last question: To get the coefficients on the same scale of the rater, do I need to take the log of the coefficients since I used a log link?
Reply With Quote
  #4  
Old 12-20-2018, 01:15 PM
itGetsBetter itGetsBetter is offline
Member
CAS AAA
 
Join Date: Feb 2016
Location: Midwest
Studying for Exam 9
Favorite beer: Spruce Springsteen
Posts: 207
Default

Quote:
Originally Posted by Actuarially Me View Post
Thanks! What's a usual R^2 for a baseline model? After doing various tweaks, I'm only getting around .2.

I couldn't get the model to converge when I used exposure as the "weight" argument. Should I just leave the exposure as a predictor or divide incurred by the exposure?

One last question: To get the coefficients on the same scale of the rater, do I need to take the log of the coefficients since I used a log link?
I'm not sure how you are getting R^2 from a Tweedie model. I usually use AIC, lift charts (use a holdout dataset), and gini coefficients to evaluate the model performance. It just depends on the data and the company and the model - so I do not have any baselines for these.

Yes, it would be reasonable to have a loss ratio or pure premium target instead of a pure loss target. If you leave the exposure out, then you're assuming that exposure and loss aren't correlated, which is a step backwards.

Check out the GLM monograph: https://www.casact.org/pubs/monograp...hare-Tevet.pdf
The equation at the top of page 5 might be what you are looking for with comparing coefficients.
Reply With Quote
  #5  
Old 12-20-2018, 01:36 PM
Actuarially Me Actuarially Me is offline
Member
CAS
 
Join Date: Jun 2013
Posts: 115
Default

Thanks! I already read through most of the monograph and started building the model. I think my idea of trying to compare the rating relativity in the excel workbook to the coefficients spit out in the model is a step in the wrong direction. I should just compare their predictive power and give insights to what variables are statistically important.

Ah yeah, I guess R^2 is a useless measure since the model is using deviance rather than least squares. AIC doesn't show up on the summary() model, so I guess I'll have to use them individually.

Definitely not used to using Tweedie

Last edited by Actuarially Me; 12-20-2018 at 01:44 PM..
Reply With Quote
  #6  
Old 12-20-2018, 02:21 PM
itGetsBetter itGetsBetter is offline
Member
CAS AAA
 
Join Date: Feb 2016
Location: Midwest
Studying for Exam 9
Favorite beer: Spruce Springsteen
Posts: 207
Default

Quote:
Originally Posted by Actuarially Me View Post
Thanks! I already read through most of the monograph and started building the model. I think my idea of trying to compare the rating relativity in the excel workbook to the coefficients spit out in the model is a step in the wrong direction. I should just compare their predictive power and give insights to what variables are statistically important.

Ah yeah, I guess R^2 is a useless measure since the model is using deviance rather than least squares. AIC doesn't show up on the summary() model, so I guess I'll have to use them individually.

Definitely not used to using Tweedie
Glad to help. Good success with your comparison and convincing the underwriters! Even if a variable is really predictive, they usually prefer a more intuitive one.

You can get a tweedie AIC: https://www.rdocumentation.org/packa...ics/AICtweedie

Last edited by itGetsBetter; 12-20-2018 at 02:21 PM.. Reason: spelling
Reply With Quote
  #7  
Old 12-20-2018, 04:35 PM
Actuarially Me Actuarially Me is offline
Member
CAS
 
Join Date: Jun 2013
Posts: 115
Default

Do you usually include counts as a predictor when you're doing severity?
Reply With Quote
  #8  
Old 12-20-2018, 06:19 PM
itGetsBetter itGetsBetter is offline
Member
CAS AAA
 
Join Date: Feb 2016
Location: Midwest
Studying for Exam 9
Favorite beer: Spruce Springsteen
Posts: 207
Default

Quote:
Originally Posted by Actuarially Me View Post
Do you usually include counts as a predictor when you're doing severity?
No, the claim counts are the denominator of severity.
Reply With Quote
  #9  
Old 12-21-2018, 10:24 AM
Colonel Smoothie's Avatar
Colonel Smoothie Colonel Smoothie is offline
Member
CAS
 
Join Date: Sep 2010
College: Jamba Juice University
Favorite beer: AO Amber Ale
Posts: 48,751
Default

I don't think R^2 is a good metric to use. You may want to double check this, but I saw a similar low value for a Poisson model that I built, and my data scientist coworker explained that there's not much variance to reduce in the first place (people only get a handful of claims for a policy), so the R^2 isn't going to be very high.

You want to compare the performance of the GLM against the current rater, so you should be looking at lift charts with the current rater as a baseline.

I think another model you should try is to run a GLM with the current rater's variables. If they were never modeled before, you may get a good performance boost by simply calibrating the coefficients (it may also be easier to do this for regulatory/filing purposes). Remember, you don't have to build a perfect model, just one that beats the current one (and the other guy's).
__________________
Recommended Readings for the EL Actuary || Recommended Readings for the EB Actuary

Quote:
Originally Posted by Wigmeister General View Post
Don't you even think about sending me your resume. I'll turn it into an origami boulder and return it to you.
Reply With Quote
  #10  
Old 12-21-2018, 10:37 AM
ALivelySedative's Avatar
ALivelySedative ALivelySedative is offline
Member
CAS
 
Join Date: Dec 2013
Location: Land of the Pine
College: UNC-Chapel Hill Alum
Favorite beer: Red Oak
Posts: 3,105
Default

Quote:
Originally Posted by itGetsBetter View Post
I've had issues getting this to work in the past. My GLM knowledge is limited to tinkering though. Ran a pure premium model against a line of business just to see what it would do, and the default Tweedie package leaves AIC blank, which is explained in the documentation, but I never could get AICtweedie to work at all from what I recall.
__________________
Stuff | 6 | ACAS | FCAS stuff
Reply With Quote
Reply

Tags
glm tweedie, weights

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


All times are GMT -4. The time now is 04:30 AM.


Powered by vBulletin®
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
*PLEASE NOTE: Posts are not checked for accuracy, and do not
represent the views of the Actuarial Outpost or its sponsors.
Page generated in 0.23429 seconds with 9 queries