Actuarial Outpost
 
Go Back   Actuarial Outpost > Exams - Please Limit Discussion to Exam-Related Topics > SoA/CAS Preliminary Exams > Exam PA: Predictive Analytics
FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions

Browse Open Actuarial Jobs

Life  Health  Casualty  Pension  Entry Level  All Jobs  Salaries


Reply
 
Thread Tools Search this Thread Display Modes
  #681  
Old 02-26-2020, 09:21 AM
letsplaay letsplaay is offline
SOA
 
Join Date: Jul 2014
Posts: 27
Default

I'm fine. THIS IS FINE.
Reply With Quote
  #682  
Old 02-27-2020, 04:53 AM
KarimZ's Avatar
KarimZ KarimZ is offline
Member
SOA
 
Join Date: May 2015
Location: Pakistan
College: Graduate, Bsc Accounting and Finance
Posts: 205
Default

Have a general question regarding GLM.

What would be your choice of distribution/link function for a GLM if you want to predict claim amounts?

There is data of 1 million individuals, and claim amounts are recorded against them. But approx 95% of those individuals have a 0 claim amount recorded against them.

Thoughts?
__________________
P FM MFE C LTAM PA

VEEs

FAP Interim FAP Final

APC

Reply With Quote
  #683  
Old 02-27-2020, 09:52 AM
Louisville_Toy Louisville_Toy is offline
Member
SOA
 
Join Date: Aug 2019
Posts: 68
Default

Reply With Quote
  #684  
Old 02-27-2020, 10:09 AM
LilActuary's Avatar
LilActuary LilActuary is offline
SOA
 
Join Date: Dec 2019
Posts: 5
Default

Quote:
Originally Posted by KarimZ View Post
Have a general question regarding GLM.

What would be your choice of distribution/link function for a GLM if you want to predict claim amounts?

There is data of 1 million individuals, and claim amounts are recorded against them. But approx 95% of those individuals have a 0 claim amount recorded against them.

Thoughts?
That would be a Tweedie distribution sir. Discrete at zero and continuous beyond that. I believe you'd also need to use a log link for interpretability of model results. Research the package for doing this in R where you'd need to start be determining the optimal power parameter for your data.

Happy to hear other opinions around here.
Reply With Quote
  #685  
Old 02-27-2020, 10:13 AM
ActuariallyDecentAtBest ActuariallyDecentAtBest is offline
Member
SOA
 
Join Date: Dec 2016
Posts: 385
Default

Quote:
Originally Posted by LilActuary View Post
That would be a Tweedie distribution sir. Discrete at zero and continuous beyond that. I believe you'd also need to use a log link for interpretability of model results. Research the package for doing this in R where you'd need to start be determining the optimal power parameter for your data.

Happy to hear other opinions around here.
I don't think I'd use a log link in this situation since he said that 95% of the claim amounts are zero.
Reply With Quote
  #686  
Old 02-27-2020, 10:20 AM
Louisville_Toy Louisville_Toy is offline
Member
SOA
 
Join Date: Aug 2019
Posts: 68
Default

Quote:
Originally Posted by LilActuary View Post
That would be a Tweedie distribution sir. Discrete at zero and continuous beyond that. I believe you'd also need to use a log link for interpretability of model results. Research the package for doing this in R where you'd need to start be determining the optimal power parameter for your data.

Happy to hear other opinions around here.

https://stats.stackexchange.com/ques...-a-tweedie-glm

with p≈1.75
Reply With Quote
  #687  
Old 02-27-2020, 11:06 AM
Nactuary Nactuary is offline
SOA
 
Join Date: Oct 2019
College: Middle Tennessee State University
Posts: 6
Default

Quote:
Originally Posted by KarimZ View Post
Have a general question regarding GLM.

What would be your choice of distribution/link function for a GLM if you want to predict claim amounts?

There is data of 1 million individuals, and claim amounts are recorded against them. But approx 95% of those individuals have a 0 claim amount recorded against them.

Thoughts?
I would use a binomial or Poisson for whether or not they make a claim (binomial if they can only make one, Poisson if they can make more than one claim), and then model severity using only the ~50,000 that have claims. Distribution/link would depend on the shape of the curve, but I would consider inverse Gaussian, lognormal, or a Gamma distribution, and I would consider an identity or log link.
Reply With Quote
  #688  
Old 02-27-2020, 06:05 PM
noone noone is offline
Member
SOA
 
Join Date: Feb 2017
Posts: 138
Default

Results are already posted in the e-learn area; I heard. Can someone confirm?
Reply With Quote
  #689  
Old 02-27-2020, 06:17 PM
BabyHorse15 BabyHorse15 is offline
Member
CAS SOA
 
Join Date: Apr 2015
Studying for life
Posts: 81
Default

Quote:
Originally Posted by noone View Post
Results are already posted in the e-learn area; I heard. Can someone confirm?
You said this last sitting as well......
__________________
Reply With Quote
  #690  
Old 02-27-2020, 06:44 PM
nole61 nole61 is offline
SOA
 
Join Date: Jan 2019
Studying for FSA PRF Module
College: FSU Grad
Favorite beer: The High Life
Posts: 10
Default

Quote:
Originally Posted by noone View Post
Results are already posted in the e-learn area; I heard. Can someone confirm?
Yea they're not up....
__________________
ASA
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


All times are GMT -4. The time now is 03:35 AM.


Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
*PLEASE NOTE: Posts are not checked for accuracy, and do not
represent the views of the Actuarial Outpost or its sponsors.
Page generated in 0.16138 seconds with 12 queries