Actuarial Outpost
 
Go Back   Actuarial Outpost > Actuarial Discussion Forum > Property - Casualty / General Insurance
FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions

Browse Open Actuarial Jobs

Life  Health  Casualty  Pension  Entry Level  All Jobs  Salaries


Reply
 
Thread Tools Search this Thread Display Modes
  #21  
Old 02-18-2019, 03:02 PM
FactuarialStatement's Avatar
FactuarialStatement FactuarialStatement is offline
Member
CAS AAA
 
Join Date: Oct 2012
Studying for 5
Favorite beer: Beer
Posts: 2,107
Default

Quote:
Originally Posted by JohnLocke View Post
This is the funniest insult I've heard in some time.

[Disclaimer] I am not a P&C Actuary. [/Disclaimer]
why isn't that good enough?
well 1 reason it isn't good enough is because it would never run to begin with and he isn't actually using a poisson. Which means he probably doesn't even know what distribution he's really using.

Another related reason it would not be good enough is because the assumptions don't fit the data so what does it even mean to maximize the poisson likelihood?

A third reason it would not be good enough is because you chose the wrong validation metrics and loss functions to optimize. Do you really want to minimize say MSE if large outliers are expected? Think about what Poisson data looks like (0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 2). And is looking at it from the perspective of "sorting accuracy" as with the Gini index correct?
__________________
P | FM | 3F | 3ST | 3LC | C | 5 | 6 |
OC1 | OC2 | COP
Econ | Stats | Corp Fin
ACAS

7
8
9

Last edited by FactuarialStatement; 02-18-2019 at 03:08 PM..
Reply With Quote
  #22  
Old 02-18-2019, 03:12 PM
FactuarialStatement's Avatar
FactuarialStatement FactuarialStatement is offline
Member
CAS AAA
 
Join Date: Oct 2012
Studying for 5
Favorite beer: Beer
Posts: 2,107
Default

Hey I minimized MSE. That means I have the best model right?
__________________
P | FM | 3F | 3ST | 3LC | C | 5 | 6 |
OC1 | OC2 | COP
Econ | Stats | Corp Fin
ACAS

7
8
9
Reply With Quote
  #23  
Old 02-18-2019, 03:17 PM
Actuarially Me Actuarially Me is offline
Member
CAS
 
Join Date: Jun 2013
Posts: 191
Default

Quote:
Originally Posted by JohnLocke View Post

if the Poisson GLM is very uncommon for this LOB and I was new to modeling this data (as you stated), then am I actually evaluating the quality of the model correctly across all dimensions?
Kinda what I'm worried about. I have nothing really to compare or benchmark it against other than the rating software tool we have. My model does perform better than that though. Most of the material I've read states Tweedie is the best single model, so wasn't sure if I was overlooking things.

I definitely have a lot to go in terms of domain knowledge.
Reply With Quote
  #24  
Old 02-18-2019, 03:19 PM
FactuarialStatement's Avatar
FactuarialStatement FactuarialStatement is offline
Member
CAS AAA
 
Join Date: Oct 2012
Studying for 5
Favorite beer: Beer
Posts: 2,107
Default

I guess if you round your losses to the nearest INT then you might get the model estimation to begin to run - I doubt it would ever converge. I actually caught a guy once at work implementing the coefficients from an h20 model that didn't converge. So I can't say I'm surprised by this.

Suppose it converged. Most of the observations are 0, but then several are in the thousands+, and we are minimizing poisson deviance? This model sounds so bad
__________________
P | FM | 3F | 3ST | 3LC | C | 5 | 6 |
OC1 | OC2 | COP
Econ | Stats | Corp Fin
ACAS

7
8
9
Reply With Quote
  #25  
Old 02-18-2019, 03:30 PM
Actuarially Me Actuarially Me is offline
Member
CAS
 
Join Date: Jun 2013
Posts: 191
Default

Quote:
Originally Posted by FactuarialStatement View Post
well 1 reason it isn't good enough is because it would never run to begin with and he isn't actually using a poisson. Which means he probably doesn't even know what distribution he's really using.

Another related reason it would not be good enough is because the assumptions don't fit the data so what does it even mean to maximize the poisson likelihood?

A third reason it would not be good enough is because you chose the wrong validation metrics and loss functions to optimize. Do you really want to minimize say MSE if large outliers are expected? Think about what Poisson data looks like (0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 2). And is looking at it from the perspective of "sorting accuracy" as with the Gini index correct?
Poisson can be a rate based on space, time, or any other metric. In this case, it's Loss Dollars divided by Exposure.

Here's the syntax I'm using for Poisson:
set.seed(1111)
model.poisson <- glm(formula = model.formula,
family = poisson(link = "log"),
data = data.train,
weights = log(Exposure))


Here's the syntax I'm using for Tweedie:
set.seed(1111)
model.tweedie <- glm(formula = model.formula,
family = tweedie(var.power = p,
link.power = 0),
data = data.train,
weights = log(Exposure))

Found a relevant SX: https://stats.stackexchange.com/ques...nteger-numbers

Last edited by Actuarially Me; 02-18-2019 at 03:40 PM..
Reply With Quote
  #26  
Old 02-18-2019, 03:36 PM
Actuarially Me Actuarially Me is offline
Member
CAS
 
Join Date: Jun 2013
Posts: 191
Default

Quote:
Originally Posted by FactuarialStatement View Post
I guess if you round your losses to the nearest INT then you might get the model estimation to begin to run - I doubt it would ever converge.

Suppose it converged. Most of the observations are 0, but then several are in the thousands+, and we are minimizing poisson deviance? This model sounds so bad
I guess I could have gotten lucky with it converging, but the predictions aren't out of the ordinary. Pretty close to the Tweedie model.

But as someone else already mentioned, it could be due to my data being heavily frequency based.
Reply With Quote
  #27  
Old 02-18-2019, 04:03 PM
JohnLocke's Avatar
JohnLocke JohnLocke is offline
Member
SOA
 
Join Date: Mar 2007
Posts: 16,597
Default

Quote:
Originally Posted by FactuarialStatement View Post
well 1 reason it isn't good enough is because it would never run to begin with and he isn't actually using a poisson. Which means he probably doesn't even know what distribution he's really using.
I don't think anything I said was contingent on the data belong to any particular distribution.

Quote:
Originally Posted by FactuarialStatement View Post
Another related reason it would not be good enough is because the assumptions don't fit the data so what does it even mean to maximize the poisson likelihood?
I don't see how this relates to my post either.

Quote:
Originally Posted by FactuarialStatement View Post
A third reason it would not be good enough is because you chose the wrong validation metrics and loss functions to optimize. Do you really want to minimize say MSE if large outliers are expected? Think about what Poisson data looks like (0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 2). And is looking at it from the perspective of "sorting accuracy" as with the Gini index correct?
Those were just examples, not specific suggestions. I read my post again and I think that is clear.

When I'm working on a pricing model I am ultimately really trying to do two things: segment the risk as much as possible in terms of a rank-ordering (the best risks get the best prices and vice-versa) and price every segment at the expected loss. Pretty much every other modeling choice, such as maximizing a likelihood, is intermediate to those tasks or a tool to get at them. In other words, if you are confident in how you are evaluating the model, it shouldn't matter whether or not the underlying data can't actually be Poisson and your model is.
__________________
i always post when i'm in a shitty mood. if i didn't do that, i'd so rarely post. --AO Fan

Lucky for you I was raised by people with a good moral center because if that were not the case, you guys would be in a lot of trouble.
So be very, very glad people like me exist. Your future basically depends on it. --jas66kent

The stock market is going to go up significantly due to Trump Economics --jas66kent
Reply With Quote
  #28  
Old 02-18-2019, 06:36 PM
FactuarialStatement's Avatar
FactuarialStatement FactuarialStatement is offline
Member
CAS AAA
 
Join Date: Oct 2012
Studying for 5
Favorite beer: Beer
Posts: 2,107
Default

Quote:
Originally Posted by JohnLocke View Post
I don't think anything I said was contingent on the data belong to any particular distribution.
And neither did I say that you said anything which was contingent on data belonging to a particular distribution

Quote:
I don't see how this relates to my post either.
It relates because you asked why "If you have a set of validation metrics that you use to determine what a "better" model is...and the best model turns out to be a Poisson GLM, why isn't that good enough?" and I answered why it is not good enough.

Quote:
Those were just examples, not specific suggestions. I read my post again and I think that is clear.
And my response was just an example response. I should rephrase to make it more clear: a third reason it may not be good enough is because *whatever validation metric/objective function you have chosen to optimize* may be an inappropriate choice and therefore you chose the best model at a terrible objective, not the best model for the real objective.
__________________
P | FM | 3F | 3ST | 3LC | C | 5 | 6 |
OC1 | OC2 | COP
Econ | Stats | Corp Fin
ACAS

7
8
9
Reply With Quote
  #29  
Old 02-18-2019, 06:47 PM
FactuarialStatement's Avatar
FactuarialStatement FactuarialStatement is offline
Member
CAS AAA
 
Join Date: Oct 2012
Studying for 5
Favorite beer: Beer
Posts: 2,107
Default

Quote:
Originally Posted by JohnLocke View Post
When I'm working on a pricing model I am ultimately really trying to do two things: segment the risk as much as possible in terms of a rank-ordering (the best risks get the best prices and vice-versa) and price every segment at the expected loss. Pretty much every other modeling choice, such as maximizing a likelihood, is intermediate to those tasks or a tool to get at them. In other words, if you are confident in how you are evaluating the model, it shouldn't matter whether or not the underlying data can't actually be Poisson and your model is.
And how do you know when you've done the bolded when your inference is all based based on bigly wrong assumptions?

Here's a tip for the OP. Take your poisson model, look up randomized quantile residuals - it is implemented in the DHARMa package in R and easy to code yourself as well with use of the simulate() function. Google for Gelman's comments on simulation tests of model assumptions and get familiar with why you need to do it. Test some of your model assumptions to see how appropriate to the data they are - test the zero inflation and dispersion. Report back here with how BADLY you fail those tests.
__________________
P | FM | 3F | 3ST | 3LC | C | 5 | 6 |
OC1 | OC2 | COP
Econ | Stats | Corp Fin
ACAS

7
8
9
Reply With Quote
  #30  
Old 02-18-2019, 07:09 PM
FactuarialStatement's Avatar
FactuarialStatement FactuarialStatement is offline
Member
CAS AAA
 
Join Date: Oct 2012
Studying for 5
Favorite beer: Beer
Posts: 2,107
Default

I'm just gonna leave this code here. Report back if you figure it out but I can't [try to] help you anymore.

library(statmod)
library(magrittr)
library(tibble)

set.seed(42)

tbl <- tibble(x1 = rnorm(1000, 100, 10),
x2 = runif(1000),
x3 = rgamma(1000, 50),
y = rgamma(1000, 10*x1, .05)* rpois(1000, .05*x1)) %>%
mutate(x1 = x1 %>% scale,
x2 = x2 %>% scale,
x3 = x3 %>% scale)

cool_model_bro <- glm(y ~ .,
"poisson",
tbl)
summary(cool_model_bro)


lolwut <- glm(y ~ .,
tweedie(var.power = 1.5,link.power = 0),
tbl)
summary(lolwut)
__________________
P | FM | 3F | 3ST | 3LC | C | 5 | 6 |
OC1 | OC2 | COP
Econ | Stats | Corp Fin
ACAS

7
8
9

Last edited by FactuarialStatement; 02-18-2019 at 07:24 PM..
Reply With Quote
Reply

Tags
glm, poisson, tweedie

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


All times are GMT -4. The time now is 09:37 AM.


Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
*PLEASE NOTE: Posts are not checked for accuracy, and do not
represent the views of the Actuarial Outpost or its sponsors.
Page generated in 0.37897 seconds with 9 queries