Actuarial Outpost Target variable = counts. What log link?
 Register Blogs Wiki FAQ Calendar Search Today's Posts Mark Forums Read
 FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions

#1
12-09-2019, 05:09 PM
 samdman82 Member Join Date: Jul 2007 Posts: 240
Target variable = counts. What log link?

If the target variable is unit-valued for something such as claim counts, I know I should use either a Poisson/Negative binomial distribution. What would be an argument for using either Poisson or Negative Binomial?

Also, what log link functions should I use and why?

Thanks for any help
#2
12-09-2019, 05:27 PM
 DrWillKirby Member SOA Join Date: Oct 2011 Studying for PA Posts: 1,065

Log is one of the link functions to be used for continuous right skewed and Poisson data. They're not all log link functions. Logit is the most commonly used for a 0 to 1 "probability of an event" type of target variable(in the material we've been presented anyway)

I'm not sure when to use negative binomial or a non-log link on a Poisson. This is just my first attempt and I'm unsure whether I'll pass. If it's something that complicated and it's not spelled out I'll probably lose a lot of point(Hopefully that would be most of us)
#3
12-09-2019, 07:57 PM
 actuary121110 SOA Join Date: Jun 2017 Location: Kuala Lumpur, Malaysia Studying for PA, QFIQF, QFIPM College: University of Michigan, Ann Arbor (Alumni) Posts: 18

Quote:
 Originally Posted by samdman82 If the target variable is unit-valued for something such as claim counts, I know I should use either a Poisson/Negative binomial distribution. What would be an argument for using either Poisson or Negative Binomial? Also, what log link functions should I use and why? Thanks for any help
If mean approximately equals to variance, use Poisson.
If mean is less than variance (overdispersion), use quasi-poisson or negative binomial.
#4
12-10-2019, 04:00 AM
 samdman82 Member Join Date: Jul 2007 Posts: 240

Quote:
 Originally Posted by actuary121110 If mean approximately equals to variance, use Poisson. If mean is less than variance (overdispersion), use quasi-poisson or negative binomial.
Thanks!
#5
12-10-2019, 10:11 PM
 Relmiw Member CAS Join Date: Apr 2013 Posts: 201

Quote:
 Originally Posted by actuary121110 If mean approximately equals to variance, use Poisson. If mean is less than variance (overdispersion), use quasi-poisson or negative binomial.
How much less than variance? What if the mean is 1000 and the variance is 300? 900? 990? Are you looking at a sample mean confidence interval?
#6
12-10-2019, 10:40 PM
 Colymbosathon ecplecticos Member Join Date: Dec 2003 Posts: 6,167

Quote:
 Originally Posted by Relmiw How much less than variance? What if the mean is 1000 and the variance is 300? 900? 990? Are you looking at a sample mean confidence interval?
Depends upon sample size. Suppose that there were 10^100 observations. the variance would be almost 0 regardless of the true model (assuming mean = 1000) This is the Law of Large Numbers.

There are two ways to answer your problem. One is practical, fit both models and then test which fits better.

Second, statistically test if you can reject the null hypothesis: mu = sigma.
__________________
"What do you mean I don't have the prerequisites for this class? I've failed it twice before!"

"I think that probably clarifies things pretty good by itself."

"I understand health care now especially very well."
#7
12-11-2019, 12:37 AM
 Relmiw Member CAS Join Date: Apr 2013 Posts: 201

Quote:
 Originally Posted by Colymbosathon ecplecticos Depends upon sample size. Suppose that there were 10^100 observations. the variance would be almost 0 regardless of the true model (assuming mean = 1000) This is the Law of Large Numbers
Variance of zero? Aren't we talking about the mean and variance of the target variable, which is a list of numbers? And we're considering the possibility where it is Poisson distributed, so that the mean is a number (say, 40), and the variance is another number (say 36)?

I just ran some test simulations in R that bore that out. Seems like we are talking about different things.

*Update. Hmmm, maybe when you say "the variance would be almost 0 regardless of the true model", you mean variance in the accounting sense; the difference between what we see for mean and what we see for variance. Not the variance of the sample data. If that's the case then I agree with your paragraph #1 but it didn't address my concern. Your second and third paragraph do, though. I can picture how I'd pull #2 off, especially in the context of this exam. #3 I'm having a hard time picturing. Maybe I need sleep. Thanks for the insight

Last edited by Relmiw; 12-11-2019 at 12:40 AM..