Actuarial Outpost
 
Go Back   Actuarial Outpost > Actuarial Discussion Forum > Software & Technology
FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions

Browse Open Actuarial Jobs

Life  Health  Casualty  Pension  Entry Level  All Jobs  Salaries


Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old 01-10-2020, 07:48 PM
we7dude we7dude is offline
Member
 
Join Date: Jan 2009
College: Temple Alumni
Posts: 327
Default log likelihood

Hi all,

SAS provides both the Rsquare and adjusted RSquare when you build a linear regression model with proc glm or proc reg/hpreg. The regular RSquare is non-decreasing, which mean that it either stays the same or increase as you throw more variables into the model, regardless of the variable significance.

I am trying to assess whether the log likelihood possess the non-decreasing property as regular RSquare. SAS Genmod provides the log-likelihood value, which is always negative. I want to know if this value is either the same or increase as we throw more variables into the model, regardless of the new added variables significance.

Thanks in advance.
Reply With Quote
  #2  
Old 01-10-2020, 10:51 PM
Colonel Smoothie's Avatar
Colonel Smoothie Colonel Smoothie is offline
Member
CAS
 
Join Date: Sep 2010
College: Jamba Juice University
Favorite beer: AO Amber Ale
Posts: 50,291
Default

According CAS Monograph #5, page 61, adding more variables "always reduces deviance, whether the predictor has any relation to the target variable or not." Deviance being a function of the maximum theoretical log-likelihood and the log-likelihood of the model. Since the saturated model represents a perfect fit, I take this to mean that the log-likelihood of the model increases.

Therefore you'd want to use penalized measures like AIC or BIC when comparing models, or an F-test if one model is a nested version of another.
__________________
Recommended Readings for the EL Actuary || Recommended Readings for the EB Actuary

Quote:
Originally Posted by Wigmeister General View Post
Don't you even think about sending me your resume. I'll turn it into an origami boulder and return it to you.
Reply With Quote
  #3  
Old 01-12-2020, 01:22 PM
Colymbosathon ecplecticos's Avatar
Colymbosathon ecplecticos Colymbosathon ecplecticos is offline
Member
 
Join Date: Dec 2003
Posts: 6,187
Default

Quote:
Originally Posted by Colonel Smoothie View Post
Since the saturated model represents a perfect fit, ...
No. The saturated model is the best fit possible, not a perfect fit. These two ideas can be different when there are replicants.
__________________
"What do you mean I don't have the prerequisites for this class? I've failed it twice before!"


"I think that probably clarifies things pretty good by itself."

"I understand health care now especially very well."
Reply With Quote
  #4  
Old 01-12-2020, 02:22 PM
Colonel Smoothie's Avatar
Colonel Smoothie Colonel Smoothie is offline
Member
CAS
 
Join Date: Sep 2010
College: Jamba Juice University
Favorite beer: AO Amber Ale
Posts: 50,291
Default

Quote:
Originally Posted by Colymbosathon ecplecticos View Post
No. The saturated model is the best fit possible, not a perfect fit. These two ideas can be different when there are replicants.
Okay, but just a few pages back there's a paragraph that says:

Quote:
At the other extreme lies the saturated model, or a hypothetical model with an
equal number of predictors as there are records in the dataset. For such a model, Equation 2 becomes a system of equations with n equations and n unknowns, and therefore a perfect solution is possible. This model would therefore perfectly “predict” every historical outcome.
What are replicants?
__________________
Recommended Readings for the EL Actuary || Recommended Readings for the EB Actuary

Quote:
Originally Posted by Wigmeister General View Post
Don't you even think about sending me your resume. I'll turn it into an origami boulder and return it to you.
Reply With Quote
  #5  
Old 01-12-2020, 02:56 PM
Colymbosathon ecplecticos's Avatar
Colymbosathon ecplecticos Colymbosathon ecplecticos is offline
Member
 
Join Date: Dec 2003
Posts: 6,187
Default

Quote:
Originally Posted by Colonel Smoothie View Post
What are replicants?
Observations that have identical predictors. For example, if you have two drivers that are MALE-16-DRIVERSED-SPORTSCAR one with 3 accidents and one with only 2, you'll never perfectly fit if you are trying to predict numberof accidents.
__________________
"What do you mean I don't have the prerequisites for this class? I've failed it twice before!"


"I think that probably clarifies things pretty good by itself."

"I understand health care now especially very well."
Reply With Quote
  #6  
Old 01-12-2020, 03:00 PM
Colymbosathon ecplecticos's Avatar
Colymbosathon ecplecticos Colymbosathon ecplecticos is offline
Member
 
Join Date: Dec 2003
Posts: 6,187
Default

Quote:
Originally Posted by Colonel Smoothie View Post
Okay, but just a few pages back there's a paragraph that says:

Quote:
At the other extreme lies the saturated model, or a hypothetical model with an
equal number of predictors as there are records in the dataset. For such a model, Equation 2 becomes a system of equations with n equations and n unknowns, and therefore a perfect solution is possible. This model would therefore perfectly “predict” every historical outcome.
What are replicants?
In context, replicants can potentially create an inconsistent system of equations. So, instead of a true solution, we seek an approximate solutions, typically using least squares.
__________________
"What do you mean I don't have the prerequisites for this class? I've failed it twice before!"


"I think that probably clarifies things pretty good by itself."

"I understand health care now especially very well."
Reply With Quote
  #7  
Old 01-14-2020, 02:39 PM
Heywood J Heywood J is offline
Member
CAS
 
Join Date: Jun 2006
Posts: 4,105
Default

In theory, it is impossible for log-likelihood to decrease with more variables added to the model, but in practice it can and often does happen. There are at least two situation where this happens, most commonly with Tweedie distribution in SAS.

The first situation where it can happen is when log-likelihood is estimated from scaled deviance. Technically it's not a real log-likelihood, but your software can present it as such. When you add more variables, both your scale estimate as well as your deviance can change, and the resulting scaled deviance is not guaranteed to decrease.

The second situation occurs when your model fails to completely converge in a subtle way, without a warning. It can happen when the variable you add is strongly correlated with existing predictors, and the algorithm terminates prematurely because of overly lax default tolerances. It can also happen when there are numerical stability issues with the algorithm.
Reply With Quote
  #8  
Old 01-30-2020, 07:40 PM
we7dude we7dude is offline
Member
 
Join Date: Jan 2009
College: Temple Alumni
Posts: 327
Default

thanks all.
Reply With Quote
  #9  
Old 01-31-2020, 08:02 AM
Marcie's Avatar
Marcie Marcie is offline
Member
CAS
 
Join Date: Feb 2015
Posts: 10,027
Default

Quote:
Originally Posted by Colymbosathon ecplecticos View Post
Observations that have identical predictors. For example, if you have two drivers that are MALE-16-DRIVERSED-SPORTSCAR one with 3 accidents and one with only 2, you'll never perfectly fit if you are trying to predict numberof accidents.
That seems like you don't have a truly saturated model. What am I missing?
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


All times are GMT -4. The time now is 08:34 AM.


Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
*PLEASE NOTE: Posts are not checked for accuracy, and do not
represent the views of the Actuarial Outpost or its sponsors.
Page generated in 0.20686 seconds with 11 queries