Actuarial Outpost
 
Go Back   Actuarial Outpost > Exams - Please Limit Discussion to Exam-Related Topics > SoA/CAS Preliminary Exams > Exam PA: Predictive Analytics
FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions



Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old 04-26-2020, 03:02 PM
SweepingRocks SweepingRocks is offline
Member
SOA
 
Join Date: Jun 2017
College: Bentley University (Class of 2019ish)
Posts: 293
Default Dumb Question on Binary/Linear Predictor/Target Mean/I don’t even know

https://imgur.com/U4GbuWQ



So I’m reading through this page of the ACTEX manual and I don’t understand the majority of what this page is saying. Is there anyone who can explain this to me like I’m stupid?

My big hang up is the “target mean” formula. What is this? What do you mean “target mean”? I know that Bernoulli is used when the value is binary and the mean is between 0 and 1.

I’m shaky on link functions. So they are a way to link the target variable to the linear model. I kind of get that. We use the linear model (coefficients and variables) to get to a value such as ln(mu). We can then make something like a poison using mu, right? And then we can use that as a model? I guess I’m shaky on the purpose and implementation and reading through the section again isn’t helping.

So back to the page in question. We have the logit formula. That makes sense. But why does pi equal that formula? That’s not the formula for the mean of a Bernoulli. I have no idea how we’re getting to that formula and I don’t know the steps we’re taking to get there. Any help is appreciated!
__________________
FM P MFE STAM LTAM FAP PA
Former Disney World Cast Member, currently no idea what I'm doing

"I think you should refrain from quoting yourself. It sounds pompous." - SweepingRocks
Reply With Quote
  #2  
Old 04-27-2020, 08:35 PM
SamCastillo SamCastillo is offline
SOA
 
Join Date: Nov 2019
Location: Chicago
College: University of Massachusetts Amherst
Posts: 28
Default

This isn't a dumb question at all! In fact, asking this question shows that you understand the concept well enough to put the concepts into words. Past experience with our online form has shown that candidates who practice explaining concepts to others do better on Exam PA than those who don't.

These are very common questions on Exam PA and GLMs in general. I will shed some light on a few points below:

Quote:
Originally Posted by SweepingRocks View Post
https://imgur.com/U4GbuWQ

My big hang up is the “target mean” formula. What is this? What do you mean “target mean”? I know that Bernoulli is used when the value is binary and the mean is between 0 and 1.

This is the mean of the target distribution. It's easier to think of in the continuous case than in the binary case. Imagine that the target is continuous and you fit a model using a Gaussian response. This will have parameters mu and sigma. The mean of the target then is mu. Two resources which provide an alterate explanation are the ExamPA.net study guide as well as this lecture from MIT OpenCourseWare.


Quote:
Originally Posted by SweepingRocks View Post
https://imgur.com/U4GbuWQ

I’m shaky on link functions. So they are a way to link the target variable to the linear model. I kind of get that. We use the linear model (coefficients and variables) to get to a value such as ln(mu).
This is another confusing topic. Your question does a good job of explaining this reasoning, which will be helpful for others who may be reading this.

Link functions "link" the random component and the covariates. They do not link to the target variable directly, as is often misunderstood. The classic example is with the log link: the log link is not the same as applying a log transform to the response variable, as everyone learned in their Stats course on Regression.

See the two resources above for details.
Reply With Quote
  #3  
Old 04-27-2020, 11:02 PM
ambroselo ambroselo is offline
Member
SOA
 
Join Date: Sep 2018
Location: Iowa City
College: University of Iowa
Posts: 308
Default

Quote:
Originally Posted by SweepingRocks View Post
My big hang up is the “target mean” formula. What is this? What do you mean “target mean”?
This is simply the mean of the target variable.

Quote:
Originally Posted by SweepingRocks View Post
We use the linear model (coefficients and variables) to get to a value such as ln(mu). We can then make something like a poison using mu, right?
Can you phrase your question in more concrete terms? In fact, the ability to write clearly is central to much of the success in the PA exam. As you know, it is an exam on written communication.

Quote:
Originally Posted by SweepingRocks View Post
So back to the page in question. We have the logit formula. That makes sense. But why does pi equal that formula? That’s not the formula for the mean of a Bernoulli. I have no idea how we’re getting to that formula and I don’t know the steps we’re taking to get there. Any help is appreciated!
This follows from a rearrangement of terms. Since ln[pi/(1-pi)] = linear predictor = eta (see the definition at the top of page 221), making pi the subject readily leads to pi = exp(eta)/[1+exp(eta)].
Reply With Quote
  #4  
Old 05-04-2020, 12:41 AM
SweepingRocks SweepingRocks is offline
Member
SOA
 
Join Date: Jun 2017
College: Bentley University (Class of 2019ish)
Posts: 293
Default

Thank you both for replying! I took a bit to focus on the other sections in the manual before returning here and getting a better grasp on GLMs.

I definitely need to work on being clearer in my questions! Unfortunately, I feel when I created this thread, I didn't have a great grasp on the questions to ask! I didn't take SRM and the VEE was a long time ago. I apologize for any lapse in knowledge that should be present given the prerequisites for this exam.

I guess the main thing I'm stuck on is what is the point of the link function and does it serve any purpose other than to dictate the target mean? We use logit with bernoulli distributions where the target mean is between 1 and 0. We might use log if we know the target mean should be positive.

Is that the sole purpose of the link function? Or are there other things we must consider when choosing a link function?

I believe I have an "okay" understanding of GLM looking back at the material and here's what I believe is true about GLM:

•GLMs are flexible and allow us to model our target variable using exponential family distributions. We want to use them if our target variable's behavior follows a binary (binomial/bernouli), strictly positive (Gamma/Inverse Gaussian), or discrete non-negative (poisson) pattern.
•We use link functions as a way to connect our predictive variable to the target variable and "set" the mean (ie if the target should have a positive mean, use a log target).
•The change in predictive variables has a different impact on target variables than regular linear models, and this interpretation depends on the link function (i.e. If log link is used, an increase of 1 in X1 results in an increase of e^B1 to the target).

Is there anything I've missed here in terms of base level understanding? Or is there anything I'm not understanding correctly?

Also thank you Dr Lo for creating this study manual. It is extremely helpful and I'd be DOA/FUBAR without it.
__________________
FM P MFE STAM LTAM FAP PA
Former Disney World Cast Member, currently no idea what I'm doing

"I think you should refrain from quoting yourself. It sounds pompous." - SweepingRocks

Last edited by SweepingRocks; 05-04-2020 at 12:43 AM.. Reason: Clarification
Reply With Quote
  #5  
Old 05-04-2020, 07:09 AM
ambroselo ambroselo is offline
Member
SOA
 
Join Date: Sep 2018
Location: Iowa City
College: University of Iowa
Posts: 308
Default

Quote:
Originally Posted by SweepingRocks View Post
I guess the main thing I'm stuck on is what is the point of the link function and does it serve any purpose other than to dictate the target mean? We use logit with bernoulli distributions where the target mean is between 1 and 0. We might use log if we know the target mean should be positive.
The link function serves to relate the target mean (the quantity of interest) to the predictor variables (the available information). For a binary target variable, there are link functions other than the logit link; see page 265 of the manual and Task 5 of the Hospital Readmissions sample project. However, the logit link is usually the most popular.

Quote:
Originally Posted by SweepingRocks View Post
Is that the sole purpose of the link function? Or are there other things we must consider when choosing a link function?
There are two important considerations when deciding on a link function: (1) Appropriateness of predictions; (2) Interpretability. See pages 223 and 225 of the manual, and Task 7 of the Dec 2019 exam and Task 5 of the June 2019 exam.

Quote:
Originally Posted by SweepingRocks View Post
We use link functions as a way to connect our predictive variable to the target variable and "set" the mean (ie if the target should have a positive mean, use a log target).
It is better to say the link function connects the predictor variables to the target mean; the target variable itself is not transformed.

Quote:
Originally Posted by SweepingRocks View Post
The change in predictive variables has a different impact on target variables than regular linear models, and this interpretation depends on the link function (i.e. If log link is used, an increase of 1 in X1 results in an increase of e^B1 to the target).
To be precise, the increase is a multiplicative increase, i.e., the target mean is multiplied by exp(beta_1), holding everything else constant; see page 251 of the manual. Equivalently, the percentage increase in the target mean is exp(beta_1) - 1.

Quote:
Originally Posted by SweepingRocks View Post
Also thank you Dr Lo for creating this study manual. It is extremely helpful and I'd be DOA/FUBAR without it.
I am glad you have found the manual useful!
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


All times are GMT -4. The time now is 04:59 AM.


Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
*PLEASE NOTE: Posts are not checked for accuracy, and do not
represent the views of the Actuarial Outpost or its sponsors.
Page generated in 0.41446 seconds with 11 queries