Actuarial Outpost Help me understand R^2, Correlation, and Causality
03-14-2007, 06:33 PM
 zeroEthix
Help me understand R^2, Correlation, and Causality

I have an array of data and the R^2 is .2632 and the correlation is .5130. I ran other tests and found that correlation tends to be larger than R^2.

What's the relationship between these two and how is causality involved??

This is pure maths masturbation in Excel.
03-14-2007, 06:38 PM
 asdfasdf

correlation is the square root of r^2. And correlation does not imply causality.
03-14-2007, 06:43 PM
 Elisha

While we're at it, can someone refresh my mind on multi-collinearity as well? I think that is where you have several same-like x-variables in a multiple regression (e.g. using 1Q06 and 2006).
03-14-2007, 06:47 PM
 zeroEthix

I knew it was easy

So R=Correlation
03-14-2007, 09:25 PM
 asdfasdf

 Originally Posted by Elisha While we're at it, can someone refresh my mind on multi-collinearity as well? I think that is where you have several same-like x-variables in a multiple regression (e.g. using 1Q06 and 2006).
I'm a little hazy, I'm remembering something from course 7, in a multiple linear regression model, you still only have one Y, but there are various x terms, you have to watch out because each x term you add will improve the model's r^2, but some of the x terms are inter-related, so you need to calculate an adjusted r^2 that adjusts against you to see if adding more x variables is really helping your model

(because if you have 10 data points, you can have a 9 variable equation that describes them, but that isn't as good for forecasting as a one variable equation)
03-14-2007, 11:24 PM
 Renaissance Man

Multicollinearity means essentially highly correlated covariates. Neither hard to detect nor hard to resolve (except perhaps in poorly designed and collected data, but that's really a different problem).
03-15-2007, 06:56 AM
 horace goldfarb

Heteroskedasticity! Sorry, I couldn't help myself.
03-15-2007, 07:20 AM
 horace goldfarb

 Originally Posted by asdfasdf (because if you have 10 data points, you can have a 9 variable equation that describes them, but that isn't as good for forecasting as a one variable equation)
Only if there is some single-parameter parametrization that describes the behavior of the data. If your 9 parameters aren't multi-collinear, that may not be the case.

03-15-2007, 09:24 AM
 celery

Quote:
 Originally Posted by zeroEthix This is pure maths masturbation in Excel.
Forgive me for asking a stupid question, but how does the bolded term relate to pure mathematics? Isn't it more applied than pure?
03-15-2007, 09:48 AM
 Chief Petosky

There are no stupid questions ... only stupid answers ... except in this thread, where all the answers are brilliant. I have to hand it to you all for a job well done.
