Actuarial Outpost Variance Question
 Register Blogs Wiki FAQ Calendar Search Today's Posts Mark Forums Read
 FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions

 Salary Surveys Property & Casualty, Life, Health & Pension Health Actuary JobsInsurance & Consulting jobs for Students, Associates & Fellows Actuarial Recruitment Visit DW Simpson's website for more info. www.dwsimpson.com/about Casualty JobsProperty & Casualty jobs for Students, Associates & Fellows

 Short-Term Actuarial Math Old Exam C Forum

#1
06-22-2018, 10:15 AM
 ericp Member Join Date: Aug 2007 Posts: 282
Variance Question

I've seen the empirical variance expressed two different ways and I don't understand when to use which.

i. Var = F*S/N
ii. Var = F*S*N

The later looks like the variance of a binomial which makes sense to me in situations where we want, say, the number of data points below a value, like F(1500). However, this is exactly the case in SOA #227, yet the solution uses the former (i). I don't understand why.

If we wanted the variance of F(1500) why wouldn't it be:
# pts below 1500/ N * # pts above 150 / N * N - because this would be variance of a binomial with F = p and S = (1-p) * N.

When finding variance of grouped data, specifically, when the value we want is within a group, we use ii. Yet, in the simulation section we write the variance as in i.

Can someone explain simply what the difference is or why we use one or the other?
thanks.
#2
06-22-2018, 02:19 PM
 Abraham Weishaus Member SOA AAA Join Date: Oct 2001 Posts: 7,181

Quote:
 Originally Posted by ericp I've seen the empirical variance expressed two different ways and I don't understand when to use which. i. Var = F*S/N ii. Var = F*S*N The later looks like the variance of a binomial which makes sense to me in situations where we want, say, the number of data points below a value, like F(1500). However, this is exactly the case in SOA #227, yet the solution uses the former (i). I don't understand why. If we wanted the variance of F(1500) why wouldn't it be: # pts below 1500/ N * # pts above 150 / N * N - because this would be variance of a binomial with F = p and S = (1-p) * N. When finding variance of grouped data, specifically, when the value we want is within a group, we use ii. Yet, in the simulation section we write the variance as in i. Can someone explain simply what the difference is or why we use one or the other? thanks.
F(1500) is not the number of points below 1500.

You must distinguish between a binomial random variable and a binomial proportion random variable.

A binomial random variable ALWAYS assumes only integral values. A binomial random variable can never by 1/2, for example. Your formula ii is the formula for its variance.

A binomial proportion random variable is a binomial random variable divided by its parameter m (or N as you're calling it) and is almost never integer valued (unless it is 0 or 1). It is therefore easy to distinguish this from a binomial random variable. Your formula i is the formula for its variance. Another way to distinguish it from a binomial random variable is that it is always between 0 and 1, whereas a binomial random variable usually can be higher than 1 (unless m=1).

F(1500) assumes fractional values and is between 0 and 1, so it is a binomial proportion random variable.

The number of points below 1500 is always an integer, so it is a binomial random variable.

I leave it to you to go through your other examples and to determine which type of random variable they are.
#3
06-22-2018, 04:01 PM
 ericp Member Join Date: Aug 2007 Posts: 282

Thank you. Your explanation makes sense. I will need to go back and find examples where the variance is based on the binomial - number, not proportion.