Actuarial Outpost (http://www.actuarialoutpost.com/actuarial_discussion_forum/index.php)
-   Software & Technology (http://www.actuarialoutpost.com/actuarial_discussion_forum/forumdisplay.php?f=17)
-   -   Pseudorandom number generation in Excel (http://www.actuarialoutpost.com/actuarial_discussion_forum/showthread.php?t=170697)

 campbell 07-16-2009 03:09 PM

Pseudorandom number generation in Excel

I was asked as to why RAND() in Excel sucks in a different thread.

I believe this is an important enough topic I'm spinning it off to here.

I'm going to give two practical reasons up front - and this holds for all versions of Excel that have a RAND() function. I can talk about tests of randomness at a later time. The first issue is relatively minor, if annoying, but the second issue is a big bold X over using the function for serious Monte Carlo modeling.

1. RAND() is volatile

That sucker reevaluates every time the spreadsheet recalculates, which could be every time you change a cell if you have it on automatic calculation. If you have too many RAND() calls, your spreadsheet will be slower than molasses even if you just type a short string in a cell.

A bit of info on volatile functions:
http://www.decisionmodels.com/calcsecretsi.htm

You could keep your spreadsheet in manual calculation mode, but that has dangers of its own.

2. RAND() is an unseeded PRNG

This is important for serious Monte Carlo applications. You need to be able to reproduce a particular sequence of numbers. The way this is accomplished in most PRNGs is by having a seed for the PRNG, which is an argument passed to the generator. This will set the initial "random" number for a sequence, and then you will get a specific "random" sequence out for however long it takes until it repeats.

One of the real reasons you need this seed is so you can put the PRNG through a bunch of tests - the Diehard suite of tests is common, but there are other tests you can get from the National Institute of Standards and Tech [NIST].

When you combine 1 & 2, you basically get an untestable algorithm, and I really can't rely on Microsoft statements such as this one:
http://support.microsoft.com/kb/828795

They claim that the current [Excel 2003 & 2007] RAND() has a period of >10^13 [so it will take over 10 trillion calls before it repeats] and that the underlying PRNG satisfies the Diehard tests.

But how can I test that? Sure, I can get around the volatility issue by setting calculation to manual, but then I have the issue of generating over 10 trillion numbers to try to catch the repeat... but because of the structure of Excel, I'm not quite sure what the "order" of these numbers is supposed to be. With seeded PRNGs, you can say "I want the 1000th number generated in the sequences seeded with 2362357" [or whatever]. I can't do that with Excel's RAND().

Of course, I can get around the reproducibility problem by saving all the numbers generated in my pseudorandom sequence... but why should I have to when there are perfectly decent seeded PRNGs out there?

More later.

 tommie frazier 07-16-2009 03:19 PM

thanks campbell. I have always heard that it was lousy for serious simulations, but hadn't cared to dig into the why of it.

Thanks, that makes sense. The volatility thing is truly annoying, but I guess excel will work for what I need. I just wanted to make sure it wasn't wrong.

 1695814 07-16-2009 04:03 PM

Quote:
 Originally Posted by campbell (Post 3774169) This is important for serious Monte Carlo applications. You need to be able to reproduce a particular sequence of numbers.
why is that?

because of this?
Quote:
 Originally Posted by campbell (Post 3774169) One of the real reasons you need this seed is so you can put the PRNG through a bunch of tests.
and why is that? to show that it's random-enough?

 JMO 07-16-2009 04:05 PM

Quote:
 Originally Posted by 1695814 (Post 3774445) why is that?

For your auditors who want you to document your MonteCarlo process. For doing reruns of a stochastic model with corrected data. For coming up with a winning strategy at roulette.

Quote:
 because of this?and why is that? to show that it's random-enough?
yes, exactly.

And, :heynow: when I first started doing actuarial work, PCs had not been invented.

So we had a statistics book that included a table of random numbers. It was used to sample our activities at various times so that we wouldn't have to keep a detailed time log. This data was then used in expense allocations.

It was also useful for selecting samples, such as for selecting data for the auditors. Really.

 Bubba Colbert 07-16-2009 04:06 PM

Pseudolus is one of a kind, I don't think you can randomly generate more of him. Although Mrs. Pseudolus has the best shot at coming close.

 DeepPurple 07-16-2009 04:43 PM

I think that you can seed the Excel PRNG in VB.

 E 07-16-2009 04:47 PM

Quote:
 Originally Posted by Bubba Colbert (Post 3774460) Pseudolus is one of a kind, I don't think you can randomly generate more of him. Although Mrs. Pseudolus has the best shot at coming close.
Now I think you're alluding to the RANDY() function.

Spoiler:

sarcasm <> hijacking

 echo 07-16-2009 05:09 PM

Quote:
 Originally Posted by campbell (Post 3774169) 1. RAND() is volatile That sucker reevaluates every time the spreadsheet recalculates, which could be every time you change a cell if you have it on automatic calculation.
This is Microsoft's solution. From the help manual under RAND():

"If you want to use RAND to generate a random number but don't want the numbers to change every time the cell is calculated, you can enter =RAND() in the formula bar, and then press F9 to change the formula to a random number. "

What a wonderful solution! What could be better? Interesting that they don't even mention copy/ paste special(values) or as you suggested setting the sheet to manual calculation.

 spencerhs5 07-16-2009 05:20 PM

If doing simulation wouldnt you be likely to be using VBAs rnd function? Which if I recall has limitations as well but can be initialized so that its truely random. I could be wrong, but I think this is what I remember.

All times are GMT -4. The time now is 05:33 AM.