01-30-2018, 01:25 PM
 Frustrated & Unmotivated
random sampling

what are the advantages (if any) of using random samples vs the entire population if the computing power exists to analyze the whole population?

e.g. you want to compare an insured of Class X, Subclass Y, territory Z with other such XYZ'ers?

Assume also the randomizer is sufficiently random to avoid selection bias.

Background:
We have a tool that does this comparison in-house using random sampling. it was created years ago. I'm wondering why we shouldn't just compare the whole population. studying this was years ago, so I forget all this. Yet i still have a needle in my brain that is telling me that Random Sampling is better.
01-30-2018, 01:31 PM
 Frustrated & Unmotivated

Moved from P&C for greater exposure. Figure it's a general stats question?
01-30-2018, 02:00 PM
 Guinness

Barring computational efficiency, one thing that springs to mind is using such a partition to keep yourself from over-interpreting random variation. You could use some random sample for ideation and another to test whether the idea really makes sense. Another thing you could always do some bootstrapped statistics to check variablity of various statistics: mean, standard deviation, etc...
02-13-2018, 09:41 PM
 PAC

Over fitting... general modeling concept, should have a strategy to stratify (with random sampling) train/test/validate data sets.