PDA

View Full Version : kernel smoothing in real applications


rl2
11-25-2002, 11:37 AM
does anyone know of a good statistical reference (preferably online) that describes the differences/tradeoffs between the different types of kernels and how to choose bandwidths when doing kernel smoothing of H(t) ?

i passed c4 by reading HTP, so naturally i have no clue how to actually use any of that stuff in real calculations ... :roll:

J-Man
11-25-2002, 12:35 PM
Perhaps the new course 4 study note?

rl2
11-25-2002, 01:30 PM
i did. it seems the survival has been deleted from the syllabus altogether. yuck.

aces219
11-25-2002, 02:15 PM
No it hasn't. See page 18 of the Klugman study note Estimation, Evaluation, and Selection of Actuarial Models. It can be found at this URL: http://www.soa.org/eande/spring03_catalog/c4nov20.pdf

ma'ak
11-25-2002, 02:28 PM
okay, totally gonna probably spell this wrong, but i think the book seems to say that the 'epanichov' kernel does the best job. if i remember correctly, when you go through the examples in the book, it tells you the results of each of the 3 kernels and i believe this produces the best results, which isn't surprising, seeing as it IS the most complicated of the 3.

i think that's the case with the confidence intervals also. in the examples it turns out, in most cases, that the arcsinc CI is typically the best, which again, shouldn't be surprising, cause it's more technical than linear or log transformed.

aces219
11-25-2002, 03:12 PM
Actually, the biweight kernel is more complicated than the EP kernel.

EP: K(x) = (3/4)(1-x^2)

Biweight: K(x) = (15/16)(1-x^2)^2

J-Man
11-25-2002, 06:11 PM
The Epanechnikov (sp) and biweight kernels seem to belong to the same "family of kernels" of the form K(x) = C(1 - x^2)^a. In this case, the constant C is chosen so that the kernel integrates to 1 on [-1,1]. So in this sense, the biweight kernel is not much more complicated than the Epanechnikov kernel. There is probably a triweight kernel, with a=3, a quadweight kernel, with a=4, and so on.

jaegar
11-25-2002, 09:12 PM
J-Man,

You can also include the uniform kernel within your family of kernels.

K(u) = [1 - (1/2)^(2^n)]*[(1 - u^2)^n]

n = 0 - Uniform
n = 1 - Epanechnikov
n = 2 - Biweight

J-Man
11-26-2002, 07:49 AM
Egads!

(Cool observation, especially about the constants in front.)

Mopus
11-26-2002, 09:10 PM
I'm no expert but my understanding is that bandwidth selection is data dependent as a practical matter and will depend on the kernel you want to use. You can try and download a copy of R and sift through the algorithms and documentation to see what you're dealing with. At the very least it will point you in the right direction. If you have access to SAS it will probably have a canned package that'll do what you want.

As far as the EP kernel goes, it has some asymptotic optimality properties that make it theoretically relevent when coupled with a bandwidth h chosen to be O(n^-.2). See Lehman (Large Sample Theory) or van Der Vaart (Asymptotic Statistics) for details.

StephenLL
11-26-2002, 09:28 PM
Matlab has it built in as well. I believe the name of the function is ksdensity"()" in the stats toolbox. I'm not sure if the freeware alternative to matlab, Octave, has any built in kernal density functions.

You can try and download a copy of R and sift through the algorithms and documentation to see what you're dealing with. At the very least it will point you in the right direction. If you have access to SAS it will probably have a canned package that'll do what you want.