Understanding Non-Parametric Density Functions and Kernel Density Estimation

non parametric density functions n.w
1 / 8
Embed
Share

Explore non-parametric density functions, influence functions, notations, and the key idea behind kernel density estimation. Learn about Gaussian kernel density functions, kNN-based density functions, and how to plot density functions for analysis.

  • Density Functions
  • Kernel Estimation
  • Influence
  • Notations
  • Gaussian

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Non-Parametric Density Functions Christoph F. Eick 12/23/12 updated 8/8/23

  2. Influence Functions The influence function of a data object is a function fBy: Fd R0+which is defined in terms of a basic influence function fB y) (x, f (x) f B B = y d F y Examples of basic influence functions are: > 0 if d(x, y) 2 d(x, y) = (x, y) fSquare 2 = f (x, y) e 1 otherwise 2 Gauss

  3. Notations O={o1, ,on} is a dataset whose density has to be measured r is the dimensionality of O n is the number of objects in O d: FrX Fr R0+is a distance function for the objects in O; d is assumed to be Euclidian distance unless specified otherwise dk(x) is the distance of x its k-nearest neighbor in O

  4. Key Idea: Kernel Density Estimation D={x1,x2,x3,x4} fDGaussian(x)= influence(x1,x) + influence(x2,x) + influence(x3,x) + influence(x4)= 0.04+0.06+0.08+0.6=0.78 x1 x3 0.04 0.08 y x2 x4 0.6 0.06 x Remark: the density value of y would be larger than the one for x 4

  5. Gaussian Kernel Density Functions Density functions fB: defined as the sum of the influence functions of all data points. Given N data objects, O={o1, , on} the density function is defined as normalized version: N O B f f = n = i (x) 1 /( * * * ) * (x, o ) r i B 1 Non-Normalized Version: f N O B f = = i (x) (x, o ) i B 1 Example: Gaussian Kernel Non-parametric Density Function f f 2 d(x, o ) i - N = 2 = i (x, o ) e 2 i Gauss 1 Remark: is called kernel or bandwidth; named h in some textbooks!

  6. Other Material on KDE Kernel density estimation Wikipedia (KDE for short) 03Gaussiankernel.nb (wisc.edu) Example: GKD-Function for a 1D-Dataset; h is in the formulas before!

  7. Density Plots https://en.wikipedia.org/wiki/Probability_densit y_function http://ggplot2.tidyverse.org/reference/geom_den sity_2d.html https://python-graph-gallery.com/2d-density- plot/

  8. kNN-based Density Functions Example: kNN density function with variable kernel width width is proportional to dk(x) in point x: the distance of the kth nearest neighbor in O to x; is usually difficult to normalize due to the variable kernel width; therefore, the integral over the density function does not add up to 1 and usually is . Parameter selection: Instead of the width , k has to be selected! ) -( e (x) f d(x,o ) 2 i N O kNN = d (x) k = i 1 or Alpaydin mentions the following variation: -( ) d(x,o ) 2 i N O kNN f = d (x) k = i (x) 1 /( * ( )) e n k x d 1

Related


More Related Content