I would like to determine the KDE for each point of my data. However, the output produ

This is very useful. Thank you for the quick response : ) <blockquote

The output has different dimensiones to the input data about kdepy HOT 2 CLOSED

tommyod commented on June 15, 2024

The output has different dimensiones to the input data

from kdepy.

Comments (2)

tommyod commented on June 15, 2024 1

I would like to determine the KDE for each point of my data.

FFTKDE is much faster than scipy, but it's because of a gradeoff in the implementation. From the FFTKDE docs:

This class implements a convolution (FFT) based computation of a KDE. While this implementation is very fast, there are some limitations: (1) the bandwidth must be constant, (2) the KDE must be evaluated on an equidistant grid and (3) the grid must encompass every data point. The finer the grid, the smaller the error.

You can increase the number of grid points, or use a custom grid (but it must be equidistant). See the documentation of the .evaluate() method.

Something like this should do the trick:

grid_points = [64, 128] # 64 grid points in first dimension, 128 in second
out = FFTKDE(kernel="gaussian").fit(customer_2d).evaluate(grid_points)

grid_points = 64 # 64 grid points in both dimensions
out = FFTKDE(kernel="gaussian").fit(customer_2d).evaluate(grid_points)

For almost every practical purpose (e.g. plotting), 1024 grid points should be more than enough and the fact that the grid is equidistant does not matter. 1024 was chosen as the default because a monitor is often 1080 pixels wide, so even a full-screen plot of a KDE will have negligible error due to grid point interpolation. Again, this is for most use cases :)

Let me know if you have any other questions. The TreeKDE allows arbitrary grid points, but is not as fast as FFTKDE.

from kdepy.

suzannejin commented on June 15, 2024

This is very useful. Thank you for the quick response : )

I would like to determine the KDE for each point of my data.

FFTKDE is much faster than scipy, but it's because of a gradeoff in the implementation. From the FFTKDE docs:

This class implements a convolution (FFT) based computation of a KDE. While this implementation is very fast, there are some limitations: (1) the bandwidth must be constant, (2) the KDE must be evaluated on an equidistant grid and (3) the grid must encompass every data point. The finer the grid, the smaller the error.

You can increase the number of grid points, or use a custom grid (but it must be equidistant). See the documentation of the .evaluate() method.

Something like this should do the trick:
grid_points = [64, 128] # 64 grid points in first dimension, 128 in second
out = FFTKDE(kernel="gaussian").fit(customer_2d).evaluate(grid_points)

grid_points = 64 # 64 grid points in both dimensions
out = FFTKDE(kernel="gaussian").fit(customer_2d).evaluate(grid_points)
For almost every practical purpose (e.g. plotting), 1024 grid points should be more than enough and the fact that the grid is equidistant does not matter. 1024 was chosen as the default because a monitor is often 1080 pixels wide, so even a full-screen plot of a KDE will have negligible error due to grid point interpolation. Again, this is for most use cases :)

Let me know if you have any other questions. The TreeKDE allows arbitrary grid points, but is not as fast as FFTKDE.

from kdepy.

The output has different dimensiones to the input data about kdepy HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent