Hi, for some reason SphericalKMeans doesn't find any valid centroids. On my data.

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thanks for your quick reply <a class="user-mention notranslate" data-hovercard-type="u

I refer you to <a href="https://en.wikipedia.org/wiki/Cosine_similarity" rel="nofollow

SphericalKMeans does not converge about spherecluster HOT 4 CLOSED

jasonlaska commented on May 24, 2024

SphericalKMeans does not converge

from spherecluster.

Comments (4)

jasonlaska commented on May 24, 2024

Hi @marcosterland,

In your example, ang is of shape ang.reshape((-1, 1)).shape = (300, 1) so it is a single vector there is not much to cluster.

The scipy.stats.vonmises routine only samples from a univariate vMF distribution. If you are trying to sample from a higher dimension mixture of vMF distribution, I recommend starting from this example: https://github.com/clara-labs/spherecluster/blob/develop/examples/small_mix_3d.py#L19-L36 and working with something like https://github.com/clara-labs/spherecluster/blob/develop/spherecluster/util.py .

cheers,
Jason

from spherecluster.

jasonlaska commented on May 24, 2024

Actually I guess it's 300 vectors of length 1. In this case, normalizing each length-1 vector will result in the value 1 for each vector, making the exercise uneventful (which is why you get

[[ 1.]
 [ 1.]
 [ 1.]]

as output).

from spherecluster.

marcosterland commented on May 24, 2024

Thanks for your quick reply @jasonlaska .
It does make sense to cluster 1-D data. And in fact, the KMeans from scikit-learn finds the correct centers [[1.] [3.] [5.]] on the generated data.
But the scikit-learn KMeans uses standard Euclidean distance instead of the circular, so it's not applicable on e.g. angles.

from spherecluster.

jasonlaska commented on May 24, 2024

I refer you to https://en.wikipedia.org/wiki/Cosine_similarity, the cosine between any two scalars is going to be same (as they all lie on the same axis, the angle between all of them is 0); similarly the inner product between then (just multiplication) normalized by their abs values (just multiplication) will always result in 1 or -1. Since the cosine distance/similarity does not take into account scale (as the euclidean distance does), it doesn't make sense to cluster scalars in this way unless they are complex scalars of the form a + b * i (and I'm not sure this package will handle that case).

I believe there might be a way to use the cosine distance in scikit-learn's k-means, you might want to give that a try.

from spherecluster.

Recommend Projects

SphericalKMeans does not converge about spherecluster HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent