Comments (4)
That would mean that the random projection trees are failing to split your data. I hadn't really anticipated such a thing happening, hence the weird error. Some possibilities: you have many points that are identical; the data has a very strange distribution; you have a lot of data. I'm guessing one of the first two.
I'll see if I can manage to at least catch this and give a more informative error to start with. Then I'll have to see if I can provide an alternative approach (random initialisation for NN-descent would do, for example).
Thanks for report!
from umap.
My data size is not huge, 100k level. There might be some identical data, i can not tell how many now. but this can happen in MINST as well if you binarize them. Is there any quick to go around this, e.g. change the #neighbors , or metric ?
from umap.
Increasing n_neighbors
may help. Better would be to avoid the rp-tree initialisation, but I don't have code for that yet. If you really want to just get it to work now you can comment out lines 238 to 253 in umap/umap_.py
in the current master and that will force it to fall back to random initialisation. It may be slower, but it should at least work (barring other errors later in the code that such a data distribution may trigger).
from umap.
I just committed what I hope is a fix that will at least allow the algorithm to continue. It will, unfortunately, be slower than it otherwise should be (something to be fixed later) but hopefully will get you passed this initial problem. If you have the opportunity I would appreciate it if you could clone from master and verify that this at least gets you beyond the current error.
from umap.
Related Issues (20)
- Setting a random state still leads to stochastic results
- Implementation of sciki-learn's get_feature_names_out() API is not correct
- Is 'n_training_epochs' working for parameteric UMAP?
- visualize video data
- How to combine UMAP models in new data?
- Edit instructions to make them compatible with zsh
- Empty API page on UMAP API Guide? HOT 1
- PCA diagnostic error HOT 2
- Speed inquries HOT 2
- UMAP crashes when torch also imported before first run HOT 2
- Unable to pickle trained UMAP instance
- Reducing Model Size for UMAP on Large Datasets HOT 2
- umap.UMAP accepts strings as n_neighbors and min_dist, causing later failures
- Optimal dimensions
- RunUMAP Failing HOT 1
- Semi-deterministic output even though randon_state is set
- TypeError: Dispatcher._rebuild() got an unexpected keyword argument 'impl_kind' HOT 1
- illegal hardware instruction python HOT 2
- Transform new input with composite model HOT 1
- Inquiry on Utilizing UMAP for Text Similarity and Clustering HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from umap.