Comments (5)
That's a great idea for making the comparison clear. It certainly raises the priority on getting the custom initialisation done!
from umap.
great! it works!
the only hurdle i had to get across: t-SNE tends towards larger embeddings than umap. for example, t-SNE with default parameters embeds MNIST into a +/-35 range while UMAP is closer to +/-15. so initializing with the raw output of t-SNE produces some minor artifacts. once i scale the t-SNE output closer to the UMAP output it works better.
some gifs for your trouble:
https://www.dropbox.com/s/wbv73swh6qlrexg/mnist-all-init-0.8.gif?dl=0
this one is UMAP with min_dist=0.8 and initialized with a t-SNE embedding scaled by 0.4.
https://www.dropbox.com/s/cufrbjsbm79a3kh/mnist-all-init.gif?dl=0
this one is UMAP with default parameters also initialized with a t-SNE embedding scaled by 0.4.
the final embeddings of both are normalized uniformly across both axes to fill the screen with a little padding.
from umap.
That's a great idea that I hadn't considered. It is something along these lines that I was hoping to use to make an "update" procedure, but what you are proposing here is the easy concrete way to move toward that.
As to whether the embedding would be preserved at all -- as long as it is "close" in objective function space to the final embedding then it will be preserved. That's really what the spectral embedding is doing: it provides a good starting point and ensures a degree of consistency over multiple runs (up to rotation/reflection).
from umap.
Awesome! For these tests I will use the spectral initialization to encourage consistency between runs. But I was also considering what it would look like to initialize with a t-SNE embedding, and then it would be much easier to see how the embedding changes after running UMAP.
from umap.
Provisional support turned out to be straightforward. Of course it may be a little glitchy if I didn't catch all the ways it can go wrong, but you should now be able to pass a numpy array of initial positions to the init parameter and have it work from there (in current master).
from umap.
Related Issues (20)
- No module named umap HOT 1
- ufunc 'correct_alternative_cosine' did not contain a loop with signature matching types
- Access umap model node identifiers
- Allow custom distance metric with bool or int as input dtype
- bokeh.plotting AttributeError when making interactive plot HOT 1
- Numba.jit and numpy dependency problems HOT 4
- Looking for most efficient way to run transform()
- numba AOT distribution HOT 6
- Pkd
- umap 0.5.4 is causing a broken package error in Conda feedstock builds HOT 6
- When I run import umap, it takes a long time, and then the program exits abnormally without any error messages HOT 1
- Failed to save a trained Parametric UMAP model ()
- Interactive plot argument: tool - 'NoneType' object is not iterable
- auto_reduce_topic throws an error when all documents are outliers
- `tbb` optional requirement should be configurable HOT 4
- Python kernel unresponsive on using umap.UMAP().fit_transform()
- Numpy 1.24 removes long, causes import error
- No module named 'pkg_resources' HOT 2
- connectivity plot values not comparable with UMAP transform output
- Penguins example SSL error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from umap.