Comments (12)
Do not know for sure, but the format the digits are stored can be different, e.g. [0,1] or 0...255. And t-SNE does a gradient descent, which may fail if the scaling and learning rates are wrong.
Try the example test.py
from abouve, do you get a pretty image?
from multicore-tsne.
Did you try py_bh_tsne
or any other non sk-learn package? Do they also produce a worse result? There can be some implementation differences, default parameters and so on. This repo uses py_bh_tsne
as the base, I fixed some errors there, but it still can be imperfect. I would give it another try and check the implementation, but I hope sk-learn guys will improve their t-sne efficiency earlier, making this repo useless (that is how it should be).
from multicore-tsne.
Yes, unfortunately sk-learn's t-sne is unusable now except for such toy datasets. Yes, that's strange the output shows that the algorithm quickly converged to a low error and stopped any progress further.
Learning embedding...
Iteration 50: error is 43.405481 (50 iterations in 0.00 seconds)
Iteration 100: error is 44.709520 (50 iterations in 0.00 seconds)
Iteration 150: error is 43.567784 (50 iterations in 0.00 seconds)
Iteration 200: error is 42.564679 (50 iterations in 0.00 seconds)
Iteration 250: error is 1.118502 (50 iterations in 0.00 seconds)
Iteration 300: error is 0.238091 (50 iterations in 0.00 seconds)
Iteration 350: error is 0.117268 (50 iterations in 0.00 seconds)
Iteration 400: error is 0.120770 (50 iterations in 0.00 seconds)
Iteration 450: error is 0.121062 (50 iterations in 0.00 seconds)
Iteration 500: error is 0.121366 (50 iterations in 0.00 seconds)
Iteration 550: error is 0.121098 (50 iterations in 0.00 seconds)
Iteration 600: error is 0.121540 (50 iterations in 0.00 seconds)
Iteration 650: error is 0.121057 (50 iterations in 0.00 seconds)
Iteration 700: error is 0.120856 (50 iterations in 0.00 seconds)
Iteration 750: error is 0.121666 (50 iterations in 0.00 seconds)
Iteration 800: error is 0.121161 (50 iterations in 0.00 seconds)
Iteration 850: error is 0.121708 (50 iterations in 0.00 seconds)
Iteration 900: error is 0.121865 (50 iterations in 0.00 seconds)
Iteration 950: error is 0.122631 (50 iterations in 0.00 seconds)
Iteration 999: error is 0.121577 (50 iterations in 0.00 seconds)
Fitting performed in 0.00 seconds.
Comparing to that MNIST test example slowly but progressed till the last iteration.
And the IRIS dataset is a simple one - linearly separable
No I haven't tried other implementations yet
from multicore-tsne.
from multicore-tsne.
I also got a very different result from sklearn implementation on mnist dataset:
Multi-core tsne
sklearn tsne
from multicore-tsne.
Hi, the picture in the README file is t-sne visualization for MNIST dataset, made with the code from this repository. Here is the code https://github.com/DmitryUlyanov/Multicore-TSNE/blob/master/python/tests/test.py
from multicore-tsne.
Hey, I loaded the dataset from sklearn and ran the multicore_tsne on it, would there be a difference?
from MulticoreTSNE import MulticoreTSNE as MultiTSNE digits2 = load_digits()
m_tsne = MultiTSNE(n_jobs=4, init='pca', random_state=0) m_y = m_tsne.fit_transform(digits2.data)
plt.scatter(m_y[:, 0], m_y[:, 1], c=digits2.target) plt.show()
from multicore-tsne.
Yes it works with your example. It appears the scalings are different for the datasets. The dataset from sklearn is 0...16 but the one in your example is [-1,1]. So is this version working only with normalized datasets?
from multicore-tsne.
Thank you for putting this together, as it is the only multicore TSNE application I can get to successfully complete. However, my results are identical to shaidams64. I have an arcsinh transformed data set and I tried an implementation of this method in R (single core) and I get good results. Sklearn implementation (python) on the same data set returns a very similar result. This multi-core implementation works quickly, but produces an indiscernible cloud of points. I have carefully aligned all of the arguments I can, and the result is the same. Even when I set multicoreTSNE to use only one core, the result is the same (cloud of points). Any recommendations on how to fix this?
EDIT: This discussion thread ends with a multicore TSNE implementation that does reproduce my results with Sklearn and Rtsne. lvdmaaten/bhtsne#18
from multicore-tsne.
Is this problem solved with this multi-core tsne?
from multicore-tsne.
from multicore-tsne.
Hi, facing same problem for now - results of sklearn tsne and yours differs on the same params
Yes it works with your example. It appears the scalings are different for the datasets. The dataset from sklearn is 0...16 but the one in your example is [-1,1]. So is this version working only with normalized datasets?
So, if I'm getting it right - data normalizing should help (to make results be about "same")?
from multicore-tsne.
Related Issues (20)
- Memory Allocation Fail (Big Data)
- TSNE results in dense centred ball HOT 2
- Ruby Library
- License Clarification HOT 1
- Failed building wheel for MulticoreTSNE HOT 1
- Non-verbose crash on input containing NaN
- Error in builds using cmake >= 3.20? HOT 8
- Cannot generate Makefile. HOT 1
- Can't install on Apple M1 chip HOT 13
- Enabling verbose causes kernel crash (likely a divided by zero)
- `pip install cmake==3.18.4` worked for me.
- cmake version requirement.txt HOT 1
- signature of sklearn.datasets.make_blobs has changed
- Fix compiling HOT 5
- why can't Multicore-TSNE speed up my experiment with mnist
- linux compile
- Error in generate MING64 Makefile
- Type error with sklearn 1.2.3 HOT 1
- Static library is built but runtime is trying to load dynamic one HOT 1
- Graphics problem with tsne algorithm
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from multicore-tsne.