Comments (4)
Hi there -
There are several sources of stochastic behaviour in Ivis (see issue #31 ).
Here's an example script that should provide reproducible results between Ivis runs.
Note that I tested this in Jupyter Notebook. If you're running ivis from shell, you'd need to set the PYTHONHASHSEED environment variable before running the script. Something like:
PYTHONHASHSEED=0 python3 run_ivis.py
import os
os.environ["PYTHONHASHSEED"]="0"
import random
import numpy as np
import numpy as np
import tensorflow as tf
import random as python_random
# The below is necessary for starting Numpy generated random numbers
# in a well-defined initial state.
np.random.seed(123)
# The below is necessary for starting core Python generated random numbers
# in a well-defined state.
random.seed(123)
# The below set_seed() will make random number generation
# in the TensorFlow backend have a well-defined initial state.
# For further details, see:
# https://www.tensorflow.org/api_docs/python/tf/random/set_seed
tf.random.set_seed(1234)
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.preprocessing import MinMaxScaler
from sklearn.neighbors import NearestNeighbors
from ivis import Ivis
iris = load_iris()
data = iris.data
target = iris.target
X = MinMaxScaler().fit_transform(data)
# Here we're creating a fixed NN matrix. For large out-of-memroy datasets, you can achieve the same
# with Ivis' Annoy functionality (https://bering-ivis.readthedocs.io/en/latest/api.html#neighbour-retrieval),
# i.e. build the index separately and then pass it into the Ivis constructor.
nbrs = NearestNeighbors(n_neighbors=5).fit(X)
distances, indices = nbrs.kneighbors(X)
model = Ivis(embedding_dims=2, k=5, batch_size=X.shape[0],
neighbour_matrix=indices,
n_epochs_without_progress=5, verbose=0)
model.fit(X)
embeddings = model.transform(X)
plt.scatter(embeddings[:, 0], embeddings[:, 1], c=target)
You should get this result:
from ivis.
Got it, thanks! From what I can see it will give very very similar representations but not exactly the same (and that's ok, just clarifying the behavior), like:
from ivis.
Interesting - you should be getting identical results between each run, as long as all seeds are set before Ivis module is imported.
Are you using Ivis' built-in nearest neighbour search, or are you pre-building the nearest neighbour matrix?
Other contributing factors may be how different versions of python, tensorflow, and numpy handle RNG...
from ivis.
Are you using Ivis' built-in nearest neighbour search, or are you pre-building the nearest neighbour matrix?
- I used the snippet you provided before.
Since you mentioned that the results should be identical, I checked some lib versions, I was using ivis 1.8.4
as soon as I updated to 2.0.1
the issue was gone. Thank you for your support.
from ivis.
Related Issues (20)
- `NotFittedError` after caching and reloading fitted `Ivis` instance HOT 2
- Suggest implementing `predict_proba` and `predict` methods for Ivis object. HOT 1
- How does ivis compare to UMAP? HOT 2
- Add conda-forge package
- About scaling HOT 2
- `KeyError` followed by `joblib.externals.loky.process_executor.BrokenProcessPool` when using `sklearn.model_selection.GridSearchCV` with `n_jobs != 1` HOT 3
- One of the unit tests (knn_retrieval) can fail (machine dependent?) HOT 1
- OSError HOT 1
- attempt to apply non-function HOT 9
- Extremely slow extraction of KNN neighbours on 100k samples HOT 4
- InternalError: Graph execution error: HOT 4
- 2D visulization of crowded cluster with ivis HOT 1
- Does not work with TensorFlow 2.16.1
- Ivis is not able to run inference on a sparse matrix
- Reproducibility HOT 2
- `chunk_size` in knn set to 0 HOT 2
- Ivis seems to provoke errors when composing a sklearn.pipeline.Pipeline passed to sklearn.model_selection.GridSearchCV and executed in parallel HOT 10
- classification_weight Parameter HOT 2
- Meaning of "Observations" on https://bering-ivis.readthedocs.io/en/latest/hyperparameters.html HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ivis.