Comments (4)
Hi @thunderbug1, thanks for your request! Sparse matrices are still on the roadmap, I didn't get to them yet.
Could you add a small example for a matrix or a dataset that you would like to use susi
on? That could be helpful for the development and tests.
from susi.
Oh great, of course, here is a small example dataset. I had to zip the npz file to be able to upload it.
"-1" is the placeholder for the missing labels.
import numpy as np
from scipy.sparse import csr_matrix, load_npz
y = np.array([-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 1, 2, 1, 1, 1, 3,
3, 4, 4, 5, 1, 1, 6, 5, 7, 7, 4, 3, 4, 2, 4, 3, 1,
4, 4, 5, 3, 2, 8, 9, 10, 3, 2, 8, 8, 9, 7, 5, 3, 11,
11, 11, 11, 11, 11, 11, 6, 6, 6, 6, 6, 6, 5, 5, 5, 1, 1,
2, 1, 9, 12, 13, 14, 14, 14, 3, 3, 3, 3, 3, 3, 3, 3, 3,
15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
15, 15, 2, 2, 5, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 1, 1, 1, 1, 1, 1, 1, 1, 5, 11, 11, 11, 11, 5, 5, 5,
1, 1, 1, 1, 1, 1, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 2,
2, 2, 2, 2, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 11, 1, 1,
1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 10, 3, 1, 1, 9,
9, 8, 5, 5, 1, 13, 13, 3, 3, 3, 3, 9, 9, 1, 1, 1, 1,
15, 15, 15, 15, 15, 15, 15])
X= load_npz("X.npz")
from susi.
Thanks! I will have a look as soon as I find time.
If it is urgent, PRs are also always welcome :)
from susi.
After some tests (thanks for the nice exemplary data, @thunderbug1, that helped a lot!) and thoughts, I decided will not pursue the implementation of handling such kind of sparse input (X
) matrices for now. The reasons:
- Meaningful adaptations are very difficult with such sparse data. If that is incorrect for a dataset, see 2.:
- I guess that
mean
imputing sparse data (e.g.,sklearn.impute.SimpleImputer
) can be a promising intermediate solution for the challenge of sparse data. I have not done the full research on this topic, but it feels intuitive to me.
I am open for corrections and further ideas, and (if course) own pull requests.
[closing this issue now, can be reopened if there is progress or new ideas]
from susi.
Related Issues (11)
- Why we need to save SOMRegressor.p? HOT 1
- ImportError: cannot import name 'softmax' HOT 1
- sklearn.utils.fixes.parallel_helper is removed in the latest version of sklearn HOT 1
- Multi-output regression for SOMRegressor? HOT 2
- SOMClassifier object has no attribute 'predict_proba' HOT 3
- Has SUSI quantization or topographic error? HOT 13
- wrong optional type hint? HOT 4
- Saving and loading SOM model HOT 4
- ValueError: Found array with dim 3. None expected <=2. HOT 4
- Quantization Error HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from susi.