danilkolikov / fsfc Goto Github PK
View Code? Open in Web Editor NEWFeature Selection for Clustering
License: MIT License
Feature Selection for Clustering
License: MIT License
Hi,
Great package, I hope that you continue to develop it. I don't see this paper included. Have I overlooked it? Any plans to implement it?
http://www.public.asu.edu/~huanliu/papers/pakdd00clu.pdf
Best,
Andrew
Hey,
since your project has the right setup, one can indeed install it over pip using following command:
pip install git+https://github.com/danilkolikov/fsfc.git
This also installs all the requirements. So there is no need for the complicated solution!
Kind regards,
Alexandra
Hello,
Thanks for this library!
There is an error when importing calinski_harabaz_score
in fsfc/fsfc/__test__/AlgorithmTest.py
, line 4.
calinski_harabasz_score has a spelling error. There is a missing 's' . It should be calinski_harabasz_score
This also needs to be changed on line 112 and 114.
I am using scikit-learn version 0.24.1
Hi,
Great package, I hope that you continue to develop it. I don't see this paper included. Have I overlooked it? Any plans to implement it?
http://www.public.asu.edu/~huanliu/papers/pakdd00clu.pdf
Best,
Andrew
Hi,
I am using this library as part of my thesis project, to extract relevant features for my Multitask learning model, and to prevent negative gradient flow.
I have followed the steps as mentioned in the github page. Attaching the code, below
from fsfc.generic import NormalizedCut
from sklearn.pipeline import Pipeline
from sklearn.cluster import KMeans
X = dt.to_numpy()
pipeline = Pipeline([
('select', NormalizedCut(3)),
('cluster', KMeans())
])
pipeline.fit_predict(X)
Attaching the error below
MemoryError Traceback (most recent call last)
in
3 ('cluster', KMeans())
4 ])
----> 5 pipeline.fit_predict(X)
c:\users\s.bangaloreramalinga\appdata\local\programs\python\python37\lib\site-packages\sklearn\utils\metaestimators.py in (*args, **kwargs)
118
119 # lambda, but not partial, allows help() to work with update_wrapper
--> 120 out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
121 # update the docstring of the returned function
122 update_wrapper(out, self.fn)
c:\users\s.bangaloreramalinga\appdata\local\programs\python\python37\lib\site-packages\sklearn\pipeline.py in fit_predict(self, X, y, **fit_params)
447 """
448 fit_params_steps = self._check_fit_params(**fit_params)
--> 449 Xt = self._fit(X, y, **fit_params_steps)
450
451 fit_params_last_step = fit_params_steps[self.steps[-1][0]]
c:\users\s.bangaloreramalinga\appdata\local\programs\python\python37\lib\site-packages\sklearn\pipeline.py in _fit(self, X, y, **fit_params_steps)
305 message_clsname='Pipeline',
306 message=self._log_message(step_idx),
--> 307 **fit_params_steps[name])
308 # Replace the transformer of the step with the fitted
309 # transformer. This is necessary when loading the transformer
c:\users\s.bangaloreramalinga\appdata\local\programs\python\python37\lib\site-packages\joblib\memory.py in call(self, *args, **kwargs)
350
351 def call(self, *args, **kwargs):
--> 352 return self.func(*args, **kwargs)
353
354 def call_and_shelve(self, *args, **kwargs):
c:\users\s.bangaloreramalinga\appdata\local\programs\python\python37\lib\site-packages\sklearn\pipeline.py in _fit_transform_one(transformer, X, y, weight, message_clsname, message, **fit_params)
752 with _print_elapsed_time(message_clsname, message):
753 if hasattr(transformer, 'fit_transform'):
--> 754 res = transformer.fit_transform(X, y, **fit_params)
755 else:
756 res = transformer.fit(X, y, **fit_params).transform(X)
c:\users\s.bangaloreramalinga\appdata\local\programs\python\python37\lib\site-packages\sklearn\base.py in fit_transform(self, X, y, **fit_params)
697 if y is None:
698 # fit method of arity 1 (unsupervised transformation)
--> 699 return self.fit(X, **fit_params).transform(X)
700 else:
701 # fit method of arity 2 (supervised transformation)
~\Desktop\Master Thesis\DataSet\fsfc\base.py in fit(self, x, *rest)
70
71 def fit(self, x, *rest):
---> 72 self.scores = self._calc_scores(x)
73 return self
74
~\Desktop\Master Thesis\DataSet\fsfc\generic\SPEC.py in _calc_scores(self, x)
42
43 def _calc_scores(self, x):
---> 44 similarity = rbf_kernel(x)
45 adjacency = similarity
46 degree_vector = np.sum(adjacency, 1)
c:\users\s.bangaloreramalinga\appdata\local\programs\python\python37\lib\site-packages\sklearn\metrics\pairwise.py in rbf_kernel(X, Y, gamma)
1103 gamma = 1.0 / X.shape[1]
1104
-> 1105 K = euclidean_distances(X, Y, squared=True)
1106 K *= -gamma
1107 np.exp(K, K) # exponentiate K in-place
c:\users\s.bangaloreramalinga\appdata\local\programs\python\python37\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0
c:\users\s.bangaloreramalinga\appdata\local\programs\python\python37\lib\site-packages\sklearn\metrics\pairwise.py in euclidean_distances(X, Y, Y_norm_squared, squared, X_norm_squared)
311 else:
312 # if dtype is already float64, no need to chunk and upcast
--> 313 distances = - 2 * safe_sparse_dot(X, Y.T, dense_output=True)
314 distances += XX
315 distances += YY
c:\users\s.bangaloreramalinga\appdata\local\programs\python\python37\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0
c:\users\s.bangaloreramalinga\appdata\local\programs\python\python37\lib\site-packages\sklearn\utils\extmath.py in safe_sparse_dot(a, b, dense_output)
150 ret = np.dot(a, b)
151 else:
--> 152 ret = a @ b
153
154 if (sparse.issparse(a) and sparse.issparse(b)
MemoryError: Unable to allocate 1.44 TiB for an array with shape (444234, 444234) and data type float64
I was trying to use your code but unfortunately i find some problems
ModuleNotFoundError Traceback (most recent call last)
in
----> 1 from sklearn.feature_selection.base import SelectorMixin
ModuleNotFoundError: No module named 'sklearn.feature_selection.base'
for resolve that i have installed scikit-learn==22.1 but anyway
ImportError Traceback (most recent call last)
in
----> 1 from sklearn.feature_selection.base import SelectorMixin
~/anaconda3/envs/python3/lib/python3.6/site-packages/sklearn/feature_selection/base.py in
4 from . import _base
5 from ..externals._pep562 import Pep562
----> 6 from ..utils.deprecation import _raise_dep_warning_if_not_pytest
7
8 deprecated_path = 'sklearn.feature_selection.base'
ImportError: cannot import name '_raise_dep_warning_if_not_pytest
can you update the code to run to the latest and without error ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.