Giter Club home page Giter Club logo

cardinal's People

Contributors

alexandreabraham avatar dsleo avatar goulagman avatar mojifarmanbar avatar simonamaggio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

cardinal's Issues

Multi-label support

Thanks for sharing this package and writing the paper.

It would be nice if this package also supported binary multi-label classification problems.

What would be the best way to aggregate for instance "Smallest Margin" computed for each label into a per-sample score in your opinion?

Margin sampling fails with a bad error message if there is only one class in the predictions

We should not expect the classifier to return at least several proba. The error is:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-30-2b90468e1fd1> in <module>
      4                                   n_init_points,
      5                                   base_model,
----> 6                                   semi_model)
<ipython-input-25-008649b258dd> in run_experiment(X_train, X_test, y_train, y_test, batch_size, n_iter, n_init_points, base_model, semi_model)
    118                 sampler.fit(X_train[selected], y_train[selected])
    119 
--> 120                 new_selected = sampler.select_samples(X_train[~selected])
    121                 new_selected = new_selected.astype(int)
    122                 selected[index[~selected][new_selected]] = True
~/dss/code-envs/python/alssl/lib/python3.6/site-packages/cardinAL/base.py in select_samples(self, X, strategy)
     29 
     30     def select_samples(self, X, strategy='top'):
---> 31         sample_scores = self.score_samples(X)
     32         self.sample_scores_ = sample_scores
     33         if strategy == 'top':
~/dss/code-envs/python/alssl/lib/python3.6/site-packages/cardinAL/uncertainty.py in score_samples(self, X)
    175             predictions (np.array): Returns an array where selected samples are classified as 1.
    176         """
--> 177         return margin_score(self.classifier_, X)
    178 
    179 
~/dss/code-envs/python/alssl/lib/python3.6/site-packages/cardinAL/uncertainty.py in margin_score(classifier, X)
     45     """
     46     classwise_uncertainty = _get_probability_classes(classifier, X)
---> 47     part = np.partition(classwise_uncertainty, -2, axis=1)
     48     margin = 1 - (part[:, -1] - part[:, -2])
     49     return margin
<__array_function__ internals> in partition(*args, **kwargs)
~/dss/code-envs/python/alssl/lib/python3.6/site-packages/numpy/core/fromnumeric.py in partition(a, kth, axis, kind, order)
    744     else:
    745         a = asanyarray(a).copy(order="K")
--> 746     a.partition(kth, axis=axis, kind=kind, order=order)
    747     return a
    748 
ValueError: kth(=-1) out of bounds (1)

Zhadnov select_sample weights wrong dimension

Description: In the select_sample method of Zhdanov query sampler, the weight attributes is not usable as is. First, it's passed to the Kmean's second step of different dimension than the X array. Most importantly as the second step depends on the selection done through the MarginSampling sampler, there is no way to know a priori the weights corresponding to the selection.

Proposed fix: Add the selection as:

new_selected = self.sampler_list[1].select_samples(
            X[selected], sample_weight=sample_weight[selected])

Handle sparse matrices and precomputed distance

So far I did not find a non-hacky way to deal with distance precomputation when dealing with sparse matrices. In particular, to reproduce zhdanov results, one wants to chain uncertainty sampling with submodularity.

Make keras optional

As of today, the package load keras even if it not needed. This is a bummer since it makes the install unnecessarily more complicated and runtime slower too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.