Giter Club home page Giter Club logo

cblearn

Comparison-based Machine Learning in Python

PyPI version Documentation Test status Test Coverage

Comparison-based Learning algorithms are the Machine Learning algorithms to use when training data contains similarity comparisons ("A and B are more similar than C and D") instead of data points.

Triplet comparisons from human observers help model the perceived similarity of objects. These human triplets are collected in studies, asking questions like "Which of the following bands is most similar to Queen?" or "Which color appears most similar to the reference?".

This library provides an easy-to-use interface for comparison-based learning algorithms. It plays hand-in-hand with scikit-learn:

from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score

from cblearn.datasets import make_random_triplets
from cblearn.embedding import SOE
from cblearn.metrics import QueryScorer

X = load_iris().data
triplets = make_random_triplets(X, result_format="list-order", size=1000)

estimator = SOE(n_components=2)
# Measure the fit with scikit-learn's cross-validation
scores = cross_val_score(estimator, triplets, cv=5)
print(f"The 5-fold CV triplet error is {sum(scores) / len(scores)}.")

# Estimate the scale on all triplets
embedding = estimator.fit_transform(triplets)
print(f"The embedding has shape {embedding.shape}.")

Please try the Examples.

Getting Started

Install cblearn as described here and try the examples.

Find a theoretical introduction to comparison-based learning, the datatypes, algorithms, and datasets in the User Guide.

Features

Datasets

cblearn provides utility methods to simplify the loading and conversion of your comparison datasets. In addition, some functions download and load multiple real-world comparisons.

Dataset Query #Object #Response #Triplet
Vogue Cover Odd-out Triplet 60 1,107 2,214
Nature Scene Odd-out Triplet 120 3,355 6,710
Car Most-Central Triplet 60 7,097 14,194
Material Standard Triplet 100 104,692 104,692
Food Standard Triplet 100 190,376 190,376
Musician Standard Triplet 413 224,792 224,792
Things Image Testset Odd-out Triplet 1,854 146,012 292,024
ImageNet Images v0.1 Rank 2 from 8 1,000 25,273 328,549
ImageNet Images v0.2 Rank 2 from 8 50,000 384,277 5M

Embedding Algorithms

Algorithm Default Pytorch (GPU) Reference Wrapper
Crowd Kernel Learning (CKL) X X
FORTE X
GNMDS X X
Maximum-Likelihood Difference Scaling (MLDS) X MLDS (R)
Soft Ordinal Embedding (SOE) X X loe (R)
Stochastic Triplet Embedding (STE/t-STE) X X

Contribute

We are happy about your bug reports, questions or suggestions as Github Issues and code or documentation contributions as Github Pull Requests. Please see our Contributor Guide.

Authors and Acknowledgement

cblearn was initiated by current and former members of the Theory of Machine Learning group of Prof. Dr. Ulrike von Luxburg at the University of Tübingen. The leading developer is David-Elias Künstle.

We want to thank all the contributors here on GitHub. This work has been supported by the Machine Learning Cluster of Excellence, funded by EXC number 2064/1 – Project number 390727645. The authors would like to thank the International Max Planck Research School for Intelligent Systems (IMPRS-IS) for supporting David-Elias Künstle.

License

This library is free under the MIT License conditions. Please cite this library appropriately if it contributes to your scientific publication. We would also appreciate a short email (optionally) to see how our library is being used.

cblearn's Projects

cblearn icon cblearn

Comparison-based Machine Learning in Python

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.