zenogantner / mymedialite Goto Github PK

View Code? Open in Web Editor NEW

497.0 50.0 192.0 29.94 MB

recommender system library for the CLR (.NET)

Home Page: http://mymedialite.net

Shell 3.01% Makefile 0.62% C# 89.96% Perl 6.41%

collaborative-filtering evaluation item-prediction matrix-factorization rating-prediction recommender-systems

mymedialite's Introduction

MyMediaLite - a free recommender system algorithm library

Features

Dozens of different recommendation methods,
- methods can use collaborative, attribute/content, and relational data,
support for incremental training for most models.
Ready to use:
- Includes evaluation routines for rating and item prediction; quality measures MAE, NAME, RMSE, CBD, AUC, MAP, precision@N, recall@N, NDCG, MRR; and
- command line tools that read a simple text-based input format.
Compact: Core library is about 275 KB "big".
Portable: Written in C#, for the .NET platform; runs on every architecture where Mono works: Linux, Windows, Mac OS X.

Feedback and Contributions

We are always happy about feedback, and encourage MyMediaLite's users to contribute code to the project.

Just fork it on GitHub and send pull requests!

Bugs and feature requests can be reported on our mailing list or in our issue tracker.

Installation

See doc/Installation for installation instructions.

Documentation

See doc/ and the website for more documentation.

Citing MyMediaLite

If you use MyMediaLite for your research, it would be nice to acknowledge it in your papers by citing the following paper:

Zeno Gantner, Steffen Rendle, Christoph Freudenthaler, Lars Schmidt-Thieme: MyMediaLite: A Free Recommender System Library. RecSys 2011

@inproceedings{Gantner2011MyMediaLite,
  author    = {Zeno Gantner and Steffen Rendle and Christoph Freudenthaler and Lars Schmidt-Thieme},
  title     = {{MyMediaLite}: A Free Recommender System Library},
  booktitle = {5th ACM International Conference on Recommender Systems (RecSys 2011)},
  year      = 2011,
  location  = {Chicago, USA}
}

Academic publications that use or reference MyMediaLite

Contributors

Thanks to the following people, who provided valuable feedback, code, or other kinds of assistance: Roberto Abalde, Nicholas Ampazis, Thorsten Angermann, Suhrid Balakrishnan, Alejandro Bellogín, Christian Brauch, Fu Changhong, Subramanyeshwar Cherukuri, Simon Dooms, Lucas Drumond, Michael Ekstrand, Christoph Freudenthaler, Zeno Gantner, Jagadeesh Gorla, Josif Grabocka, Mark Graus, Guibing Guo, Andreas Hoffmann, Tomas Horvath, Kenneth Hoste, Frantisek Hrdina, Jia Huang, Nicolas Hug, Dominik Imrich, Dietmar Jannach, Peng Jiang, KwangSeob Kim, Artus Krohn-Grimberghe, Tobias Lang, Christina Lichtenthäler, Damir Logar, Marcelo Manzato, Brian McFee, Greg Najda, Chris Newell, Thai-Nghe Nguyen, Dimitris Paraschakis, Simon Renaud, Steffen Rendle, Marco Ribeiro, Roland Richter, Saurabh S., Sebastian Schelter, Lars Schmidt-Thieme, Yue Shi, Jordan Silva, Piotr Sobotka, Jessica Tölke, Tom Tung, Pieter-Jan Verbruggen, Julien Verplanken, Elvio Vicosa, João Vinagre, Oleksandr Vitvitskyi, Yongfeng Wang, Lina Weichbrodt, Cees Wesseling, Yong Zheng, GitHub users jkleint and NEBabylon.

This work was funded by the European Commission FP7 project MyMedia (Dynamic Personalization of Multimedia) under the grant agreement no. 215006.

Copyright & Licensing

MyMediaLite is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

MyMediaLite is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with MyMediaLite. If not, see http://www.gnu.org/licenses/.

This package contains Mono.Options, C5, Math.NET Numerics, and NUnit.

See doc/ComponentLicenses for more information about their licensing terms.

mymedialite's People

Contributors

Stargazers

Watchers

Forkers

dylanhogg tarunanand marrk edwardt kinyue shroom spirit-dongdong joaoms pipifuyj irwenqiang mmanzato mariankh dxgod cchenv musashi974 stusutherland isabellali proximamonkey inman wxiang7 liyanghua yanacov ummae bigbear2017 gevourah sharop wl-pro meng-li avontd2868 invinciblejha burjorjee amedhat3 visionwang jesusgarza ejjy mahadevanalagar yingzheng lhcgreg tmacmilan tristian2 ywl sanqiang njuhugn gxhrid manleviet deepthipr ichengzi jtoelke roant way2joy sijmen stryker1 ziwei-fan prakashru ljhaaa lucentcosmos mbit-cloud chuchu2op seedaily jcastro-inf 466152112 ty01csbaidu narayana1208 milstein guomin phsimon vulcanallen delip xulunfan nkwangyuan it8090 babakx micseb lexiao811 ysongfinance ambier ericeiffel ericzhouh nedosekov akiratu devil399 gjcoding cherishzhang sixence altaibaatar newbeess andrewsamodurov paraschakis wisonhuang jooliver mokarakaya strategist922 rwzhao penkoske dominikimrich softwarevamp tpnguyen mohit-shrma trietnm2 chenzhen

mymedialite's Issues

every recommender should have at least one literature reference in the API documentation

... so that people know where to read about the implemented method.

load recommender from model file w/o specifying the type

Currently, we instantiate a recommender and then load a model via its LoadModel() method.

It would be nice to have a tiny helper tool that looks into the model file, instantiates the recommender by itself, and then does the above.

make bold-driver heuristics configurable

Currently, the bold-driver learning rate adaptation schemes (for BiasedMatrixFactorization and BPRMF) use fixed values to increment/decrement the step size. This should be configurable (and set to sensible defaults)

support KDD Cup 2011 track 2 evaluation protocol (item prediction)

For each positive item, sample a negative item according to its overall frequency/popularity.

get MinRating and MaxRating from data

Currently, the user (of the command-line tool or the library) has to set the minimum and maximum ratings manually (if they are not the default 1 and 5). It would be more convenient to get them from the data and allowing to set them manually if necessary.

save user/item ID mappings together with recommender models

Chris wrote:

I've done some work with Mahout, and one feature I appreciate is that it stores the user and item mappings with the model data when you save it.
It makes it easier to resurrect a recommender and reduces the likelihood I'll get all the IDs mixed up!

create Debian package

automatic determination of a suitable learn rate

For recommenders that are trained with gradient-based algorithms we need suitable learn rates. These usually differ from data set to data set. MyMediaLite should contain a routine that automatically finds a suitable learn rate for a given data set.

cross-validation for relation-aware recommenders

two modes: split relations, do not split relations

use uint instead of int to refer to entities and list entries

user and item IDs could be uints (it is assumed anyway that they are >= 0)

Same for index data types in many places in the library.

It would make it harder to port MyMediaLite to Java after those changes, so we better be careful.

Not high priority.

hyperparameter search for all recommenders

Hyperparameter search by line/grid search and Nelder-Mead should be supported for all recommenders;
For recommenders that use a learn rate (=step size), there should also be routines for learning good step sizes.

This will push MyMediaLite more towards being usable as a black-box tool.

filters for item prediction

add pre- and post-filter APIs to MyMediaLite

pre-filters generate candidate lists

categories
already seen
...

post-filter

diversification
thresholds

evaluation: new-user/new-item cross-validation

Support CV for cold-start evaluation protocols

directly support MovieLens u.item and u.user files

... for reading in attributes

support non-binary attributes out of the box

Currently, attributes are supposed to be binary: https://github.com/zenogantner/MyMediaLite/blob/master/src/MyMediaLite/IO/AttributeData.cs

It would be nice if the recommender API supported at least binary and real-valued attributes, and the IO methods supported binary, real-valued, nominal, and text attributes, and would map them accordingly to binary and real-valued attributes.

kNN recommenders: use UserItemBaseline via composition, not inheritance

http://www.ismll.de/mymedialite/documentation/doxygen/interface_my_media_lite_1_1_i_iterative_model.html

The current solution is not the most elegant.
KNN recommenders are (usually) not iterative models, so we should rather use the UserItemBaseline via composition, not inheritance.

update from Math.NET Iridium to Math.NET Numerics

update the math package
consider using Math.NET numerics for matrix and vector computations - if it is faster than our home-grown code (likely)

implement user fold-in

Chris suggested this for item prediction.

An interface for this could be:

IList<WeightedItems> Predict(IList<int> watched_items, IList<int> candidate_items)
IList<int> Predict(IList<int> watched_items, IList<int> candidate_items, int n)

This would train features for a user specified by the list watched_items, and then predict scores for the list candidate_items.

One additional thing to consider for the interface would be to extend the interface to allow user attributes (not supported by BPR-MF, but possibly by other recommenders):

IList<double> Predict(IList<int> watched_items, var user_features , IList<int> candidate_items)

Also implement a similar thing for rating prediction, like

 IList<WeightedItem> PredictItems(IList<WeightedItem> rated_items);
 IList<int> PredictItems(IList<WeightedItem> rated_items, int n);

create stand-alone binary package of the GUI demo

command-line programs: load from model file without specifying recommender type

The recommender type can also be derived from the information in the model file.

Example web application

Implement example web application that uses the web service interface.

stand-alone rating prediction exectuable

A rating prediction program that does not need training data, but just relies on the model file to make predictions.

Will not work for memory-based recommenders; we will also take care to change the model file format to incorporate user ID and item ID mappings.

stand-alone evaluation exectuables

for rating prediction and item prediction

The idea is that users of other software packages can use those to create the predictions, and then evaluate the predictions using MyMediaLite's evaluation routines.

Suggested by Lucas Drumond.

integrate GraphLab

GraphLab has a nice library of rating prediction algorithms based on matrix/tensor factorization:
http://graphlab.org/pmf.html

It would be nice to have an interface to GraphLab to be able to use this library and to use other recommenders written "in" GraphLab that make use of its particular features wrt. parallelization.

Context-aware recommendation

add namespace ContextAwareRecommendation with the interfaces IContextAwareItemRecommender (also covers tag recommendation, time-aware recommendations, and search queries) and IContextAwareRatingRecommender

recommender: factorized personal markov chains

Paper: http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf

Can be used for next-basket recommendations (=recommendations based on the last purchase)

For this, create ISequentialItemRecommender.

suppress i18n for command line parameters

Currently, parsing floats/doubles in the Mono.Option command line parameters follows the current locale.
This is not desirable, because we want the command line options to be the same everywhere so that people can copy+paste commands from the documentation etc.

modularize: move apps in different repositories

The item and rating command-line programs should remain in the core repository, but the attribute-to-factor mapping code and the GUI demo could go into another repository.

create NuGet package

documentation: DB howto

Give an example how to get training data from a database.

implement/port CofiRank by Weimer et al.

http://www.cofirank.org/

output eval graphs for item prediction

Output graphs (image files or CSV files) for things like precision@N and recall@N for different N.

support ensembles in the command-line programs

Support the combination of several recommenders by the command-line programs.

rating_prediction and rating_based_ranking programs should also support --test-users and --candidate-items arguments

For consistency with the item prediction program, and because it would be a useful feature.
That way, we could also generate rating predictions for arbitrary items.

documentation: F#

Create an example that explains how to use MyMediaLite from F#, and how to implement a new recommender in F#.

Parallelize item prediction evaluation

... by parallelizing the candidate score computations.

active learning interface

Create an interface for active learning recommenders, i.e. recommenders that request certain items to be rated by a user in order to improve the predictive model.

implement Collaborative Topic Models by Wang+Blei

http://www.cs.princeton.edu/~chongw/

command line arguments to select underlying data types

--static: slower loading (2 passes), less memory consumption, faster access, no new data can be added
--non-static: faster loading (1 pass), new data can be added

top-n evaluation for rating prediction

support top-n evaluation (and other item prediction measures) in the rating prediction command-line program

group recommendation interface

Create an interface for recommenders that aggregate score predictions for several users.

chronological splits

Support chronological splits, both relative to user history and to absolute times.

--chronological-split=DATETIME

--chronological-split=RATIO

ternary relations (tags)
n-ary relations
weighted relations
multiple relations over the same set (equivalent to labelled relations)
relations with time information
etc.

common class for command-line programs

The current command-line programs for item and rating prediction share many concepts.
It may be worthwile to consider implementing those shared concepts in one class and deriving from it.

implement Bayesian Probabilistic Matrix Factorization

http://www.mit.edu/~rsalakhu/papers/bpmf.pdf

zenogantner / mymedialite Goto Github PK

mymedialite's Introduction

MyMediaLite - a free recommender system algorithm library

Features

Feedback and Contributions

Installation

Documentation

Citing MyMediaLite

Contributors

Copyright & Licensing

mymedialite's People

Contributors

Stargazers

Watchers

Forkers

mymedialite's Issues

Recommend Projects

Recommend Topics

Recommend Org