Giter Club home page Giter Club logo

implicit's Introduction

Implicit

Build Status Documentation

Fast Python Collaborative Filtering for Implicit Datasets.

This project provides fast Python implementations of several different popular recommendation algorithms for implicit feedback datasets:

All models have multi-threaded training routines, using Cython and OpenMP to fit the models in parallel among all available CPU cores. In addition, the ALS and BPR models both have custom CUDA kernels - enabling fitting on compatible GPU's. Approximate nearest neighbours libraries such as Annoy, NMSLIB and Faiss can also be used by Implicit to speed up making recommendations.

Installation

Implicit can be installed from pypi with:

pip install implicit

Installing with pip will use prebuilt binary wheels on x86_64 Linux, Windows and OSX. These wheels include GPU support on Linux.

Implicit can also be installed with conda:

# CPU only package
conda install -c conda-forge implicit

# CPU+GPU package
conda install -c conda-forge implicit implicit-proc=*=gpu

Basic Usage

import implicit

# initialize a model
model = implicit.als.AlternatingLeastSquares(factors=50)

# train the model on a sparse matrix of user/item/confidence weights
model.fit(user_item_data)

# recommend items for a user
recommendations = model.recommend(userid, user_item_data[userid])

# find related items
related = model.similar_items(itemid)

The examples folder has a program showing how to use this to compute similar artists on the last.fm dataset.

For more information see the documentation.

Articles about Implicit

These blog posts describe the algorithms that power this library:

There are also several other articles about using Implicit to build recommendation systems:

Requirements

This library requires SciPy version 0.16 or later and Python version 3.6 or later.

GPU Support requires at least version 11 of the NVidia CUDA Toolkit.

This library is tested with Python 3.7, 3.8, 3.9, 3.10 and 3.11 on Ubuntu, OSX and Windows.

Benchmarks

Simple benchmarks comparing the ALS fitting time versus Spark can be found here.

Optimal Configuration

I'd recommend configuring SciPy to use Intel's MKL matrix libraries. One easy way of doing this is by installing the Anaconda Python distribution.

For systems using OpenBLAS, I highly recommend setting 'export OPENBLAS_NUM_THREADS=1'. This disables its internal multithreading ability, which leads to substantial speedups for this package. Likewise for Intel MKL, setting 'export MKL_NUM_THREADS=1' should also be set.

Released under the MIT License

implicit's People

Contributors

aburkard avatar apat1n avatar atakanfilgoz avatar bavaria95 avatar benfred avatar chedatomasz avatar danieljl avatar dirtysalt avatar ds2268 avatar escherba avatar focus avatar inkrement avatar ita9naiwa avatar iuri-queiroz avatar ivanweiz avatar jbochi avatar jmc-bbk avatar markdouthwaite avatar martinthoma avatar mrticker avatar nfultz avatar reinerrubin avatar seonbeomkim avatar soonmok avatar stillmatic avatar tgsmith61591 avatar timgates42 avatar torsjonas avatar tych0n avatar yurijmikhalevich avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

implicit's Issues

WARNING:root:Annoy isn't installed

all the time I try to use example script lastfm.py I get the error

python implicit_argparser.py --input=usersha1-artmbid-artname-plays.tsv
WARNING:root:Annoy isn't installed
Traceback (most recent call last):
File "implicit_argparser.py", line 155, in
cg=args.cg)
File "implicit_argparser.py", line 97, in calculate_similar_artists
model.fit(plays)
File "/Users/Kakadu/anaconda/lib/python3.6/site-packages/implicit/annoy_als.py", line 78, in fit
self.cosine_index = annoy.AnnoyIndex(self.item_factors.shape[1], 'angular')
NameError: name 'annoy' is not defined

Where can I get Annoy? P.S I use MacOS
by the way I used model=als by default and even in this way I get the error with Annoy that can be seen in Traceback , be glad to see a reasonable reply

Installing on Mac OSX

pip install implicit initially fails to find GCC since clang gcc is located in /usr/bin/gcc

After adding /usr/bin/gcc to setup.py I get:

gcc: error: implicit/_implicit.c: No such file or directory

I also tried installing gcc via Homebrew and installing from source:

python setup.py install

running install
running bdist_egg
running egg_info
writing requirements to implicit.egg-info/requires.txt
writing implicit.egg-info/PKG-INFO
writing top-level names to implicit.egg-info/top_level.txt
writing dependency_links to implicit.egg-info/dependency_links.txt
reading manifest file 'implicit.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching 'implicit/*.c'
writing manifest file 'implicit.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.5-x86_64/egg
running install_lib
running build_py
running build_ext
building 'implicit._implicit' extension
gcc-6 -fno-strict-aliasing -I/Users/seanlaw/anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/seanlaw/anaconda/include/python2.7 -c implicit/_implicit.c -o build/temp.macosx-10.5-x86_64-2.7/implicit/_implicit.o -fopenmp -ffast-math
gcc-6: error: implicit/_implicit.c: No such file or directory
gcc-6: fatal error: no input files
compilation terminated.
An exception has occurred, use %tb to see the full traceback.

SystemExit: error: command 'gcc-6' failed with exit status 1

Don't know if there were recent changes but it looks like implict/_implicit.c is missing

Get user factors per id

I can see the the first entry for user_factors is for new user.
But if I know which user id I want, how can I retrieve it per user? how to match each user to it's factors?

recalculate_user parameter in the recommend() of the als.py

Hi Ben,

Thanks for the awesome library, i am using the library to create a recommender system.

I am trying to implement my own version of returning the liked items by the user, so can i ask you what do the recalculate_user parameter in the recommend() do in the als.py?

Thank you in advance!

Meiyi

ImportError: No module named approximate_als

Hi Ben,

First let me thank you so much for this amazing software! I really appreciate the time and effort that went into it. I'm excited to try the changes that permit quick recommendations but I'm getting the following error when I try to run the lastfm.py example:

Traceback (most recent call last): File "lastfm.py", line 27, in <module> from implicit.approximate_als import (AnnoyAlternatingLeastSquares, NMSLibAlternatingLeastSquares, ImportError: No module named approximate_als

PyCharm also says cannot find reference 'approximate_als' in __init.py__

Note that the line numbers may be off by a few as I added: import os os.environ["OPENBLAS_NUM_THREADS"] = "1" in light of a related OpenBLAS warning I received.

I'm on Xubuntu that I installed a few days ago (not a VM).

I can comment out the approximate_als import and the subsequent reference to it, and the code then runs fine (giving a quick test with bm25). I tried to figure out a solution and it's probably really obvious but I haven't figured it out.

Probably unhelpful stuff I did to try to fix it: I tried adding "from . import approximate_als" to the top of init.py and "approximate_als to "all =" but that didn't work. I also tried blanking out init.py but that didn't work. I did a pip install nmslib and pip install annoy after combing through your blog but that didn't help.

Thanks for any thoughts you might have!

ALS.Recommend function coo_matrix index access problem

So the als.recommend function takes an 'int' for userid and a 'coo_matrix' for item_user_data.T which is a coo_matrix, however you cannot index coo_matrix that way so it fails for me

In [46]: ratings
Out[46]: 
<635810x14744082 sparse matrix of type '<class 'numpy.float64'>'
	with 115307196 stored elements in COOrdinate format>

In [47]: model.recommend(5, ratings.T)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-46-2d2962f49703> in <module>()
----> 1 model.recommend(5, ratings.T)

/Users/ml/lib/python3.5/site-packages/implicit/als.py in recommend(self, userid, user_items, N)
     87 
     88         # calcualte the top N items, removing the users own liked items from the results
---> 89         liked = set(user_items[userid].indices)
     90         count = N + len(liked)
     91         if count < len(scores):

TypeError: 'coo_matrix' object does not support indexing

If not a coo_matrix what should I pass to the recommend function?

nmslib recommend

I've been running the movielen example, trying to use approximate_als.NMSLibAlternatingLeastSquares instead os the AlternatingLeastSquares.
The recommend function returns indexs I can't decipher what they means.
I think there might be a bug.
It fails when I try to use the movie_lookup[movie].

movielens.py MemoryError

Non issue, it was 32-bit python, after installing 64-bit, it works fine

File "pandas_libs\parsers.pyx", line 894, in pandas._libs.parsers.TextReader.read
File "pandas_libs\parsers.pyx", line 944, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas_libs\parsers.pyx", line 2228, in pandas._libs.parsers._concatenate_chunks
MemoryError

ValueError: negative row index found on input

When I run the attached input, I get the following input:

Traceback (most recent call last):
File "/Users/username/Desktop/Recommendation/Implementation.py", line 206, in
collaborative_filter(formatted, result)
File "/Users/username/Desktop/Recommendation/Implementation.py", line 80, in
collaborative_filter
df, plays = read_data(input_filename)
File "/Users/username/Desktop/Recommendation/Implementation.py", line 25, in read_data
data['user'].cat.codes.copy())))
File "/usr/local/lib/python2.7/site-packages/scipy/sparse/coo.py", line 182, in init
self._check()
File "/usr/local/lib/python2.7/site-packages/scipy/sparse/coo.py", line 240, in _check
raise ValueError('negative row index found')
ValueError: negative row index found

From what I can tell, the input is correctly formatted with 3 columns separated by tabs. Thank you for your time!
faulty_input.txt

Optimize for ranking instead of rmse

In many cases, ranking the items is an easier problem then solving the matrix.
That can be implemented by optimise the precision-recall instead of the RMSE.
Another cool feature.

pip installing version 0.1.5

pip install implicit Successfully installed implicit-0.1.7
import implicit
implicit.__version__ '0.1.5'

Python 2.7.11 |Anaconda 4.0.0 (64-bit)| (default, Dec 6 2015, 18:08:32)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2

model_name(als) trains out unexist movieIds

I copied the content of example file "movielens.py" into my program file "datatrain.py" , trained the ml-20m dataset with model(als).
some unexist movieIds were trained out. how is that happened??

movieId| name | similar_movieId | score
26462 | Bad Boys (1983)| 113812 | 0.926960830278

Traceback (most recent call last):
File "datatrain.py", line 108, in
min_rating=args.min_rating)
File "datatrain.py", line 86, in calculate_similar_movies
o.write("%s\t%s\t%s\n" % (movie, movie_lookup[other], score))
KeyError: 113812

Expected in: flat namespace


ImportError Traceback (most recent call last)
in ()
----> 1 import implicit

/Users/yanan.chen/anaconda/lib/python2.7/site-packages/implicit/init.py in ()
1 from .als import alternating_least_squares
2
----> 3 from . import nearest_neighbours
4 from . import als
5

/Users/yanan.chen/anaconda/lib/python2.7/site-packages/implicit/nearest_neighbours.py in ()
5 from scipy.sparse import coo_matrix, csr_matrix
6
----> 7 from ._nearest_neighbours import all_pairs_knn
8 from .recommender_base import RecommenderBase
9 from .utils import nonzeros

ImportError: dlopen(/Users/yanan.chen/anaconda/lib/python2.7/site-packages/implicit/_nearest_neighbours.so, 2): Symbol not found: __ZdlPvm
Referenced from: /Users/yanan.chen/anaconda/lib/python2.7/site-packages/implicit/_nearest_neighbours.so
Expected in: flat namespace
in /Users/yanan.chen/anaconda/lib/python2.7/site-packages/implicit/_nearest_neighbours.so

[question] loss calculation and cross-validation

Hi Ben. Thanks for library and especially for great posts. I'm wondering what is the procedure for loss calculation? I checked code here, but didn't understand exact algorithm. I suppose it is kind of approximation for loss from paper, isn't it? I think it is almost infeasible(very computationally expensive) to calculate exact loss from paper, because it will require to calculate prediction for each users and each item matrix in order to take into account loss for not observed items. Or I missed some trick?

Stopping criterion of ALS

Is it a good idea to stop ALS by validation dataset based on some criteria(RMSE, etc.)? The paper use probe datasets as validation-set. Once the RMSE is less than 1e-9, they stop the iteration.

Reproduce results of fitted model

Every time when i run model.fit() i obtain different values in model.user_factors and model.item_factors. How can i get reproducible result after model fitting?

Allow float32 matrices to be used as input

I was able to compile the package from source,
however I am getting error during the execution of the least_squares (cpp extension, python version is working)

  File "build/bdist.freebsd-11.0-RELEASE-p1-amd64/egg/implicit/als.py", line 48, in alternating_least_squares
  File "implicit/_als.pyx", line 60, in implicit._als.least_squares (implicit/_als.cpp:3561)
ValueError: Buffer dtype mismatch, expected 'double' but got 'float'

Here is the corresponding line from _als.cpp

    cdef double[:] data = Cui.data

Can you please advise how to fix this ?

Pickling

Are the models pickle-able? If so close this question, if not -- consider it a feature request!

module 'implicit.cuda' has no attribute 'CuCSRMatrix'

Ben, thank you for a wonderful blogpost on CUDA programming! Was playing with the latest version of implicit. While the package does build, I am running into the module 'implicit.cuda' has no attribute 'CuCSRMatrix' error.

It looks like this is due to the implicit.cuda.CuCSRMatrix call (and other related calls) on https://github.com/benfred/implicit/blob/master/implicit/als.py#L171. Switching all of them to _cuda.CuCSRMatrix after changing the import statement to from implicit.cuda import _cuda fixes the issue.

Note that https://github.com/benfred/implicit/blob/master/implicit/cuda/__init__.py#L3 doesn't seem to be doing what it's expected to do.

Failure installing in Windows10

OS: Windows 10
Python Version: 3.5.2
Cython Version: 0.26
scipy Version: 0.19.1

pip install implicit as well as python setup.py install result in errors:

gcc: error: /O2: No such file or directory
gcc: error: /openmp: No such file or directory

I thought openmp was only relevant for OSX. Any suggestions?

Query top N recommended items

Hi Ben,

I'm using implicit to predict a top7list of recommendations using a sparse matrix of aggregated customer purchases composed of 7101 customer purchases from 24 products.

The issue I'm having is that I'm a little confused at the output from .recommend which produces a list of N tuples:

[(845, 1.0136324354312989), (1150, 1.0028331824506354), (51, 1.0027650376439357), (2411, 1.0024685562873292), (1810, 1.0019960930254448), (1211, 1.0018685279069661), (775, 1.0018545578136604)]

Now I would have expected the first value in the tuple to be an index to the product list, but I suspect that I'm looking at the indices for the latent factor vectors? If you give me a steer about the process for extracting out the product identities it would be very much appreciated.

Kind regards,
Michael.

`

import pandas as pd
import scipy.sparse as sparse
import numpy as np
import implicit
# import data and add header rows
data = pd.read_csv('D:\santander\\train_sample_small.csv', names=['cust_id', 'product', 'rating'])
# transform dataset to sum by activity
grouped_data = data.groupby(['cust_id', 'product']).sum().reset_index()
grouped_data.head()

image

# Only get customers where purchase totals were positive
grouped_purchased = grouped_data.query('rating > 0')
print(grouped_purchased.head())

# Get our unique customers
customers = list(np.sort(grouped_purchased.cust_id.unique()))

# Get our unique products that were purchased
products = list(grouped_purchased['product'].unique())

# All of our purchases
rating = list(grouped_purchased.rating)

# Get the associated row/column indices
rows = grouped_purchased['cust_id'].astype('category', categories=customers).cat.codes
cols = grouped_purchased['product'].astype('category', categories=products).cat.codes

# create sparse matrix from data
purchases_sparse = sparse.csr_matrix((rating, (rows, cols)), shape=(len(customers),    len(products)), dtype=np.float64)

# Build, fit model and recommend top 7 products for first user
model = implicit.als.AlternatingLeastSquares(factors=50, regularization=0.1, iterations=50)
model.fit(item_users=purchases_sparse)
recom = model.recommend(userid=0, user_items=purchases_sparse.T, N=7)`

Clarify or rename `filter_items`

Hi, thanks for this nice package!

AlternatingLeastSquares.recommend has a parameter called filter_items, which, apart from the source code, does not have any documentation.

The same parameter is found in RecommenderBase, AnnoyAlternatingLeastSquares, and ItemItemRecommender.

Before reading the source, I thought it was a whitelist of items that the recommendation should select from (which suited my usecase), but as it turns out, it is a blacklist. So, I have a two suggestions I'd like to hear your thoughts on:

  1. Deprecate filter_items, and make a more descriptively named parameter such as skip_items, ignore_items, or item_blacklist.
  2. Put an explanation in the docstrings: filter_items: A list of items that should not be recommended.

I would think the second one is a no-brainer - of course it should be documented. The first one, however, is a bit more bold. Thoughts?

As another remark (should I move it to another issue?), the line if filter_items: such as here doesn't work with numpy arrays. I would suggest moving to if filter_items is not None: instead.

Implementing regular ALS

Hi Ben,
I am trying to modify your code to work with regular ALS matrix factorization algorithm (for sparce matrices)
This code seems working for now.
However could you please take a look and verify correctness of the proposed changes ?

def least_squares(Cui, X, Y, regularization, num_threads=0):
    users, factors = X.shape
    E = np.eye(factors)
    for u in range(users):
        A = np.zeros(shape=(factors,factors))
        b = np.zeros(factors)

        # confidence = 1, Pu = confidence
        
        for i, confidence in nonzeros(Cui, u):
            factor = Y[i]
            A += np.outer(factor, factor)
            b += 1 * factor * confidence

        A += regularization * E
        X[u] = np.linalg.solve(A, b)

pip install missing 32-bit factor support

Hey Ben, looks like when I install via pip I'm not getting your recent changes to support 32 bit factorization -- the dtype argument is not part of the alternating_least_squares method signature. (Among other things, this means I can't download and run the tests.)

Error installing implicit on Ubuntu

Hey, thanks for putting together this package. I'm encountering a C compiling-related error when I try to install it on Ubuntu. I've checked that gcc is installed. Any thoughts on what might be going on?

Thanks,

Collecting implicit==0.2.6 (from -r requirements.txt (line 67))
Downloading implicit-0.2.6.tar.gz (260kB)
�
    100% || 266kB 2.6MB/s 
plete output from command python setup.py egg_info:

Error compiling Cython file:
------------------------------------------------------------
...
from cython.parallel import parallel, prange
from libc.stdlib cimport malloc, free
from libc.string cimport memcpy

# requires scipy v0.16
cimport scipy.linalg.cython_lapack as cython_lapack
^
------------------------------------------------------------

implicit/_als.pyx:9:8: 'scipy/linalg/cython_lapack.pxd' not found

Error compiling Cython file:
------------------------------------------------------------
...
from libc.stdlib cimport malloc, free
from libc.string cimport memcpy

# requires scipy v0.16
cimport scipy.linalg.cython_lapack as cython_lapack
cimport scipy.linalg.cython_blas as cython_blas
^
------------------------------------------------------------

implicit/_als.pyx:10:8: 'scipy/linalg/cython_blas.pxd' not found

Error compiling Cython file:
------------------------------------------------------------
...
cdef inline void axpy(int * n, floating * da, floating * dx, int * incx, floating * dy,
int * incy) nogil:
if floating is double:
cython_blas.daxpy(n, da, dx, incx, dy, incy)
else:
cython_blas.saxpy(n, da, dx, incx, dy, incy)
^
------------------------------------------------------------

implicit/_als.pyx:18:19: cimported module has no attribute 'saxpy'

Error compiling Cython file:
------------------------------------------------------------
...

# lapack/blas wrappers for cython fused types
cdef inline void axpy(int * n, floating * da, floating * dx, int * incx, floating * dy,
int * incy) nogil:
if floating is double:
cython_blas.daxpy(n, da, dx, incx, dy, incy)
^
------------------------------------------------------------

implicit/_als.pyx:16:19: cimported module has no attribute 'daxpy'

Error compiling Cython file:
------------------------------------------------------------
...
cdef inline void symv(char *uplo, int *n, floating *alpha, floating *a, int *lda, floating *x,
int *incx, floating *beta, floating *y, int *incy) nogil:
if floating is double:
cython_blas.dsymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
else:
cython_blas.ssymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
^
------------------------------------------------------------

implicit/_als.pyx:25:19: cimported module has no attribute 'ssymv'

Error compiling Cython file:
------------------------------------------------------------
...
cython_blas.saxpy(n, da, dx, incx, dy, incy)

cdef inline void symv(char *uplo, int *n, floating *alpha, floating *a, int *lda, floating *x,
int *incx, floating *beta, floating *y, int *incy) nogil:
if floating is double:
cython_blas.dsymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
^
------------------------------------------------------------

implicit/_als.pyx:23:19: cimported module has no attribute 'dsymv'

Error compiling Cython file:
------------------------------------------------------------
...

cdef inline floating dot(int *n, floating *sx, int *incx, floating *sy, int *incy) nogil:
if floating is double:
return cython_blas.ddot(n, sx, incx, sy, incy)
else:
return cython_blas.sdot(n, sx, incx, sy, incy)
^
------------------------------------------------------------

implicit/_als.pyx:31:26: cimported module has no attribute 'sdot'

Error compiling Cython file:
------------------------------------------------------------
...
else:
cython_blas.ssymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)

cdef inline floating dot(int *n, floating *sx, int *incx, floating *sy, int *incy) nogil:
if floating is double:
return cython_blas.ddot(n, sx, incx, sy, incy)
^
------------------------------------------------------------

implicit/_als.pyx:29:26: cimported module has no attribute 'ddot'

Error compiling Cython file:
------------------------------------------------------------
...

cdef inline void scal(int *n, floating *sa, floating *sx, int *incx) nogil:
if floating is double:
cython_blas.dscal(n, sa, sx, incx)
else:
cython_blas.sscal(n, sa, sx, incx)
^
------------------------------------------------------------

implicit/_als.pyx:37:19: cimported module has no attribute 'sscal'

Error compiling Cython file:
------------------------------------------------------------
...
else:
return cython_blas.sdot(n, sx, incx, sy, incy)

cdef inline void scal(int *n, floating *sa, floating *sx, int *incx) nogil:
if floating is double:
cython_blas.dscal(n, sa, sx, incx)
^
------------------------------------------------------------

implicit/_als.pyx:35:19: cimported module has no attribute 'dscal'

Error compiling Cython file:
------------------------------------------------------------
...
cdef inline void posv(char * u, int * n, int * nrhs, floating * a, int * lda, floating * b,
int * ldb, int * info) nogil:
if floating is double:
cython_lapack.dposv(u, n, nrhs, a, lda, b, ldb, info)
else:
cython_lapack.sposv(u, n, nrhs, a, lda, b, ldb, info)
^
------------------------------------------------------------

implicit/_als.pyx:44:21: cimported module has no attribute 'sposv'

Error compiling Cython file:
------------------------------------------------------------
...
cython_blas.sscal(n, sa, sx, incx)

cdef inline void posv(char * u, int * n, int * nrhs, floating * a, int * lda, floating * b,
int * ldb, int * info) nogil:
if floating is double:
cython_lapack.dposv(u, n, nrhs, a, lda, b, ldb, info)
^
------------------------------------------------------------

implicit/_als.pyx:42:21: cimported module has no attribute 'dposv'

Error compiling Cython file:
------------------------------------------------------------
...
cdef inline void gesv(int * n, int * nrhs, floating * a, int * lda, int * piv, floating * b,
int * ldb, int * info) nogil:
if floating is double:
cython_lapack.dgesv(n, nrhs, a, lda, piv, b, ldb, info)
else:
cython_lapack.sgesv(n, nrhs, a, lda, piv, b, ldb, info)
^
------------------------------------------------------------

implicit/_als.pyx:51:21: cimported module has no attribute 'sgesv'

Error compiling Cython file:
------------------------------------------------------------
...
cython_lapack.sposv(u, n, nrhs, a, lda, b, ldb, info)

cdef inline void gesv(int * n, int * nrhs, floating * a, int * lda, int * piv, floating * b,
int * ldb, int * info) nogil:
if floating is double:
cython_lapack.dgesv(n, nrhs, a, lda, piv, b, ldb, info)
^
------------------------------------------------------------

implicit/_als.pyx:49:21: cimported module has no attribute 'dgesv'

Error compiling Cython file:
------------------------------------------------------------
...
cdef inline void axpy(int * n, floating * da, floating * dx, int * incx, floating * dy,
int * incy) nogil:
if floating is double:
cython_blas.daxpy(n, da, dx, incx, dy, incy)
else:
cython_blas.saxpy(n, da, dx, incx, dy, incy)
^
------------------------------------------------------------

implicit/_als.pyx:18:25: Calling gil-requiring function not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...

# lapack/blas wrappers for cython fused types
cdef inline void axpy(int * n, floating * da, floating * dx, int * incx, floating * dy,
int * incy) nogil:
if floating is double:
cython_blas.daxpy(n, da, dx, incx, dy, incy)
^
------------------------------------------------------------

implicit/_als.pyx:16:25: Calling gil-requiring function not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...
cdef inline void symv(char *uplo, int *n, floating *alpha, floating *a, int *lda, floating *x,
int *incx, floating *beta, floating *y, int *incy) nogil:
if floating is double:
cython_blas.dsymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
else:
cython_blas.ssymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
^
------------------------------------------------------------

implicit/_als.pyx:25:25: Calling gil-requiring function not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...
cython_blas.saxpy(n, da, dx, incx, dy, incy)

cdef inline void symv(char *uplo, int *n, floating *alpha, floating *a, int *lda, floating *x,
int *incx, floating *beta, floating *y, int *incy) nogil:
if floating is double:
cython_blas.dsymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
^
------------------------------------------------------------

implicit/_als.pyx:23:25: Calling gil-requiring function not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...

cdef inline floating dot(int *n, floating *sx, int *incx, floating *sy, int *incy) nogil:
if floating is double:
return cython_blas.ddot(n, sx, incx, sy, incy)
else:
return cython_blas.sdot(n, sx, incx, sy, incy)
^
------------------------------------------------------------

implicit/_als.pyx:31:31: Calling gil-requiring function not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...
else:
cython_blas.ssymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)

cdef inline floating dot(int *n, floating *sx, int *incx, floating *sy, int *incy) nogil:
if floating is double:
return cython_blas.ddot(n, sx, incx, sy, incy)
^
------------------------------------------------------------

implicit/_als.pyx:29:31: Calling gil-requiring function not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...

cdef inline void scal(int *n, floating *sa, floating *sx, int *incx) nogil:
if floating is double:
cython_blas.dscal(n, sa, sx, incx)
else:
cython_blas.sscal(n, sa, sx, incx)
^
------------------------------------------------------------

implicit/_als.pyx:37:25: Calling gil-requiring function not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...
else:
return cython_blas.sdot(n, sx, incx, sy, incy)

cdef inline void scal(int *n, floating *sa, floating *sx, int *incx) nogil:
if floating is double:
cython_blas.dscal(n, sa, sx, incx)
^
------------------------------------------------------------

implicit/_als.pyx:35:25: Calling gil-requiring function not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...
cdef inline void posv(char * u, int * n, int * nrhs, floating * a, int * lda, floating * b,
int * ldb, int * info) nogil:
if floating is double:
cython_lapack.dposv(u, n, nrhs, a, lda, b, ldb, info)
else:
cython_lapack.sposv(u, n, nrhs, a, lda, b, ldb, info)
^
------------------------------------------------------------

implicit/_als.pyx:44:27: Calling gil-requiring function not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...
cython_blas.sscal(n, sa, sx, incx)

cdef inline void posv(char * u, int * n, int * nrhs, floating * a, int * lda, floating * b,
int * ldb, int * info) nogil:
if floating is double:
cython_lapack.dposv(u, n, nrhs, a, lda, b, ldb, info)
^
------------------------------------------------------------

implicit/_als.pyx:42:27: Calling gil-requiring function not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...
cdef inline void gesv(int * n, int * nrhs, floating * a, int * lda, int * piv, floating * b,
int * ldb, int * info) nogil:
if floating is double:
cython_lapack.dgesv(n, nrhs, a, lda, piv, b, ldb, info)
else:
cython_lapack.sgesv(n, nrhs, a, lda, piv, b, ldb, info)
^
------------------------------------------------------------

implicit/_als.pyx:51:27: Calling gil-requiring function not allowed without gil

Error compiling Cython file:
------------------------------------------------------------
...
cython_lapack.sposv(u, n, nrhs, a, lda, b, ldb, info)

cdef inline void gesv(int * n, int * nrhs, floating * a, int * lda, int * piv, floating * b,
int * ldb, int * info) nogil:
if floating is double:
cython_lapack.dgesv(n, nrhs, a, lda, piv, b, ldb, info)
^
------------------------------------------------------------

implicit/_als.pyx:49:27: Calling gil-requiring function not allowed without gil
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-build-5s7y1gnq/implicit/setup.py", line 111, in <module>
ext_modules=define_extensions(use_cython),
File "/tmp/pip-build-5s7y1gnq/implicit/setup.py", line 47, in define_extensions
return cythonize(modules)
File "/home/rof/.pyenv/versions/3.5.4/lib/python3.5/site-packages/Cython/Build/Dependencies.py", line 1039, in cythonize
cythonize_one(*args)
File "/home/rof/.pyenv/versions/3.5.4/lib/python3.5/site-packages/Cython/Build/Dependencies.py", line 1161, in cythonize_one
raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: implicit/_als.pyx
implicit/_als.pyx: cannot find cimported module 'scipy.linalg.cython_lapack'
implicit/_als.pyx: cannot find cimported module 'scipy.linalg.cython_blas'
Compiling implicit/_als.pyx because it depends on /home/rof/.pyenv/versions/3.5.4/lib/python3.5/site-packages/Cython/Includes/libc/string.pxd.
Compiling implicit/_nearest_neighbours.pyx because it depends on /home/rof/.pyenv/versions/3.5.4/lib/python3.5/site-packages/Cython/Includes/libcpp/vector.pxd.
[1/2] Cythonizing implicit/_als.pyx

----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-5s7y1gnq/implicit/

cython is a requirment for this package

Hi,
Great package. Thanks for sharing it.
I just want to note that cython is a requirement for this package, and I had problems installing it until I discover that.

thanks.
Imri

Difficulty with bm25_weight in Linux install

On Ubuntu 16.04, when I try to import the bm25_weight function I get the following error:

import implicit
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/implicit/init.py", line 3, in
from . import nearest_neighbours
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/implicit/nearest_neighbours.py", line 7, in
from ._nearest_neighbours import all_pairs_knn
ImportError: /home/ubuntu/anaconda3/lib/python3.6/site-packages/implicit/_nearest_neighbours.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZdlPvm

knn_test.py failed

Hi Ben,
I was testing new version.
But seems like import of the nearest_neighbours has failed.

vagrant@deep-learning:~/implicit/tests$ python als_test.py 
.
----------------------------------------------------------------------
Ran 1 test in 3.561s

OK
vagrant@deep-learning:~/implicit/tests$ python knn_test.py 
E
======================================================================
ERROR: testNearestNeighbours (__main__.NearestNeighboursTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "knn_test.py", line 20, in testNearestNeighbours
    counts = implicit.nearest_neighbours.tfidf_weight(counts).tocsr()
AttributeError: 'module' object has no attribute 'nearest_neighbours'

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (errors=1)
vagrant@deep-learning:~/implicit/tests$ 

Question: how can I use multiple implicit feedback?

Thanks for wonderful lib,
I have a question: what if I have multiple user features like user rating, play time, play number of an item....,
How to compile these user features and apply it as an input ranking to run this algorithm?

recommend N top recommendations

Currently if you ask for more recommendations that are available, you get out of bounds.

Solution: return as many as possible instead

I can fix that and make PR

Does not install on Windows

pip install implicit fails on Windows 7.

After first fixing the issue of nog finding the VS C++ compiler (by installing the appropriate VS build tools), now the setup reports "failed building wheel" for this package.

Additionaly this message appears:

 C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe
/c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Users\XXX\AppData\Local\Continuum\
Anaconda3\include -IC:\Users\XXX\AppData\Local\Continuum\Anaconda3\include
"-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program
 Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86
)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\in
clude\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tcimplicit\
_implicit.c /Fobuild\temp.win-amd64-3.5\Release\implicit\_implicit.obj -Wno-unus
ed-function -O3 -fopenmp -ffast-math
    cl : Command line error D8021 : invalid numeric argument '/Wno-unused-function'

I tried checking for the CFLAGS in a previously mentioned issue (that one was on Mac OSX though):

C:\Users\XXX\implicit>python
Python 3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul  5 2016, 11:41:13) [MSC v.1
900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sysconfig
>>> print(sysconfig.get_config_var("CFLAGS"))
None

Any idea what is going on here?

aws lamda

is there any way this can take advantage of aws lamda for big input file and output?

No iteration log

Tried the following source to get iteration log.

model = AlternatingLeastSquares(factors=20,
regularization = 0.1,
calculate_training_loss=True,
iterations = 150,
num_threads=4)

Strangely there was no log of iteration loss. Checked the source als.py, but don't know why.
It runs OK. Just no log to check the output.

[question] implicit recommendations + time decaying confidence

I don't see much about this situation in the literature - but suppose we hypothesize that our confidence in user's implicit feedback should decrease as time passes (ie we are more confident that they're interested in an item they recently interacted with than one they interacted with days/weeks ago). Any advice/thoughts on how to approach this? Am currently applying a time decay function directly to the R_ui matrix, but if you have more experience with this setting I'd be happy to hear about it.

ERROR: Not building on Windows or Bash for Windows

So I am attempting to build this library, but I keep getting this error about not method called POSV found:

Downloading/unpacking implicit
  Downloading implicit-0.1.7.tar.gz (161kB): 161kB downloaded
  Running setup.py (path:/tmp/pip_build_skylion/implicit/setup.py) egg_info for package implicit

    Error compiling Cython file:
    ------------------------------------------------------------
    ...
                        # Since we've already added in YtY, we subtract 1 from confidence
                        for j in range(factors):
                            temp = (confidence - 1) * Y[i, j]
                            axpy(&factors, &temp, &Y[i, 0], &one, A + j * factors, &one)

                    posv("U", &factors, &one, A, &factors, b, &factors, &err);
                        ------------------------------------------------------------

    implicit/_implicit.pyx:97:20: no suitable method found

    Error compiling Cython file:
    ---------------------------------

I am really wondering what could be the issue. I can look into trying to get this to build later, but I have tried on both bash for Windows and Windows with the proper libraries installed. I am wondering if this is an issue with newer versions of scipy or cython. Here is my scipy config info.

blas_info:
    libraries = ['blas']
    library_dirs = ['/usr/lib']
    language = f77
lapack_info:
    libraries = ['lapack']
    library_dirs = ['/usr/lib']
    language = f77
atlas_threads_info:
  NOT AVAILABLE
blas_opt_info:
    libraries = ['blas']
    library_dirs = ['/usr/lib']
    language = f77
    define_macros = [('NO_ATLAS_INFO', 1)]
openblas_info:
  NOT AVAILABLE
atlas_blas_threads_info:
  NOT AVAILABLE
lapack_opt_info:
    libraries = ['lapack', 'blas']
    library_dirs = ['/usr/lib']
    language = f77
    define_macros = [('NO_ATLAS_INFO', 1)]

[question] Getting recommendations for 14M+ users quickly

Hi Ben, thanks for the package. I have a dataset of 14M+ users and 1M+ items. The model fitting take around 70 mins. But getting recommendations for all users would take north of 10 hours. Is there a way to expedite this, parallelize it maybe. Any suggestions?

Confidence Matrix values should be 0 or 1?

It seems that the code expects a sparse (e.g. csr) input containing the confidences on each of the UI pairs. However, looking at the Hu et al. 2008 paper, it seems like there should still be non-zero confidence for items that are unobserved (r_ui = 0 in the paper's notation), such as from c_ui = 1 + alpha * r_ui. But, doing this would make the confidence matrix dense.

I see that in computing the updates for the latent factor matrices, there is a (C - I) term and a multiplication by P, which transform back to the sparse space, but then shouldn't the input matrix be dense to accommodate the subtraction? (Ideally it isn't, but I'm trying to understand the implementation as written).

Thanks a lot for the explanation!

[Request] Additional filter for items

Hi @benfred,

Can you add in the official library the option for recommend method to accept additional list of items, that you don't want to include in the recommendations. Therefore we would still get for example top N items (excluding the items from given list) instead of filtering afterwards where you prune given recommendations and get less than the number of requested recommendations.

Thanks!

Installing from sources

I'm trying to install the package from sources with the following command:

python setup.py install

However, when I import implicit I get the following error:


In [1]: import implicit
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-9eae7608d57c> in <module>()
----> 1 import implicit

/home/agrigorev/soft/implicit/implicit/__init__.py in <module>()
----> 1 from .implicit import alternating_least_squares
      2 
      3 __version__ = '0.1.5'
      4 
      5 __all__ = [alternating_least_squares, __version__]

/home/agrigorev/soft/implicit/implicit/implicit.py in <module>()
      4 import os
      5 import logging
----> 6 from . import _implicit
      7 
      8 log = logging.getLogger("implicit")

ImportError: cannot import name '_implicit'

I assume this indicates that there was a problem with building the cython source.

What is the correct way of installing the library from sources? I'm running anaconda3 with python 3.5 under ubuntu.

undefined symbol: GOMP_parallel

I've got some troubles with executing lastfm.py test:

asegrenev@vw:~/Downloads/implicit$ python3 examples/lastfm.py
Traceback (most recent call last):
File "examples/lastfm.py", line 24, in
from implicit import alternating_least_squares
File "/home/asegrenev/anaconda3/lib/python3.5/site-packages/implicit-0.1.7-py3.5-linux-x86_64.egg/implicit/init.py", line 1, in
from .implicit import alternating_least_squares
File "/home/asegrenev/anaconda3/lib/python3.5/site-packages/implicit-0.1.7-py3.5-linux-x86_64.egg/implicit/implicit.py", line 6, in
from . import _implicit
ImportError: /home/asegrenev/anaconda3/lib/python3.5/site-packages/implicit-0.1.7-py3.5-linux-x86_64.egg/implicit/_implicit.cpython-35m-x86_64-linux-gnu.so: undefined symbol: GOMP_parallel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.