In this final project, we built and evaluated a recommender system using Spark alternating least squares (ALS) method on a large dataset (1 Million+). Specifically, each data record contains a table of triples (user_id, count, track_id) which measure implicit feedback derived from listening behavior. We also implemented two extensions on top of the baseline collaborative filter model: modifying the count data to improve model accuracy; used T-SNE on the learned representation to develop a visualization of the items and users.
nhuang37 / big-data-project Goto Github PK
View Code? Open in Web Editor NEWProject for 'DS-GA 1004 Big Data'