Giter Club home page Giter Club logo

data_mining_ai_exercises's Introduction

data_mining_ai_exercises

factorize.py creates a utility matrix of users and ratings from a dataset of movie ratings in which each user does not have a rating for every movie. It then factorizes the utility matrix into 2 matrices, U and V, whose product could then be used for a recommendation system.

pagerank.py takes a dataset of links between nodes (representing webpages) and computes a PageRank score for each node. Deadends and spider traps in the graph are handled.

stochastic_gradient_descent trains a linear regression model using stochastic gradient descent. It takes a tsv file where each row is the features of a datapoint (test set had 100,000 points and 300 features) and outputs a tsv containing the co-efficients of each feature for the linear regression model.

naive_bayes.py takes a dataset of positive and negative movie reviews and uses a predetermined set of keywords with high polarity values (e.g. "awful" and "hilarious") as features for naive bayes classification. It produces 2 confusion matrices, one for classification using the entire dataset for training and testing, and one for classification using K-folds cross-validation. It also generates movie review vectors given a sentiment value (positive or negative).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.