Giter Club home page Giter Club logo

movies's Introduction

Movies-HiveQL

This repository contains some brief HiveQL querying examples on an open-source movie database.

Dataset

The data for this project can be found here (100K dS): https://grouplens.org/datasets/movielens/. For this project, only the u.data and u.item tables are used.

u.data

This file contains 100000 ratings by 943 users on 1682 items. It is a tab separated list of:

 user id | item id | rating | timestamp 

Timestamp is in unix seconds.

u.item

This file contains information about each item (movies). It is a tab separated list of:

movie id | movie title | release date | video release date | IMDb URL | unknown | Action | Adventure | Animation |Children's | Comedy | Crime | Documentary | Drama | Fantasy |Film-Noir | Horror | Musical | Mystery | Romance | Sci-Fi |Thriller | War | Western

The last 19 fields contain genres (1 indicating the movie belongs to the genre and 0 indicates it does not).

Skills

This analysis relies on HiveQL statements and includes SELECT, WHERE, GROUP BY, ORDER BY, LIMIT and JOIN commands.

Analysis

Queries on this dataset include:

  • How many records are in each table?
  • What are the names of the movies released in 1990?
  • What are the movie id's for the 10 films that recieved the most ratings?
  • What are the titles of the movies with the 10 most ratings?
  • What is the highest average rated sci-fi movie?
  • Are there any movies with no ratings?

movies's People

Contributors

colleenbobbie avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.