Giter Club home page Giter Club logo

explore-tmdb-movie-dataset's Introduction

Explore TMDb Movies Dataset

Date created

This project was created on 2021-02-09.

Description

This project explores the movie dataset of TMDb which is a popular, user editable database for movies and TV shows. The goal is to identify the features that can best predict the return on investement of a movie.

This work was made to complete an assignement for the Udacity Data Analyst Nanodegree.

Files used

Explore TMDb Movie Dataset.ipynb : The jupyter notebook for wrangling and analyzing the data.

tmdb-movies.csv : The movie dataset.

There is also another file that is not included in the repository MovieData.csv. This file was used to fill the missing values for the revenue of the movies. It was provided by The Numbers, a company that tracks financial movie data. Since it was a proprietary dataset, I could not share it on Github. However, you can obtain it for free by simply filling their form. They will send you an email with a dropbox link and a password to download it.

Installation

To run the notebook, you will need python, numpy, pandas, matplotlib, urllib, json, and seaborn installed. You can download all these libraries individually or with Anaconda, a python distribution with a focus on data science. If you’re interested in Anaconda you can follow their installation guide.

You will also need to get an API key for OMDb API. It is a web service that offers a great wealth of information about movies. It is used in this project to fill missing values. You can get a free key by creating an account and you will receive it at the email address that you used for registration.

Dataset

This dataset contains information about 10866 movies and has 21 columns. The missing values were filled using both OMDb API and MovieData.csv. The columns that were selected for analysis are : id, popularity, budget, revenue, original_title, runtime, genres, release_date, vote_count, vote_average, release_year, and roi

Credits

Thanks to the One Million Arab Coders' initiative for offering me a chance to learn data science.

Thanks to Udacity for their great content.

Thanks to TMDb for providing their data to students.

Thanks to OMDb API for their free API keys.

Thanks to The Numbers for sharing their data.

explore-tmdb-movie-dataset's People

Contributors

aminedemagh avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.