Giter Club home page Giter Club logo

video-analysis's Introduction

youtube-analysis

I received a word document named Tasks, which contains 3 questions that I needed to answer based on a dataset on Kaggle about Youtube videos statistic. There were 20 files, 10 files (10 countries) with video statistic in CSV format and 10 files in JSON with category_id that you could merge with video statistic file. I converted JSON files in CSV format and if I wanted to import it in VSC or Jupyter-Lab, some files needed the change in encoding to UTF-8 with BOOM. After that I started creating the code named TemplateForCountries.

First I imported the files and check for missing values, duplicates and minimum and maximum values (not included in a code). After that I merged the two datasets for every country on category_id. Then I selected only columns that would be useful for my analysis. I grouped data by category_name and merged categories that had less than 3% together as Other. I plotted two pie charts that showed popular categories in every country. That was needed two answer the first question.

Then I created the ratios of likes, dislikes and views and exported as Excel file. Using conditional formatting I colored the tables for every country so it looked was more clear at first sight what are liked and disliked categories.

The second code file named PopularChannels was used to answer the third question. I had to choose popular channels in most countries. I merged the 2 files for every country like before, Then grouped by channel_title and calculated the sum of likes, dislikes and views. I sorted the updated dataset by channel views and selected the top 50 of every country. I append each countries top 50 channels in a grouped dataset. I then grouped the total list by channel_title and sort the channels by size. This is how I got most popular channels on top and limited the number of channels to 20, because I checked and the 20 most popular channel was the last one being popular in more than half countries. I exported the file as Excel.

The file Mistakes in Pandas to avoid I created following a youtube video that shows commmon mistakes beginners makes in Pandas.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.