This project aims to generate, visualize, and analyze some random social media data using Python libraries such as pandas, numpy, matplotlib, seaborn, and random.
The project consists of the following tasks:
- Import required libraries
- Generate random data for the social media data
- Load the data into a Pandas DataFrame and explore the data
- Clean the data
- Visualize and analyze the data
- Describe conclusions
To run this project, you need to have Python 3 and the following libraries installed using pip or conda commands:
- pandas
- numpy
- matplotlib
- seaborn
- random
To execute this project, follow these steps:
- Clone this repository to your local machine.
- Open the
social_media_data_analysis.ipynb
file in a Jupyter notebook or any other Python IDE. - Run the code cells in order, or use the "Run All" option in the notebook.
- Observe the output of each cell, which includes graphs, statistics, and explanations.
- Modify the code or the data as you wish to experiment with different scenarios.
The project generates some random tweet data with fields such as date, category, and number of likes loaded into a pandas dataframe and explored using various methods. The data is cleaned by removing null and duplicate values, and converting the data types to appropriate formats. It is then visualized using seaborn plots like histograms and boxplots, and analyzed using pandas methods like mean and groupby.