Faraaz Arsath's Projects
Data Cleaning - The Movies Dataset - These files contain metadata for all 45,000 movies listed in the Full MovieLens Dataset. Data points include cast, crew, plot keywords, budget, revenue, posters, release dates, languages, production companies, countries, vote counts and vote averages.
Customer Credit Card Analysis
This repository includes examples for Decision Tree Classification and Decision Tree Regression.
Faraaz Portfolio
Data Preprocessing
Logistic Regression - This dataset contains information of users in a social network. Those informations are the user id the gender the age and the estimated salary. A car company has just launched their brand new luxury SUV. And we're trying to see which of these users of the social network are going to buy this brand new SUV And the last column here tells If yes or no the user bought this SUV we are going to build a model that is going to predict if a user is going to buy or not the SUV based on two variables which are going to be the age and the estimated salary. So our matrix of feature is only going to be these two columns. We want to find some correlations between the age and the estimated salary of a user and his decision to purchase yes or no the SUV.
KNN Algorithm
This repository is about Analysis of Cricket Chirps, Brain-Body Weight, and Salary Discrimination Data: Linear regression, visualization,R2 squared and correlation assessments.
Data about the retail price of 2005 General Motors cars can be found in car_data.csv.
Dataset from USA Forensic Science Service which has description of 6 types of glass; defined in terms of their oxide content (i.e. Na, Fe, K, etc). Task is to use K-Nearest Neighbor (KNN) classifier to classify the glasses.
This respository is a collection of Python scripts showcasing various data manipulation and analysis tasks using the Pandas library.
This repository contains Python scripts for assessing and categorizing student performance data from two CSV files. The tasks include categorizing students based on their CodeKata scores.
This Python script provides a simple user registration system with username and password validation using regular expressions. It also includes a password recovery option.
This respository is on various Numpy Tasks provided as academic assignment
This GitHub repository contains the analysis of drug safety data using hypothesis testing.
K-means clustering Algorithm : It is an example for Unsupervised machine learning algorithm. Here we find hidden patterns in the dataset.
Petrol Consumption Prediction - Multiple Linear Regression
Customer Segmentation of E commerce purchase database
Ensemble Learning Techniques - Breast Cancer Classification
This repository contains Python scripts and resources for analyzing region-wise rainfall data across India. The dataset provides detailed information on monthly rainfall for various districts, allowing for insights into regional climate patterns. The provided code snippets guide users through tasks such as data import, cleaning, and analysis.