Giter Club home page Giter Club logo

sukhman-singh-1612 / data-science-projects Goto Github PK

View Code? Open in Web Editor NEW
148.0 6.0 87.0 37.96 MB

Explore my diverse collection of projects showcasing machine learning, data analysis, and more. Organized by project, each directory contains code, datasets, documentation, and resources. Dive in, to discover insights and techniques in data science. Reach out for collaborations and feedback.

Home Page: https://sukhman-singh-1612.github.io/data_science/

License: MIT License

Jupyter Notebook 94.91% HTML 5.09%
data-science data-science-portfolio data-science-projects

data-science-projects's Introduction

Data Science Projects

Welcome to my Data Science Projects Repository! This repository contains a collection of my data science projects, showcasing my skills and expertise in the field. Each project demonstrates different aspects of data analysis, machine learning, and visualization.

GitHub Repo stars GitHub forks

GitHub Page

Data-Science-Projects

Projects

  1. Breast Cancer Prediction
    • Description: The project predicts the diagnosis (M = malignant, B = benign) of the Breast Cancer
    • Technologies Used: The notebooks uses Decision Tree Classification and Logistic Regression
    • Results: The logistic regression gave 97% accuracy and decision tree gave 93.5% accuracy
  2. Red Wine Quality Prediction
    • Description: The project predicts the quality of the wine in the value 0 or 1. 1 for good quality and 0 for bad quality
    • Technologies Used: The notebooks uses logistic regression, support vector machine, decision tree and knn
    • Results: The logistic regression model performs the best with accuracy of 86.67%
  3. Heart Stroke Prediction
    • Description: The project predicts the risk of heart stroke on studying the person's demographics and medical info
    • Technologies Used: The notebooks uses logistic regression, support vector machine, decision tree and knn
    • Results: The logistic regression, SVM and KNN performs the best with 93.8 % accuracy
  4. House Price Prediction
    • Description: The project predicts the house price after studying the variables such as location, area, bredroom, bathroom count and many more.
    • Technologies Used: The notebooks uses Linear Regression, Ridge Regression and Random Forest Regressor
    • Results: The Random Forest Regressor performed best with accuracy of 87.89%
  5. Titanic Survival Prediction
    • Description: The project predicts the survival during the titanic disaster based on socio-economic measures
    • Technologies Used: The notebooks uses Descision Tree Classifier
    • Results: The Decision Tree Classifer performed well on the test dataset with an accuracy of 89.5%
  6. Diamond Price Prediction
    • Description: The project predicts the price (in US dollars) of the diamonds based on their features
    • Technologies Used: The notebooks uses Descision Tree Regressor and Random Forest Regressor
    • Results: The Decision Tree Regresor performed well on the test dataset with an accuracy of 96%
  7. Medical Cost Prediction
    • Description: The project predicts the medical treatment cost by analysing the patients age, gender, bmi, smoking habits etc.
    • Technologies Used: The notebooks uses Linear and Polynomial Regression, Decision Tree and Random Forest Regressor
    • Results: The Decision Tree Regressor and Random Forest Regressor performed well
  8. Room Occupancy Detection
    • Description: The project predicts the room occupancy by analyzing the sensor data such as temperature, light and co2 level.
    • Technologies Used: The notebooks uses Random Forest Classifier
    • Results: The Random Forest Classifier performed well with an accuracy of 98%
  9. Sleep Disorder Prediction
    • Description: The project aims to predict sleep disorders and their types by analyzing lifestyle and medical variables, such as age, BMI, sleep duration, blood pressure, and more
    • Technologies Used: The notebooks uses Random Forest Classifier and Decision Tree cLassifier
    • Results: The Random Forest Classifier performed well with an accuracy of 89%
  10. Pima Indians Diabetes Prediction
    • Description: The primary objective of the Pima Indian Diabetes Prediction project is to analyze various medical factors of female patients, to predict whether they have diabetes or not.
    • Technologies Used: The notebooks uses Logistic Regression, Random Forest Classifier and Support Vector Machine
    • Results: The Logistic Regression performed with an accuracy of 78%.
  11. Bank Customer Churn Prediction
    • Description: The main objective of the Bank Customer Churn Prediction project is to analyze the demographics in order to predict whether a customer will leave the bank or not.
    • Technologies Used: The notebooks uses Random Forest Classifier and Decision Tree Classifier
    • Results: The Random Forest Classifier and Decision Tree Classifier performed equally well with an accuracy of 87%
  12. Salary Prediction
    • Description: The main objective of the Salary Prediction project is analyze the employee's demographics such as age, experience job title, country and race to predicts the salary.
    • Technologies Used: The notebooks uses Descision Tree Regressor and Random Forest Regressor
    • Results: The Random Forest Regressor performed best with 94.6% accuracy
  13. Delhi House Price Prediction
    • Description: he primary objective is to develop a predictive model that can accurately estimate the prices of houses based on several key features present in the dataset.
    • Technologies Used: The notebooks uses Descision Tree Regressor and Random Forest Regressor
    • Results: The Random Forest Regressor performed best with 84.98% accuracy
  14. Loan Approval Prediction
    • Description: The Loan Approval Prediction project aims to predict whether a loan application will be approved by a bank.
    • Technologies Used: The notebooks uses Random Forest Classifier and Decision Tree Classifier
    • Results: The Decision Tree Classifier performed well with an accuracy of 91.4%
  15. Cardiovascular Disease Prediction
    • Description: The Cardiovascular Disease Prediction project aims to predict the occurrence of cardiovascular disease in patients based on their medical records and history.
    • Technologies Used: The notebooks uses Random Forest Classifier, Decision Tree Classifier and Logistic Regression
    • Results: The Logistic Regression performed well with an accuracy of 91.4%
  16. Belarus Car Price Prediction
    • Description: The Belarus Car Price Prediction project aims to predict the price of car in Belarus based on car features.
    • Technologies Used: The notebooks uses Decision Tree Regressor
    • Results: The Decision Tree Regressor gave an accuracy of 86.29%
  17. Warranty Claims Fraud Prediction
    • Description: The aim of this data science project is to predict the authenticity of warranty claims by analyzing various factors such as region, product category, claim value, and more.
    • Technologies Used: The notebooks uses Decision Tree Classifier, Random Forest Classifier and Logistic Regression
    • Results: All three models gave an accuracy of 91-92%
  18. E-Commerce Product Delivery Prediction
    • Description: The aim of this project is to predict whether products from an international e-commerce company will reach customers on time or not.
    • Technologies Used: The notebooks uses Decision Tree Classifier, Random Forest Classifier, Logistic Regression and KNN Classifier
    • Results: The decision tree classifier model performed best with 69% accuracy
  19. Hotel Reservations Cancellation Prediction
    • Description: The aim of this project to predict the possible reservations that are going to cancelled by the customers by analyzing various features and variables associated with the reservation.
    • Technologies Used: The notebooks uses Decision Tree Classifier, Random Forest Classifier and Logistic Regression.
    • Results: The decision tree classifier model performed best with 85% accuracy
  20. Telecom Customer Churn Prediction
    • Description: The aim of this project is to analyze customer demographics, services, tenure and other variables to predict whether a particular customer will churn or not.
    • Technologies Used: The notebooks uses Decision Tree Classifier, Random Forest Classifier and K Nearest Neighbor Classifier.
    • Results: The random forest classifier model performed best with 82% accuracy
  21. SFR Analysis
    • Description: The objective of this project is to analyze the SFR (SpaceFund Realty) of the aerospace companies and their missions in order to help the investors to make better decisions.
    • Technologies Used: The notebooks uses Decision Tree Classifier, Random Forest Classifier.
    • Results: The random forest classifier and decision tree classifier gave 87% accuracy.
  22. Indian Used Car Price Prediction
    • Description: The aim of this data science project is to predict the price of used cars in major Indian metro cities.
    • Technologies Used: The notebooks uses Decision Tree Regressor and Random Forest Regressor.
    • Results: The random forest regressor gave 87.8% accuracy
  23. Crop Yield Prediction
    • Description: The aim of this data science project is to predict crop yield using the dataset provided from Crop Yield Prediction..
    • Technologies Used: The notebooks uses Decision Tree Regressor and Random Forest Regressor.
    • Results: The random forest regressor gave 80.2% accuracy
  24. Osteoporosis Risk Prediction
    • Description: The aim of this project is to predict the risk of osteoporosis in patients using a dataset of patients' medical records.
    • Technologies Used: The notebooks uses Logistic Regression, Random Tree, Decision Tree and Support Vector Classifier.
    • Results: The Decision Tree Classifier gave 87% accuracy

License

This project is licensed under the MIT License. You are free to use the code and resources for educational or personal purposes with citation or reference to the original code and resources used.

Contributing

Contributions are welcome! If you would like to contribute to this repository, please follow the guidelines outlined in CONTRIBUTING.md. Any improvements, bug fixes, or additional projects are greatly appreciated.

๐Ÿ‘‹ Join the Discussion!

We believe in the power of community and collaboration. Head over to our Discussion Page to engage with fellow data enthusiasts, share your ideas, ask questions, and contribute to our vibrant community. Whether you're a seasoned data scientist or just starting out, your voice matters! Let's learn, grow, and innovate together. See you there! ๐Ÿš€

Feedback and Contact

I welcome any feedback, suggestions, or questions you may have about the projects or any kind of sponsorships for the repository. Feel free to reach out to me via email at [email protected]

Enjoy exploring my data science projects!

data-science-projects's People

Contributors

sukhman-singh-1612 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

data-science-projects's Issues

Diabetes Prediction

There has been a small mistake while naming the confusion matrix and distribution plot in the model evaluation in the recent project Pima Indians Diabetes Prediction. I have mistakenly titled the confusion matrix heatmap and distribution plot for random forest classifier and SVM, to logistic regression. However the plot is created for the desired model with correct data.

In the medical cost prediction workbook cell 177 needs to check

In the medical cost prediction workbook

Pls check cell number 177 - The Accuracy needs the Parameters y_test and dtree_pred but you took something else , which is not making sense

print('MAE:', mean_absolute_error(y_test, dtree_pred))
print('MSE:', mean_squared_error(y_test, dtree_pred))
print('RMSE:', np.sqrt(mean_squared_error(y_test, dtree_pred)))
print('Accuracy:', dtree.score(x_test,y_test))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.