Giter Club home page Giter Club logo

data_science_general_assembly's Introduction

Data Science Immersive Projects

Hello World! This repository contains multiple projects from my Data Science Immersive course at General Assembly. The first project I have posted in this repository is a series of three separate Jupyter notebooks under the Titanic_Series_Commit folder. This series contains a progression from basic data analysis workflow to professional level workflow using machine learning in conjunction with the Titanic dataset from Kaggle.com. The purpose for these notebooks are to serve as a template/tutorial for how to progress from the basics of descriptive analytics to the more advanced techniques of machine learning. Over three separate Jupyter notebooks, I cover a step by step breakdown of how to explore, clean, analyze, visualize, and predict survival for Titanic passengers. I will demonstrate everything from importing libraries to uploading your predictions to Kaggle in a format that it will accept. Finally, I will also demonstrate how to automate workflow using pipelines to make code re-useable.

If you would like to learn more info about my Data Science Experience please feel free to check out my Blog at https://medium.com/@benweinstein_52172

Titanic Series Commits

Series 1: Data Analysis with the Titanic

This notebook uses pandas to explore, clean, analyze, and visualize basic information about the Titanic dataset and includes findings related to survival based on age, gender, class, and a host of other characteristics.

Series 2: Logistic Regression w/out Pipelines and how to upload predictions to Kaggle.com

This is a step by step guide to cleaning a dataset, manipulating the features, creating a logistic regression model, obtaining predictions, and passing the dataset into a pandas dataframe to upload to Kaggle. As you will see, my process is line by line to help with visualization however not extremely reusable. I will explain how to make my code reusable in the next series with pipelines.

Series 3: Logistic Regression with pipelines: How to automate workflow

In this third and final jupyter notebook I will demonstrate how to use pipelines to allow for code to be re-useable. I will also upload this final dataset to Kaggle.com and report back a score.

Housing Prices

This project is a commit from the Kaggle Housing Prices competition. It is currently under contruction so check back in later for an update.

Final Project: Predicting Time on Trail For Appalachian Trail Thru Hikers with Machine Learning

data_science_general_assembly's People

Contributors

benw413 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.