Giter Club home page Giter Club logo

Harvey Benjamin Smith's Projects

analyzing_nyc_highschool_data icon analyzing_nyc_highschool_data

One of the most controversial issues in the U.S. educational system is the efficacy of standardized tests, and whether they're unfair to certain groups. Given our prior knowledge of this topic, investigating the correlations between SAT scores and demographics might be an interesting angle to take. We could correlate SAT scores with factors like race, gender, income, and more. The SAT, or Scholastic Aptitude Test, is an exam that U.S. high school students take before applying to college. Colleges take the test scores into account when deciding who to admit, so it's fairly important to perform well on it. The test consists of three sections, each of which has 800 possible points. The combined score is out of 2,400 possible points (while this number has changed a few times, the data set for our project is based on 2,400 total points). Organizations often rank high schools by their average SAT scores. The scores are also considered a measure of overall school district quality. New York City makes its data on high school SAT scores available online, as well as the demographics for each high school.

annual_medical_bill_predictor icon annual_medical_bill_predictor

Machine learning model using TPOT library to find the most optimal model for predicting a clients annual medical expenses in order to lead to decisions on premium pricing

black_friday_analysis_r icon black_friday_analysis_r

An exploratory analysis in R of the different variables involved in the purchasing patterns of consumers on Black Friday in three different kinds of cities

blog icon blog

Data, code, and scripts for the analysis in the Mode blog.

cia_statements_word_counter icon cia_statements_word_counter

This repo contains a dataset of CIA statements and a python file containing a function finding the most common words with a length greater than 5 and takes a year as input

diabetes_prediction icon diabetes_prediction

Streamlit webapp featuring a machine learning classification model for positive or negative for diabetes based on dataset of the Pima Indians

good-credit-or-bad-credit icon good-credit-or-bad-credit

Classifying customers as either good or bad for loans based and dealing with minority classes in classification

gun_deaths icon gun_deaths

Exploratory data analysis of gun related deaths in the United States from 2012 - 2014

movie_recomendations_pyspark icon movie_recomendations_pyspark

This report stores my Pyspark code for me practicing using Spark with the pyspark library using a cluster setup on Hadoop. I then use the ALS library to set up a simple user-based recomedation engine

pyspark_demographics icon pyspark_demographics

This simple exploratory analysis was just a chance for me to play around with a large dataset to display and stock data using PySpark DataFrames.

python icon python

Some small projects using python to do data analysis

pytricks icon pytricks

Collection of less popular features and tricks for the Python programming language

salary_predictions icon salary_predictions

Machine learning project using a comparison of different models for predicting potential employee salaries

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.