New York City has a significant immigrant population and is very diverse, so comparing demographic factors such as race, income, and gender with SAT scores is a good way to determine whether the SAT is a fair test. For example, if certain racial groups consistently perform better on the SAT, we would have some evidence that the SAT is unfair.
A significant amount of data has been published, concerning student SAT scores by high school, along with additional demographic data sets. The datasets are the following:
- SAT scores by school - SAT scores for each high school in New York City
- School attendance - Attendance information for each school in New York City
- Class size - Information on class size for each school
- AP test results - Advanced Placement (AP) exam results for each high school (passing an optional AP exam in a particular subject can earn a student college credit in that subject)
- Graduation outcomes - The percentage of students who graduated, and other outcome information
- Demographics - Demographic information for each school
- School survey - Surveys of parents, teachers, and students at each school
This project reads a number of different datasets (see above), combines them into a single one, clean pandas dataframe and further processes it, aiming to investigate potential relationships between demographics and SAT scores.
-
North districts are characerized by higher safety scores. Specifically, the Upper Manhattan, the Bronx and parts of Queens have on average higher safety scores, in contrast to Brooklyn.
-
There is suspicion the SAT Scores to distribute "unevenly" among the races, with the "white" and "yellow" students to "enjoy" higher scores.
-
The higher the number of students in a High School that take at least on AP exam, does not necessarily mean that this school will note higher SAT scores.
[It was created while studying Data Science in DataQuest platform, in an effort to enhance my ability to Communicate results using Visualizations, Reason about data from varied sources and stay Motivated to continuously implement the newly aquired skills & capabilities, so as to enrich my portfolio of data science-oriented projects]