This is my attempt at testing some Machine Learning algorithms using the Kaggle Titanic dataset. I use a Jupyter Notebook and go through the data analysis and prediction step by step. You can use this Notebook if you're a beginner in ML and want to get a first taste for doing data analysis with Kaggle datasets.
I recommend using conda to install all the required packages. You'll need the following packages:
- numpy
- scikit-learn
- pandas
- matplotlib
The aim is to predict the survival of the Titanic passengers, using the available data.
- We first get familiar with the data.
- We use a simple Decision Tree model to do our first prediction.
- We use a Random Forest model to do another prediction.