In this lab, I explore the Airlines dataset and apply feature transformation techniques we learned in the Feature Engineering lesson.
The Airlines dataset contains information on flight prices, routes, and other attributes. We will use this dataset to explore techniques such as one-hot encoding, label encoding, and dimensionality reduction.
This is the link to the dataset: https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-ML0232EN-SkillsNetwork/asset/airlines_data.xlsx
To get started, I download the Airlines dataset, which is provided in the airlines_data.csv file.
To run the code in the Jupyter Notebook, you will need to have the following packages installed:
*pandas *numpy *matplotlib *seaborn *scikit-learn
You can install these packages using pip: pip install pandas numpy matplotlib seaborn scikit-learn
Open the airlines_feature_engineering.ipynb file in Jupyter Notebook and follow the instructions provided in the notebook to perform feature engineering on the Airlines dataset. You will learn techniques such as one-hot encoding, label encoding, and dimensionality reduction.
Conclusion By the end of this lab, I have gained valuable experience in applying feature engineering techniques to real-world datasets. These skills are essential for any data scientist who wants to be able to work with messy data and turn it into actionable insights.