Giter Club home page Giter Club logo

notrichbish / airline-ticket-price-prediction Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.53 MB

This project covers both Simple Linear Regression and Multiple Linear Regression which are used in prediction airline flight ticket. Moreover, Correlation analysis and Timeseries analysis are performed as well. This project is built as a fulfillment for my masters degree.

R 100.00%
applied-statistics correlation-coefficient data-preprocessing exploratory-data-analysis linear-regression machine-learning r timeseries-analysis timeseries-forecasting

airline-ticket-price-prediction's Introduction

Airline Ticket Price Prediction using Correlation and Regression analysis

This project was one of the requirements within my postgraduate module called Applied Statistics. The main aim of this project is to generate an accurate model in predicting airline ticket price based on the features. The machine learning models used in this project are Simple Linear Regression and Multiple Linear regression. Additionally, a timeseries using AUTO ARIMA is performed to forecast the price of a particular airline in the year of 2023. The main process flow of this project is performing Exploratory Data Analysis, Data Pre-processing, Correaltion Analysis, Model Training, Timeseries Analysis, and Hypothesis Testing using ADF.

The project is coded in R Language using the R Studio IDE.

There are 2 dataset used in this project which are located in the "Dataset" folder.

The full code can be viewed in the "Code.R" file.

If anyone wants to use a part of the code. Please reference it. Thanks.

Executive Summary

Current research within this domain imply that airline ticket prices can be predicted using a set of a certain features which can be useful companies and tourist to deduce the price and when is the best time to buy a flight ticket. As the price of a flight ticket fluctuates as there is a seasonal price is applied from time to time, it is difficult to get an accurate prediction. Thus, the main question now is it possible to predict the ticket price based on features related to the flight itself such as flight duration, number of stops, etc.

During correlation analysis, a strong positive correlation of 0.92 is identified between average price and remaining days left to buy the ticket. This was the strongest of all correlation founded. The linear regression analysis discovered that the feature “average price” and “days_left” could explain 62.53% of the variation of “average price”. Moreover, the timeseries analysis forecasted that in April 2023 the ticket price for Jet Airways is ₹12431.34.

Overall, the findings in this project conclude that the features can be used to predict the airline ticket price. Nevertheless, more features could be considered such as weather condition or expanding the dataset more with numerical variables to predict a more accurate result.

Statistical Questions

  1. Does choosing different airlines affect the price of the ticket?
  2. Is there a difference in the price of an economy and business flight ticket?
  3. Does buying a ticket ahead of time affect the ticket price?
  4. Does the departure and arrival time affect the ticket price?
  5. Does the source and destination of the flight impact the ticket price?
  6. Does the number of stops (transit) affect the ticket price?
  7. What is the forecasted ticket price of Jet Airways in the year of 2023?
  8. Is the timeseries produced for the forecasting stationary?

Conclusion and Future Recommendation

The project covers all the process from data pre-processing to developing a linear regression model. The results found in this project encompasses all the statistical questions mentioned previously. Data pre-processing was performed to transform the data into suitable standard for the linear regression model. A correlation analysis is completed to identify which are the variables that are strongly dependent on each other which are beneficial for the linear regression analysis. Based on the correlation analysis, linear regression model was created to perform prediction on the airline flight price ticket. Additionally, a timeseries analysis is made to forecast the price of the ticket of Jet Airways airline in 2023 which is ₹12431.34. Finally, hypothesis testing using Augmented Dickey Fuller (ADF) or unit root test is carried out to identify whether the timeseries is stationary or not.

Overall, the features provided can deduce the airline price ticket. However, not all features are used and strong enough to achieve this task. Thus, a future improvement can be done where more factors could be considered such as weather condition or expanding the dataset more with more numerical variables which can be used to predict a more accurate result.

airline-ticket-price-prediction's People

Contributors

notrichbish avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.