Giter Club home page Giter Club logo

orie4741-project's People

Contributors

apcalam avatar jschluger avatar wanxinglu avatar

Watchers

 avatar  avatar

Forkers

apcalam

orie4741-project's Issues

Final Peer Review

This project aims to find the ideal price of US domestic flight tickets in order to maximize profit. It uses as dataset a the US DOT's domestic airline consumer airfare report.

I really like how you spent some time describing why you're pursuing this project, and you have data to back it up! Additionally, I appreciated how you were very explicit about which features you were using and why you were dropping some. I was left with no questions about that. I think overall, you outlined things very clearly. You made sure to address most, if not all questions the reader may have, and laid all of the information out very clearly. I said that twice for emphasis-- this was an easy to read and understand project report, at least from my perspective. Also, you had a very in-depth analysis about the model's viability as a weapon of math destruction, going through each part of the WMD definition.

It would've been nice if Figure 2 was on the same page as when you described it. I had to scroll back and forth to compare what you were saying with the visualization. If I have to nitpick, I think you could've defined extrapolation vs interpolation one more time for less technical readers, and maybe in different words. Additionally, it may have been nice to see more visuals- not just tables of information, but more charts and graphs.

Overall, great job!

Midterm Peer Review sbz24

This project analyzes the dataset from the US Department of Transportation’s Domestic Airline Consumer Airfare Report from 2019. The group is trying to predict the domestic flight prices assuming a normal year of travel. The data contains an origin airport and city, destination airport and city, year, time of year (quarter), average fare, average fare for the carrier with the largest market share, average fare for the lowest carrier, number of miles, passengers per day, and geocoded information.

Things I like

  1. Good job with showing whether the data overfits or underfits using least squares
  2. Good description of next steps, I think this project is progressing well.
  3. I also liked the description of the dataset and choice of graphs.

Areas for improvement

  1. The EDA section can be enhanced to include some more descriptions of what the trends mean and why these trends occurred
  2. For k-fold cross validation, you can try to compare the performance of your model by splitting data in 10%-90%, 20%-80%, 30%-70%, etc.
  3. Wish you can be more clear about your objective and the problem you're trying to solve.

Mid-Term Peer Review

The main goal of the “Domestic_Flight_Predictions” project is to analyze the data from the US Department of Transportation’s Domestic Airline Consumer Airfare Report from 2019. This dataset contains information on flight’s origin airport and city, destination airport and city, year, time of year, average fare, average fare for the carrier with the largest market share, average fare for the lowest carrier, number of miles, passengers per day, and geocoded information.

One thing I like about this project midterm report are the histograms of the data. This graphics help get a sense of the information in the dataset. Another thing I like about this report is the use of k-fold cross validation to avoid overfitting. I also liked this group's process going forward.

One improvement I would suggest for this project is to include graphs for the preliminary analysis on the data. This would help the reader better understand the models explained in the explanation. Another improvement I would suggest is to experiment with other models beside the linear regression model because the MSE appears to be pretty high. I would also suggest you state the project objective clearly in the beginning of the report.

Peer Review

Summary
The project is looking at the price of domestic airlines, aiming to come up with prices for airline companies that are both profitable and competitive in the market. The data using mainly comes from the United States Bureau of Transportation and online airline trackers.
Things I like
1.The result of the project can be applied easily since travel by plane is now a very common way of transportation and has great significance for users when choosing the most affordable flight.
2.The objective of the project is very clear and feasible to me since most of the variables needed for predicting a domestic flight price can be obtained.
3.Both data sets chosen are very extensive, and they will serve good purposes for the project.
Things I think could be improved
1.Using data from online airline trackers seems interesting but I am not sure how you are going to process the data. It can be time-consuming based on your approach to extract the data and then clean it.
2.It seems to me that the data from the United States Bureau of Transportation can be excessive since if you are extracting the data from online airline trackers, I do not think you still need the data for each airport at different quarters in different years as the data from “Average Domestic Airline Itinerary Fares By Origin City” listed since you are looking at specific flights instead of the average of airports. This, of course, will depend on much data you are able to extract from the online tracker.
3.The objective is a little contradicting since you are trying to find both the most profitable price for airline companies and at the same time, the maximized trip for consumers within their budgets. I would suggest focusing on just one side.

Midterm Peer review

This project takes the prices of domestic flights from 1993 to 2019 and predicts prices based on that. The dataset includes information on origin airport and city, destination airport and city, year, time of year (quarter), average fare, average fare for the carrier with the largest market share, average fare for the lowest carrier, number of miles, passengers per day, and geocoded information.

Things I like:

  1. They described the dataset very well, using different plots to illustrate different relationships between variables.
  2. The team's plan for the rest of the semester seems robust and feasible; they also take many things into account when looking at what they can do in the future.
  3. Their model does not overfit the dataset and is actually doing quite well.

Things to improve:

  1. I do not really understand the second graph that they used to describe the dataset. It might be better if they could phrase it in another way.
  2. I would suggest including some graphs as a result of the model they now have to make it easier for the readers.
  3. This might be nothing but it would be great if they set the legends of the three graphs to the same corner.

Midterm Peer Review (caa234)

This group aims to use data from the US Department of Transportation’s Domestic Airline Consumer Airfare Report to analyze the cost of domestic flights given a host of financial and geographical data on more than 200,000 flights.

Some comments:

(+) The explanation of the visualizations was great, a lot of detail and effort was put in so that the reader understood not only what the visualization meant, but why it was important
(+) The "Avoiding Overfitting" section was thoroughly detailed, it is important to state modeling assumptions and say that your approach won't be robust to outliers such as 2020.
(+) Your analysis of your least-squares model is detailed and thorough

(-) Might be better in the "large_ms" density to display actual probabilities for the y-axis so the interpretation is more straightforward for the reader (I can see the spike but what does the value of 2.5 mean?)
(-) For your modeling section, you don't really need to specify that you used the Julia backslash operator, seems superfluous
(-) There seems to be some redundant information in your report (i.e. your definition of the "when" column and your use of k-fold cross-validation) perhaps consider removing some of these for a more streamlined final outcome
(-) What did cleaning the data look like? Did you simply remove all entries with missing data? Will this affect any future modeling decisions you make? More information on this would be good.

Overall great work! Excited for the final project

Final report peer review ckb65

Summary

This group worked to build models interpolating and extrapolating airline ticket prices given a dataset from the US department of transportation. They build many different linear models with hyper-parameter tuning over different feature sets in order to accomplish this. Many of their final models performed well, under one standard deviation from the dataset.

What I liked

I thought it was thorough and interesting that you separated your models into the two categories of interpolation and extrapolation.

The tables comparing your models, the feature set used on each model, and the results were really well done and organized. It made interpreting the results of your project really easy.

I thought you did a good job analyzing the applications of your model for both consumers and airlines including the note of where your model would still predict well given the COVID pandemic.

Areas of improvement

  • I noticed in your "data cleaning" you say you added an additional row when I think you meant to write column.
  • Your correlation matrix plot had overlapping numbers on its axis making it a bit harder to understand.
  • There wasn't much discussion of individual feature importance or coefficients in each of your models. Was this a factor that varied a lot model to model or was it fairly constant?

Peer Review

Summary of Project:
The objective of the project is to identify the optimal price point of domestic flights from the perspective of an airline company. The ideal price point will be defined by the profit maximizing price that allows the airline to remain competitive. The dataset that will be used is the Average Domestic Airline Itinerary Fares By Origin City dataset from the US Bureau of Transportation as well as data from online fare trackers.

What I like:

  • I liked how well you established ideal price point as the main metric and what the ideal price point would be
  • I believe you did a good job establishing why this is an important problem, citing external sources like the FAA.
  • The dataset you’re using is well suited for a project in this class - it’s big and messy

Areas for improvement:

  • I believe tackling this problem from both the consumer’s and the airline’s point of view might be too ambitious for the time we have. Maybe focusing on just one side would be best given the time.
  • It is stated that data from online airfare trackers such as Fare Detective will be used in the project. I believe the project proposal could benefit from an explanation of how this data will be extracted (web scraping, API, manual entry).
  • I think the proposal could benefit from an explanation and breakdown of what features are going to be used to determine the optimal price.

Overall, it’s a very interesting project and I look forward to seeing how it evolves over the course of the semester!

Final Peer Review (jpb375)

Peer Review

Summary

The group sought to predict the airline fare of historical flight data in order to advise airline companies on the best pricing to be competitive and make a profit. First they fit a baseline model using basic linear regression to understand the strengths and weakness of more developed models on the same data. They fit over a thousand models to their data set and were able to create accurate models which produced errors largely within a standard deviation of the actual price.

What I liked

  • Your writing is extremely technical and professional. I felt that you have covered all the bases of the project and more.
  • I appreciate the number of tables you used. It helped to put numerical data side by side.
  • Your hyper parameter analysis and testing was extremely thorough and likely was key to the success of your model.

What needs work

  • I found that many of your graphs were mentioned on different pages than they were shown. While it was helpful to have figure numbers I wish it would have been easier to find graphs and tables referenced in the text as I was reading.
  • I wished your tables with results would have referenced the loss function or regularization used along with the error rates. I found that I had to move around in the page a lot.
  • I would have loved to see your model applied to today’s pandemic data. While the results may have been less accurate I wonder if you could have pulled out more robust insights by analyzing this challenging time.

Great work!

Final Report Peer Review!!

Summary
The report detailed the group's process of cleaning, feature selection, model selection + tuning, and analysis of flight pricing data. Understanding pricing and demand for flights is valuable for both consumers and airline companies, because knowing trends from seasonality or other features can help users make informed decisions. The final model was optimized over 4 different feature sets and 1280 different linear models.

What I Liked

  1. Separation between interpolation and extrapolation - It made sense that you separated your models into two categories, since there is usually a tradeoff between interpolation and extrapolation, and users might want to choose between a model that is stronger in one area or another.
  2. Thorough model selection and tuning - The train/test/validate methodology was clear, and the thorough selection across so many models, feature sets, and hyper parameters gives me confidence that the final models are the best possible.
  3. Error analysis - Looking at the distribution of errors provided valuable insight to understanding model performance. You were able to show that there weren't many cases where the model predicted extremely poorly; most errors were near the true value.

Areas of Improvement

  1. Would have liked some explanation on how the extrapolation model achieved better interpolation RMSE than the interpolation model.
  2. Would like to see more interpretation of the effect of the predictors on the model using the model coefficients. You talked about seasonality, but I'm sure you could find other interesting information from features like airport id or miles.

There was a clear methodology and the supporting tables and graphs enhanced my understanding. Great work!

Proposal Peer Review

This group seeks to analyze historical data so as to find the optimal price of flights for an airline company from a specific set of destinations. Their ultimate objective is to use this data analysis to help airlines with the pricing of future flights. The data sets they're using are largely historical data from the US Bureau of Transportation, and from an online airfare tracker called FareDetective.

Things I like:

  1. Your data set looks comprehensive, with years from 1993-2020 -- that's great!
  2. The proposal was very well-written. It articulated the importance of the project and the objectives very clearly, in general. I especially liked how they cited the share of the Airline industry in the US economy, thus providing a new level of meaning for the project.
  3. It sounds like this project is very do-able, the scale of it was picked well!

Areas of Improvement:

  1. Under the assumption that airlines aren't currently pricing flights optimally, I think historical data isn't necessarily the best indicator of the most profitable pricing points. If I understand your project correctly, without a set of data that has the profit of each flight, it would be difficult to find the best price. This feels more like an optimization problem to me rather than a simple data analysis. Though, I could be misinterpreting it.

  2. For someone who is not interested in the airline industry, this may be a little bland. It might be good to emphasize how this project could affect the average person.

  3. "...in addition to helping consumers decide when and where to fly to maximize a trip within their budget." This line in the proposal is a little vague - how do you maximize a trip?

Final Report Peer Review(sc2267)

Summary
The group analyzed flight pricing data in order to derive insights that could potentially benefit both customers and airline companies. They aim to provide this insight to airline companies so they can price their flights better. The group uses a datset from the US Department of Transportation consumer airfare report.

3 Things I Liked

  1. Very-well written report. Everything is explained technically and you cover a lot of information, going in-depth on all your models.
  2. Your models showed good results! You were thorough in your analysis and it seems like these results can have some real-world implications.
  3. You guys described very well how you handled and processed the data before you actually fit your models.

Potential improvements

  1. Some of the graphs were not on the same page as when you described them. This made the paper hard to follow at times.
  2. Some more visuals would be helpful to get a clearer picture of the whole data.
  3. Would have been good to elaborate on the concept on extrapolation vs interpolation and how the models for each differed

Overall, great report!

Proposal Peer Review

This project is about discovering patterns to determine optimal price points for airlines to price their flights. The group also would like to determine what the optimal buying price is from the customers standpoint. They will be using datasets from the United States Bureau of Transportation from year 1993-2020 because it provides historical data on airline pricing. They will also use a website for specifics on certain flights.

Certain aspects I think are especially good about this proposal is the fact that it really outlines the value of conducting this study. Including the statistic of aviations impact on the GDP definitely helps assert the value and importance of this project. Furthermore, I think that the United States Bureau of Transportation’s data is an excellent dataset for this project. It has tons of historical data and it definitely brings up more interesting questions for the group to explore (i.e. will they evaluate how the optimal price changes over time? Will they try to do predictive analytics on the optimal price). The third thing that I liked about this proposal is that they narrowed the scope of the problem to only evaluating Domestic Flights. I think this was a smart move as including international flights would perhaps make the problem a little too Messy and the group may have been unable to get meaningful results by the time the project was over. Narrowing it down to domestic flights definitely make the project more feasible ,and the time can be used to explore different methods more than preprocessing the data.

While this project is definitely really interesting, I think that it would be stronger if it improved in three areas. The first being establishing how the online websites would be used. Would the group have to web scrape the data or write a script that automatically puts in a bunch of destinations and starting points., or would it have to be done manually? The second aspect that they should consider is what has already been done in the field. Websites like kayak and Expedia already have their own algorithms to determine optimal flight paths by budget. Seeing what work is already done in the field may serve as an excellent baseline to start. Finally, I think the most concerning part of the project is the lack of definition on a specific problem and specific factors. I think looking at it from both the airlines perspective and the customers perspective maybe rather challenging to accomplish in this short period of time. Also, there are so many more factors that can be chosen from to make even one of those two problem extremely complex already. As they have not clearly defined the aspect they wish to evaluate and what the Input and Output will be several questions come to mind. For example, will they factor in destination variability or just consider one destination and one starting point? If this is a predictive model, how will they handle the non normal flight trends from COVID? Will they consider factors such as weather that may not be in the data set but are readily available? I definitely think this is what makes this project interesting but it might be great to start thinking about now.

Overall, I’m super excited to see the work to come from this project and the direction they decide to take it! Great work :)

Midterm Peer Review(am2236)

This project looks at a dataset from the U.S. Department of Transportation's Domestic Airline Consumer report. It contains information about where flights are going from and to, time of year. The goal is try to predict the fare of a flight.

3 things I like:

I like how in the midterm report you used headings to make it clear which part of the midterm report requirements you were addressing. I also like how large and robust the dataset is. I also liked the detailed description of how you guys hope to proceed in the remainder of the project, mentioning what features you want to add to your predictive analysis and how you want to use one hot encoding.

3 things for improvement:

I felt like the writeup was a little clunky and hastily put together, this could be easily fixed by some more effort spent in formatting. I also felt like more of the midterm report should have been spent discussing model/models you've already fitted on your data--I think you guys only had a couple of lines on it. Additionally I felt like you could have included a couple more visualizations than the ones you did. I for one would have liked to see a correlation matrix.

Peer Review

Domestic Flight Predictions:

Summary: This project is about predicting optimal airline prices. The objective is to predict an optimal domestic flight price that maximizes an airline company’s profit while also staying competitive within the airline market. They plan on using the United States Bureau of Transportation which lists “Average Domestic Airline Itinerary Fares By Origin City”2 from years 1993 to 2020, as well as online airfare trackers such as Fare Detective3, which provides historical pricing data on flights between any two airports within the United States.

Positive Feedback:

  • Their models can find potential patterns within the airline industry to find optimal (cheap) travel times for consumers.
  • Their objective helps balance the airline market as a whole by creating a potential ‘average’ price point with their predictions.
  • Their objective can not only benefit airline markets/passengers, but the GDP of a country as a whole.

Constructive Feedback:

  • Potentially restructure objective to find predictions for optimal prices for consumers, rather than potentially driving prices up by giving airline companies suggestions to charge more in order to raise profits.
  • Potentially use datasets that have more precise flight prices rather than averages in order to account for outliers.
  • Go into detail on what models you plan on using in order to solve the objective.

Midterm report Peer review

Summary:
The project is trying to analyze the dataset from the US Department of Transportation’s Domestic Airline Consumer Airfare Report from 2019. The dataset contains 213175 rows and 23 columns, which after cleaning contains 201392 rows and 24 columns, describing information for airport pair markets. They have added a “when” column to the dataset that is calculated by the formula: when = year + (quarter - 1)/4. The rows contain an origin airport and city, destination airport and city, year, time of year (quarter), average fare, average fare for the carrier with the largest market share, average fare for the lowest carrier, number of miles, passengers per day, and geocoded information.

Things I liked:

  1. I really like how they are using k-fold cross validation to avoid overfitting and using more complex models to avoid underfitting. I also really like how they explained that why these methods would work.
  2. I also really liked the graphs that they have used. They have used appropriate graphs to represent appropriate columns and values. The choice of the graph plotted is good.
  3. I also really liked how they have explained what needs to be done. It shows that they have a well-defined plan for the rest of the semester.

Areas for improvement:

  1. I would suggest explaining your project in a little more detail at the start of the report.
  2. I would also suggest using other models than the simple linear model.
  3. I also suggest analyzing the data more. It would have been great to include explanation of the graphs plotted or what these graphs are representing and what insights can be drawn from them.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.