Giter Club home page Giter Club logo

corolla-sales-regression-analysis's Introduction

Automotive Analytics: Advanced Regression Models of Toyota Corolla Sales Performance

This project conducts a detailed linear regression analysis on a dataset from a European Toyota car dealer, focusing on the sales prices of used Toyota Corolla cars. The analysis aims to discern how various factors such as age, mileage, horsepower, and other features influence the sales prices.

Table of Contents

Data Description

The dataset, UsedCars.csv, consists of various attributes of used Toyota Corolla cars, including:

  • Id: Identification number of the car.
  • Model: Model name of the car.
  • Price: Sales price in Euros.
  • Age: Age of the car in months (as of August 2004).
  • KM: Accumulated kilometers on the odometer.
  • HP: Horsepower of the car.
  • Metallic: Whether the car has a metallic color (1 for Yes, 0 for No).
  • Automatic: Whether the car has an automatic transmission (1 for Yes, 0 for No).
  • CC: Cylinder volume in cubic centimeters.
  • Doors: Number of doors.
  • Gears: Number of gears.
  • Weight: Weight of the car in kilograms.

Installation

To replicate this analysis:

  1. Clone the repository:
    git clone https://github.com/your-username/used-toyota-car-sales-analysis.git
    cd used-toyota-car-sales-analysis
  2. Install R and required packages.
  3. Execute the R scripts for analysis.

Exploratory Data Analysis (EDA)

Initial EDA revealed a negative correlation between Price and KM, suggesting that as KM increases, Price tends to decrease.

Visualizations

Scatter Plot (Price vs KM)

Shows the relationship between Price and KM, highlighting the negative trend.

Scatter Plot (Price vs KM)

Residual Plot (Model 1)

Indicates potential issues with the linear regression assumptions due to the pattern of residuals.

Residual Plot

Q-Q Plot (Model 1)

Assesses the normality of residuals, showing deviations, particularly in the tails.

Q-Q Plot

Histogram of Residuals (Model 1)

Displays the distribution of residuals, highlighting skewness or deviations from normality.

Histogram of Residuals

Linear Regression Model Development

Model 1: Basic Linear Regression

  • R-squared: 0.3824
  • F-statistic: 781.4 on 1 and 1262 DF
  • p-value: < 2.2e-16

Focused on the Price-KM relationship, it revealed a moderate negative linear relationship.

Model 2: Improved Model with Transformation

  • R-squared: 0.4089
  • F-statistic: 873.2 on 1 and 1262 DF
  • p-value: < 2.2e-16

Implemented due to heteroscedasticity and non-linearity in Model 1. A Box-Cox transformation suggested using the inverse of Price, improving model fit and assumptions adherence.

Model Comparison and Validation

Model 2 showed an improved R-squared value and better adherence to regression assumptions, compared to Model 1.

ANOVA for Model Selection

  • Full Model:

    • R-squared: 0.8649, Adjusted R-squared: 0.8639
    • F-statistic: 891.8 on 9 and 1254 DF
    • p-value: < 2.2e-16
  • Reduced Model:

    • R-squared: 0.8648, Adjusted R-squared: 0.8642
    • F-statistic: 1341 on 6 and 1257 DF
    • p-value: < 2.2e-16

Both a full model with all variables and a reduced model with significant variables were compared. The reduced model was simpler and slightly better in terms of adjusted R-squared.

Conclusion

The analysis underscores the importance of data transformation in regression and provides insights into factors affecting used car prices, useful for dealers and buyers alike.

corolla-sales-regression-analysis's People

Contributors

pclaridy avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.