Giter Club home page Giter Club logo

reproduction_of_classical_methods-corporaci-n-favorita-grocery-sales-forecasting's Introduction

Project Repository

Project Theme: In order to familiarize ourselves with technical details at various stages of time series tasks, this project aims to address the time series unit sales prediction challenge within a classical kaggle competition - Corporación Favorita Grocery Sales Forecasting competition。(From HKUST(GZ)-DSAA5021 Instructed by Prof. TANG Jing)

Note: We selected and extensively studied the solutions of the top competitors in the competition, and ultimately found that they employed relatively similar input-output modeling approaches. Therefore, we began our own practical implementation based on this insight: reference_project_1-by_oberoiheman, reference_project_2-by_sjvasquez, reference_project_3-by_weiwei,

Through exploration and reconstruction of historical sales records, sales planning, and auxiliary information, we successfully constructed feature vectors for each individual product, organized into a comprehensive Feature-Batch encompassing all items. Leveraging the XGBoost machine learning model, which takes the complete Feature-Batch as input and predicts the unit sales for all items over the upcoming 16 days, we trained the model using the provided data to forecast the unit sales of thousands of products across various Favorita stores in Ecuador over the next 16 days.

In the given topic provided by our teacher, we chose this task mainly because we encountered relevant knowledge in the field of "time series prediction" during stock forecasting assignments, and we felt the learning value and application prospects of this machine learning domain. Therefore, our initial motivation is to fully engage in learning the practical application of machine learning time series prediction tasks.

To achieve this, we primarily need to clarify several important subtasks aimed at learning:

  • First, we should define the pipeline for time series prediction tasks.
  • Second, we need to understand how to model based on time series prediction tasks, defining how the objective should be modeled as output and how the data should be modeled as input.
  • Third, we need to familiarize ourselves with technical details at various stages of time series tasks, such as initial data exploration, mid-term model selection, and late-stage model evaluation.

Based on these motivations and tasks, we have outlined the fundamental approach to completing this project: within a more refined application-oriented competition, using an existed excellent solution as a "Role Model", we aim to accomplish the several learning-oriented subtasks we’ve proposed. We selected and extensively studied the solutions of the top competitors in the competition, and ultimately found that they employed relatively similar input-output modeling approaches. Therefore, we began our own practical implementation based on this insight.

reproduction_of_classical_methods-corporaci-n-favorita-grocery-sales-forecasting's People

Contributors

jonarck avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.