Giter Club home page Giter Club logo

challenges's Introduction

ICDSS Advanced Data Science Team Challenges 2020-2021

Datasets

Choose and work on one out of the six distinct datasets provided. You can assume that they are correctly labelled and cleaned. Detailed descriptions are provided in the READEME.md files of each challenge.

  1. Bike Deployment
  2. Fashion MNIST Classification
  3. Cryptocurrency Trading
  4. Volatility Pairs Trading
  5. Real Estate Housing
  6. Toxic Comment Classification

Assessment

The submissions are assessed for their:

Target & Effectiveness

Briefly summarise the goal of the challenge of your choice and provide an abstract of your analysis.

Succinctness is always welcome! If you can show something with 3 lines of math or use well-accepted terminology to explain something concisely, please do so! Long texts are, in general, not preferred.

Visualization & Statistics

Descriptions of the datasets are intentionally not provided to enable you to research for them and gain insight into the raw data.

Any type of meaningful visualization and basic statistical analysis of the data is highly encouraged. Do not spend too much time on that, you should be able to get this with just a few lines of code.

Please avoid spending lots of time trying to produce the perfect looking graph. Simple ones will suffice.

Assumptions

Clearly state your assumptions before fitting any model. This will, hopefully, lead you to better support your model theoretically and also help us assess your model selection and approach.

Model & Model Selection & Validation

Provide insight into the suggested model to address the problem. Using sklearn's API fit, predict, score methods is expected but not the purpose of these challenges. So just coding a model is not enough! We are interested in how you made the decisions for coming up with your model.

As far as model selection and validation go, we do not expect you to perform grid cross-validation for everything, since this is computationally demanding. We have relaxed the requirements for the optimality of the final model, as long as there is a systematic approach. On the other hand, we do not trust oracles, so if you plan to hardcode model hyperparameters you had better come up with a good reasoning!

Note: Model accuracy is not the main objective of these challenges. State-of-the-art models can be found online, but if they are not properly presented and reasoned then the solutions will be consider of inferior quality compared to accurate but well developed and explained solutions.

Submission

Format

Compress your solution and send us the .zip or .tar file via email. Format the email subject according to:

ADST Challenge 20-21: <challenge_name>_<full_name>

where:

  • <challenge_name>: is the name of the project given above
  • <full_name>: your full legal name, same as the one stated in your google form submission

Note that you should be a member of ICDSS society in order to accept your submission.

Environment

You are allowed to use any programming language, as long as you provide us with an automated way to setup our environment for reproducing your results. You should expect us to use Ubuntu >18.04 LTS for this purpose. If you have some particular preference between the two please specify it in your submission.

Deriverables

Challenge

You can choose to submit:

  • a Jupyter Notebook

  • a pdf report accompanied with documented scripts for reproducing the results. In this case we expect a well method-level documentation convention used so that we can follow the code along the report. There is no preference in the structure of the report, it can also look like a presentation.

Note: Quality matters only. The challenge can be successfully addressed and reported with just a handful of jupyter-cells/slides/report-pages.

Fill up our Application form

Please fill in your details and upload your resume through this jotform

Logistics

Issues

We will not answer to personal emails, please use issues for asking questions!

Deadline

Requests for deadline extensions will be ignored.

challenges's People

Contributors

sytan98 avatar filangelos avatar adam-bhaiji avatar udvardipeter avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.