Giter Club home page Giter Club logo

hashcode-template's Introduction

Template for Google Hash Code

to make it easier to deploy an incremental approach. Automating the booring parts.

To see an (possibly a bit out of date) example usage of this template (commit 2fe1063) look at Cache Flow's solution to the 2018 qualifier. Most forks are also usages of the template.

Model:

  • solvers/solve.py should implement the function solve that takes the input as a string and returns an answer as a string.
  • score.py should implement a function score that takes the input as solve got, and what solve returned, and then scores the submission. You should return the score as an integer.
  • Parsing of the input-file is done to an argparse.NameSpace in dataparser.py
  • main.py reads the config files default.cfg, main.cfg and then the commandline argument applied config file, in that order, overwriting config-elements if the file exists. It then runs the scorer and solver that are specified. Each time you invoke main.py a folder inside runs/ is created with relevant data about your run.

If you get a higher score than before on a test case the submission is saved in the submission folder, and a link to the run folder is created in the folder best_runs/.

Run a testinstance on the format in/$testcase.in with your own solver by:

python main.py $testcase, given that you have implemented the functions solve in solvers/solve.py and score in score.py. If you want to name your files, solve and score-functions differently the module and function-names in the config-file default.cfg can be modified, or set directly by arguments. For example if you want to run the pizza-solution, which has solve and score functions implemented in the module pizza.py, do python main.py -c tests/pizza.cfg in/exaple_pizza.in or manually python main.py --score module=tests/pizza --solve module=tests/pizza in/example_pizza.in

main.py will handle file-io, save the solution that gets maximal score to the submission-folder, set up logging, set up randomization, etc.

Other functionality:

  • sum_score.py - looks in max.json and prints a table with a row for each input file: {testcase} {max_score} {solve_module used} {run_folder_w_high_score}
  • show.py - easy access to the input files to do visualizations of them.
  • analyze.py - easy access to the run folders and best run folder to analyze the output-file.
  • Bug in your scorer? Just remove max.json and rerun, now main.py will happliy overwrite the ans-files in the submission folder. If you accidentaly remove max.json you can recover your ans files form the best_runs folder or the ans folder.
  • package.sh create a zip folder with your solution
  • setup.sh removes in/example_pizza.in and creates a main.cfg.
  • Pass extra args to your solver with: pypy3 main.py --solve_args N=10,M=Hello,X=-0.5 Filling the args-dict with {"N" : "10", "M" : "Hello", "X" : "-0.5"}.

How my team uses this:

We start by implementing a solver and a scorer in parallel. Usually we try to make a very dumb solver to have a baseline for improvements. After we have a system working were we get a score and the judge system reports the same score we continue on 2 or three different solvers that do things differently. Usually we opt for different greedy approaches that sort of different reasonable weight funtions.

A good idea is also to create an improver that you can apply after any solver. However it depends on the problem if there exists a reasonable improver. For the pizza problem there exists an improver that expands all the small pizza pieces, which usually gives quite a few extra points, depending on your solution.

Another good idea is to analyze your inputs and outputs. show.py and analyze.py helps out with file-io. If you wish to use matplotlib, remember that it's easier used under python3 than pypy3.

Nice to have for the competition:

  • pypy3 faster execution, because of JiT compilation to C
    • MacOS: brew install pypy3
    • Ubuntu: sudo apt-get install pypy3
    • Arch: sudo pacman -S pypy3
  • sortedcontainers sorted datastructures for greedy approaches:
    • pypy3 -m pip install sortedcontainers
  • matplotlib for drawing charts/grids visualizing inputs/outputs.
    • python3 -m pip install matplotlib
  • Your own gitrepo to share your progress with your team mates. Just clone this repo and upload you your own private repo.

Have any questions? Post an issue here or tweet to @exoji2e.

hashcode-template's People

Contributors

ahnlabb avatar exoji2e avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

hashcode-template's Issues

Results are not reproducible with seed option

Using -s option to specify seed does not reproduce results (when applied to pizza.py)

To reproduce:
python3 main.py --nsspec pizza:score:solve example_pizza
python3 main.py --nsspec pizza:score:solve -s example_pizza

Handling mal-formated output in scoring functions.

What seems like a good approach?

  • raising errors
  • log errors and give score = 0 back?

I think we want to be able to do both, what if our solution sometimes does not produce a valid output, we might not want everything to crash..

Write a report after many runs

I especially want to know what the best score the run got, even if I don't beat the current highscore.

Right now if we run 1k rounds it's very hard to get a feeling of what happened, since I either get spam of every score with -l warning, or just the best ones with -l critical.

It would be nice with some overview printed at the end, maybe as a histogram?

Remove setup.sh

Having a setup bash file seems reasonable however the current function of setup.sh can be handled in a safer and (in my opinion) cleaner manner (e.g. try statement)

Adding testing

To simplify development we should probably have some basic tests. Could the code for the practice problem be refactored and then used for this?

Deploy script

Zip the srcfiles (the judge wants all of them, and accepts only one file)

We could also copy our best files for each testcase to a realeasefolder and open that folder if we want.

Cache data from precomputation

The precomputation step of the problems may take considerable time. I suggest adding a precompute function that caches the result.

Proposal:

  • use pickle library
  • add command line option to clear cache

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.