Giter Club home page Giter Club logo

gin-train's People

Contributors

avsecz avatar hoeze avatar stefanches7 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

gin-train's Issues

Decorate external libraries

Add a module external (not imported by default), where you would externally import all the classes from libraries such as fastai

  • maybe one could have a separate python package for each

Extenal libraries

  • pytorch
  • fastai
  • keras

Hyper-parameter optimization

Main goal would be to define the Objective analogous to kopt' CompileFN but now using the gin-config. Arguments of Objective would be same as arguments to gin_train but where the gin-config files would be normal gin files which would be overriden using gin bindings (https://github.com/Avsecz/gin-train/blob/master/gin_train/cli/gin_train.py#L187). E.g. either pass to parse_config_files_and_bindings as bindings or specify by:

gin.bind_parameter('supernet.num_layers', 5)
gin.bind_parameter('supernet.weight_decay', 1e-3)

where values can be any valid python object (lists, tuples, dicts, strings). Note that if we use bind_parameter, then finalize() should only be called after we have bound all the parameters.

import json
def config2bindings(config):
    return [f"{k} = {json.dumps(v)}" for k,v in config.items()]

config = {'asd': [1,2,3],
   ...: "dsa": 10,
   ...: "dsads": "11",
   ...: "dsasdsadas": {"a": 1}}

bindings = config2bindings(config)

In [9]: for p in bindings: print(p)
asd = [1, 2, 3]
dsa = 10
dsads = "11"
dsasdsadas = {"a": 1}

Both assume that the dictionary would solely be a key-value mapping which might contain dictionaries / lists as values but these will not be interpreted as nested variables.

Note - note_params should be used to keep track of the hyper-parameter optimization study and the run-id

Additional arguments to Objective

  • objective_metric="acc", # which metric to optimize for. can be nested if multiple
  • objective_metric_mode="max"

Multiple different Objective versions would need to be implemented, one for each hyper-parameter optimization system.

Supported backends:

  • ray tune - RayObjective(...)
    • which also supports HyperOpt
  • (maybe) hyperopt - HyperoptObjective

For more advanced scenarios we would probably need to implement the Trainable class ourselves

Use log_parameters

Instead of Experiment.log_multiple_params use Experiment.log_parameters

Upload trained models to S3 etc

Design decisions

Saving to different backends

  • output_dir allows to have the full s3 path or gcs path

  • output_dir can be a comma separated list of output_directories including S3

  • Implement a wrapper to write the output to multiple locations

  • Use pyfilesystem2 for easier writes (pass the directories)

https://www.pyfilesystem.org/page/s3fs/

from fs import open_fs
s3fs = open_fs('s3://mybucket')
s3fs.listdir()

# gcs
# pip install fs-gcsfs
gcsfs = open_fs("gs://mybucket/root_path?strict=False")

# ssh
# pip install fs.sshfs
my_fs = fs.open_fs("ssh://[user[:password]@]host[:port]/[directory]")
Caveats
  • If s3 or other filesystems don't allow to append, then only write the files once they are complete
    • is there a way to buffer the writes?
  • or, write to a local disk and then at the end upload all the results to S3

Adding a random prefix

Allow the user to only specify the local folder where to save the results while auto-generating the final folder name

Flag name: --auto-subdir

Notes

  • test if you can write to the remote file-system before training the model
    • write out the hyper-parameters

Authentication

  • use environment variables

Update

Add gt-gather command

Add a command which gathers all the experiments in a folder into a single table (csv file), similar to the table in kopt. This can then be easily imported into google sheets

Sacred support

https://github.com/IDSIA/sacred
sacred is an alternative to wandb.io for managing training runs.
It can store the model together with a source code copy in mongodb.

(Omniboard)[https://github.com/vivekratnavel/omniboard] is a very good web frontend for it.

Advantages:

  • open source / free
  • stores source code if needed
  • can be hosted locally

NAs conversion issue

I run a simple model with one-hot encoded sequence as feature. I've tried both single float as target and nparray as target, nevertheless get following error in both cases:
Screenshot_2019-10-04_12-14-41

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.