avsecz / gin-train Goto Github PK
View Code? Open in Web Editor NEWTracking ML experiments using gin-config, wandb, comet.ml and S3.
License: MIT License
Tracking ML experiments using gin-config, wandb, comet.ml and S3.
License: MIT License
Seems that due to the way argh returns the serialized dictionary, the return code is 120 instead of 0 which doesn't work well with snakemake
Since there are multiple solutions available for tracking ML experiments, it would be nice to have an abstract ExperimentLogger() class and specific implementations of it to allow for all the other platform-specific experiment classes.
Add a module external
(not imported by default), where you would externally import all the classes from libraries such as fastai
Extenal libraries
Main goal would be to define the Objective
analogous to kopt' CompileFN
but now using the gin-config. Arguments of Objective would be same as arguments to gin_train
but where the gin-config files would be normal gin files which would be overriden using gin bindings (https://github.com/Avsecz/gin-train/blob/master/gin_train/cli/gin_train.py#L187). E.g. either pass to parse_config_files_and_bindings
as bindings
or specify by:
gin.bind_parameter('supernet.num_layers', 5)
gin.bind_parameter('supernet.weight_decay', 1e-3)
where values can be any valid python object (lists, tuples, dicts, strings). Note that if we use bind_parameter
, then finalize()
should only be called after we have bound all the parameters.
import json
def config2bindings(config):
return [f"{k} = {json.dumps(v)}" for k,v in config.items()]
config = {'asd': [1,2,3],
...: "dsa": 10,
...: "dsads": "11",
...: "dsasdsadas": {"a": 1}}
bindings = config2bindings(config)
In [9]: for p in bindings: print(p)
asd = [1, 2, 3]
dsa = 10
dsads = "11"
dsasdsadas = {"a": 1}
Both assume that the dictionary would solely be a key-value mapping which might contain dictionaries / lists as values but these will not be interpreted as nested variables.
Note - note_params
should be used to keep track of the hyper-parameter optimization study and the run-id
Additional arguments to Objective
Multiple different Objective
versions would need to be implemented, one for each hyper-parameter optimization system.
Supported backends:
RayObjective(...)
HyperoptObjective
For more advanced scenarios we would probably need to implement the Trainable
class ourselves
Instead of Experiment.log_multiple_params
use Experiment.log_parameters
output_dir
allows to have the full s3 path or gcs path
output_dir
can be a comma separated list of output_directories including S3
Implement a wrapper to write the output to multiple locations
Use pyfilesystem2 for easier writes (pass the directories)
https://www.pyfilesystem.org/page/s3fs/
from fs import open_fs
s3fs = open_fs('s3://mybucket')
s3fs.listdir()
# gcs
# pip install fs-gcsfs
gcsfs = open_fs("gs://mybucket/root_path?strict=False")
# ssh
# pip install fs.sshfs
my_fs = fs.open_fs("ssh://[user[:password]@]host[:port]/[directory]")
Allow the user to only specify the local folder where to save the results while auto-generating the final folder name
Flag name: --auto-subdir
copy_fs()
:
output_file
create a temporary directory in /tmp/
Add a command which gathers all the experiments in a folder into a single table (csv file), similar to the table in kopt. This can then be easily imported into google sheets
https://github.com/IDSIA/sacred
sacred is an alternative to wandb.io for managing training runs.
It can store the model together with a source code copy in mongodb.
(Omniboard)[https://github.com/vivekratnavel/omniboard] is a very good web frontend for it.
Advantages:
Add https://github.com/anderskm/gputil to auto-schedule model training
import GPUtil
if args.gpu == -1:
gpu = GPUtil.getFirstAvailable(attempts=3, includeNan=True)[0]
else:
gpu = args.gpu
add also the GPU memory fraction to use.
The value is currently hard-coded to .5
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.