The source is available here
and the title of the original paper is Reproducible and Efficient Benchmarks for Hyperparameter Optimization of Neural Machine Translation Systems
.
As the raw datasets are not easy to handle directly, this repository provides:
- Preprocessing of the raw datasets
- Easy-to-use API.
The processed datasets are available in datasets/.
To obtain the datasets from the raw datasets, run the following:
$ python -m data_gen.raw_to_json
A simple example is available in examples/.
To test the script, run the following:
$ python -m examples.example_query