MLJ
A pure Julia machine learning framework.
MLJ aims to be a flexible framework for combining and tuning machine learning models, written in the high performance, rapid development, scientific programming language, Julia. MLJ is work in progress and new collaborators are being sought.
Click here if your are interested in contributing.
The MLJ project is partly inspired by MLR (recent slides 7/18.) For an earlier proof-of-concept, see this branch and this poster summary.
Packages wishing to implement the MLJ interface for their algorithms should import MLJBase.
Installation
In the Julia REPL:
]add "https://github.com/wildart/TOML.jl"
add "https://github.com/alan-turing-institute/MLJBase.jl"
add "https://github.com/alan-turing-institute/MLJModels.jl"
add "https://github.com/alan-turing-institute/MLJ.jl"
Alternatively, try it out now using our docker image :
A docker image is provided with instructions in how to set it up.
Running Docker
To run the docker image you can simply call
docker run -p 8888:8888 ysimillides/mlj-docker
and this will open a port on your localhost:8888 from where you can access the container/notebook. ames.ipynb has been provided as an example.
Features to include:
-
Automated tuning of hyperparameters, including composite models with nested parameters. Tuning implemented as a wrapper, allowing composition with other meta-algorithms. ✔
-
Option to tune hyperparameters using gradient descent and automatic differentiation (for learning algorithms written in Julia).
-
Data agnostic: Train models on any data supported by the Tables.jl interface. ✔
-
Intuitive syntax for building arbitrarily complicated learning networks .✔
-
Learning networks can be exported as self-contained composite models ✔, but common networks (e.g., linear pipelines, stacks) come ready to plug-and-play.
-
Performant parallel implementation of large homogeneous ensembles of arbitrary models (e.g., random forests). ✔
-
Task interface matches machine learning problem to available models.
-
Benchmarking a battery of assorted models for a given task.
-
Automated estimates of cpu and memory requirements for given task/model.
Frequently Asked Questions
See here.
Known issues
- The ScikitLearn SVM models will not work under Julia 1.0.3 but do work under Julia 1.1 due to Issue #29208
Getting started
Get started with MLJ, or take a tour of some of the features implemented so far.
History
Predecessors of the current package are AnalyticalEngine.jl and Orchestra.jl, and Koala.jl. Work continued as a research study group at the University of Warwick, beginning with a review of existing ML Modules that were available in Julia at the time (in-depth, overview).
Further work culminated in the first MLJ proof-of-concept