Current optimizer
branch will be merged with master in a few days. It will lead to a new minor beta release with some minor changes in ANNs API. The most important change is about the customization of optimization algorithms. A new Lua class, in the table/namespace ann.optimizer
has been defined. The instances of trainable.supervised_trainer
could receive as foruth argument the optimizer that you want to use. By default the optimizer is ann.optimizer.sgd()
, which is exactly the same as we have before, the implementation of Stochastic Gradient Descent with momentum, weight decay, and max normalization penalty.
This change leads to an important API modification, because the optimization algorithm is not developed in the ANN components, it is developed in a decoupled class. So, the learning parameters (as learning_rate, momentum, etc) couldn't be defined in the components, they need to be set/get via the optimizer. In order to simplify the API, the trainable.supervised_trainer
has methods to perform set/get of the optimizer options. The following code is an example of this last way:
trainer = trainable.supervised_trainer(thenet,
ann.loss.multi_class_cross_entropy(10),
bunch_size)
trainer:build()
trainer:set_option("learning_rate", learning_rate)
trainer:set_option("momentum", momentum)
trainer:set_option("weight_decay", weight_decay)
-- bias has weight_decay of ZERO
trainer:set_layerwise_option("b.", "weight_decay", 0)
Note that weight_decay
parameter is set layerwise, so all layers has the global option set at trainer:set_option("weight_decay", weight_decay)
, but the bias has a weight decay of zero, set with the method trainer:set_layerwise_option("b.","weight_decay",0")
.