skorch-dev / skorch Goto Github PK
View Code? Open in Web Editor NEWA scikit-learn compatible neural network library that wraps PyTorch
License: BSD 3-Clause "New" or "Revised" License
A scikit-learn compatible neural network library that wraps PyTorch
License: BSD 3-Clause "New" or "Revised" License
Other candidates for this are:
300MiB are a lot of stuff to download every time.
The example should include
Investigate which options are stored here and what we should do with this parameter.
Currently, the Scoring
callback calculates the score on each batch and averages over all batches for the epoch score. For some scores, however, this leads to inaccurate results (e.g. AUC). It would be better to score on the whole validation set at once.
To achieve this, the callback could store all predictions from the batches and score on_epoch_finished
. It might be better, though, if the NeuralNet
did it, so that if we have more than one score that uses the predictions, the predictions don't need to be made twice.
possibly in a jupyter notebook
r0.1.0
It would be helpful to have the ability to set parameters beyond module level (for sub-components of the module, for example):
class Seq2Seq:
def __init__(self, encoder, decoder, **kwargs):
self.encoder = encoder
self.decoder = decoder
class Encoder:
def __init__(self, num_hidden=100):
self.num_hidden = num_hidden
self.lin = nn.Linear(1, num_hidden)
ef = NeuralNet(
module=Seq2Seq(encoder=AttentionEncoderRNN, decoder=DecoderRNN),
module__encoder__num_hidden=23,
)
I would expect module.encoder.num_hidden
to be set to 23
. This should be robust with respect to the initializtion of the sub-module, for example if the encoder has elements that depend on the initialized value, those elements should be updated as well. In the given example, I would expect not only module.encoder.num_hidden
to be updated to 23
but also that module.encoder.lin.out_features
is updated (e.g. by re-initializing the whole module).
NeuralNet
currently only initializes Dataset
with X, y, use_cuda
but we may have more parameters. The user should be able to pass them the same way as for criterion
etc. (i.e. via the prefixes_
).
We have some ugly pieces of code that are necessary to make our code work with default_collate
. They relate to the probem that default_collate
picks out values one at a time, which makes it hard to work with 1-dim arrays (e.g. to cast them to cuda).
The corresponding pieces of code are:
_prepare_target_for_loss
NeuralNetRegressor
Current state of different things (both = class and object supported, class = only class supported)
we should find a consistent scheme for this (either only initialized, always both, only class, ...)
This is a similar issue as with regression and 1-dimensional target data, namely that default_collate
unpacks the contents of the array (int64) and then the .cuda()
call fails on int64.
It does not happen with 2-dimensional arrays, but for binary classifications, we can't use an n x 1 array, since that conflicts with StratifiedKFold
.
When the wrapper is initialized with unknown keys the following AssertionError
is raised:
Code/skorch/skorch/net.py in __init__(self, module, criterion, optimizer, lr, gradient_clip_value, gradient_clip_norm_type, max_epochs, batch_size, iterator_train, iterator_valid, dataset, train_split, callbacks, cold_start, verbose, use_cuda, **kwargs)
234 assert not hasattr(self, key)
235 key_has_prefix = any(key.startswith(p) for p in self.prefixes_)
--> 236 assert key.endswith('_') or key_has_prefix
237 vars(self).update(kwargs)
238
AssertionError:
To reproduce this, initialize a wrapper with iterator_test__batch_size=32
as parameter. Since the correct key would be iterator_valid
this code fails with the aforementioned error.
There should be at least a detailed, helpful error message.
There's currently no way to use a custom Sampler.
For example:
inferno.dataset.get_len([[(1,2),(2,3)],[(4,5)], [(7,8,9)]])
expected: 3
actual: ValueError: Dataset does not have consistent lengths.
Another example:
inferno.dataset.get_len([[(1,2),(2,3)],[(4,5)], [(7,8)]])
expected: 3
actual: 2 (length of tuples)
A workaround is to convert the list into a numpy array.
There should be an easy way to add Lx-regularization.
No longer needed.
predict
will currently take the argmax of dimension 1. This is very specific, despite the NeuralNet
class being intended for generic use cases. I see 2 solutions:
forward
(thus making no assumption of what that is)For example:
class Foo(inferno.callbacks.Callback):
def on_epoch_end(self, net, **kwargs):
pass
net = NeuralNet(..., callbacks=[Foo])
Error:
Traceback (most recent call last):
File "train.py", line 189, in <module>
pl.fit(corpus.train[:1000], corpus.train[:1000])
File "/home/ottonemo/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 945, in fit
return self._fit(X, y, groups, ParameterGrid(self.param_grid))
File "/home/ottonemo/anaconda3/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 550, in _fit
base_estimator = clone(self.estimator)
File "/home/ottonemo/anaconda3/lib/python3.6/site-packages/sklearn/base.py", line 69, in clone
new_object_params[name] = clone(param, safe=False)
File "/home/ottonemo/anaconda3/lib/python3.6/site-packages/sklearn/base.py", line 57, in clone
return estimator_type([clone(e, safe=safe) for e in estimator])
File "/home/ottonemo/anaconda3/lib/python3.6/site-packages/sklearn/base.py", line 57, in <listcomp>
return estimator_type([clone(e, safe=safe) for e in estimator])
File "/home/ottonemo/anaconda3/lib/python3.6/site-packages/sklearn/base.py", line 67, in clone
new_object_params = estimator.get_params(deep=False)
TypeError: get_params() missing 1 required positional argument: 'self'
Probable cause: get_params
recursively inspects all attributes of the wrapper instance including self.callbacks
which still contains the uninitialized callbacks. It then calls get_params
which does not work as it is not a static method.
Currently we warn that CUDA is not supported but the model still has use_cuda=True
.
Possibly by making use_cuda
a positional parameter instead of a keyword parameter.
get_iterator
blows up at https://github.com/dnouri/inferno/blob/169e1a0/inferno/net.py#L310 in case sklearn CV split is used and a 1-dimensional torch tensor is fed for X and y.
For example:
pl = GridSearchCV(trainer, params)
pl.fit(corpus.train, corpus.train)
It would be nice to have n_jobs=2
on a system with 2 GPUs and both jobs are dispatched to one of the GPUs.
Requirements:
torch.save
to save the model data)Open questions:
key='valid_loss_best'
)?{epoch}
or {unique_run_id}
)? Might be useful when doing grid search and runs would otherwise override checkpointsI already computed scores in my loss function and now I want to score them so that I can print them per epoch. For example:
class MyNet(NeuralNetwork):
def get_loss(...):
self.history.record_batch('foo', 42)
net = MyNet(callbacks=[
inferno.callbacks.Scoring('foo'),
])
However, the Scoring
callback calls its score
method on each batch end and overwrites the value "foo"
and there is no way to properly disable this behavior.
There is a workaround though (but an ugly one):
def ignore_scorer(*_): raise KeyError()
net = MyNet(callbacks=[
inferno.callbacks.Scoring('foo', scoring=ignore_scorer),
])
We should cover this case.
Currently, color highlighting is homebrew. We could use a package instead, e.g. https://github.com/tartley/colorama.
Advantages:
Disadvantages:
We need a method (possibly on the wrapper class) to initialize the random state for all components that are concerned with sampling. These include
Currently, net._yield_callbacks
discerns between PrintLog
and other callbacks with the effect that PrintLog
is added last to the list of callbacks so it has access to all processed values added by other callbacks. Maybe we should generalize this by classifying callbacks into two groups: processing callbacks and output callbacks.
Output callbacks (identified by inheriting from an abstract subclass of Callback
) are by default appended to the end of the callback list. This would pave the way for other output callbacks besides PrintLog
such as TensorBoard logging callbacks.
It should be possible to explicitly dispatch a model on multiple GPUs.
This probably affects data operations and .cuda()
calls.
Currently, the module_
is not automatically moved to cuda even if use_cuda=True
. This is unexpected and should change.
This is what @ottonemo has to say about this:
I suppose so, yes. I was worried that it might interfere with settings that add parameters to the module after the point where we automatically apply .cuda() to the model which would result in these parameters to be excluded from the type conversion. One solution would be to do this conversion every time training starts (as is the case here) and mention in the documentation that there might be a case where the user has to call .cuda() on the model by themselves in certain cases.
In short my suggestion is: Implement self.module_.cuda() in on_train_begin of the base class and leave a comment somewhere (where?) in a docstring.
My suggestion: When a parameter is set on module_
, the module needs to be re-initialized using the initialize_module
method. We could move the .cuda()
call to the end of this method.
I have another fear, though. What if the user wants part of the module and data to be on cuda and part on cpu? I guess we need to make sure that as long as use_cuda=False
, we don't
There should be a CI service that checks new pull requests for errors.
NeuralNet.fit
should use y=None
by default to support arbitrary data loaders.
NeuralNetClassifier.fit
should require y
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.