Dumping a skorch model with dill then reloading it (does not matter if with dill or pi

okay, after some checks: <div class="highlight highlight-source-python notranslate

dill load and sklearn clone result in error about skorch HOT 7 OPEN

DCoupry commented on May 28, 2024

dill load and sklearn clone result in error

from skorch.

Comments (7)

BenjaminBossan commented on May 28, 2024 1

Thanks for investigating further. This is super strange IMO, because the _optimizers attribute is empty but _modules and _crtiteria are not empty, even though these three attributes are treated exactly the same. Do you know if dill uses __getstate__ and __setstate__ or if it has equivalent methods? Maybe we can salvage something there.

Edit: Just checked it, dill does call __getstate__ and __setstate__, which makes this even more confusing.

from skorch.

DCoupry commented on May 28, 2024 1

I was thinking pdb might be of some help here, will report if I manage anything. In the meantime, I have found that dumping byref with dill solves the fail:

# works
dill.loads(dill.dumps(base_model, byref=True)).fit(X, y) 
clone(dill.loads(dill.dumps(base_model.fit(X, y), , byref=True))).fit(X, y)

from skorch.

BenjaminBossan commented on May 28, 2024

I could not 100% reproduce the issue, thus I had to make some small changes:

from sklearn.datasets import make_regression
from sklearn.base import clone
import numpy as np
import torch
import skorch
import dill

dill.__version__  # 0.3.6

X, y = make_regression()
X, y = X.astype(np.float32), y.astype(np.float32).reshape(-1, 1)  # added
base_model = skorch.NeuralNetRegressor(torch.nn.Linear(100, 1))
cloned_model = clone(base_model)
dumped_model = dill.loads(dill.dumps(base_model))
cloned_dumped_model = clone(dumped_model)

base_model.fit(X, y) # works
cloned_model.fit(X, y) # works
dumped_model.fit(X, y) # THIS ALREADY FAILS FOR ME
cloned_dumped_model.fit(X, y) # fails with same error

First, could you please confirm that my snippet produces the same error for you?

Second, is the error you get also:

...

1225 self.notify("on_batch_begin", batch=batch, training=training)
1226 step = step_fn(batch, **fit_params)
-> 1227 self.history.record_batch(prefix + "_loss", step["loss"].item())
1228 batch_size = (get_len(batch[0]) if isinstance(batch, (tuple, list))
1229 else get_len(batch))
1230 self.history.record_batch(prefix + "_batch_size", batch_size)

TypeError: 'NoneType' object is not subscriptable

from skorch.

DCoupry commented on May 28, 2024

I can reproduce it, yes. And indeed the dumped version does die also.
I am confused as the dumped model did work for me at one point, but the cloned one did not. trying to refine this.

What does work is:

dumped_model = dill.loads(pickle.dumps(base_model))
dumped_model.fit(X, y)

the error is the same. the loss is None here. The process looks okay to me and goes through all the initializations, and I have tracked it to the train_step function, where if you print the optimizers you will get an empty list. But when you take the models themselves, and print the pre-fit atributes, everything looks good! quite frustrating.

from skorch.

DCoupry commented on May 28, 2024

okay, after some checks:

from sklearn.datasets import make_regression
from sklearn.base import clone
import numpy as np
import torch
import skorch
import dill
import pickle

dill.__version__  # 0.3.6

X, y = make_regression()
X, y = X.astype(np.float32), y.astype(np.float32).reshape(-1, 1)  # added
base_model = skorch.NeuralNetRegressor(torch.nn.Linear(100, 1))
cloned_model = clone(base_model)
dumped_model = dill.loads(dill.dumps(base_model))
dumped_fitted_model = dill.loads(dill.dumps(base_model.fit(X, y)))
cloned_dumped_model = clone(dumped_model)
cloned_dumped_fitted_model = clone(cloned_dumped_model)

base_model.fit(X, y) # works
cloned_model.fit(X, y) # works
dumped_model.fit(X, y) # fails
dumped_fitted_model.fit(X, y) # works
cloned_dumped_fitted_model.fit(X, y) # fails
cloned_dumped_model.fit(X, y) # fails with same error

from skorch.

DCoupry commented on May 28, 2024

we could print a trace of execution with the final fit and diff across, maybe?

from skorch.

BenjaminBossan commented on May 28, 2024

Sorry, I don't understand. How can this be done?

from skorch.

dill load and sklearn clone result in error about skorch HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent