Comments (10)
I've done some research. In python 3, int
is unlimited [sic] in magnitude on all platforms. numpy has a different set of numeric types which correspond to C sizes. Most are fixed in size but a few differ in size by implementation (e.g. np.intc). The size of a Tensor type is the same on all platforms.
As you've stated, the issue is that the dataset returns a numpy data type (np.intc?) which has implementation dependent sizes. The size can differ according to machine architecture, OS, C compiler, and other factors. You can't make any assumptions about the size of an np.intc. The size could even differ on the same system and same C compiler. Using np.int32 would make for consistent processing across all platforms -- all platforms would raise an error because a LongTensor is expected. LongTensor's are always 64-bits.
There's at least four possible solutions:
- Have the dataset return np.int64.
- For places where LongTensor is expected, such as in model.py, force the type to 64-bit. Note that
variable.int64()
isn't an implemented attribute.variable.long()
is implemented (works for both cpu and gpu) but I'm unsure if it guarantees 64-bit. I'll post this question on stackoverflow. - Change the type earlier in the call sequence. This would help a static type checker. It would be more efficient if the variable undergoes multiple type changes.
- Change the type later in the call sequence, at the point where C is called. Change calls to
torch._C.*
(e.g.torch._C._nn.nll_loss
) to coerce to the required C data type.
I'll continue working on this issue over the next few days.
http://pytorch-zh.readthedocs.io/en/latest/tensors.html
https://docs.scipy.org/doc/numpy-1.12.0/user/basics.types.html
from fastai.
I think the .long() solution is a good one - I can't see any reason that this should cause problems on Linux or CPU. I'll try it out.
BTW, I'm well aware of the status of pytorch on Windows - the issue is whether I'm ready to support fastai on Windows :) . I suspect I'll endeavor to support it officially after 0.4 is out and Windows CI is done for pytorch, but where we have simple clear solutions to problems in the meantime I'll include them.
from fastai.
The solution you proposed assumes we're on CUDA, which may not be the case. I'll see if I can think of something...
from fastai.
I think the right fix is to have the dataset return the correct type (np.int32) in the first place. Closing this issue since Windows isn't something I'm ready to officially support just yet. But if you create a fix that works on Linux and Windows with and without CUDA then I'll certainly consider merging it.
from fastai.
I'm wondering why the issue is showing up at all? Seems like it should show up everywhere or nowhere. I'm guessing the difference is in some recent change to pytorch which has not caught up to the Windows version.
from fastai.
It's because on Windows the int sizes are different.
from fastai.
The maintainer of pytorch for Windows, peterjc123, says "The Windows PRs are actively merged. The official Windows CI is near to be setup, and the official package is planned for 0.4.0." at pytorch/pytorch#494 (comment)
pytorch 1.3 + CUDA 9.0 for Windows is available and works for me. tensorflow-gpu 1.4 for Windows also works but requires CUDA 8.0, 9.0 is not yet supported. I'm running CUDA 8.0 and CUDA 9.0 side-by-side without issue.
from fastai.
Just to let you know, was playing with this on Win7 and ended up with the same problem! Going with y.long() fixes it for the moment.
I'm really hoping I'll be able to use Windows for Part2-2018!
from fastai.
@davideboschetto Where did you change to y.long()?
cat, cont, y = next(iter(md.trn_dl))
cat, cont, y = Variable(cat), Variable(cont), Variable(y).long()
pred = model(cat, cont)
for p, true in zip(pred.data.numpy(), torch.max(y, 1)[0].data):
print('pred log probs: {} -- True: {}'.format(p, true))
Then run lr_find() yields the error. y is set to a long so don't know where to change it.
Running on Bash Ubuntu on Win 10.
from fastai.
I had this issue in Linux, however, my install was a little unusual (I have an old GPU with my own build from source).
@nikos-h
I fixed it with
if dim == 2:
return torch._C._nn.nll_loss(
input, _Variable.long(target)_,
weight, size_average,
ignore_index, reduce
)
in place of,
if dim == 2:
return torch._C._nn.nll_loss(input, _target_, weight, size_average, ignore_index, reduce)
but the error on python should be clear enough to tell you the exact place where this fails. you may not have dim == 2
, for example.
from fastai.
Related Issues (20)
- get_y should be changed to 'category' from 'label'
- AssertionError when trying to predict image data with FastAI HOT 2
- Training multiple FasiAI models and validating using dataframe dataloader(): HOT 1
- can find the rename_extracted funtion HOT 1
- Unable to process Multi-GPU Training
- Want to ask why I use DataBlock to do NLP task,I have many "xxboxs xxunk" in my data? HOT 1
- Issue with setting the number of workers
- Documentation on which is the actual label vs predicted in learn.show_results() is missing HOT 2
- how to use in google colab?sorry I don't find the entrance,I am a new learner
- Fastai version 2.7.14 produces "AttributeError: device" when running .predict HOT 1
- Notebook_launcher: Importing libraries initializes CUDA
- Fastai
- Fastai
- fastai
- modify_exception introduces another (nested) exception HOT 1
- FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\xxx\\.fastai\\data\\imdb_tok\\counter.pkl' HOT 3
- Fastai v1.0.61 - help wanted HOT 1
- "cut" parameter not passed to vision_learner
- Need to restructure _with_events, the main function in Learner.
- Need to restructure _with_events, the main function of Learner HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fastai.