Comments (12)
Was just about to report this!
I see checks in place for filter_acts, but not for img_acts or weight_acts.
Original comment by [email protected]
on 28 Jul 2014 at 7:54
from cuda-convnet2.
Yeah. There's also some texture usage in NVMatrix, actually. So it'll take a
bit more effort to bypass entirely. Hopefully I'll get a chance to fix this
soon but in the meantime it's usually possible to just split a layer into two
layers if it's too big.
Original comment by [email protected]
on 28 Jul 2014 at 11:27
from cuda-convnet2.
> but in the meantime it's usually possible to just split a layer into two
layers if it's too big.
by making it block-sparse using multiple groups? or two separate layers
themselves?
Original comment by [email protected]
on 29 Jul 2014 at 6:32
from cuda-convnet2.
Two separate layers.
Original comment by [email protected]
on 29 Jul 2014 at 6:49
from cuda-convnet2.
I've replicated the texture kernels and changed tex1Dfetch to regular pointer
addressing and added logic to use these kernels if the inputs are bigger than
texture memory. I've only done this for the convolution kernels.
I can send you a patch if that's the approach you want to take.
Original comment by [email protected]
on 4 Aug 2014 at 3:34
from cuda-convnet2.
Yeah that is the approach. The only thing to watch out for is that sometimes
making this change causes register usage to change sufficiently to change the
kernel's occupancy which can have a significant effect on performance. So
sometimes you have to do some stuff to try to get register usage back down
again.
Original comment by [email protected]
on 4 Aug 2014 at 6:17
from cuda-convnet2.
okay cool, i'll look at the register count/spillage with --ptxas-options=-v and
if there's going to be no change wrt occupancy, I'll prepare patches and send
them your way.
Original comment by [email protected]
on 4 Aug 2014 at 6:20
from cuda-convnet2.
Thanks, I appreciate it. But don't do too much work because I do have the
originals somewhere in my source control history. I did start out without using
texture memory.
Original comment by [email protected]
on 4 Aug 2014 at 6:32
from cuda-convnet2.
Hi,I defined a net, and I got this error code:
"CUDA error at src/nvmatrix.cu:1471 code=11(cudaErrorInvalidValue)
"cudaCreateTextureObject(&_texObj, &resDesc, &texDesc, NULL)" "
But I run your defined net config, it's ok. So I think this may relate with my
net config. The first conv layer has 3X3 filter size, stride =1, pad = 1, and
output channel is 64.Maybe it's out of texture memory size? and is it related
to Issue 2?
Original comment by [email protected]
on 10 Sep 2014 at 2:24
from cuda-convnet2.
clarkon, this is very likely because of texture memory limits. in my fork of
this i rewrote the kernels to not use texture memory if the incoming tensor is
greater than 512MB in footprint, but I haven't had time to port this over.
You can split your layer into two parallel layers to avoid this if you want to
use cuda-convnet2 in the current state.
Original comment by [email protected]
on 12 Sep 2014 at 6:21
from cuda-convnet2.
Hi, on running one of my convnets architecture, I get the error
CUDA error at src/nvmatrix.cu:1548 code=11(cudaErrorInvalidValue)
"cudaCreateTextureObject(&_texObj, &resDesc, &texDesc, NULL)"
As suggested above, it looks like the texture is bigger than 512Mb causing my
program to crash.
My current config has multiple 128,256,512 channel filters. But they are all
3*3 size.
1. Is this error because of the presence of multiple layers (suggesting that
convnet2 cannot be used for deeep configurations) or is it because of the
presence of even one big channel filter say of size:3*3 and channel: 512
2. Also is it possible to find out which conv layer is causing this error?
3. A possible suggestion mentioned above is to seperate into 2 parrallel
layers. Can you please suggest what the config file would like if I have to
seprate say a conv layer with 512 channel and size: 3*3
Any help is appreciated...Thanks in advance! :)
Original comment by [email protected]
on 18 Feb 2015 at 8:41
from cuda-convnet2.
In response to durvasul:
The issue is the size of the buffers for a particular layer and not necessarily
the total memory footprint of your model. Try reducing the number of channels
until it doesn't crash, use the debugger, or insert some print statements in
the python code if you want to see which layer is causing the problems.
As soumith pointed out, the fprop conv code has checks on the buffer sizes, so
it's most likely image_acts or weight_acts that is trying to use the texture
kernels. You can insert similar checks (around line 2120 of weight_acts.cu,
for example) and just back off to the non-texture versions of the kernels as in
line 1251 of filter_acts.cu.
Original comment by [email protected]
on 25 Feb 2015 at 4:51
from cuda-convnet2.
Related Issues (15)
- program will crash because of line 1473 in nvmatrix/src/nvmatrix.cu HOT 5
- Benchmark on CUDA 6.5 HOT 1
- (nvmatrix.cu) Kernel execution failed error with cuda5.5 HOT 3
- cost.sum2 crash HOT 1
- saving multiview predictions (--test-out) does not work HOT 1
- Loading all data in shownet
- Multiple data layer with binomialcrossEntropyCostLayer HOT 1
- Does not work on 8 GPUs
- GTX7XX support
- Element wise sum not working as expected HOT 1
- Remove NPY deprecated warnings
- Error: cannot allocate memory for thread-local data: ABORT
- conv1 weights and biases become nan
- invalid device function with GTX 980 HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cuda-convnet2.