Comments (9)
As far as I know with TF, manually running finalize
on a graph is definitely not a good idea unless there is a need to do so.
I'm not sure whether the problem is with Keras or TF not deallocating memory, your dataset/loader, or that the GPU you are using simply runs out of memory. When we run the training process on our dataset we do not encounter this issue.
What is your sequence length limit? Maybe that, or the minibatch size, should be reduced.
from ncc.
On my dataset, the longest sequnce is of length 8618 and the mean sequence length is 425. I've tried to reduce the minibatch size to 1, but it didn't work.
The reason why I manually run finalize
on the graph is to test whether new nodes are added to the graph, and it does happen. I'm now trying other methods to solve this problem.
Another question is, could you please briefly explain which part those parameters will affect? I'm very upset finding that the accuracy on my dataset is only about 0.3 and remains almost unchanged until OOM Error occurs. I have tried many combinations of parameters, but none of them performs well. Is there possibly something wrong?
from ncc.
from ncc.
In fact, the POJ-104 dataset used for the classifyapp
task does not contain very large files.
The histogram shows the number of statements per file for a subset of the dataset. As you can see, 8000 lines is an order of magnitude larger than any file that is included in the subset considered.
In order to train on significantly longer sequences than that, you probably need a few tricks that go beyond the code provided here, but you can try training on the shorter sequences in your dataset.
The network probably generalizes to longer sequences at inference time fairly well and inference through an LSTM is quite memory efficient (activations don't need to be stored for backpropagation).
Hope this helps,
Zacharias
from ncc.
Thanks for your patient reply. Finally I find that it is probably the function tf.nn.embedding_lookup
(line 148, 169 in train_task_classifyapp.py
) that leads to the OOM Error. It is used in the batch generator and adds new nodes to the current tensorflow graph, thus makes the graph grows at runtime. I manually rewrite it and the programme now works well. I know that tf.nn.embedding_lookup
can be calculated in parallel so that it is faster, but I have to do that to avoid runtime error. Thanks again!
from ncc.
If this is an issue with the current code base, would you mind creating a pull request with your fix? Thanks!
from ncc.
Thanks for your patient reply. Finally I find that it is probably the function
tf.nn.embedding_lookup
(line 148, 169 intrain_task_classifyapp.py
) that leads to the OOM Error. It is used in the batch generator and adds new nodes to the current tensorflow graph, thus makes the graph grows at runtime. I manually rewrite it and the programme now works well. I know thattf.nn.embedding_lookup
can be calculated in parallel so that it is faster, but I have to do that to avoid runtime error. Thanks again!
Strangely, I didn't encounter this OOM issue with the current code base. How to reproduce the problem and how did you eventually fix it?
from ncc.
I do some other tests and now think it is because of some unknown error in my GPU server or keras/tensorflow. I finally give up finding the true reason why this bug appears. So just let this issue closed.
from ncc.
Thanks for reporting anyway. Good luck
from ncc.
Related Issues (20)
- train_inst2vec.py fails on specific file during vocabulary building HOT 11
- loss and acc have large differences even though train and valid are set totally same HOT 4
- llvm ir of linux kernel HOT 1
- test
- Asm inline call handling HOT 1
- question about predict value p at train_task_classifyapp.py line 417 HOT 2
- Train task classifyapp on same data as for training the embedding HOT 2
- ValueError: GraphDef cannot be larger than 2GB. HOT 1
- Bug of Regular Express matching Report For Preprocessing LLVM IR HOT 1
- 'MultiDiGraph' object has no attribute 'node' HOT 3
- Confusion in inst2vec_preprocess.py when reading code HOT 6
- The original source code of the datasets HOT 3
- Expected combineable dataset HOT 1
- for classifyapp, vocubalary dictionary is not present. HOT 1
- How to get the embedding result of inst2vec ? HOT 3
- the links to all the datasets did not work. HOT 3
- dictionary_pickle not available HOT 3
- [inst2vec_evaluate.py] IndexError : list index out of range in analogies HOT 1
- The link to the dataset is not working HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ncc.