Comments (5)
Hi @moyix, did you find the solution? I'm having the same issue here.
from char-rnn.
I hit this problem when seeing how well LSTM could learn the granularity of numbers of seconds for a stream of event timestamps (so x s, xx secs, xxx secs, xxxx secs etc.). I didn't see this as the code relying on ASCII text inputs - it's to do with the vocab size you're instructing it to expect (256 in the output above). If you increase the vocab size config parameter such that OneHot never asks for an index out of range, then it worked for me. I think I ended up using 40,000 for my use case..
The downside to this (so I think it's a bit of a hack) is that training time and memory usage increases and the initial perplexity explodes if you're doing a penn treebank task, but equally the perplexity collapses really quickly as the network learns that the vocabulary size is not really that large, the vocabulary just has some large values inside it. There's probably a more efficient / sparse data structure that can be used here for this use case rather than a "mostly empty" OneHot.
from char-rnn.
@hsheil Could you elaborate on exactly what you changed? I've tried some obvious things, like changing the vocab_size
passed to LSTM.lstm
to vocab_size + 4000
, but still hit the same error.
from char-rnn.
I think one thing you'll need to do is change the data type, currently 8-bit char, to some larger data type. This manifests itself for example at https://github.com/karpathy/char-rnn/blob/master/util/CharSplitLMMinibatchLoader.lua#L159 Basically, anywhere that says ByteTensor
might need to change to ShortTensor
or IntTensor
or FloatTensor
. I looked through quickly, and this was the only location I could find actually, but I reckon there might be one or two others. You'll also need to change anything that says :byte()
to eg :int()
or :float()
. Again, I couldnt see anywhere, so might be enought to fix line 159 of CharSplitLMMinibatchLoader.lua.
from char-rnn.
I've tried everything listed in the previous comments but not gotten this to work yet, I'm having the same issue as @moyix . When I increase the vocab size (e.g. "self.vocab_size = 5000") it gives me this error:
out of memory at /tmp/luarocks_cutorch-scm-1-4252/cutorch/lib/THC/THCStorage.cu:44
If someone has gotten this to work on binary files I would be grateful for a couple of pointers.
Update: Got this working by changing the vocab_size in the LSTM constructor but the quality of the results makes me think something more is needed.
from char-rnn.
Related Issues (20)
- output not being stored as .txt HOT 1
- Sampling Text HOT 3
- Length of primetext causes it to fail
- Question:
- Torch readline.o error
- Th: command not found, Torch installed fine HOT 3
- Question: Is there a way to "pause" temporarily without making it restart? HOT 3
- Random alias of folder created in same location as real
- AMD? HOT 2
- Links in readme.md not working in certain editors
- cutorch installation make error HOT 1
- Package 'python-software-properties' has no installation candidate
- Installation problem: readline.h HOT 2
- Failed to clone the RNN
- How can I change the model form LSTM to GRU?
- Training stuck on "cloning criterion"
- Duo GPU Capabilities? HOT 4
- how do i implement this code in python? HOT 3
- Code
- Prototypical Recurrent Unit
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from char-rnn.