Giter Club home page Giter Club logo

Comments (5)

ZijingMao avatar ZijingMao commented on September 25, 2024

Hi @moyix, did you find the solution? I'm having the same issue here.

from char-rnn.

hsheil avatar hsheil commented on September 25, 2024

I hit this problem when seeing how well LSTM could learn the granularity of numbers of seconds for a stream of event timestamps (so x s, xx secs, xxx secs, xxxx secs etc.). I didn't see this as the code relying on ASCII text inputs - it's to do with the vocab size you're instructing it to expect (256 in the output above). If you increase the vocab size config parameter such that OneHot never asks for an index out of range, then it worked for me. I think I ended up using 40,000 for my use case..

The downside to this (so I think it's a bit of a hack) is that training time and memory usage increases and the initial perplexity explodes if you're doing a penn treebank task, but equally the perplexity collapses really quickly as the network learns that the vocabulary size is not really that large, the vocabulary just has some large values inside it. There's probably a more efficient / sparse data structure that can be used here for this use case rather than a "mostly empty" OneHot.

from char-rnn.

moyix avatar moyix commented on September 25, 2024

@hsheil Could you elaborate on exactly what you changed? I've tried some obvious things, like changing the vocab_size passed to LSTM.lstm to vocab_size + 4000, but still hit the same error.

from char-rnn.

hughperkins avatar hughperkins commented on September 25, 2024

I think one thing you'll need to do is change the data type, currently 8-bit char, to some larger data type. This manifests itself for example at https://github.com/karpathy/char-rnn/blob/master/util/CharSplitLMMinibatchLoader.lua#L159 Basically, anywhere that says ByteTensor might need to change to ShortTensor or IntTensor or FloatTensor. I looked through quickly, and this was the only location I could find actually, but I reckon there might be one or two others. You'll also need to change anything that says :byte() to eg :int() or :float(). Again, I couldnt see anywhere, so might be enought to fix line 159 of CharSplitLMMinibatchLoader.lua.

from char-rnn.

HelgiHelgason avatar HelgiHelgason commented on September 25, 2024

I've tried everything listed in the previous comments but not gotten this to work yet, I'm having the same issue as @moyix . When I increase the vocab size (e.g. "self.vocab_size = 5000") it gives me this error:

out of memory at /tmp/luarocks_cutorch-scm-1-4252/cutorch/lib/THC/THCStorage.cu:44

If someone has gotten this to work on binary files I would be grateful for a couple of pointers.

Update: Got this working by changing the vocab_size in the LSTM constructor but the quality of the results makes me think something more is needed.

from char-rnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.