Comments (9)
Batches don't have to be in a "sequence" to be fed into the model but a batch should have sequences.
Quite the opposite for Flux as Brian pointed out. Let me add to this in case there is uncertainty about how recurrence is handled in Flux.
If you have a recurrent model m
(i.e. a cell wrapped in Flux.Recur
) that accepts a vector of features, x
, then m(x)
will evaluate a single time step and update the internal state of m
. Suppose a single sample is sequence of features, xs
, then we evaluate the full sequence as [m(x) for x in xs]
.
Batching serves many purposes in ML, but one of them is achieving higher utilization for hardware that supports parallelism. So, in the framework described above, we want m(xbatch)
to evaluate m
at a given time step for multiple samples concurrently. This means that xbatch
should have dimensions (features x batch)
to hit BLAS etc. Since xbatch
is only a single time step, to represent a sequence, we need a vector where each element is a single time step like xbatch
. This vector, xbatches
, is evaluated as [m(xbatch) for xbatch in xbatches]
, making xbatches
have dimensions (features x batch) x sequence_length
.
The relevant detail here for the issue is that once you have the data in this format, accessing a single sample becomes cumbersome. You have to iterate over xbatches
to access each time step, slice the batch of features to access the correct column, then merge the results together into a single sequence. That's why this operation can only happen at the end. If it is done too early, then all the encodings that require random access to samples will be cumbersome and slow. This also means that the transformation should happen to a batch, because applying MLUtils.batchseq
to the entire dataset is necessarily "too early."
TL;DR:
- "batch of sequences": the outer index is by sample which is convenient for the data processing
- "sequence of batches": the outer index is the time step which is required by Flux but inconvenient for the rest of the data pipeline
from fastai.jl.
Hm, I see the issue and how this doesn't solve it. Of course putting the batchseq into the model is not desirable either.
Instead of introducing a lot of new APIs to make this possible it may be doable to stick with the simple encode and instead introduce a ´Batch <: WrapperBlock´ that has the default implementation above.
The encoding that does the padding could then have a custom method for encode that takes in a Batch block and performs the batchseq operation, returning data for a SequenceBatch <: Wrapperblock block.
This way we wouldn't have to introduce any new APIs while unifying observation- and batch-level transformations and not breaking existing encode implementations. What do you think?
from fastai.jl.
Flux RNNs expect an input format of (features x batch) x sequence length
, but the data loader will generate (features x sequence length) x batch
by default. Ideally that transposition happens as late as possible, but it does need to happen at some point.
from fastai.jl.
Yeah I like this approach better because of the unification. It addresses the concerns about tying batchseq
into the data block visualization. Now, it should be clear to the user that the encoded data is stored as a "sequence of batches."
from fastai.jl.
Adding this kind of first-class support for batches will entail a lot of changes to FastAi.jl internals, e.g. applying encode to batches and not individual samples, but should ultimately reduce the amount of code.
We could then make it an encoding that transforms a ´Batch{NumberVector}´ into something like a ´SequenceBatch{NumberVector}´.
Until we find time to implement those changes, though, I would continue with the current method of doing the sequencing.
from fastai.jl.
No issues here with the proposed API.
Typically, in FastAI, we have a "batch of images" or a "batch of tabular entries." Similarly, here we have a "batch of sequences." Ultimately, the model will want a "sequence of batches" though, so this transformation needs to happen somewhere. After this transformation, it becomes very hard to access each sample individually, so it must only happen at the end. Even if we do this as a final encoding step, there's the question of how FastAI understands the encoded block. With other data, you can view the individual encoded samples or encoded batch. What will the view look like here?
from fastai.jl.
Can you explain a bit more what you mean by "sequence of batches" so I can wrap my head around it?
from fastai.jl.
Can you explain a bit more what you mean by "sequence of batches" so I can wrap my head around it?
Yeah, even I didn't get it. Batches don't have to be in a "sequence" to be fed into the model but a batch should have sequences.
from fastai.jl.
Yeah, I think the approach Lorenz suggested should be "the way" to achieve this batch-wise encoding.
But, where do we encode this? Will this be a part of the initial transformations? Or just before passing the data to the model?
from fastai.jl.
Related Issues (20)
- Log to TensorBoard link in TOC
- Log to TensorBoard link in TOC HOT 2
- Faster image pipelines
- Benchmark image pipelines against ffcv
- Docs aren't working correctly. HOT 2
- Windows CI failure HOT 4
- Make a subpackage for Makie support HOT 1
- Deprecate implicit parameters
- Model registry
- `TaskDataset` does not sub-type `MLUtils.AbstractDataContainer` HOT 3
- `Quickstart` tutorial no longer works (`loaddataset` doesn't exist) HOT 6
- Collaborative filtering example HOT 2
- the dataset is deleted right after download in Windows10 HOT 2
- loading datasets fails under proxy, but Base.download works HOT 1
- Broken links on Readme page HOT 1
- unresponsive docs HOT 1
- Dead links in documentation HOT 1
- Use PrecompileTools.jl HOT 9
- Custom learning tasks tutorial gives error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fastai.jl.