Comments (10)
Sure! You can find a block-sparse attention example there https://github.com/ptillet/triton/blob/master/python/test/test_blocksparse.py#L150-L159. You can create a block-sparse softmax operation as follows:
sparse_softmax = triton.ops.blocksparse.softmax(layout, block)
and then call it on the output of a SDD matmul
from triton.
This is general block-sparse spMM. There are three modes:
- SDD: sparse = dense x dense, a.k.a. sampled dense-dense matrix multiplication
- DSD: dense = sparse x dense, the lhs is sparse
- DDS: dense = dense x sparse, the rhs is sparse
The output of SDD is in a block-sparse format that can be re-used for triton.ops.blocksparse.softmax
and also by DSD for attention mechanisms.
from triton.
thx
from triton.
Hi what's the definition of block here? Is it to blockify the matrix and then do the MM?
I'm looking to implement the block sparse attention proposed by bigbird using triton ops. I want to know how to obtain the block
in triton. Thanks
from triton.
The sparsity layout is specified as a tensor of 0s and 1s. On your example, this would only work if each colored square corresponds to a 16x16 (or 32x32, 64x64, 128x128) block of data. Also note that triton.ops.blocksparse
doesn't support overlapping blocks
from triton.
Thanks. There won't be any overlapping blocks during attention. And the block size in default is 64X64.
Does triton provide any optimized softmax
or layernorm
kernel for the output from block sparse MM?
from triton.
No layernorm, but you can use triton.ops.blocksparse.softmax to reduce the row of the output of triton.ops.blocksparse.matmul('SDD').
from triton.
Could you please give some explanation on "how" to reduce the row in triton implementation?
from triton.
Thanks. Last question. Is this softmax
specifically different from the softmax
used for full attention, i.e. full spMM.
from triton.
Hmm, it should be the same. You can also pass dense masks since a block-triangular matrix is not triangular (the blocks on the diagonals are dense). There are example usages here https://github.com/ptillet/triton/blob/master/python/test/test_blocksparse.py#L47
from triton.
Related Issues (20)
- How do I compile a Triton program? HOT 8
- Compile error with fp8 block pointer usage
- Jitting Error: can't pass bfloat16 as a tl.dype to my kernel! HOT 1
- Segmentation fault when DataLoader processes are launched after compiling Triton kernels HOT 6
- error when call tl.sort for data of torch.bfloat16 HOT 2
- Segfault on mixed mm HOT 4
- How to get the type of data from its ptr? HOT 1
- @LyricZhao Can you add a tutorial for the tile pointer? I think that could be helpful. HOT 1
- [Problem][Porposal] How does `make_block_ptr` work HOT 2
- FP8 conversion on SM89 (Ada Lovelace) fails with outdated PTX version HOT 4
- Initializing runtime driver overwrites root logger's verbosity level
- About Int8 Matrix Multiplication
- Why not allow JITFunction as parameter to another JITFunction(high-order jit function)?
- Add support for triangular solve operation.
- Inconsistency between constants as arguments and captured globals HOT 3
- The relationship between operator running time and number of runs HOT 2
- How to figure out layout/order for loads via explicit Tma? HOT 10
- Support for tl shift operator on tensor
- tl.clamp bug : AttributeError("module 'triton.language' has no attribute 'clamp'") HOT 4
- fp8 tensor core support on h100 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from triton.