srush / triton-puzzles Goto Github PK

View Code? Open in Web Editor NEW

686.0 4.0 40.0 55 KB

Puzzles for learning Triton

License: Apache License 2.0

Jupyter Notebook 100.00%

machine-learning puzzle

triton-puzzles's Introduction

Triton Puzzles

w/ Tejas Ramesh and Keren Zhou based on Triton-Viz

Programming for accelerators such as GPUs is critical for modern AI systems. This often means programming directly in proprietary low-level languages such as CUDA. Triton is an alternative open-source language that allows you to code at a higher-level and compile to accelerators like GPU.

Coding for Triton is very similar to Numpy and PyTorch in both syntax and semantics. However, as a lower-level language there are a lot of details that you need to keep track of. In particular, one area that learners have trouble with is memory loading and storage which is critical for speed on low-level devices.

This set is puzzles is meant to teach you how to use Triton from first principles in an interactive fashion. You will start with trivial examples and build your way up to real algorithms like Flash Attention and Quantized neural networks. These puzzles do not need to run on GPU since they use a Triton interpreter.

Discord: https://discord.gg/cudamode #triton-puzzles

If you are into this kind of thing, this is 7th in a series of these puzzles.

triton-puzzles's People

Contributors

Stargazers

Watchers

triton-puzzles's Issues

Index Mixup

I think you mixed up the indexes in question 3/4. Your code actually does

$$z_{j,i} = x_i + y_j\text{ for } i = 1\ldots B_0,\ j = 1\ldots B_1$$

instead of

$$z_{i, j} = x_i + y_j\text{ for } i = 1\ldots B_0,\ j = 1\ldots B_1$$

j and I are switched on z, I think.

You can see it when computing e.g.

add_vec_spec(torch.tensor([1,2,3]), torch.tensor([10,20,30]))

which returns:

tensor([[11, 12, 13],
        [21, 22, 23],
        [31, 32, 33]])

Anyways: Thanks for these puzzles :)

Edit: the same mixup seems to be in Q 5

Running in Google Colab Free Tier may trigger warnings and disconnects

I have received the following warning twice when running in Google Colab free tier, and am wary of continuing to run the Triton Puzzles notebook because of it: (I don't want to get banned)

I have not modified the notebook in any way (other than attempt to solve the puzzles, of course).

From what I can gather, this may be some false positive due to the use of Triton-Viz and/or HTML output by the notebook.
Specifically, in the Colab FAQ it's stated that these activities are disallowed in the free tier:

bypassing the notebook UI to interact primarily via a web UI

I suspect that Triton-Viz and/or HTML output by the notebook triggers this false positive (and thus the warning), although for me, this has happened sporadically - only twice. (I have finished 8 of 12 puzzles, in case that matters)

Doing a bit more digging, it appears that Google cracked down on this when Stable Diffusion Web UI usage on the free tier skyrocketed, causing usage concerns. See the following for more details:

There's a comment in there from a Colab PM who states:

just the webui bit is getting warned - use the models in notebooks to your heart's content. stability is awesome and I really like them.

we prioritize interactive notebook compute on the free tier, and the webUI piece grew really big fast.

This suggests that Web UIs may be the cause or factor here. I don't know enough about Triton, but I doubt the usage of Triton here would be considered the "disallowed" code.

I'm wondering if anyone else has experienced this? (The obvious mitigation would be just to purchase a plan or some credits)

Thanks,

Peter

[QST] Triton MLIR

@srush

Always appreciate your wonderful OSS educational contributions!

I'm relatively familiar with CUDA and triton but less so with machine learning compilers and am interested in getting into the weeds of triton's compilation pipeline.

I've come across a few resources for learning MLIR as well as related projects such as TVM (which has a comprehensive set of tutorials / learning materials spearheaded by Tianqi Chen of CMU), but have yet to bridge the gap from basic MLIR to something on the scale of triton.

The overarching motivation -- other than the fact that ML compilers are super-interesting :) -- is that in a world of increased demand for ML training / inference but limited GPU (NVIDIA) supply, the ability to write code that is backend-agnostic is evermore important.

A few questions:

Are you aware of any resources for learning MLIR incrementally, ideally building from basics to something like a toy triton, and more ambitiously, understanding enough of the triton backend to be able to contribute new optimization passes?
Is this something you're interested in and possibly collaborating on?

I'd be willing to do as much of the heavy lifting as needed:

I'd envision a step by step walkthrough of each of triton tutorials, starting with vec-add.
The goal would be to understand how each pass of the compilation pipeline translates high-level python to performant device code.
Something that pulls apart each component of the C++ MLIR pipeline and provides greater visibility -- and hackability -- than simply observing the output of MLIR_ENABLE_DUMP.

cc @Jokeren

viz=False does not work

I don't really understand the visualizations tbh, so I wanted to turn them off, but viz=False, leads to the following error:

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
[<ipython-input-44-e13ba08403d2>](https://ucrqhfq45wo-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240403-060136_RC00_621473488#) in <cell line: 28>()
     26     tl.store(z_ptr + n0_range + n1_range * N0, z, mask)
     27 
---> 28 test(add_vec_block_kernel, add_vec_block_spec, nelem={"N0": 100, "N1": 90}, viz=False)

[<ipython-input-3-91403af2aaed>](https://ucrqhfq45wo-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240403-060136_RC00_621473488#) in test(puzzle, puzzle_spec, nelem, B, viz)
     40     if viz:
     41         failures = triton_viz.launch()
---> 42     if not match or failures:
     43         print("Invalid Access:", failures)
     44         print("Yours:", z)

UnboundLocalError: local variable 'failures' referenced before assignment

changing the test code a bit to something like this in the relevant lines worked for me:

failures=None
if viz:
    failures = triton_viz.launch()
if not match or failures:
    if failures:
        print("Invalid Access:", failures)

srush / triton-puzzles Goto Github PK

triton-puzzles's Introduction

Triton Puzzles

triton-puzzles's People

Contributors

Stargazers

Watchers

Forkers

triton-puzzles's Issues

Index Mixup

Running in Google Colab Free Tier may trigger warnings and disconnects

[QST] Triton MLIR

viz=False does not work

ModuleNotFoundError: No module named 'triton_viz'

Puzzle 6 - shape annotation for x

question about long softmax

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent