Comments (2)
Good catch! Thanks a lot :)
It looks like the memory leak happens in the predictive model which figures out what GEMM kernel to generate given some input shapes. This routine is called once per shape, so hopefully it shouldn't happen at every GEMM call but rather at every non-cached GEMM call, i.e., every time a new shape is encountered.
Nonetheless, this could be a big problem for applications that use repeated GEMM calls with many different shapes, such as some tiled linear algebra algorithm (like SVD).
from triton.
Closing this because ISAAC was updated to Triton. There may still be memory leaks though; may reopen later.
from triton.
Related Issues (20)
- Can I compile the kernel using triton.compile? HOT 1
- How to implement matmul with B in column major? HOT 2
- how to solve "cannot find -lNVGPUIR: No such file or directory"
- Use TRITON_CACHE in setup.py so that .triton must not have to be in home directory of user HOT 1
- Can't install triton HOT 1
- Flash Attention 3 --> Triton HOT 1
- Mistakes in `class DistributedEncoding`'s illustration
- Latest nightly triton causes my custom fused attention kernel to output incorrect results. HOT 3
- Unexpected segmentation fault with tl.sum in a simple loop HOT 3
- Support reduction operations on global memory with `red` ptx instruction
- Cannot use Triton interpreter with matrix multiply example HOT 4
- error: fp8e4nv data type is not supported on CUDA arch < 89
- Can I explicitly specify the "tl.load" to load data into shared memory? HOT 1
- breaking change to constexpr in triton 3.0.0 HOT 2
- AttributeError: 'InterpretedFunction' object has no attribute 'cache_key' HOT 2
- SWP: use of address of iterator in dist1Cluster HOT 1
- Introduce `tl.assume` or use `assert` expression in non-debug builds to guide optimization?
- run into dead loop when tuning the tma persistent kernel HOT 6
- [BUG] error load fp32 value from 2D tensor HOT 2
- [BUG] device_print - Triton nightly, 3.0 incorrect values (zero) when using pointer arithmetics(constexpr etc.) other than with triton.language.arange HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from triton.