Comments (5)
Hi :)
The .json file contains the predictor (random forest) to pick the right kernel template to choose given the input shapes. The predictor is trained in $PROJECT_ROOT_DIR/tune/, and serialized into a .json file (to make sure that ISAAC doesn't depend on Python.).
What kind of kernel are you trying to add?
from triton.
Thanks for your reply. currently we have some fine-tuned kernel of GEMV/GEMM for Intel's graphics, the kernels have some issues:
- they may not suitable for all the GEMV/GEMM parameters.
- the other is that they may not have good performance for non-Intel platform.
what is your opinion about the good solution that integrated these kernels into ISAAC?
add @gongzg into discussion
from triton.
Oh, yes. I've talked briefly to @gongzg about that. For now I just know that the kernels used intel subgroups extension, which is fine. I have a few questions, so that I can think about integrating them:
Does it work for all sizes of M, N, K (i.e., handles bounds-checking properly) without calling additional "cleanup" kernels?
Does it work for the 4 layouts (NN, NT, TN, TT)?
What are the tunable parameters (how many?). In the current GEMM generator, one parameter ("Depth") is pretty useful for handling "small M, N ; large K" situations. And I'll add another one. We can talk about adding them into your kernel template, if they're not already there.
from triton.
Hi,
Currently our kernels support al the sizes of M, N, K, but will bring cleanup kernels.
And just work for RowMajor layout currently.
Don't have tunable parameters, but have some different kernels for different usage situation.
from triton.
That could be interesting :) Is there any place where I can see and/or benchmark the code?
from triton.
Related Issues (20)
- How to implement matmul with B in column major? HOT 2
- how to solve "cannot find -lNVGPUIR: No such file or directory"
- Use TRITON_CACHE in setup.py so that .triton must not have to be in home directory of user HOT 1
- Can't install triton HOT 1
- Flash Attention 3 --> Triton HOT 1
- Mistakes in `class DistributedEncoding`'s illustration
- Latest nightly triton causes my custom fused attention kernel to output incorrect results. HOT 3
- Unexpected segmentation fault with tl.sum in a simple loop HOT 3
- Support reduction operations on global memory with `red` ptx instruction
- Cannot use Triton interpreter with matrix multiply example HOT 4
- error: fp8e4nv data type is not supported on CUDA arch < 89
- Can I explicitly specify the "tl.load" to load data into shared memory? HOT 1
- breaking change to constexpr in triton 3.0.0 HOT 2
- AttributeError: 'InterpretedFunction' object has no attribute 'cache_key' HOT 2
- SWP: use of address of iterator in dist1Cluster HOT 1
- Introduce `tl.assume` or use `assert` expression in non-debug builds to guide optimization?
- run into dead loop when tuning the tma persistent kernel HOT 6
- [BUG] error load fp32 value from 2D tensor HOT 2
- [BUG] device_print - Triton nightly, 3.0 incorrect values (zero) when using pointer arithmetics(constexpr etc.) other than with triton.language.arange HOT 3
- Incompatible type error with "torch.onnx.export()"
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from triton.