I'd like to do some performance evaluation of isaac blas library. Fo

Performance Evaluation methods? it's not very clear. about triton HOT 5 CLOSED

triton-lang commented on July 22, 2024

Performance Evaluation methods? it's not very clear.

from triton.

Comments (5)

ptillet commented on July 22, 2024

Hello,

You can use the bench-blas executable included in the package. If CMake detects other BLAS implementations on your computer (clBLAS, OpenBLAS, cuBLAS...), it will benchmark against those.
USAGE: ${BUILD_DIR}/bench/bench-blas gemm

Alternatively, you can link whatever executable you want against Isaac instead of clBLAS. It'll work for BLAS1, GEMV and GEMM.

from triton.

MaryRand commented on July 22, 2024

Quick questions:
(1) Is there any option to specify single or double for gemm ? like sgemm, dgemm ?
(2) Can I just run one "N" instance ? like, square 5000 x 5000 ?
(3) I see that ISAAC is outperforming clBLAS by 395 vs. 123 below. I interpreted right , right ?
(4) Compared to ViennaCL, what is the high-level difference ?

Thanks ! for the helpful answers.

./bench-blas 0 gemm
#Benchmark : BLAS
#----------------
#gemm (GFLOPS)
"N" "ISAAC" "clBLAS"
"square896" 395 123
"square2560" 390 117
"conv1" 251 59
"conv2" 326 124
"conv3" 225 109
"conv4" 286 103
"conv5" 214 93
"ica32" 83 13
"ica256" 337 64
"32rank1-4096" 275 111
"32rank1-3456" 270 116
"32rank1-896" 179 97

from triton.

ptillet commented on July 22, 2024

(1) Although DGEMM is supported, I had no time to run the auto-tuner for double precision on all existing architectures. I will make it more easily benchmarkable once everything is included.
(2) For now, you can edit bench/blas.cpp to add the shapes that you want. Ideally, I should indeed provide a config file that lets one benchmark isaac more easily for arbitrary shapes. The shapes included for this benchmark are square, those found in 5 layers of alexnet convolutions (if you do im2col), shapes found in covariance/ica computation, and shapes found in SVD (32 rank1 updated on a 896,3456 and 4096 elements square matrices).
(3) Yes, that's correct! 395GFLOPS vs 123GFLOPS
(4) Oh, many things! I wrote the BLAS3 kernels for ViennaCL, actually. The most notable difference is that ViennaCL is tuned for square matrices, while ISAAC uses a machine learning model to be tuned for any input shape. ViennaCL GEMM also has less efficient bounds checking (isaac uses tricky pointer arithmetics, ViennaCL uses cleanup kernels).
Ultimately, ViennaCL/CLBlas/etc.'s device database are datastructures that map compute devices to kernel parameters, while ISAAC's maps compute devices to a model that predicts kernel parameters given input shapes.

from triton.

MaryRand commented on July 22, 2024

Great. Thank you for the information !

On Wed, Sep 14, 2016 at 4:59 PM, ptillet [email protected] wrote:

(1) Although DGEMM is supported, I had no time to run the auto-tuner for
double precision on all existing architectures. I will make it more easily
benchmarkable once everything is included.
(2) For now, you can edit bench/blas.cpp to add the shapes that you want.
Ideally, I should indeed provide a config file that lets one benchmark
isaac more easily for arbitrary shapes. The shapes included for this
benchmark are square, those found in 5 layers of alexnet convolutions (if
you do im2col), shapes found in covariance/ica computation, and shapes
found in SVD (32 rank1 updated on a 896,3456 and 4096 elements square
matrices).
(3) Yes, that's correct! 395GFLOPS vs 123GFLOPS
(4) Oh, many things! I wrote the BLAS3 kernels for ViennaCL, actually. The
most notable difference is that ViennaCL is tuned for square matrices,
while ISAAC uses a machine learning model to be tuned for any input shape.
ViennaCL GEMM also has less efficient bounds checking (isaac uses tricky
pointer arithmetics, ViennaCL uses cleanup kernels).
Ultimately, ViennaCL/CLBlas/etc.'s device database are datastructures that
map compute devices to kernel parameters, while ISAAC's maps compute
devices to a model that predicts kernel parameters given input shapes.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#6 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AVK7CoR1b-0vnu9EHQt81LODeeQtd_JZks5qqCgHgaJpZM4J87MP
.

from triton.

ptillet commented on July 22, 2024

You're welcome :)

from triton.

Performance Evaluation methods? it's not very clear. about triton HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent