It would be helpful to have a special pragma or something where the enclosed code woul

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Feature request: fp32 multiplication with zero optimization specifier about libcudacxx HOT 5 CLOSED

Saitama10000 commented on June 2, 2024

Feature request: fp32 multiplication with zero optimization specifier

from libcudacxx.

Comments (5)

miscco commented on June 2, 2024 2

That depends on your targeted toolchain. Maybe this is a good start:
https://forums.developer.nvidia.com/c/accelerated-computing/hpc-compilers/nvc-nvc-and-nvfortran/313

from libcudacxx.

Saitama10000 commented on June 2, 2024 1

@jrhemstad
https://forums.developer.nvidia.com/t/feature-request-fp32-multiplication-with-zero-optimization-specifier/245926/5?u=saitamatenthousand

from libcudacxx.

miscco commented on June 2, 2024

Hi @Saitama10000 this is not a library feature but a compiler feature.

As library, we have no say in how the compiler interprets our code. So please file this feature request with the compiler toolchains

from libcudacxx.

Saitama10000 commented on June 2, 2024

Hi @miscco I'm having a hard time finding where I should file this request. Can you help me out with a link?

Thank you,
Saitama10000

from libcudacxx.

jrhemstad commented on June 2, 2024

@Saitama10000 what optimization are you hoping to achieve? If the coefficients are known at compile time and your function is inlined, it will exploit the fact that the values are known at compile time.

Example: https://godbolt.org/z/K6Khrb446

Even if the values aren't known at compile time, there is no difference in the type of FMA that is generated.

from libcudacxx.

Feature request: fp32 multiplication with zero optimization specifier about libcudacxx HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent