Giter Club home page Giter Club logo

Comments (21)

pborsutzki avatar pborsutzki commented on June 21, 2024 1

it also looks like CPU + SPIRV produce the same output now regardless of if a workaround is used or not:

7060504
1B1A1918
2F2E2D2C
43424140

That's interesting, it would mean that the C++ backend got padded on your end? Otherwise I'd expect 6050403 as first line.

do any of the workarounds work on your machine?

Yes, they work - at least for achieving the same results for Vulkan and C++: As workaround A changes padding it yields also different results (starting with 7060504).

do the outputs look similar to mine?

Not really.

Output for no workaround using slang 2024.1.8:

  • Vulkan:
    3020100
    17161514
    2B2A2928
    3F3E3D3C
    
  • C++:
    6050403
    1A191817
    2E2D2C2B
    4241403F
    

Using Workaround A, both Vulkan and C++ yield the same output:

7060504
1B1A1918
2F2E2D2C
43424140

Using Workaround B and C, both Vulkan and C++ yield the same output:

6050403
1A191817
2E2D2C2B
4241403F

I checked multiple times if I am using the right version etc. I of course could still be doing something silly ...

from slang.

csyonghe avatar csyonghe commented on June 21, 2024 1

Somebody need to verify that -xslang -O0 actually takes effect in our testing framework.

from slang.

csyonghe avatar csyonghe commented on June 21, 2024 1

@pborsutzki The SPIRV generated from Slang seems correct, unless we are missing any details in the SPIRV spec that says PhysicalStorageBuffer64 will affect struct layout in a different way.

We also tested the shader on an AMD GPU and able to get the correct result, so this is likely an issue in the NVIDIA driver. We are working with the driver team to investigate this further down the stack.

from slang.

ArielG-NV avatar ArielG-NV commented on June 21, 2024

Minimal repro for future bug reference:

struct FooBar {
    float4x4 c;
    int load(int row, int col)
    {
        return int(c[row][col]);
        //return *(int*)int(c[row][col]); // Works if using ptr to any member, cast or no cast. Problem is missing SpvDecorationRowMajor/SpvDecorationColMajor emit if no cast is used
    }
};
RWStructuredBuffer<int> outputBuffer;
uniform StructuredBuffer<FooBar, ScalarDataLayout> sb;
[numthreads(4, 1, 1)]
[shader("compute")]
void computeMain(
    int3 dispatchThreadID : SV_DispatchThreadID)
{
    outputBuffer[dispatchThreadID.x] = sb[0].load(dispatchThreadID.x/4, dispatchThreadID.x%4);
}

from slang.

ArielG-NV avatar ArielG-NV commented on June 21, 2024

issues found:

  1. incorrect slangc arg parsing (possible to have rowMajor & colMajor)
  2. spirv-opt 2023-6 incorrectly optimizes code and removes member decorations in some odd cases, fixed in spirv-opt 2024-1

from slang.

ArielG-NV avatar ArielG-NV commented on June 21, 2024

@pborsutzki

Once #3972 is merged, the following issue can be worked around through compiling with slangc using the -O0 option. This issue is related to upstream spirv-opt causing issues when optimizing.

If you would like to still have optimized spirv, you may setup spirv-opt 2024-1 to optimize your shader. This is the spirv-opt bundled with Vulkan SDK 1.3.280.0.

from slang.

jkwak-work avatar jkwak-work commented on June 21, 2024

Minimal repro for future bug reference:

struct FooBar {
    float4x4 c;
    int load(int row, int col)
    {
        return int(c[row][col]);
        //return *(int*)int(c[row][col]); // Works if using ptr to any member, cast or no cast. Problem is missing SpvDecorationRowMajor/SpvDecorationColMajor emit if no cast is used
    }
};
RWStructuredBuffer<int> outputBuffer;
uniform StructuredBuffer<FooBar, ScalarDataLayout> sb;
[numthreads(4, 1, 1)]
[shader("compute")]
void computeMain(
    int3 dispatchThreadID : SV_DispatchThreadID)
{
    outputBuffer[dispatchThreadID.x] = sb[0].load(dispatchThreadID.x/4, dispatchThreadID.x%4);
}

Does this mean that the alignment is not a cause as mentioned on the initial description?

struct FooBar {
    uint64_t a;
    uint32_t b;
#ifdef WORKAROUND_A
    uint32_t pad;
#endif
    float4x4 c;

Why would WORKAROUND_A avoid the problem if the issue was on the optimization option?

from slang.

pborsutzki avatar pborsutzki commented on June 21, 2024

@jkwak-work I have the same question - to me also the minimum reproducer does not address the alignment/padding issue.

@ArielG-NV & @csyonghe: I tried the newly released slang 2024.1.8 using the same file as posted initially (note that it does contain an -xslang -O0 option, so spirv opt shouldn't break anything) and it still fails.

I am not sure if I am still missing something here, can you advise?

Still, thanks for looking into this, any work on this is highly appreciated!

from slang.

ArielG-NV avatar ArielG-NV commented on June 21, 2024

Minimal repro for future bug reference:

struct FooBar {
    float4x4 c;
    int load(int row, int col)
    {
        return int(c[row][col]);
        //return *(int*)int(c[row][col]); // Works if using ptr to any member, cast or no cast. Problem is missing SpvDecorationRowMajor/SpvDecorationColMajor emit if no cast is used
    }
};
RWStructuredBuffer<int> outputBuffer;
uniform StructuredBuffer<FooBar, ScalarDataLayout> sb;
[numthreads(4, 1, 1)]
[shader("compute")]
void computeMain(
    int3 dispatchThreadID : SV_DispatchThreadID)
{
    outputBuffer[dispatchThreadID.x] = sb[0].load(dispatchThreadID.x/4, dispatchThreadID.x%4);
}

Does this mean that the alignment is not a cause as mentioned on the initial description?

struct FooBar {
    uint64_t a;
    uint32_t b;
#ifdef WORKAROUND_A
    uint32_t pad;
#endif
    float4x4 c;

Why would WORKAROUND_A avoid the problem if the issue was on the optimization option?

2 problems were found when attempting to solve this issue:

  1. we had in our compiler flags system a column major flag always toggled. This meant we would have row major+column major
  2. spirv-opt was not applying decorations to children when the only use of the struct members was fetching a pointer to an element. This meant many things were broken (but most notably, alignment).

from slang.

ArielG-NV avatar ArielG-NV commented on June 21, 2024

@jkwak-work I have the same question - to me also the minimum reproducer does not address the alignment/padding issue.

@ArielG-NV & @csyonghe: I tried the newly released slang 2024.1.8 using the same file as posted initially (note that it does contain an -xslang -O0 option, so spirv opt shouldn't break anything) and it still fails.

I am not sure if I am still missing something here, can you advise?

Still, thanks for looking into this, any work on this is highly appreciated!

I will look into the issue further.
At initial check: annotations are now correct (equal with/without workaround).
Likely another problem.
Code gen looks functionally identical with WORKAROUND_A

from slang.

ArielG-NV avatar ArielG-NV commented on June 21, 2024

Vulkan validation throws without-workaround, with WORKAROUND_A, with WORKAROUND_B, and C:

SPIR-V Capability VariablePointers was declared, but one of the following requirements is required (VkPhysicalDeviceVulkan11Features::variablePointers).

None of the workarounds produce an expected/checked-for result.

from slang.

ArielG-NV avatar ArielG-NV commented on June 21, 2024

it also looks like CPU + SPIRV produce the same output now regardless of if a workaround is used or not:

7060504
1B1A1918
2F2E2D2C
43424140

from slang.

ArielG-NV avatar ArielG-NV commented on June 21, 2024

@pborsutzki I have 2 questions:

  1. do any of the workarounds work on your machine?
  2. do the outputs look similar to mine?

from slang.

pborsutzki avatar pborsutzki commented on June 21, 2024

Somebody need to verify that -xslang -O0 actually takes effect in our testing framework.

If it helps - it also fails in my application as well where -O0 definitely has an effect ;)

from slang.

ArielG-NV avatar ArielG-NV commented on June 21, 2024

Somebody need to verify that -xslang -O0 actually takes effect in our testing framework.

It does take effect, the matrix layouts do annotate correctly.

from slang.

ArielG-NV avatar ArielG-NV commented on June 21, 2024

reproduced the results to some degree:

  1. A has wrong results for all tests + vk validation error
  2. B has a vk validation error
  3. C works unlike B since no variable pointer (and therefore no validation error?).

It is highly likely the vk validation error is the issue here, or it is my hardware.

from slang.

ArielG-NV avatar ArielG-NV commented on June 21, 2024

Minimal repro with workarounds for future reference:

//IGNORE:TEST(compute, vulkan):COMPARE_COMPUTE_EX(filecheck-buffer=CHECK):-vk -emit-spirv-directly -shaderobj -compute -xslang -matrix-layout-row-major -xslang -O0
//TEST(compute):COMPARE_COMPUTE_EX(filecheck-buffer=CHECK):-slang -compute -cpu -shaderobj -xslang -matrix-layout-row-major -xslang -O0

// removes from code all uses of pointers, this fixes the code from removal of `PhysicalStorageBuffer64` -- note that the pointer branch is unused regardless of if toggled
//#define WORKAROUND 

// ensures the matrix byte offset is 0, this 'fixes' the issue since the issue is that member memory offsets were not being respected (and offset 0 was loaded for each struct)
//#define WORKAROUND2 
struct FooBar {
#ifdef WORKAROUND2
    float4x4 c;
    uint64_t a;
    uint32_t b;
#else
    uint64_t a;
    uint32_t b;
    float4x4 c;
#endif

    int load() {
#ifdef WORKAROUND
        return (int)a;
#else
        return *((int *)a);
#endif
    }
};

//TEST_INPUT: ubuffer(data=[0 0 0 0], stride=4):out,name=outputBuffer
RWStructuredBuffer<int> outputBuffer;

//TEST_INPUT:ubuffer(data=[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400], stride=4):name=sb
uniform StructuredBuffer<FooBar, ScalarDataLayout> sb;

int test(int val)
{
    uint32_t res = 0;
    for (int i = 0; i < 4; ++i) {
        res |= bit_cast<uint32_t>(sb[val].c[0][i]) << i * 8;
    }

    if (val == 100000) { // will never use a pointer
        res = sb[val].load();
    }
    return bit_cast<int>(res);
}

[numthreads(4, 1, 1)]
[shader("compute")]
void computeMain(
    int3 dispatchThreadID : SV_DispatchThreadID)
{
    int tid = dispatchThreadID.x;

    int inVal = tid;
    int outVal = test(inVal);

    // CHECK: 6050403
    // CHECK: 1A191817
    // CHECK: 2E2D2C2B
    // CHECK: 4241403F
    outputBuffer[tid] = outVal;
}

from slang.

ArielG-NV avatar ArielG-NV commented on June 21, 2024

some spir-v for reference:

; with `WORKAROUND` defined
; SPIR-V
; Version: 1.5
; Generator: Khronos; 40
; Bound: 156
; Schema: 0
OpCapability Int64
OpCapability Shader
OpExtension "SPV_KHR_storage_buffer_storage_class"
OpMemoryModel Logical GLSL450
OpEntryPoint GLCompute %computeMain "main" %sb %outputBuffer %gl_GlobalInvocationID
OpExecutionMode %computeMain LocalSize 4 1 1

; Debug Information
OpSource Slang 1
OpName %tid "tid"  ; id %13
OpName %val "val"  ; id %17
OpName %i "i"  ; id %20
OpName %i "i"  ; id %20
OpName %res "res"  ; id %22
OpName %res "res"  ; id %22
OpName %res "res"  ; id %22
OpName %_MatrixStorage_float4x4natural "_MatrixStorage_float4x4natural"  ; id %49
OpMemberName %_MatrixStorage_float4x4natural 0 "data"
OpName %FooBar_natural "FooBar_natural"  ; id %47
OpMemberName %FooBar_natural 0 "a"
OpMemberName %FooBar_natural 1 "b"
OpMemberName %FooBar_natural 2 "c"
OpName %StructuredBuffer "StructuredBuffer"  ; id %54
OpName %sb "sb"  ; id %57
OpName %sb "sb"  ; id %57
OpName %FooBar "FooBar"  ; id %59
OpMemberName %FooBar 0 "a"
OpMemberName %FooBar 1 "b"
OpMemberName %FooBar 2 "c"
OpName %unpackStorage "unpackStorage"  ; id %70
OpName %unpackStorage_0 "unpackStorage"  ; id %62
OpName %res_0 "res"  ; id %116
OpName %i_0 "i"  ; id %118
OpName %this "this"  ; id %133
OpName %FooBar_load "FooBar.load"  ; id %131
OpName %test "test"  ; id %15
OpName %outVal "outVal"  ; id %14
OpName %RWStructuredBuffer "RWStructuredBuffer"  ; id %150
OpName %outputBuffer "outputBuffer"  ; id %153
OpName %outputBuffer "outputBuffer"  ; id %153
OpName %computeMain "computeMain"  ; id %2

; Annotations
OpDecorate %gl_GlobalInvocationID BuiltIn GlobalInvocationId
OpDecorate %_arr_v4float_int_4 ArrayStride 16
OpMemberDecorate %_MatrixStorage_float4x4natural 0 Offset 0
OpMemberDecorate %FooBar_natural 0 Offset 0
OpMemberDecorate %FooBar_natural 1 Offset 8
OpMemberDecorate %FooBar_natural 2 Offset 12
OpDecorate %_runtimearr_FooBar_natural ArrayStride 80
OpDecorate %StructuredBuffer Block
OpDecorate %StructuredBuffer Block
OpMemberDecorate %StructuredBuffer 0 Offset 0
OpDecorate %sb Binding 1
OpDecorate %sb DescriptorSet 0
OpDecorate %sb NonWritable
OpMemberDecorate %FooBar 0 Offset 0
OpMemberDecorate %FooBar 1 Offset 8
OpMemberDecorate %FooBar 2 Offset 12
OpMemberDecorate %FooBar 2 ColMajor
OpMemberDecorate %FooBar 2 MatrixStride 16
OpDecorate %_runtimearr_int ArrayStride 4
OpDecorate %RWStructuredBuffer Block
OpDecorate %RWStructuredBuffer Block
OpMemberDecorate %RWStructuredBuffer 0 Offset 0
OpDecorate %outputBuffer Binding 0
OpDecorate %outputBuffer DescriptorSet 0

; Types, variables and constants
%void = OpTypeVoid
%3 = OpTypeFunction %void
%uint = OpTypeInt 32 0
%v3uint = OpTypeVector %uint 3
%_ptr_Input_v3uint = OpTypePointer Input %v3uint
%int = OpTypeInt 32 1
%v3int = OpTypeVector %int 3
%16 = OpTypeFunction %int %int
%_ptr_Function_int = OpTypePointer Function %int
%_ptr_Function_uint = OpTypePointer Function %uint
%float = OpTypeFloat 32
%v4float = OpTypeVector %float 4
%_ptr_Function_v4float = OpTypePointer Function %v4float
%bool = OpTypeBool
%int_100000 = OpConstant %int 100000
%int_0 = OpConstant %int 0
%uint_0 = OpConstant %uint 0
%ulong = OpTypeInt 64 0
%int_4 = OpConstant %int 4
%_arr_v4float_int_4 = OpTypeArray %v4float %int_4
%_MatrixStorage_float4x4natural = OpTypeStruct %_arr_v4float_int_4
%FooBar_natural = OpTypeStruct %ulong %uint %_MatrixStorage_float4x4natural
%_ptr_StorageBuffer_FooBar_natural = OpTypePointer StorageBuffer %FooBar_natural
%_runtimearr_FooBar_natural = OpTypeRuntimeArray %FooBar_natural
%StructuredBuffer = OpTypeStruct %_runtimearr_FooBar_natural
%_ptr_StorageBuffer_StructuredBuffer = OpTypePointer StorageBuffer %StructuredBuffer
%mat4v4float = OpTypeMatrix %v4float 4
%FooBar = OpTypeStruct %ulong %uint %mat4v4float
%63 = OpTypeFunction %FooBar %FooBar_natural
%71 = OpTypeFunction %mat4v4float %_MatrixStorage_float4x4natural
%_ptr_Function_float = OpTypePointer Function %float
%int_8 = OpConstant %int 8
%int_1 = OpConstant %int 1
%132 = OpTypeFunction %int %FooBar
%_ptr_StorageBuffer_int = OpTypePointer StorageBuffer %int
%_runtimearr_int = OpTypeRuntimeArray %int
%RWStructuredBuffer = OpTypeStruct %_runtimearr_int
%_ptr_StorageBuffer_RWStructuredBuffer = OpTypePointer StorageBuffer %RWStructuredBuffer
%gl_GlobalInvocationID = OpVariable %_ptr_Input_v3uint Input
%sb = OpVariable %_ptr_StorageBuffer_StructuredBuffer StorageBuffer
%outputBuffer = OpVariable %_ptr_StorageBuffer_RWStructuredBuffer StorageBuffer

; Function computeMain
%computeMain = OpFunction %void None %3
%4 = OpLabel
%7 = OpLoad %v3uint %gl_GlobalInvocationID
%12 = OpBitcast %v3int %7
%tid = OpCompositeExtract %int %12 0
%outVal = OpFunctionCall %int %test %tid
%147 = OpBitcast %uint %tid
%149 = OpAccessChain %_ptr_StorageBuffer_int %outputBuffer %int_0 %147
OpStore %149 %outVal
OpReturn
OpFunctionEnd

; Function test
%test = OpFunction %int None %16
%val = OpFunctionParameter %int
%18 = OpLabel
%i = OpVariable %_ptr_Function_int Function
%res = OpVariable %_ptr_Function_uint Function
%26 = OpVariable %_ptr_Function_v4float Function
%38 = OpBitcast %uint %val
%40 = OpIEqual %bool %val %int_100000
OpStore %i %int_0
OpStore %res %uint_0
OpBranch %27
%27 = OpLabel
OpLoopMerge %31 %37 None
OpBranch %28
%28 = OpLabel
OpBranch %29
%29 = OpLabel
%53 = OpAccessChain %_ptr_StorageBuffer_FooBar_natural %sb %int_0 %38
%58 = OpLoad %FooBar_natural %53
%61 = OpFunctionCall %FooBar %unpackStorage_0 %58
%103 = OpCompositeExtract %mat4v4float %61 2
%104 = OpCompositeExtract %v4float %103 0
OpStore %26 %104
%106 = OpLoad %int %i
%108 = OpAccessChain %_ptr_Function_float %26 %106
%109 = OpLoad %float %108
%110 = OpBitcast %uint %109
%111 = OpLoad %int %i
%112 = OpIMul %int %111 %int_8
%114 = OpShiftLeftLogical %uint %110 %112
%115 = OpLoad %uint %res
%res_0 = OpBitwiseOr %uint %115 %114
%117 = OpLoad %int %i
%i_0 = OpIAdd %int %117 %int_1
%120 = OpSLessThan %bool %i_0 %int_4
OpSelectionMerge %36 None
OpBranchConditional %120 %36 %30
%30 = OpLabel
OpBranch %31
%31 = OpLabel
OpBranch %32
%32 = OpLabel
OpSelectionMerge %35 None
OpBranchConditional %40 %34 %33
%33 = OpLabel
OpStore %res %res_0
OpBranch %35
%34 = OpLabel
%127 = OpAccessChain %_ptr_StorageBuffer_FooBar_natural %sb %int_0 %38
%128 = OpLoad %FooBar_natural %127
%129 = OpFunctionCall %FooBar %unpackStorage_0 %128
%130 = OpFunctionCall %int %FooBar_load %129
%138 = OpBitcast %uint %130
OpStore %res %138
OpBranch %35
%35 = OpLabel
%141 = OpLoad %uint %res
%142 = OpBitcast %int %141
OpReturnValue %142
%36 = OpLabel
OpStore %i %i_0
OpStore %res %res_0
OpBranch %37
%37 = OpLabel
OpBranch %27
OpFunctionEnd

; Function unpackStorage_0
%unpackStorage_0 = OpFunction %FooBar None %63
%64 = OpFunctionParameter %FooBar_natural
%65 = OpLabel
%66 = OpCompositeExtract %ulong %64 0
%67 = OpCompositeExtract %uint %64 1
%68 = OpCompositeExtract %_MatrixStorage_float4x4natural %64 2
%69 = OpFunctionCall %mat4v4float %unpackStorage %68
%101 = OpCompositeConstruct %FooBar %66 %67 %69
OpReturnValue %101
OpFunctionEnd

; Function unpackStorage
%unpackStorage = OpFunction %mat4v4float None %71
%72 = OpFunctionParameter %_MatrixStorage_float4x4natural
%73 = OpLabel
%74 = OpCompositeExtract %_arr_v4float_int_4 %72 0
%75 = OpCompositeExtract %v4float %74 0
%76 = OpCompositeExtract %float %75 0
%77 = OpCompositeExtract %float %75 1
%78 = OpCompositeExtract %float %75 2
%79 = OpCompositeExtract %float %75 3
%80 = OpCompositeExtract %v4float %74 1
%81 = OpCompositeExtract %float %80 0
%82 = OpCompositeExtract %float %80 1
%83 = OpCompositeExtract %float %80 2
%84 = OpCompositeExtract %float %80 3
%85 = OpCompositeExtract %v4float %74 2
%86 = OpCompositeExtract %float %85 0
%87 = OpCompositeExtract %float %85 1
%88 = OpCompositeExtract %float %85 2
%89 = OpCompositeExtract %float %85 3
%90 = OpCompositeExtract %v4float %74 3
%91 = OpCompositeExtract %float %90 0
%92 = OpCompositeExtract %float %90 1
%93 = OpCompositeExtract %float %90 2
%94 = OpCompositeExtract %float %90 3
%95 = OpCompositeConstruct %v4float %76 %77 %78 %79
%96 = OpCompositeConstruct %v4float %81 %82 %83 %84
%97 = OpCompositeConstruct %v4float %86 %87 %88 %89
%98 = OpCompositeConstruct %v4float %91 %92 %93 %94
%99 = OpCompositeConstruct %mat4v4float %95 %96 %97 %98
OpReturnValue %99
OpFunctionEnd

; Function FooBar_load
%FooBar_load = OpFunction %int None %132
%this = OpFunctionParameter %FooBar
%134 = OpLabel
%135 = OpCompositeExtract %ulong %this 0
%136 = OpSConvert %int %135
OpReturnValue %136
OpFunctionEnd
; without any `WORKAROUND`
; SPIR-V
; Version: 1.5
; Generator: Khronos; 40
; Bound: 158
; Schema: 0
OpCapability Int64
OpCapability VariablePointers
OpCapability PhysicalStorageBufferAddresses
OpCapability Shader
OpExtension "SPV_KHR_storage_buffer_storage_class"
OpExtension "SPV_KHR_variable_pointers"
OpExtension "SPV_KHR_physical_storage_buffer"
OpMemoryModel PhysicalStorageBuffer64 GLSL450
OpEntryPoint GLCompute %computeMain "main" %sb %outputBuffer %gl_GlobalInvocationID
OpExecutionMode %computeMain LocalSize 4 1 1

; Debug Information
OpSource Slang 1
OpName %tid "tid"  ; id %13
OpName %val "val"  ; id %17
OpName %i "i"  ; id %20
OpName %i "i"  ; id %20
OpName %res "res"  ; id %22
OpName %res "res"  ; id %22
OpName %res "res"  ; id %22
OpName %_MatrixStorage_float4x4natural "_MatrixStorage_float4x4natural"  ; id %49
OpMemberName %_MatrixStorage_float4x4natural 0 "data"
OpName %FooBar_natural "FooBar_natural"  ; id %47
OpMemberName %FooBar_natural 0 "a"
OpMemberName %FooBar_natural 1 "b"
OpMemberName %FooBar_natural 2 "c"
OpName %StructuredBuffer "StructuredBuffer"  ; id %54
OpName %sb "sb"  ; id %57
OpName %sb "sb"  ; id %57
OpName %FooBar "FooBar"  ; id %59
OpMemberName %FooBar 0 "a"
OpMemberName %FooBar 1 "b"
OpMemberName %FooBar 2 "c"
OpName %unpackStorage "unpackStorage"  ; id %70
OpName %unpackStorage_0 "unpackStorage"  ; id %62
OpName %res_0 "res"  ; id %116
OpName %i_0 "i"  ; id %118
OpName %this "this"  ; id %133
OpName %FooBar_load "FooBar.load"  ; id %131
OpName %test "test"  ; id %15
OpName %outVal "outVal"  ; id %14
OpName %RWStructuredBuffer "RWStructuredBuffer"  ; id %152
OpName %outputBuffer "outputBuffer"  ; id %155
OpName %outputBuffer "outputBuffer"  ; id %155
OpName %computeMain "computeMain"  ; id %2

; Annotations
OpDecorate %gl_GlobalInvocationID BuiltIn GlobalInvocationId
OpDecorate %_arr_v4float_int_4 ArrayStride 16
OpMemberDecorate %_MatrixStorage_float4x4natural 0 Offset 0
OpMemberDecorate %FooBar_natural 0 Offset 0
OpMemberDecorate %FooBar_natural 1 Offset 8
OpMemberDecorate %FooBar_natural 2 Offset 12
OpDecorate %_runtimearr_FooBar_natural ArrayStride 80
OpDecorate %StructuredBuffer Block
OpDecorate %StructuredBuffer Block
OpMemberDecorate %StructuredBuffer 0 Offset 0
OpDecorate %sb Binding 1
OpDecorate %sb DescriptorSet 0
OpDecorate %sb NonWritable
OpMemberDecorate %FooBar 0 Offset 0
OpMemberDecorate %FooBar 1 Offset 8
OpMemberDecorate %FooBar 2 Offset 12
OpMemberDecorate %FooBar 2 ColMajor
OpMemberDecorate %FooBar 2 MatrixStride 16
OpDecorate %_ptr_PhysicalStorageBuffer_int ArrayStride 4
OpDecorate %_runtimearr_int ArrayStride 4
OpDecorate %RWStructuredBuffer Block
OpDecorate %RWStructuredBuffer Block
OpMemberDecorate %RWStructuredBuffer 0 Offset 0
OpDecorate %outputBuffer Binding 0
OpDecorate %outputBuffer DescriptorSet 0

; Types, variables and constants
%void = OpTypeVoid
%3 = OpTypeFunction %void
%uint = OpTypeInt 32 0
%v3uint = OpTypeVector %uint 3
%_ptr_Input_v3uint = OpTypePointer Input %v3uint
%int = OpTypeInt 32 1
%v3int = OpTypeVector %int 3
%16 = OpTypeFunction %int %int
%_ptr_Function_int = OpTypePointer Function %int
%_ptr_Function_uint = OpTypePointer Function %uint
%float = OpTypeFloat 32
%v4float = OpTypeVector %float 4
%_ptr_Function_v4float = OpTypePointer Function %v4float
%bool = OpTypeBool
%int_100000 = OpConstant %int 100000
%int_0 = OpConstant %int 0
%uint_0 = OpConstant %uint 0
%ulong = OpTypeInt 64 0
%int_4 = OpConstant %int 4
%_arr_v4float_int_4 = OpTypeArray %v4float %int_4
%_MatrixStorage_float4x4natural = OpTypeStruct %_arr_v4float_int_4
%FooBar_natural = OpTypeStruct %ulong %uint %_MatrixStorage_float4x4natural
%_ptr_StorageBuffer_FooBar_natural = OpTypePointer StorageBuffer %FooBar_natural
%_runtimearr_FooBar_natural = OpTypeRuntimeArray %FooBar_natural
%StructuredBuffer = OpTypeStruct %_runtimearr_FooBar_natural
%_ptr_StorageBuffer_StructuredBuffer = OpTypePointer StorageBuffer %StructuredBuffer
%mat4v4float = OpTypeMatrix %v4float 4
%FooBar = OpTypeStruct %ulong %uint %mat4v4float
%63 = OpTypeFunction %FooBar %FooBar_natural
%71 = OpTypeFunction %mat4v4float %_MatrixStorage_float4x4natural
%_ptr_Function_float = OpTypePointer Function %float
%int_8 = OpConstant %int 8
%int_1 = OpConstant %int 1
%132 = OpTypeFunction %int %FooBar
%_ptr_PhysicalStorageBuffer_int = OpTypePointer PhysicalStorageBuffer %int
%_ptr_StorageBuffer_int = OpTypePointer StorageBuffer %int
%_runtimearr_int = OpTypeRuntimeArray %int
%RWStructuredBuffer = OpTypeStruct %_runtimearr_int
%_ptr_StorageBuffer_RWStructuredBuffer = OpTypePointer StorageBuffer %RWStructuredBuffer
%gl_GlobalInvocationID = OpVariable %_ptr_Input_v3uint Input
%sb = OpVariable %_ptr_StorageBuffer_StructuredBuffer StorageBuffer
%outputBuffer = OpVariable %_ptr_StorageBuffer_RWStructuredBuffer StorageBuffer

; Function computeMain
%computeMain = OpFunction %void None %3
%4 = OpLabel
%7 = OpLoad %v3uint %gl_GlobalInvocationID
%12 = OpBitcast %v3int %7
%tid = OpCompositeExtract %int %12 0
%outVal = OpFunctionCall %int %test %tid
%149 = OpBitcast %uint %tid
%151 = OpAccessChain %_ptr_StorageBuffer_int %outputBuffer %int_0 %149
OpStore %151 %outVal
OpReturn
OpFunctionEnd

; Function test
%test = OpFunction %int None %16
%val = OpFunctionParameter %int
%18 = OpLabel
%i = OpVariable %_ptr_Function_int Function
%res = OpVariable %_ptr_Function_uint Function
%26 = OpVariable %_ptr_Function_v4float Function
%38 = OpBitcast %uint %val
%40 = OpIEqual %bool %val %int_100000
OpStore %i %int_0
OpStore %res %uint_0
OpBranch %27
%27 = OpLabel
OpLoopMerge %31 %37 None
OpBranch %28
%28 = OpLabel
OpBranch %29
%29 = OpLabel
%53 = OpAccessChain %_ptr_StorageBuffer_FooBar_natural %sb %int_0 %38
%58 = OpLoad %FooBar_natural %53
%61 = OpFunctionCall %FooBar %unpackStorage_0 %58
%103 = OpCompositeExtract %mat4v4float %61 2
%104 = OpCompositeExtract %v4float %103 0
OpStore %26 %104
%106 = OpLoad %int %i
%108 = OpAccessChain %_ptr_Function_float %26 %106
%109 = OpLoad %float %108
%110 = OpBitcast %uint %109
%111 = OpLoad %int %i
%112 = OpIMul %int %111 %int_8
%114 = OpShiftLeftLogical %uint %110 %112
%115 = OpLoad %uint %res
%res_0 = OpBitwiseOr %uint %115 %114
%117 = OpLoad %int %i
%i_0 = OpIAdd %int %117 %int_1
%120 = OpSLessThan %bool %i_0 %int_4
OpSelectionMerge %36 None
OpBranchConditional %120 %36 %30
%30 = OpLabel
OpBranch %31
%31 = OpLabel
OpBranch %32
%32 = OpLabel
OpSelectionMerge %35 None
OpBranchConditional %40 %34 %33
%33 = OpLabel
OpStore %res %res_0
OpBranch %35
%34 = OpLabel
%127 = OpAccessChain %_ptr_StorageBuffer_FooBar_natural %sb %int_0 %38
%128 = OpLoad %FooBar_natural %127
%129 = OpFunctionCall %FooBar %unpackStorage_0 %128
%130 = OpFunctionCall %int %FooBar_load %129
%140 = OpBitcast %uint %130
OpStore %res %140
OpBranch %35
%35 = OpLabel
%143 = OpLoad %uint %res
%144 = OpBitcast %int %143
OpReturnValue %144
%36 = OpLabel
OpStore %i %i_0
OpStore %res %res_0
OpBranch %37
%37 = OpLabel
OpBranch %27
OpFunctionEnd

; Function unpackStorage_0
%unpackStorage_0 = OpFunction %FooBar None %63
%64 = OpFunctionParameter %FooBar_natural
%65 = OpLabel
%66 = OpCompositeExtract %ulong %64 0
%67 = OpCompositeExtract %uint %64 1
%68 = OpCompositeExtract %_MatrixStorage_float4x4natural %64 2
%69 = OpFunctionCall %mat4v4float %unpackStorage %68
%101 = OpCompositeConstruct %FooBar %66 %67 %69
OpReturnValue %101
OpFunctionEnd

; Function unpackStorage
%unpackStorage = OpFunction %mat4v4float None %71
%72 = OpFunctionParameter %_MatrixStorage_float4x4natural
%73 = OpLabel
%74 = OpCompositeExtract %_arr_v4float_int_4 %72 0
%75 = OpCompositeExtract %v4float %74 0
%76 = OpCompositeExtract %float %75 0
%77 = OpCompositeExtract %float %75 1
%78 = OpCompositeExtract %float %75 2
%79 = OpCompositeExtract %float %75 3
%80 = OpCompositeExtract %v4float %74 1
%81 = OpCompositeExtract %float %80 0
%82 = OpCompositeExtract %float %80 1
%83 = OpCompositeExtract %float %80 2
%84 = OpCompositeExtract %float %80 3
%85 = OpCompositeExtract %v4float %74 2
%86 = OpCompositeExtract %float %85 0
%87 = OpCompositeExtract %float %85 1
%88 = OpCompositeExtract %float %85 2
%89 = OpCompositeExtract %float %85 3
%90 = OpCompositeExtract %v4float %74 3
%91 = OpCompositeExtract %float %90 0
%92 = OpCompositeExtract %float %90 1
%93 = OpCompositeExtract %float %90 2
%94 = OpCompositeExtract %float %90 3
%95 = OpCompositeConstruct %v4float %76 %77 %78 %79
%96 = OpCompositeConstruct %v4float %81 %82 %83 %84
%97 = OpCompositeConstruct %v4float %86 %87 %88 %89
%98 = OpCompositeConstruct %v4float %91 %92 %93 %94
%99 = OpCompositeConstruct %mat4v4float %95 %96 %97 %98
OpReturnValue %99
OpFunctionEnd

; Function FooBar_load
%FooBar_load = OpFunction %int None %132
%this = OpFunctionParameter %FooBar
%134 = OpLabel
%135 = OpCompositeExtract %ulong %this 0
%137 = OpConvertUToPtr %_ptr_PhysicalStorageBuffer_int %135
%138 = OpLoad %int %137 Aligned 4
OpReturnValue %138
OpFunctionEnd

from slang.

bmillsNV avatar bmillsNV commented on June 21, 2024

@sriramm-nv looks like a GLV issue. Can you help to track?

from slang.

sriramm-nv avatar sriramm-nv commented on June 21, 2024

We have a bug to track this. Do we have any pending work in slang that @ArielG-NV is working on?

from slang.

csyonghe avatar csyonghe commented on June 21, 2024

There is nothing we can do inside Slang regarding this issue.

from slang.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.