Giter Club home page Giter Club logo

wsl's Introduction

W3C GPU for the Web Community Group

WebGPU logo

This is the repository for the W3C GPU for the Web Community Group WebGPU API and WebGPU Shading Language (WGSL) specifications. This specification is formally standardized by the W3C GPU for the Web Working Group.

We use the wiki and issue tracker as the main sources of information related to the work. This repository will hold the actual specification, examples, etc.

Work-in-progress specification: https://gpuweb.github.io/gpuweb/

Work-in-progress WGSL specification: https://gpuweb.github.io/gpuweb/wgsl/

Charter

The charter for this group is maintained in a separate repository.

Membership

Membership in the Community Group is open to anyone. We especially encourage hardware vendors, browser engine developers, 3d software engineers and any Web Developers with expertise in graphics to participate. You'll need a W3C account to join, and if you're affiliated with a W3C member, your W3C representative will confirm your participation. If you're not a W3C member, you're still welcome. All participants are required to agree to the Contributor License Agreement.

Contributions

You are not required to be a member of the Community Group or Working Group in order to file issues, errors, fixes or make suggestions. Anyone with a GitHub account can do so.

In order to assure that WebGPU specifications can be implemented on a Royalty-Free (RF) basis, all significant contributions need to be made with RF commitments. Members of the Working Group, and members of the Community Group who have signed the Final Specification Agreement have already committed to the terms of the W3C Patent Policy. Non-members will be requested to provide an RF commitment under terms similar to the W3C Patent Policy.

All contributions must comply with the group's contribution guidelines.

See CONTRIBUTING.md for technical guidance on contributing.

Code of Conduct

This group operates under W3C's Code of Conduct Policy.

Communication

Our primary public chat channel is via Matrix (what is matrix?) at #WebGPU:matrix.org.

For asynchronous concerns, we use GitHub for both our issue tracker and our discussions forum.

Both the Community Group and the Working Group have W3C email lists as well, though these are largely administrative.

wsl's People

Contributors

dj2 avatar grorg avatar kungfooman avatar litherum avatar qanat avatar robinmorisset avatar saambarati avatar thomasdenney avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wsl's Issues

Investigation: Subgroup Support in WHLSL

See 0. Subgroup support in shading languages mostly overlap but SPIR-V has functions without correspondence in other languages. In this article, only mutual functions are inspected. Tables are divided according to feature categories specified in 0 but risks for unified consideration (as SIMD-group in MacOS and quad-group in iOS) of SIMD-group and quad-group in MSL should be further investigated.

Existing Overlap Work

DirectX Shader Compiler which supports SPIR-V backend for HLSL has an overlap work for HLSL to SPIR-V mapping 1 which we can use as a clue for HLSL to GLSL mapping.

Tables

WebGPUSubgroupFeatureBits::BASIC

Proposed Function HLSL Function 2 MSL Function (SIMD-group / quad-group) 3 GLSL Function 4
u32 subgroupSize(void) uint WaveGetLaneCount(void) [[threads_per_simdgroup]] / 4 gl_SubgroupSize
u32 subgroupThreadIndex(void) uint WaveGetLaneIndex(void) [[simdgroup_index_in_threadgroup]] / [[thread_index_in_quadgroup]] gl_SubgroupInvocationID
bool subgroupElect(void) bool WaveIsFirstLane(void) bool simd_is_first(void) / bool quad_is_first(void) bool subgroupElect(void)

WebGPUSubgroupFeatureBits::VOTE

Proposed Function HLSL Function 2 MSL Function (SIMD-group / quad-group) 3 GLSL Function 4
bool subgroupAll(bool value) bool WaveActiveAllTrue(bool expr) bool simd_all(bool expr) / bool quad_all(bool expr) bool subgroupAll(bool value)
bool subgroupAny(bool value) bool WaveActiveAnyTrue(bool expr) bool simd_any(bool expr) / bool quad_any(bool expr) bool subgroupAny(bool value)

WebGPUSubgroupFeatureBits::BALLOT

Proposed Function HLSL Function 2 MSL Function (SIMD-group / quad-group) 3 GLSL Function 4
u128 subgroupBallot(bool value) uint4 WaveActiveBallot(bool expr) simd_vote simd_ballot(bool expr) / quad_vote quad_ballot(bool expr) uvec4 subgroupBallot(bool value)
type subgroupBroadcast(type value, u32 threadIndex) <type> WaveReadLaneAt(<type> expr, uint laneIndex) T simd_broadcast(T data,
 ushort broadcast_lane_id) / T quad_broadcast(T data,
 ushort broadcast_lane_id) genType subgroupBroadcast(genType value, uint id)
type subgroupBroadcastFirst(type value) <type> WaveReadLaneFirst(<type> expr) T simd_broadcast_first(T data) / T quad_broadcast_first(T data) genType subgroupBroadcastFirst(genType value)

WebGPUSubgroupFeatureBits::ARITHMETIC

Proposed Function HLSL Function 2 MSL Function (SIMD-group / quad-group) 3 GLSL Function 4
type subgroupAdd(type value) <type> WaveActiveSum(<type> expr) T simd_sum(T data) / T quad_sum(T data) genType subgroupAdd(genType value)
type subgroupMul(type value) <type> WaveActiveProduct(<type> expr) T simd_product(T data) / T quad_product(T data) genType subgroupMul(genType value)
type subgroupMin(type value) <type> WaveActiveMin(<type> expr) T simd_min(T data) / T quad_min(T data) genType subgroupMin(genType value)
type subgroupMax(type value) <type> WaveActiveMax(<type> expr) T simd_max(T data) / T quad_max(T data) genType subgroupMax(genType value)
type subgroupAnd(type value) <type> WaveActiveBitAdd(<type> expr) T simd_and(T data) / T quad_and(T data) genType subgroupAnd(genType value)
type subgroupOr(type value) <type> WaveActiveBitOr(<type> expr) T simd_or(T data) / T quad_or(T data) genType subgroupOr(genType value)
type subgroupXor(type value) <type> WaveActiveBitXor(<type> expr) T simd_xor(T data) / T quad_xor(T data) genType subgroupXor(genType value)
type subgroupPrefixExclusiveAdd(type value) <type> WavePrefixSum(<type> expr) T simd_prefix_exclusive_sum(T data) / T quad_ prefix_exclusive_sum(T data) genType subgroupExclusiveAdd(genType value)
type subgroupPrefixExclusiveMul(type value) <type> WavePrefixProduct(<type> expr) T simd_ prefix_exclusive_product(T data) / T quad_ prefix_exclusive_product(T data) genType subgroupExclusiveMul(genType value)

WebGPUSubgroupFeatureBits::QUAD

Proposed Function HLSL Function 2 MSL Function (SIMD-group / quad-group) 3 GLSL Function 4
type subgroupQuadBroadcast(type value, u32 threadIndex) <type> QuadReadLaneAt(<type> expr, uint laneIndex) T quad_broadcast(T data,
 ushort broadcast_lane_id) / T quad_broadcast(T data,
 ushort broadcast_lane_id) genType subgroupQuadBroadcast(genType value, uint id)

References

Memory model

We should have something to say about the memory model

[WHLSL] Dynamic function generation affects further compilation

Migrated from https://bugs.webkit.org/show_bug.cgi?id=189988:

At 2018-09-26T05:03:18Z, [email protected] wrote:
The dynamic creation of operator==() in Checker causes later checks to succeed or fail. For example, if you cause Checker to generate native bool operator==(int* thread,int* thread) and native bool operator==(int* thread* thread,int* thread* thread), later calls with (nullptr, nullptr) will fail because of an ambiguous overload.

Bounding undef-ness

During the F2F last week, Philip raised the issue of how to bound the
variability of an undef value. The troublesome scenario is:

    int i = some_undef_value();
    a[i];  // How to do the bounds check on this?

A naive translation including the bounds-check is:

    int i = some_undef_value();
    if (i < 0 || i >= a_bound) { return fail; }
    a[i];

But each use of an undef value can see a different concrete value, so
upon lowering to the machine and executing the code you could get the
following (uses of i mapping down to 12, 2, and 900 respectively):

    int i = undef;
    if (12 < 0 || 2 >= a_bound) { return fail; }
    a[900]; // Out of bounds!  You have been pown'd.

This is a good problem to solve.

The same problem occurs in digitial circuit design where a circuit block
needs to capture a digital signal from an asyncrhonous input. You
want all uses within your block to use a consistent value (for a given
clock cycle). The "undef" behaviour is analogous to sampling that
asynchronous signal at each use within your block, which leads to
inconsistent values due to differing signal propagation delays to reach
those uses. The standard solution is to sample the asynchronous input
exactly once (with one register in the receiving clock domain), and
then distribute the output of that register to all intended uses within
your block.

(I'm skipping over the metastability management aspect to achieve a
target MTBF. That's not relevant here.)

How does that translate back to the compiler-for-an-ISA domain?
The "undef" behaviour is supposed to give the compiler stack the
freedom to reuse registers for different program (SSA) values without
having to clear the register first or save/spill the register when no
longer needed. The capture behaviour requires the whole compiler stack
to preserve the captured value either within a register or by spilling,
whatever is appropriate.

So, to safely do the bounds check on the array index, you have to
capture the index and use the same captured value for both the bounds
check and the access. In our example, the compiler has to do something
like:

    int i = some_undef_value();
    int i_as_index = CAPTURE(i);  // A new compiler primitive.
    if (i_as_index < 0 || i_as_index >= a_bound) { return fail; }
    a[i_as_index];

The semantics of CAPTURE(x) is:

  • if x is undef, then result is some value for typeof(x). The result
    value is not an undef value. That is, all uses of the result will
    see the same value. In concrete terms for common architectures, the
    value is preserved in a register or spilled as needed.

  • if x is not an undef, then the result is x.

Each implementation can have its own particular implementation of the CAPTURE
primitive. For example, a crude one could be to map each instance of CAPTURE
to its own variable and executing CAPTURE(X) does a volatile store to the
variable, and uses of the value are volatile loads.

References

[1] "Synchronization in Digital Logic Circuits" (tutorial) Ryan Donohue.
https://web.stanford.edu/class/ee183/handouts/synchronization_pres.pdf

[2] Understanding Metastability in FPGAs (Intel Whitepaper).
https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/wp/wp-01082-quartus-ii-metastability.pdf

[WHLSL] It shouldn’t be possible to use ternary expressions as l-values

Migrated from https://bugs.webkit.org/show_bug.cgi?id=189290:

At 2018-09-05T00:32:16Z, [email protected] wrote:
Ternary expressions can be used as l-values:

(someCondition ? a : b) = 42;

It should be possible to take the address of any l-value, however the following doesn’t work:

int a = 42;
int b = 43;
thread int* c = (someCondition ? a : b);

We should either disallow taking the address of a ternary expression, improve the current error message “Bad address space: undefined”, or support this.

Shader Modularity

Metal Shading Language

Metal Shading Language has "Function Constants," (Section 4.10 of the Metal Shading Language 2.1 spec) where you can say something like

constant int a [[function_constant(0)]];

and then there's a two-phase preparation of the shader:

  1. Compile the source of the shader by calling MTLDevice.makeLibrary(options:) where you specify the preprocessor macros
  2. Specialize the shader by specifying MTLLibrary.makeFunction(constantValues:) where you specify the function constants

GLSL

Similarly, GLSL (and SPIR-V) have "Specialization Constants," (Section 7.2.1 of the OpenGL 4.6 Core spec, and section 4.11 of the GLSL 4.6 spec) where you can say something like

layout (constant_id = 0) const int a = 42;

and then there's a three-phase preparation of the shader:

  1. Specify preprocessor macros by manually generating strings of the form "#define foo 5" and injecting them into the GLSL source string at the right place
  2. glShaderSource() which presumably does some compilation
  3. glSpecializeShader()

GLSL (but not SPIR-V?) also has a concept of subroutines (Section 7.10 of the OpenGL 4.6 Core spec, and section 6.1.2 of the GLSL 4.6 spec), which are just like specialization constants, except for functions. You can say

subroutine float MySignature(float a, float b);
subroutine(MySignature) float foo(float a, float b) { ... }
subroutine(MySignature) float bar(float a, float b) { ... }
subroutine uniform MySignature myUniformName;
... 
float r = myUniformName(3.3, 4.4);

and the OpenGL API hooks up one of foo()/bar to myUniformName. Notably, this happens after compilation, and the value for myUniformName could change each frame / draw call.

HLSL

AFAICT, HLSL doesn't have anything like function constants, specialization constants, or subroutines, but instead has preprocessor macros. You can specify these as arguments to D3DPreprocess().


Language Generics / Templates Specialization Constants Subroutines Preprocessor Macros Polymorphism
HLSL
GLSL
SPIR-V
MSL
WHLSL

These constant values lead to better performance than regular constants because they can cause dead code to be removed before the shader ever hits the GPU. This is important for ubershaders, where most of the code will end up being removed.

WHLSL does not include a preprocessor because of the additional complexity it brings. Similarly, on the last WebGPU call where we discussed shading languages, we agreed to remove the ability for user-defined structs/functions to accept type arguments.

The big game engines (Unity, Unreal, etc.) often don't have a single shader to run; instead, they often have families of related shaders. For example, an engine's shader might describe a forward-rendering algorithm, but leave the BRDF and lighting model up to the specific app linking with it. Generics, specialization constants, and preprocessor macros are all ways of making this easier.

Given that HLSL has been successful without specialization constants, perhaps they aren't necessary. On the other hand, WHLSL doesn't have the mechanism that people usually use to specialize HLSL shaders (preprocessor macros). GLSL has both specialization constants and subroutines, but GLSL ES has removed both of those features, so perhaps this entire problem isn't very important. We should figure out what to do here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.