pygfx / pyshader Goto Github PK

View Code? Open in Web Editor NEW

71.0 6.0 1.0 721 KB

Write modern GPU shaders in Python!

License: BSD 2-Clause "Simplified" License

Python 98.43% JavaScript 1.57%

gpu shaders spirv python

pyshader's Issues

Trace debug info so we can produce more useful error messages.

Would be especially nice to show the Python source line when unable to convert it to spirv.

Syntax for defining input and output

Introduction

Shader entrypoints have IO:

input: Builtin inputs
output: Builtin outputs
input: Per-vertex data input (vertex shader only)
output: Per-vertex output (vertex shader only)
input: Per-fragment data input, interpolated from the above (fragment shader only)
resource: Buffers (r/w)
resource: Textures (r/w resource)
resource: Samplers (r/w resource)
Specialization constants (constants set post-compile-time)

References:

How do we want to spell these out? There's always three and sometimes more "attributes" to an i/o variable:

Its name within the function
Its binding slot (or builtin name)
Its type
(its offset)

A few options

In the below examples, I write the triangle vertex shader, and a (hypothetical) compute shader that calculates the average of three values (from a buffer) and writes it back to a buffer.

The current version

In the current approach, you call define on input / output, passing name, slot, type. The actual value is then accessible as an attribute of the input or output object.

Cons:

I will definitely forget the order of the define arguments.
May be annoying that you'd need another line to give a variable a short name.
I find having strings in a shader slightly inelegant.

@python2shader
def vertex_shader(input, output):

    input.define("index", "VertexId", i32)
    output.define("pos", "Position", vec4)
    output.define("color", 0, vec3)

    positions = [vec2(+0.0, -0.5), vec2(+0.5, +0.5), vec2(-0.5, +0.7)]
    p = positions[input.index]

    output.pos = vec4(p, 0.0, 1.0)
    output.color = vec3(p, 0.5)

@python2shader
def compute_shader(input, buffers):
    input.define("offset", "SomeBuiltinIndex", i32)
    buffers.define("pos3dArray", 0, Array(vec3)  # read
    buffers.define("avgArray", 1, Array(f32))  # write

    pos3d = buffers.pos3dArray[input.offset]
    buffers.avgArray[offset] = (pos3d.x + pos3d.y + pos3d.z) / z

Using indexing

This can also be written using indexing. Indexing works well here because it can be used as a getter and a setter. You also assign directly to the local name, which is nice.

Cons:

Users can do output[0, vec3] = vec3(p, 0.5), which makes the code less readable because it moves binding definitions to the bottom of the shader, and there is no local variable name to add semantics. We can tell users to avoid it but ...
I find having strings in a shader slightly inelegant.

@python2shader
def vertex_shader(input, output):

    index = input["VertexId", i32]
    outpos = output["Position", vec4]
    outcolor = output[0, vec3]

    positions = [vec2(+0.0, -0.5), vec2(+0.5, +0.5), vec2(-0.5, +0.7)]
    p = positions[index]

    outpos.xyzw = vec4(p, 0.0, 1.0)
    outcolor.rgb = vec3(p, 0.5)


@python2shader
def compute_shader(input, buffers):
    offset = input["SomeBuiltinIndex", i32]
    pos3dArray = buffers[0, Array(vec3)]
    avgArray = buffers[1, Array(f32)]

    pos3d = pos3dArray[offset]
    avgArray[offset] = (pos3d.x + pos3d.y + pos3d.z) / 3

Put definitions in the function signature - as object

This looks quite Pythonic, and opens up the possibility to define io binding outside of the function.

Cons:

Outputs bindings being specified as input args may feel a bit weird.
Users may also expect this to work outcolor = ..., but we should not. I think we should raise an error for this syntax.
Need extra classes for Input etc.

@python2shader
def vertex_shader(
    index: Input("VertexId", i32),
    outpos: Output("Position", vec4),
    outcolor: Output(0, vec3),
):
    positions = [vec2(+0.0, -0.5), vec2(+0.5, +0.5), vec2(-0.5, +0.7)]
    p = positions[index]

    outpos.xyzw = vec4(p, 0.0, 1.0)
    outcolor.rgb = vec3(p, 0.5)


@python2shader
def compute_shader(
    offset: Input("SomeBuiltinIndex", i32),
    pos3dArray: BufferIO(0, Array(vec3)),
    avgArray: BufferIO(1, Array(f32)),
):
    pos3d = pos3dArray[offset]
    avgArray[offset] = (pos3d.x + pos3d.y + pos3d.z) / 3

Put definitions in the function signature - as 3-tuple

Compared to the above, we don't need additional classes, so it feels more "lightweight", but it may be harder to remember the order of the elements.

@python2shader
def vertex_shader(
    index: ("input", "VertexId", i32),
    outpos: ("output", Position", vec4),
    outcolor: ("output", 0, vec3),
):
    positions = [vec2(+0.0, -0.5), vec2(+0.5, +0.5), vec2(-0.5, +0.7)]
    p = positions[index]

    outpos.xyzw = vec4(p, 0.0, 1.0)
    outcolor.rgb = vec3(p, 0.5)


@python2shader
def compute_shader(
    offset: ("input", "SomeBuiltinIndex", i32),
    pos3dArray: ("buffer", 0, Array(vec3)),
    avgArray: ("buffer", 1, Array(f32)),
):
    pos3d = pos3dArray[offset]
    avgArray[offset] = (pos3d.x + pos3d.y + pos3d.z) / 3

Put definitions in the function signature - as 2-tuple

Also no classes, but make it somewhat "easier" by combining the resource type with the resource binding. This feels kinda natural, I think?

@python2shader
def vertex_shader(
    index: ("input:VertexId", i32),
    outpos: ("output:Position", vec4),
    outcolor: ("output:0", vec3),
):
    positions = [vec2(+0.0, -0.5), vec2(+0.5, +0.5), vec2(-0.5, +0.7)]
    p = positions[index]

    outpos.xyzw = vec4(p, 0.0, 1.0)
    outcolor.rgb = vec3(p, 0.5)


@python2shader
def compute_shader(
    offset: ("input:SomeBuiltinIndex", i32),
    pos3dArray: ("buffer:0", Array(vec3)),
    avgArray: ("buffer:1", Array(f32)),
):
    pos3d = pos3dArray[offset]
    avgArray[offset] = (pos3d.x + pos3d.y + pos3d.z) / 3

Option to spell co_select in Python?

In #35 we added control flow. One ideas was to let ternary operations (.. if .. else ..) be translated to co_select: Choose between two objects depending on the truth value of another object. This enables if-like behavior without introducing branching. Anyway, that idea failed because it turned out too complex/iffy to reliably detect the use of ternary operations from the Python bytecode.

I think for the time being we leave it as is. I recall that some years ago ppl went to great lengths trying to avoid branching in shaders. I'm not sure how relevant that is with the current hardware.

If it turns out to still be relevant, we could:

Have another go at using ternary ops?
Introduce a simple function select(condition, ob_if_true, ob_if_false)
... something else?

Is this worth it?

The idea for PyShader is based on the following advantages:

People can write in a language they are familiar with, and use the same tools for e.g. linting. I still think this is true, but writing shaders requires a very specific way of thinking anyway. Is the familiar spelling worth it?
Re-using code is easier, because you can simply call out to other (Python) functions. This can be done because the compiler understands the code and recognizes function calls, which can be used to detect these functions in the same code, to compile these too. However, there are other cases of code-reuse, more in the range of templating, that are need to make writing shaders not a pain.
No need to pre-compile GLSL, or require users to install a compiler toolchain. This advantage is removed with WGSL and Naga.
It's simply cool that you can write a shader in Python. This still holds :)

Disadvantages:

Having yet another shading language can make things harder. E.g. to re-use shaders from elsewhere you need to rewrite in Python. Ask a question about shaders on SO?
Maintenance. Python bytecode is not well defined. We need to reverse-engineer things for every new Python version. And there's pypy too.

I think that for pygx we need to consider WGSL too.

Archive this repo

Allow creating modules with multiple entry points?

Or ... maybe not. GLSL modules have one entry point, so we'd exclude compiling to GLSL. At the least wait until we know if WSL supports this too.

Pro's:

Better code-reused at the GPU, so a tiny GPU memory consumption benefit?
Might be nice if it can be used to pack a vertex and fragment shader into a single module. But not sure if it can?
...

Implement more Python syntax

[FEATURE] Add support for layout local size(s)

It would be very useful if it's possible to define layout as part of the function definition. It could be something like:

@python2shader
def compute_shader_multiply(
        index: ("input", "GlobalInvocationId", ivec3),
        data1: ("buffer", 0, Array(f32)),
        data2: ("buffer", 1, Array(f32)),
        data3: ("buffer", 2, Array(f32)),
        layout=[x,y,z]):
    i = index.x
    data3[i] = data1[i] * data2[i]

Be more consistent about exception types raised by the parsers

First release to PyPi

Can we make a first release?

We could then add requirements.txt files to the examples in https://github.com/almarklein/wgpu-py and start working towards tests for all the GUI backends.

Is our use of annotations ok?

We use annotations to specify shader type information using tuples, e.g.:

@python2shader
def vertex_shader(
    index: ("input", "VertexId", i32),
    out_pos: ("output", "Position", vec4),
    out_color: ("output", 0, vec3),
):
    ...

Unfortunately, pyflakes (which we use via flake8), reports stuff like:

F821 undefined name 'VertexId'
F821 undefined name 'output'
...

Which is why I added F821 to the ignore list, but I realized much later that this hides all occurrences of an undefined name, which is one of those crucial errors that you want to detect beforehand. Woops.

Googling for this, I bumped into some pyflakes issues, which basically state that this is intended behavior. Their argument refers to a section in PEP 563:

While annotations are still available for arbitrary use besides type checking, it is worth mentioning that the design of this PEP, as well as its precursors (PEP 484 and PEP 526), is predominantly motivated by the type hinting use case.
[...]
With this in mind, uses for annotations incompatible with the aforementioned PEPs should be considered deprecated.

It is not clear to me whether annotations "conflict" with those PEPs, and whether how we use them is now considered deprecated.

This all would be a non-issue if pyflakes would use different error codes for undefined name in annotations :(

Support shaders with jumps in bytecode >255 bytes

In Python bytecode, the value of a jump is encoded in the bytecode, meaning it can have a value of at most 255. To work around this, Python bytecode has OP_EXTENDED_ARG to specify more bytes for an upcoming instruction. We should be able to deal with these.

Implicit type conversions

Pyhon implicitly converts ints to floats when one of the operands is a float, or when the op is division. Should we follow this approach or enforce strict types?

I think the former because it makes the code feel more Pythonic, and I don't think it will cause much confusion as long as we stick to only doing it with int/float.

Support for Python 3.9

Obviously, the bytecode emitted by Python 3.9 has changed somewhat again, so we need to update ...

Document math / built-in functions

Edit: Given this has been fixed, and works correctly with the stdlib functions, the issue can be focused mainly on documenting these functions as they are currently not documented.

I am currently trying to implement the machine learning logistic regression algorithm below (as explained in this blog post). Most of the shader can be restructured to some of the limitations (such as removing the functions), but it seems that the current blocker is that the log(...) function is not supported. It would be quite useful if built-in shader functions could be used in the pyshader function.

The GLSL shader being implemented is below:

#version 450

layout (constant_id = 0) const uint M = 0;

layout (local_size_x = 1) in;

layout(set = 0, binding = 0) buffer bxi { float xi[]; };
layout(set = 0, binding = 1) buffer bxj { float xj[]; };
layout(set = 0, binding = 2) buffer by { float y[]; };
layout(set = 0, binding = 3) buffer bwin { float win[]; };
layout(set = 0, binding = 4) buffer bwouti { float wouti[]; };
layout(set = 0, binding = 5) buffer bwoutj { float woutj[]; };
layout(set = 0, binding = 6) buffer bbin { float bin[]; };
layout(set = 0, binding = 7) buffer bbout { float bout[]; };
layout(set = 0, binding = 8) buffer blout { float lout[]; };

float m = float(M);

float sigmoid(float z) {
    return 1.0 / (1.0 + exp(-z));
}

float inference(vec2 x, vec2 w, float b) {
    // Compute the linear mapping function
    float z = dot(w, x) + b;
    // Calculate the y-hat with sigmoid
    float yHat = sigmoid(z);
    return yHat;
}

float calculateLoss(float yHat, float y) {
    return -(y * log(yHat)  +  (1.0 - y) * log(1.0 - yHat));
}

void main() {
    uint idx = gl_GlobalInvocationID.x;

    vec2 wCurr = vec2(win[0], win[1]);
    float bCurr = bin[0];

    vec2 xCurr = vec2(xi[idx], xj[idx]);
    float yCurr = y[idx];

    float yHat = inference(xCurr, wCurr, bCurr);

    float dZ = yHat - yCurr;
    vec2 dW = (1. / m) * xCurr * dZ;
    float dB = (1. / m) * dZ;
    wouti[idx] = dW.x;
    woutj[idx] = dW.y;
    bout[idx] = dB;

    lout[idx] = calculateLoss(yHat, yCurr);
}

Enable using other functions in an entry point

Add support for runtime constants (specialization)?

SpirV supports specialization: setting constants at runtime, can be useful?

Make Python compiler work consistent

CI is ready to test this. Now for some tests and fixes.

pypi package is broken because readme is not included in the package.

WebGPU shading language (WGSL)

https://gpuweb.github.io/gpuweb/wgsl.html

Well, there it is! Do we need to do anything in response?

Project scope

I have doubts whether the current scope makes much sense:

The raw bytes2spirv and file2spirv feel silly.
The glsl compiler functionality need the sdk tools, so should really only be used by devs in a pre-build step, and then this friendly API is not really needed; calling glslc in a subprocess is then the easy part.
By far the most code is about the Python2spirv compiler. And this is code is also the main motivation for this package.
There is an odd mix of functions that work everywhere and functions that need the vulkan sdk to be installed. The latter are intended for devs and should probably not end up in user code.
I also fear that with the current scope we may shift towards eventually packing the vulkan sdk ... and I don't want to go there ...

I think I want to refactor this to be:

A python to (our own) bytecode compiler.
A bytecode to spirv compiler.
A single decorator that can be used on a Python function to create a shader in end-user code.

Then when WebGPU becomes a thing we may add, if it makes sense:

A WSL to bytecode "front end".
An output format other than Spirv (or subset?).

Compute example crashes in create_compute_pipeline

(edit by @Korijn - transferred this issue here from wgpu-py repo)

I am trying to run the compute_noop.py example and get the following error:

$ RUST_BACKTRACE=1 python compute_noop.py
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Other', /Users/runner/runners/2.166.2/work/1/s/wgpu/wgpu-core/src/device/mod.rs:1895:17
stack backtrace:
   0: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
   1: core::fmt::write
   2: std::io::Write::write_fmt
   3: std::panicking::default_hook::{{closure}}
   4: std::panicking::default_hook
   5: std::panicking::rust_panic_with_hook
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::result::unwrap_failed
   9: wgpu_core::device::<impl wgpu_core::hub::Global<G>>::device_create_compute_pipeline
  10: ffi_call_unix64
  11: ffi_call_int
  12: cdata_call
  13: _PyObject_MakeTpCall
  14: call_function
  15: _PyEval_EvalFrameDefault
  16: _PyEval_EvalCodeWithName
  17: _PyFunction_Vectorcall
  18: method_vectorcall
  19: call_function
  20: _PyEval_EvalFrameDefault
  21: _PyEval_EvalCodeWithName
  22: _PyFunction_Vectorcall
  23: call_function
  24: _PyEval_EvalFrameDefault
  25: _PyEval_EvalCodeWithName
  26: PyEval_EvalCode
  27: PyRun_FileExFlags
  28: PyRun_SimpleFileExFlags
  29: Py_RunMain
  30: pymain_main
  31: Py_BytesMain
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
fatal runtime error: failed to initiate panic, error 5
Abort trap: 6

How could I debug this further? I am on Mac btw. Any ideas?

Compiling to file

Is it possible to precompile shaders to files with the current pyshader API? Or would that still need to happen at runtime because it depends on drivers & hardware?

Support some form of templating of entry points?

When I started this, I thought that with allowing code to call other (python-defined) functions (#8) would solve all problems that we normally see for composing shaders from different snippets. It does solve the issue of being able to use common code in different places, in a very natural way.

However, quite often, you have a shader that you want to behave slightly slightly differently depending on the type of some inputs. E.g. Whether a texture is scalar, RGB or RGBA, whether texture coordinates are 1D, 2D or 3D. Or depending on the texture dtype. In pygfx we deal with that last issue (texture dtype) by modifying the functions signature in-place, which is obviously a bit of a hack. The only other current solution is to create a shader for each type of each input, but that can quickly result in many entry points that all have (nearly) the same code.

I'd like a more formal way to write a shader and being able to tweak inputs (e.g. dimensionality and dtype of textures, and changing vec2 for vec3 for texture coords). Along that line, a way to call a function inside a shader, and being able to swap-out that function for another one. Of course, after swapping things out, the bytecode will be regenerated, and in that process all the types and signatures are validated to match up.

Error processing resulting SPIR-V shader in Vulkan 1.2.x (Kompute v0.4.2)

Great project, thank you for creating it. I'm currently working on using it to integrate with our Vulkan Kompute project specifically through the Kompute Python SDK.

The initial basic example I have to showcase the workflow is a basic multiplication operation as follows:

from kp import Tensor, Manager, Sequence
from pyshader import python2shader, f32, ivec3, Array

# Define simple multiplication shader
@python2shader
def compute_shader_multiply(index: ("input", "GlobalInvocationId", ivec3),
                            data1: ("buffer", 0, Array(f32)),
                            data2: ("buffer", 1, Array(f32)),
                            data3: ("buffer", 2, Array(f32))):
    i = index.x
    data3[i] = data1[i] * data2[i]

# Create tensors that we'll be using
tensor_in_a = Tensor([2, 2, 2])
tensor_in_b = Tensor([1, 2, 3])
tensor_out = Tensor([0, 0, 0])

# Default manager (chooses device 0 and first compute capable queue)
mgr = Manager()

# Initialise the GPU memory & buffers
mgr.eval_tensor_create_def([tensor_in_a, tensor_in_b, tensor_out])

# Run the compute shader
mgr.eval_algo_data_def([tensor_in_a, tensor_in_b, tensor_out], compute_shader_multiply.to_spirv())

# Map the data back to local 
mgr.eval_tensor_sync_local_def([tensor_out])

# Confirm successful operation
assert tensor_out.data() == [2.0, 4.0, 6.0]

Unfortunately when running this, it seems to fail when creating the Vulkan Pipeline, which seems to be due to an error on the shader structure, namely the error is Error reading file: #, which seems to happen if the shader doesn't have the expected structure.

When introspecting the shader I don't think it's possible to specify things like the version of the shader, if you look at the current shader that works correctly (this is the glsl code), the main difference is that it contains the #version 450 definition.

Have you come across this issue before? I would be quite keen to get this working, as I'm planning to write a blog post similar to the ones outlined in the end to end examples and integration with this would make it fully pythonic, so I'd be keen to do it using this library.

Thanks, let me know if you need further details to get more insights on what may be the issue.

Edit: If you would like to try running the example above, there is a slight fix required that is currently in the branch python_shader_extension, which would require running pip install . from that branch to install the kompute Python package.

pygfx / pyshader Goto Github PK

pyshader's Issues

Introduction

A few options

The current version

Using indexing

Put definitions in the function signature - as object

Put definitions in the function signature - as 3-tuple

Put definitions in the function signature - as 2-tuple

Recommend Projects

Recommend Topics

Recommend Org