Giter Club home page Giter Club logo

computeshaderfun's People

Contributors

pbbastian avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

computeshaderfun's Issues

Allow scan shader to run only on warp

#7 needs to do a scan of counts, but the counts will always be of size 16, thus there is no need to sum up between thread groups. Global sync is needed however, so it might be best to do in another dispatch rather than just stuffing in at the end of the count shader.

A different shader program using a variant of the same compute shader is probably the way to go.

Radix sort shader program

Depends on #2, #3, #4, #5, #8, #9.

Steps needed:

  • Generate histogram (#5) and scan (#6)
  • Count bitwise combinations and scan (#9)
  • Re-order input using the results of steps 1 and 2 to sort input by radix (#8)
  • Repeat 8 times to sort the entire input

All of these should be possible to dispatch right away.

Make scan and histogram shaders compatible

Options:

  1. Store histogram as AOS and make version of scan that operates on this structure.
  2. Store histogram as SOA and run a scan for each array using an offset per dispatch.

Option 1 has some advantages for smaller input sizes, as it can do all the scans at once. OTOH it increases register usage which is bad for performance. Larger input sizes should satisfy option 2 just fine, which could then be made to operate on multiple inputs from the same sequence, thus not increasing register usage as much.

Radix tree constructor

Create a compute shader for constructing a radix tree from a sorted integer buffer according to Karras2012.

Distance field combination

It might make sense to combine a BVH with a distance field in some way to lessen backtracking. E.g. put cells all around the bounding volume (so maybe 4x4 cells for each face) containing the distance to either the closest direct child bounding volume, or maybe closest distance to actual geometry. This could avoid situations where a box is a really bad approximation of the geometry. Might also be that it's just too impractical to be useful.

Process multiple items per thread in scan

At least 4 items should be processed, as that will make each read 128 bits which is the stride recommended by NVIDIA, although the exact number should be found through experimentation.

In the spirit of the NVIDIA paper on GPU scan, the adds another level to the hierarchy (they do also state that CUDPP does this): Intra-thread. This brings each thread group to 4096 items processed, meaning that 4096^2=16.777.216 items can be scanned using the usual 3 dispatches (scan, scan, group add). I believe that a good method for scaling to larger numbers will be to simply increase the number of items processed by each thread, rather than applying more scans. Doubling the number of items processed will quadruple the number of items that can be scanned in 3 dispatches.

Collect buffers into a single using "buffer views"

A buffer view should contain a buffer reference, an offset and a limit.

Using this it should be possible to manually manage everything in 1 buffer. It might however not make sense to do so, must be looked into.

Space optimization for histogram

Currently all histograms are generated at once. If n elements are processed, then the histogram has a size of n*16, which is not nice. Also, every thread writing into 16 separate parts of an array is not nice.

The thing is that the 16 histograms can be stored in the size of 1. The parts of a scanned histogram that will be read from, is the parts where a 1 was in the original histogram. Since 2 histograms cannot ever have a 1 in the same place, this means that we only need n to store the final scanned histogram.

Turn ray tracer into image effect

Currently implemented as an editor window. To profile more precisely we need to export as game. Also, this will make it easier to support looking around. I imagine there being a hotkey for turning off rasterization and rendering using ray tracing only. Should also show fairly huge performance increases for BVH, kinda proving that it is not terrible.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.