Giter Club home page Giter Club logo

convectionkernels's Introduction

ConvectionKernels

These are the stand-alone texture compression kernels for Convection Texture Tools (CVTT), you can embed these in other applications. https://github.com/elasota/cvtt

The CVTT codecs are designed to get very high quality at good speed by leveraging effective heuristics and a SPMD-style design that makes heavy use of SIMD ops and 16-bit math.

Compressed texture format support:

  • BC1 (DXT1): Complete
  • BC2 (DXT3): Complete
  • BC3 (DXT5): Complete
  • BC4: Complete
  • BC5: Complete
  • BC6H: Experimental
  • BC7: Complete
  • ETC1: Complete
  • ETC2 RGB: Complete
  • ETC2 RGBA: Complete
  • ETC2 with punchthrough alpha: Complete
  • 11-bit EAC: Experimental
  • PVRTC: Not supported
  • ASTC: Not supported

Basic usage

Include "ConvectionKernels.h"

Depending on the input format, blocks should be pre-packed into one of the PixelBlock structures: PixelBlockU8 for unsigned LDR formats (BC1, BC2, BC3, BC7, BC4U, BC5U), PixelBlockS8 for signed LDR formats (BC4S, BC5S), and PixelBlockF16 for HDR formats (BC6H). The block pixel order is left-to-right, top-to-bottom, and the channel order is red, green, blue, alpha.

BC6H floats are stored as int16_t in the pixel block structure, which should be bit-cast from the 16-bit float input. Converting other float precisions to 16-bit is outside of the scope of the kernels.

Create an Options structure and fill it out:

  • flags: A bitwise OR mask of one of cvtt::Flags, which enable or disable various features.
  • threshold: The alpha threshold for encoding BC1 with alpha test. Any alpha value lower than than the threshold will use transparent alpha.
  • redWeight: Red channel relative importance
  • blueWeight: Blue channel relative importance
  • alphaWeight: Alpha channel relative importance

For some modes, you must pass an encoding plan, which controls how the encoder will behave. You should NOT attempt to initialize the encoding plan yourself, either use a default-initialized encoding plan (which will run at maximum quality), or use ConfigureBC7EncodingPlanFromQuality or ConfigureBC7EncodingPlanFromFineTuningParams to configure a lower-quality encoding plan. Configuring an encoding plan is somewhat slow and you should only do it once per encode job.

Once you've done both of those things, call the corresponding encode function to digest the input blocks and emit output blocks.

VERY IMPORTANT: The encode functions must be given a list of cvtt::NumParallelBlocks blocks, and will emit cvtt::NumParallelBlocks output blocks. If you want to encode fewer blocks, then you must pad the input structure with unused block data, and the output buffer must still contain enough space.

ETC compression

The ETC encoders require significantly more temporary data storage than the other encoders, so the storage must be allocated before using the encoders.

To allocate the temporary data:

  • Create an allocation function compatible with cvtt::Kernels::allocFunc_t, which accepts a context pointer and byte size and returns a buffer of at least that size. The returned buffer must be byte-aligned for SIMD usage (i.e. 16 byte alignment on Intel).
  • Use the AllocETC1Data or AllocETC2Data functions, pass the allocation function and a context pointer, which will be passed back to the allocation function.

To release the temporary data:

  • Create a free function compatible with cvtt::Kernels::freeFunc_t, which accepts a context pointer, a pointer to the buffer allocated by the allocation func, and the original size.
  • Use the ReleaseETC1Data or ReleaseETC2Data functions, pass the original compression data structure returned by the allocation function, and the free function.

Once allocated, the compression data can be reused over multiple calls to the encode functions, and depending on architecture, can usually be used by a different thread than the one that allocated it, as long as multiple encode functions are not using it at once.

convectionkernels's People

Contributors

daft-freak avatar elasota avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

convectionkernels's Issues

Use include <cmath>

thirdparty/cvtt/ConvectionKernels_ETC.cpp:3134:55: error: no member named 'sqrt' in namespace 'std'
const float lengthRatio = static_cast(std::sqrt(ca0LengthSq / ca1UNLengthSq));
~~~~~^

Please use #include <cmath>.

BC6H decoder seems to produce broken results

I'm trying to use cvtt::Internal::BC6HComputer::UnpackOne directly, to decode a BC6H (unsigned) image block by block. With various other decompression libraries (DirectXTex and others) everything seems to work, but cvtt seems to produce "something entirely else". Happens with all BC6H files I tried so far, on both Mac (clang 13) & Windows (vs2022).

Attached the source file (IndoorHDRI001.dds), what is the expected decoding result (with DirectXTex in this case), IndoorHDRI001-dxtex.hdr and what I get with cvtt, IndoorHDRI001-convection.hdr. Github does not allow attaching dds/hdr directly, so they are all in a zip file.

IndoorHDRI001.zip

My usage is basically like this:

cvtt::PixelBlockF16 rgba;
for (int i = 0; i < height; i += 4)
{
    for (int j = 0; j < width; j += 4)
    {
        cvtt::Internal::BC6HComputer::UnpackOne(rgba, src, format == DDSFormat::BC6HS);
        src += 16;
        for (int c = 0; c < 16; ++c) {
            float* dst_f = (float*)(dst + (i*width+j)*12 + width*(c/4)*12 + (c&3)*12);
            dst_f[0] = half_to_float_fast5(rgba.m_pixels[c][0]);
            dst_f[1] = half_to_float_fast5(rgba.m_pixels[c][1]);
            dst_f[2] = half_to_float_fast5(rgba.m_pixels[c][2]);
        }
    }
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.