Giter Club home page Giter Club logo

Comments (28)

Ono-Sendai avatar Ono-Sendai commented on September 25, 2024 1

Again, what does wuffs_base__cpu_arch__have_x86_sse42 return, for both alpha.3 and alpha.4?

I think I have done enough remote debugging for now sorry. I think you need a windows build machine :)

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

Thanks for the bug report. I don't run Windows or VS myself. Are you able to bisect a few Wuffs versions to see which commit caused the slow down?

If you're using wuffs-v0.4.c then there's only a handful of relevant commits (and f1698226 is the most recent):

f1698226 (tag: v0.4.0-alpha.4) release: wuffs gen -version=0.4.0-alpha.4
5c693c9d (tag: v0.4.0-alpha.3) release: wuffs gen -version=0.4.0-alpha.3
3eb4e3a6 (tag: v0.4.0-alpha.2) release: wuffs gen -version=0.4.0-alpha.2
6381a325 (tag: v0.4.0-alpha.1) release: wuffs gen -version=0.4.0-alpha.1

That's pretty coarse-grained though. If you have the time, I'd find it more helpful if you can bisect using wuffs-unsupported-snapshot.c instead. The two endpoints (below) are about 343 commits apart, so a bisect should take 9 cuts.

f1698226 (tag: v0.4.0-alpha.4) release: wuffs gen -version=0.4.0-alpha.4
a27a7598 base: disable SIMD on 32-bit (not 64-bit) x86
...
9c03b8c7 wuffsfmt: double-indents hanging lines
d910658b (tag: v0.3.3) wuffs gen -version=0.3.3

Also, what compiler flags are you using?

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

Tangentially, if this is your code:

https://github.com/glaretechnologies/glare-core/blob/2c7174ca95c68bfc4cea982ec12ac60db0bfd61d/graphics/PNGDecoder.cpp#L276

https://github.com/glaretechnologies/glare-core/blob/2c7174ca95c68bfc4cea982ec12ac60db0bfd61d/graphics/PNGDecoder.cpp#L322

Then an explicit "Convert from BGR to RGB" shouldn't be necessary, Instead, I think that you can change line 276 from this:

use_pixelformat = WUFFS_BASE__PIXEL_FORMAT__BGR

to this:

use_pixelformat = WUFFS_BASE__PIXEL_FORMAT__RGB

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

I'd also like to know whether any or these #ifdefs trigger for your VS compiler flags:

#if defined(_M_IX86)
#if defined(_M_X64)
#if defined(__AVX__)

from wuffs.

Ono-Sendai avatar Ono-Sendai commented on September 25, 2024

Tangentially, if this is your code:

https://github.com/glaretechnologies/glare-core/blob/2c7174ca95c68bfc4cea982ec12ac60db0bfd61d/graphics/PNGDecoder.cpp#L276

https://github.com/glaretechnologies/glare-core/blob/2c7174ca95c68bfc4cea982ec12ac60db0bfd61d/graphics/PNGDecoder.cpp#L322

Then an explicit "Convert from BGR to RGB" shouldn't be necessary, Instead, I think that you can change line 276 from this:

use_pixelformat = WUFFS_BASE__PIXEL_FORMAT__BGR

to this:

use_pixelformat = WUFFS_BASE__PIXEL_FORMAT__RGB

Yeah that's my code.
If I use WUFFS_BASE__PIXEL_FORMAT__RGB on line 276 then I get
"Error: base: unsupported pixel swizzler option" on PngSuite-2013jan13/tbbn3p08.png

In addition, despite being relatively unoptimised code, my BGRA->RGBA swizzling seems faster than Wuffs:

wuffs 0.3, WUFFS_BASE__PIXEL_FORMAT__RGB:
-------------------------------------
doDecodeFromBufferWithWuffs took 9.1046 ms
doDecodeFromBufferWithWuffs took 8.9896 ms
doDecodeFromBufferWithWuffs took 9.0422 ms


wuffs 0.3, WUFFS_BASE__PIXEL_FORMAT__BGR and BGRA->RGBA swizzling myself:
-------------------------------------
doDecodeFromBufferWithWuffs took 8.9411 ms
doDecodeFromBufferWithWuffs took 8.8994 ms
doDecodeFromBufferWithWuffs took 8.9056 ms
doDecodeFromBufferWithWuffs took 8.9171 ms

from wuffs.

Ono-Sendai avatar Ono-Sendai commented on September 25, 2024

I'd also like to know whether any or these #ifdefs trigger for your VS compiler flags:

#if defined(_M_IX86)
#if defined(_M_X64)
#if defined(__AVX__)

Just _M_X64. I'm not building with AVX.

from wuffs.

Ono-Sendai avatar Ono-Sendai commented on September 25, 2024
wuffs 0.4, alpha 1
------------------
doDecodeFromBufferWithWuffs took 8.8656 ms
doDecodeFromBufferWithWuffs took 8.7829 ms
doDecodeFromBufferWithWuffs took 8.8434 ms

wuffs 0.4, alpha 2
------------------
doDecodeFromBufferWithWuffs took 8.8657 ms
doDecodeFromBufferWithWuffs took 8.8763 ms
doDecodeFromBufferWithWuffs took 8.7356 ms

wuffs 0.4, alpha 3
------------------
doDecodeFromBufferWithWuffs took 8.6924 ms
doDecodeFromBufferWithWuffs took 8.6936 ms
doDecodeFromBufferWithWuffs took 8.7011 ms

wuffs 0.4, alpha 4
------------------
doDecodeFromBufferWithWuffs took 12.211 ms
doDecodeFromBufferWithWuffs took 12.057 ms
doDecodeFromBufferWithWuffs took 12.018 ms

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

I suspect that the slow-down is due to SIMD code no longer being used.

Comparing v0.4.0-alpha.3 and v0.4.0-alpha.4, do any of these #ifdefs trigger?

#if defined (WUFFS_BASE__CPU_ARCH__X86_FAMILY)
#if defined (WUFFS_BASE__CPU_ARCH__X86_64)

Also, there's a wuffs_base__cpu_arch__have_x86_sse42 function. What does it return for you (again, comparing alpha.3 and alpha.4):

static inline bool  //
wuffs_base__cpu_arch__have_x86_sse42(void) {
  etc;
}

I'm not building with AVX.

Out of curiosity (I'm not familiar with MSVC / Visual Studio), both wuffs-v0.3.c and wuffs-v0.4.c have this line:

#pragma message("Wuffs with MSVC+IX86/X64 needs /arch:AVX for best performance")

Did you notice that at all, when building Wuffs?

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

I suspect that the slow-down is due to SIMD code no longer being used.

That might not be true, though. Fortunately, there's not that many commits between v0.4.0-alpha.3 and v0.4.0-alpha.4. Are you able to switch over from wuffs-v0.4.c to wuffs-unsupported-snapshot.c and bisect these commits:

f1698226 (tag: v0.4.0-alpha.4) release: wuffs gen -version=0.4.0-alpha.4
a27a7598 base: disable SIMD on 32-bit (not 64-bit) x86
4209264a cgen: disable SIMD on 32-bit (not 64-bit) x86
fdd08200 test/3pdata: add note re XZ backdoor test files
d8c3deb0 aux: remove DynIOBuffer::allocated
00a01abc base: avoid some (ptr + len) when ptr is NULL
d7f6b81d base: have empty_foo() functions use NULL ptr
2142aa65 cgen: fix iterate's "ptr + len" when ptr is NULL
8ed5d132 example/gifplayer: tweak try_allocate workbuf len
a9a6d8ac base: make malloc_slice_u8(etc, 0) return empty
f32bfe95 std/crc64: optimize for x86+SSE4.2
5c693c9d (tag: v0.4.0-alpha.3) release: wuffs gen -version=0.4.0-alpha.3

from wuffs.

Ono-Sendai avatar Ono-Sendai commented on September 25, 2024

I suspect that the slow-down is due to SIMD code no longer being used.

Comparing v0.4.0-alpha.3 and v0.4.0-alpha.4, do any of these #ifdefs trigger?

#if defined (WUFFS_BASE__CPU_ARCH__X86_FAMILY)
#if defined (WUFFS_BASE__CPU_ARCH__X86_64)

WUFFS_BASE__CPU_ARCH__X86_64 is not defined for me.


Also, there's a `wuffs_base__cpu_arch__have_x86_sse42` function. What does it return for you (again, comparing alpha.3 and alpha.4):

static inline bool //
wuffs_base__cpu_arch__have_x86_sse42(void) {
etc;
}


> I'm not building with AVX.

Out of curiosity (I'm not familiar with MSVC / Visual Studio), both `wuffs-v0.3.c` and `wuffs-v0.4.c` have this line:

#pragma message("Wuffs with MSVC+IX86/X64 needs /arch:AVX for best performance")


Did you notice that at all, when building Wuffs?

Yes, and it's kind of annoying :)

Looking at the code, the issue seems to be

#if defined(__AVX__) || defined(__clang__)

preventing SSE from being used.

As mentioned before I'm not building with AVX.

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

WUFFS_BASE__CPU_ARCH__X86_64 is not defined for me.

To be clear, are you saying "WUFFS_BASE__CPU_ARCH__X86_64 is not defined for me" for just v0.4.0-alpha.4 or for both v0.4.0-alpha.3 and v0.4.0-alpha.4?

Again, what does wuffs_base__cpu_arch__have_x86_sse42 return, for both alpha.3 and alpha.4?


#pragma message("Wuffs with MSVC+IX86/X64 needs /arch:AVX for best performance")

Did you notice that at all, when building Wuffs?

Yes, and it's kind of annoying :)

Yeah, it's not ideal, but I don't know how to make it better.

Wuffs ships as a "single file C library". And in gcc or clang, code can opt-in to "compile me with SIMD enabled" via an __attribute__ on a function by function basis, by the library writer (me). But if I understand correctly, for Visual Studio, enabling SIMD is on a file by file basis (e.g. the /arch:AVX compile-time switch), by the library user (you). (Yeah, technically, by "file" I mean "translation unit".)

So, for VS, I'd like the single file C library to work out of the box (even if, by default, it's leaving significant performance on the table), and it does, but the wuffs-v0.4.c file itself can't self-configure VS to SIMD-enable the SIMD-using functions, and the #pragma message was the least worst mechanism I could think of to try to raise awareness for VS-using programmers.


Sort of tangential to the original post, but as you're concerned about PNG decode performance: if your CPUs are less than 10 years old, then I'm curious how that "8.4 milliseconds" number changes if you do pass /arch:AVX, on either Wuffs alpha.3 and alpha.4.

from wuffs.

Ono-Sendai avatar Ono-Sendai commented on September 25, 2024

So, for VS, I'd like the single file C library to work out of the box (even if, by default, it's leaving significant performance on the table), and it does, but the wuffs-v0.4.c file itself can't self-configure VS to SIMD-enable the SIMD-using functions, and the #pragma message was the least worst mechanism I could think of to try to raise awareness for VS-using programmers.

Maybe you could add a preprocessor option to suppress the warning? something like
WUFFS_NO_MISSING_AVX_WARNING

from wuffs.

Ono-Sendai avatar Ono-Sendai commented on September 25, 2024

Sort of tangential to the original post, but as you're concerned about PNG decode performance: if your CPUs are less than 10 years old, then I'm curious how that "8.4 milliseconds" number changes if you do pass /arch:AVX, on either Wuffs alpha.3 and alpha.4.

AVX not set
---------------------------------------------
doDecodeFromBufferWithWuffs took 8.8381 ms
doDecodeFromBufferWithWuffs took 8.7475 ms
doDecodeFromBufferWithWuffs took 8.6410 ms
doDecodeFromBufferWithWuffs took 8.6766 ms


With Advanced Vector Extensions 2 (X86/X64) (/arch:AVX2)
--------------------------------------------------------
doDecodeFromBufferWithWuffs took 8.6368 ms
doDecodeFromBufferWithWuffs took 8.6676 ms
doDecodeFromBufferWithWuffs took 8.5720 ms
doDecodeFromBufferWithWuffs took 8.6433 ms

from wuffs.

Ono-Sendai avatar Ono-Sendai commented on September 25, 2024

WUFFS_BASE__CPU_ARCH__X86_64 is not defined for me.

To be clear, are you saying "WUFFS_BASE__CPU_ARCH__X86_64 is not defined for me" for just v0.4.0-alpha.4 or for both v0.4.0-alpha.3 and v0.4.0-alpha.4?

It's not defined for me using v0.4.0-alpha.4. Not sure about alpha 3.

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

I think I have done enough remote debugging for now sorry. I think you need a windows build machine :)

OK, but in that case, I don't expect this bug to be fixed any time soon. Sorry.

from wuffs.

Ono-Sendai avatar Ono-Sendai commented on September 25, 2024

Isn't it clear what the issue is? The SSE code is only being used when AVX is defined.

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

For Wuffs + Visual Studio, "SSE code is only being used when AVX is defined" was true for Wuffs v0.3, v0.4.0-alpha.3 and v0.4.0-alpha.4. All three versions have the same #if defined(__AVX__) || defined(__clang__) guard.

At least, I think it's the same across all three versions, and that's why I asked you previously if you could confirm that (for older versions not just v0.4.0-alpha.4).

If so, "SSE only when AVX defined" doesn't explain why performance regressed between v0.3 and v0.4, or between v0.4.0-alpha.3 and v0.4.0-alpha.4.

I don't think it's clear yet what the issue is.

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

But also, I don't think "SSE code is only being used when AVX is defined" has an obvious fix. SSE isn't a single thing, it's at least six different things: SSE, SSE2, SSE3, SSSE3 and SSE4.1 and SSE4.2. If all you have is _M_X64 then, IIUC (if I understand correctly), you automatically have the first two extensions enabled, SSE and SSE2, but the later four extensions are not mandatory.

On the other hand, Wuffs "SSE code" (in both version 0.3 and 0.4) uses intrinsics like _mm_shuffle_epi8 that are part of SSSE3 (and therefore IIUC not automatically enabled by _M_X64 alone). If you're on Visual Studio and you want Wuffs' SIMD code enabled then you (the library user, not me the library writer) are going to have to configure or define something CPU-arch specific, along the lines of /arch:AVX.

from wuffs.

Ono-Sendai avatar Ono-Sendai commented on September 25, 2024

There's a few things going on here I think.
The first is an issue with wuffs (all versions):

WUFFS_BASE__CPU_ARCH__X86_FAMILY and are only defined when __AVX__ or __clang__ is defined (under MSVC).
This is incorrect.

The second thing that is going on is that I was working around this (I guess you could call it) by defining WUFFS_BASE__CPU_ARCH__X86_FAMILY myself before including the wuffs c file. This workaround stopped working well in wuffs 0. 4 alpha 4 when WUFFS_BASE__CPU_ARCH__X86_FAMILY became used less. (in fact not used at all)

from wuffs.

Ono-Sendai avatar Ono-Sendai commented on September 25, 2024

But also, I don't think "SSE code is only being used when AVX is defined" has an obvious fix. SSE isn't a single thing, it's at least six different things: SSE, SSE2, SSE3, SSSE3 and SSE4.1 and SSE4.2. If all you have is _M_X64 then, IIUC (if I understand correctly), you automatically have the first two extensions enabled, SSE and SSE2, but the later four extensions are not mandatory.

On the other hand, Wuffs "SSE code" (in both version 0.3 and 0.4) uses intrinsics like _mm_shuffle_epi8 that are part of SSSE3 (and therefore IIUC not automatically enabled by _M_X64 alone). If you're on Visual Studio and you want Wuffs' SIMD code enabled then you (the library user, not me the library writer) are going to have to configure or define something CPU-arch specific, along the lines of /arch:AVX.

Well currently just including wuffs by default on windows x64 wouldn't even use SSE2.

Solution I think is to detect building on x64, then allow SSE1 and SSE2.
Then there's the question of supporting SSE3, 4.1, 4.2 etc. If wuffs can choose code at runtime then you can use CPUID based switching. Otherwise you could detect a preprocessor definition like __SSE4_1__.
MSVC doesn't have options for targetting SSE4.1, SSE4.2 in particular, so these are borrowed from the GCC definitions.

Or you could call it something like
WUFFS_USE_SSE_4_1

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

WUFFS_BASE__CPU_ARCH__X86_FAMILY and are only defined when __AVX__ or __clang__ is defined (under MSVC). This is incorrect.

I'd say more "poorly named" than "incorrect". WUFFS_BASE__APPLY_X86_SIMD_OPTIMIZATIONS_AND_YOU_CAN_ASSUME_ALL_OF_SSE_NOT_JUST_SSE1_AND_SSE2 is more accurate, but maybe not a better name.

In terms of "SIMD capability granularities" and not "which WUFFS macros trigger which SIMD code paths", I'm only looking at MSVC documentation (I'm not running MSVC myself) but for e.g. SSE4.1 code, not SSE1 or SSE2, https://learn.microsoft.com/en-us/cpp/build/reference/arch-x86?view=msvc-170 says you're going to need /arch:AVX. There is no /arch:SSE41 option.

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

I was working around this (I guess you could call it) by defining WUFFS_BASE__CPU_ARCH__X86_FAMILY myself before including the wuffs c file.

Ah, "defining it yourself" is a crucial bit of information that wasn't obvious from the earlier conversation. Having a second look at your graphics/PNGDecoder.cpp, I see where you're doing that.

That macro is a private implementation detail and it's not designed for library users to configure. Wuffs' documentation could certainly be better, but only the macros starting with WUFFS_CONFIG__ are the user-configuration knobs.

If you need an immediate workaround for v0.4.0-alpha.4 then add #define WUFFS_BASE__CPU_ARCH__X86_64 right next to where your graphics/PNGDecoder.cpp file says #define WUFFS_BASE__CPU_ARCH__X86_FAMILY. However, in an upcoming v0.4.0-alpha.5 the macros will probably be renamed to something like WUFFS_PRIVATE_IMPL__CPU_ARCH__BLAH_BLAH_BLAH to emphasize that you (the library user) are not supposed to #define it yourself.

I'll think about whether to introduce a WUFFS_CONFIG__MSVC__ASSUME_MINIMUM_CPU_ARCH__X86_64_V2 or similarly-named knob.

I wouldn't have done so in the past as I didn't know how MSVC would behave when you try to compile SSE4.2 intrinsics without also passing the /arch:AVX option. But from this bug report, it sounds like you're been running with that just fine.

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

If you're curious about why Wuffs' "SSE-family enabled" behavior changed, from depending on WUFFS_BASE__CPU_ARCH__X86_FAMILY to WUFFS_BASE__CPU_ARCH__X86_64, this fixed issue #145. Intel's SIMD documentation said that the _mm_cvtsi64_si128 intrinsic function was part of SSE2. And, as far as I could tell, part of SSE2 unconditionally. The gcc and clang compiler implementations would accept that when targeting 64-bit x86 but had compile errors when targeting 32-bit x86, even though 32-bit x86 can support both SSE2 and a uint64_t type.

Rather than dealing with any similar-to-#145 issues in the future, it seemed simpler for Wuffs to enable its SIMD code (both SSE-family and AVX) only on 64-bit x86, not both 32-bit and 64-bit x86. That involved the WUFFS_BASE__CPU_ARCH__X86_FAMILY macro but, as I said, that macro is supposed to be a private implementation detail.

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

Solution I think is to detect building on x64, then allow SSE1 and SSE2.

That won't help Wuffs' PNG performance. Here's a code snippet from wuffs_png__decoder__filter_4_distance_4_x86_sse42:

while (v_curr.ptr < i_end1_curr) {
  v_b128 = _mm_cvtsi32_si128((int32_t)(wuffs_base__peek_u32le__no_bounds_check(v_prev.ptr)));
  v_b128 = _mm_unpacklo_epi8(v_b128, v_z128);
  v_pa128 = _mm_sub_epi16(v_b128, v_c128);
  v_pb128 = _mm_sub_epi16(v_a128, v_c128);
  v_pc128 = _mm_add_epi16(v_pa128, v_pb128);
  v_pa128 = _mm_abs_epi16(v_pa128);
  v_pb128 = _mm_abs_epi16(v_pb128);
  v_pc128 = _mm_abs_epi16(v_pc128);
  v_smallest128 = _mm_min_epi16(v_pc128, _mm_min_epi16(v_pb128, v_pa128));
  v_p128 = _mm_blendv_epi8(_mm_blendv_epi8(v_c128, v_b128, _mm_cmpeq_epi16(v_smallest128, v_pb128)), v_a128, _mm_cmpeq_epi16(v_smallest128, v_pa128));
  v_x128 = _mm_cvtsi32_si128((int32_t)(wuffs_base__peek_u32le__no_bounds_check(v_curr.ptr)));
  v_x128 = _mm_unpacklo_epi8(v_x128, v_z128);
  v_x128 = _mm_add_epi8(v_x128, v_p128);
  v_a128 = v_x128;
  v_c128 = v_b128;
  v_x128 = _mm_packus_epi16(v_x128, v_x128);
  wuffs_base__poke_u32le__no_bounds_check(v_curr.ptr, ((uint32_t)(_mm_cvtsi128_si32(v_x128))));
  v_curr.ptr += 4;
  v_prev.ptr += 4;
}

That all works fine if you assume SSE4.2 (and earlier SSEs). But what if you only have SSE1 and SSE2? It turns out that e.g. the _mm_blendv_epi8 function in the middle of all that needs SSE4.1 so we can't use wuffs_png__decoder__filter_4_distance_4_x86_sse42 as is, if you only have SSE1 and SSE2.

Then there's the question of supporting SSE3, 4.1, 4.2 etc. If wuffs can choose code at runtime then you can use CPUID based switching.

Wuffs does choose code at run time. Furthermore, Wuffs' compiler (it takes in *.wuffs code and generates a C file) ensures at compile time that you can't call SIMD intrinsics (at run time) unless you've previously confirmed (at run time, via CPU ID) that your CPU is capable.

Currently, Wuffs' "can I call this __m128i-using SIMD intrinsic function" compile-time check is simple. Either all or none of the SSE extensions are allowed. Changing that from "all or none" to distinguishing a third "SSE1 and SSE2 only: _mm_unpacklo_epi8 (and hundreds of others) are allowed but _mm_blendv_epi8 (and hundreds of others) are not" is technically possible but more complicated. It has an opportunity cost and I'm not sure if it's even going to have much benefit since, as above, things like wuffs_png__decoder__filter_4_distance_4_x86_sse42 won't run as is in a "SSE1 and SSE2 only" mode and, on the other hand, the fraction of CPUs these days that are SSE2 but not SSE4.2 capable is surely very small.

from wuffs.

Ono-Sendai avatar Ono-Sendai commented on September 25, 2024

Yeah, I'm not interested in just targeting SSE2. Targeting SSE 4.2 is much more reasonable these days.

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

I have cut a new wuffs-v0.4.c. If you upgrade glare-core/graphics from wuffs-v0.3.c to the latest wuffs-v0.4.c then:

  • defining WUFFS_BASE__CPU_ARCH__X86_FAMILY no longer does anything, but also
  • SSE4.2 should now be enabled by default, on MSVC.
  • You can also #define WUFFS_CONFIG__I_KNOW_THAT_WUFFS_MSVC_PERFORMS_BEST_WITH_ARCH_AVX2 if you really don't like the "Wuffs with MSVC+IX86/X64 needs /arch:AVX for best performance" message.

If you try it, please let me know if you hit any problems (or performance regressions from v0.3). I don't have a Windows or MSVC install myself.

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

SSE4.2 should now be enabled by default, on MSVC.

This is possibly causing "Internal compiler error" problems with MSVC: see issue #151. Hmm...

from wuffs.

nigeltao avatar nigeltao commented on September 25, 2024

I've just rolled Wuffs v0.4.0-alpha.6, which rolls back to the v0.4.0-alpha.4 behavior where, on MSVC, SIMD is opt in (instead of v0.4.0-alpha.5's opt out).

You can still get ICE if you opt in, but at least you should no longer see #151's ICE by default.

#define WUFFS_CONFIG__ENABLE_MSVC_CPU_ARCH__X86_64_V2 (or _V3) to opt in.

from wuffs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.