Giter Club home page Giter Club logo

Comments (8)

Viq111 avatar Viq111 commented on June 25, 2024

Thanks for the report! I'll try to provision an arm64 machine this week and rerun that test.
Just to make sure, you tested without any changes on the 1.x branch right (commit 4b8fdba ) ?
(I'm asking because upstream facebook/zstd released 1.4.7 recently that had some issues with arm64)

from zstd.

pterjan avatar pterjan commented on June 25, 2024

Yes this is using https://github.com/DataDog/zstd/releases/tag/v1.4.5%2Bpatch1

I made some more tests and I didn't get any failure using "taskset -a -c 0-15 go test" so the 32 cores may be a problem rather than arm64.

from zstd.

pterjan avatar pterjan commented on June 25, 2024

Yes, I can confirm, I could not reproduce on a 32 vcpu x86_64 VM, but could on a 64 vcpu one:

 go test
--- FAIL: TestStreamCompressionDecompressionParallel (0.00s)
    --- FAIL: TestStreamCompressionDecompressionParallel/#100 (0.05s)
        zstd_stream_test.go:115: Did not read the same ��C != Hello World!
FAIL
exit status 1
FAIL	_/home/pterjan/co/cauldron/golang-github-datadog-zstd/BUILD/zstd-1.4.5-patch1	2.631s

from zstd.

Viq111 avatar Viq111 commented on June 25, 2024

Thanks for the pointers!

I was able to reproduce on a aws c5.9xl (36 cores):

go version go1.15.7 linux/amd64
base:~/zstd (1.x✘) ᐅ go test -run TestStreamCompressionDecompressionParallel -count 100
--- FAIL: TestStreamCompressionDecompressionParallel (0.00s)
    --- FAIL: TestStreamCompressionDecompressionParallel/#101 (0.01s)
        zstd_stream_test.go:115: Did not read the same �D != Hello World!
--- FAIL: TestStreamCompressionDecompressionParallel (0.00s)
    --- FAIL: TestStreamCompressionDecompressionParallel/#169 (0.02s)
        zstd_stream_test.go:115: Did not read the same 	���K�	� != Hello World!
FAIL
exit status 1
FAIL	_/home/viq111/zstd	3.005s

However testing previous versions:

base:~/zstd (1.x✘) ᐅ eval "$(~/bin/gimme 1.14)"
go version go1.14 linux/amd64
base:~/zstd (1.x✘) ᐅ go test -run TestStreamCompressionDecompressionParallel -count 100
PASS
ok  	_/home/viq111/zstd	1.873s
base:~/zstd (1.x✘) ᐅ eval "$(~/bin/gimme 1.15)"
go version go1.15 linux/amd64
base:~/zstd (1.x✘) ᐅ go test -run TestStreamCompressionDecompressionParallel -count 100
PASS
ok  	_/home/viq111/zstd	1.842s
base:~/zstd (1.x✘) ᐅ eval "$(~/bin/gimme 1.15.2)"
go version go1.15.2 linux/amd64
base:~/zstd (1.x✘) ᐅ go test -run TestStreamCompressionDecompressionParallel -count 500
PASS
ok  	_/home/viq111/zstd	9.037s
base:~/zstd (1.x✘) ᐅ eval "$(~/bin/gimme 1.15.3)"
go version go1.15.3 linux/amd64
base:~/zstd (1.x✘) ᐅ go test -run TestStreamCompressionDecompressionParallel -count 500
PASS
ok  	_/home/viq111/zstd	8.497s
base:~/zstd (1.x✘) ᐅ eval "$(~/bin/gimme 1.15.4)"
go version go1.15.4 linux/amd64
base:~/zstd (1.x✘) ᐅ go test -run TestStreamCompressionDecompressionParallel -count 100
--- FAIL: TestStreamCompressionDecompressionParallel (0.00s)
    --- FAIL: TestStreamCompressionDecompressionParallel/#165 (0.00s)
        zstd_stream_test.go:115: Did not read the same �D != Hello World!
FAIL
exit status 1
FAIL	_/home/viq111/zstd	2.961s

Seems like it starts failing the tests with high CPU cores with Go version >= 1.15.4
(either a bug was introduced or the failure case happen more frequently)

I've tried updating the vendored C zsts code here:

base:~/zstd (1.x✘) ᐅ git checkout viq111/zstd1.4.8
Switched to branch 'viq111/zstd1.4.8'
Your branch is up-to-date with 'origin/viq111/zstd1.4.8'.
base:~/zstd (viq111/zstd1.4.8✘) ᐅ go test -run TestStreamCompressionDecompressionParallel -count 100
--- FAIL: TestStreamCompressionDecompressionParallel (0.00s)
    --- FAIL: TestStreamCompressionDecompressionParallel/#167 (0.00s)
        zstd_stream_test.go:115: Did not read the same �D != Hello World!
FAIL
exit status 1
FAIL	_/home/viq111/zstd	2.973s

and it still fails so something from the Go side is happening.
I'll try to dig deeper but if you have any insights into the go runtime, I'll gladly take the help

from zstd.

evanj avatar evanj commented on June 25, 2024

I think I may have figured out how to reproduce this, and maybe how to fix it. First up: I'm going to submit a change to include a test that fails consistently, then I'll figure out if my "fix" is right ...

from zstd.

evanj avatar evanj commented on June 25, 2024

Wow this was an amazing adventure. My proposed fix is in #91. If you want to read more about how this bug happens, see https://github.com/evanj/cgouintptrbug . I have wasted FAR too much time on this, but I learned a lot!

from zstd.

pterjan avatar pterjan commented on June 25, 2024

Thanks for the investigation, the fix and the great explanation!

from zstd.

Viq111 avatar Viq111 commented on June 25, 2024

Releases v1.4.5+patch2 & v1.4.8 now have this bug fixed.
Thanks for the report!

from zstd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.