Giter Club home page Giter Club logo

Comments (11)

peterbourgon avatar peterbourgon commented on July 30, 2024

If I time.Sleep or runtime.Gosched after "the circuit should have opened by now", then it works as expected. I guess adding the buffer removed a synchronization point in your handler. Do you consider that a problem?

from hystrix-go.

afex avatar afex commented on July 30, 2024

hi, thanks for your interest in the project!

you've correctly identified the behavioral change here. with a buffered channel for metrics reporting, the associated health measurement is now asynchronous from the command execution which would "trip" the circuit. the buffer was added to the metrics channel based on the load tests i ran for #40

the tradeoff of synchronization for performance was acceptable to me but i am open to a conversation about it if you feel differently. specifically, that production systems have not lost their ability to quickly detect and react to external outages and that the added need for waiting during tests is reasonable. if you think i'm missing something important, please let me know!

from hystrix-go.

peterbourgon avatar peterbourgon commented on July 30, 2024

Unfortunately the time.Sleep or runtime.Gosched isn't guaranteed to make the test pass; the scheduler isn't required to schedule the circuit breaker goroutine at that yield. So the commit changes the circuit breaker's behavior from deterministic to probabilistic. I'm not sure if that's necessarily good or bad, but it does make your breaker unique among the implementations that Go kit provides ā€” the others being sony/gobreaker and streadway/handy/breaker.

From skimming #40, it looks like the performance improvement you got from using the buffered channel was just for the statsd collector. Is that right? If so, could you get the same benefit by making changes to the statsd collector instead?

from hystrix-go.

afex avatar afex commented on July 30, 2024

the change benefits all collectors equally, but the default collector is in-memory so we have more control about guaranteeing its speed. i'm going to play with the idea of adding a bit more structure around the default collecter since it serves as the source of truth during tests. possibly having it be synchronous and any other registered collectors be async.

from hystrix-go.

afex avatar afex commented on July 30, 2024

before i began investigating changes to the library, i attempted to replicate your findings by executing your example program using my local checkout. i found that be7b59e also exhibited the failure behavior you shared for 5b79165. (building your code into a "gokit_bugreport" binary and using while ./gokit_bugreport; do :; done for execution)

i think there may be a misunderstanding about how the channel is being processed. when the channel is unbuffered, the only guarantee go provides us is that the sending code will block until the message has been read, not until the processing for the message is completed. so it wasn't ever really deterministic, and the question isn't whether we need to wait, but how long we should wait before checking circuit health.

can you reproduce my findings on your end?

from hystrix-go.

peterbourgon avatar peterbourgon commented on July 30, 2024

No, I can't reproduce that. Here's my terminal session. Are you removing the built package between checkouts?

from hystrix-go.

afex avatar afex commented on July 30, 2024

looking at your session output it appears you are only running the program once. try using the while loop i describe to run the program continuously until it exits 1.

running the binary instead of go run speeds up the loop considerably. in my tests using this method, a failed run happens almost immediately.

from hystrix-go.

peterbourgon avatar peterbourgon commented on July 30, 2024

No, I'm sorry, I still can't reproduce the failure on be7b59e. Here's the test run as you like it. I let it run about a minute.

ugh ~/tmp/hystr  cd $GOPATH/src/github.com/afex/hystrix-go ; git checkout be7b59e ; cd -
HEAD is now at be7b59e... Merge branch 'master' into loadtest
ugh ~/tmp/hystr rm -rf $GOPATH/pkg/darwin_amd64/github.com/afex/hystrix-go
ugh ~/tmp/hystr go build
ugh ~/tmp/hystr date ; while test $status -eq 0 ; ./hystr 2>/dev/null ; end ; date
Tue Aug 18 09:38:47 CEST 2015
^Cfish: Job 2, './hystr 2>/dev/null ; ' terminated by signal SIGINT (Quit request from job control (^C))
ugh ~/tmp/hystr

Here's the test run with the other commit. It fails immediately.

ugh ~/tmp/hystr cd $GOPATH/src/github.com/afex/hystrix-go ; git checkout 5b79165 ; cd -
Previous HEAD position was be7b59e... Merge branch 'master' into loadtest
HEAD is now at 5b79165... use buffered client for statsd collector to reduce TotalDuration overhead
ugh ~/tmp/hystr rm -rf $GOPATH/pkg/darwin_amd64/github.com/afex/hystrix-go
ugh ~/tmp/hystr go build
ugh ~/tmp/hystr date ; while test $status -eq 0 ; ./hystr 2>/dev/null ; end ; date
Tue Aug 18 09:40:25 CEST 2015
Tue Aug 18 09:40:25 CEST 2015
ugh ~/tmp/hystr

For completeness sake, I ran the same test on a Debian Linux VPS. Same results. Here's be7b59e running for a long time with no failures.

husserl ~/tmp/hystr cd $GOPATH/src/github.com/afex/hystrix-go ; git checkout be7b59e ; cd -
Note: checking out 'be7b59e'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at be7b59e... Merge branch 'master' into loadtest
husserl ~/tmp/hystr go build
husserl ~/tmp/hystr rm -rf $GOPATH/pkg/*/github.com/afex/hystrix-go
husserl ~/tmp/hystr rm -rf $GOPATH/pkg/linux_amd64/github.com/afex/hystrix-go
husserl ~/tmp/hystr go build
husserl ~/tmp/hystr date ; while test $status -eq 0 ; ./hystr 2>/dev/null ; end ; date
Tue Aug 18 03:50:02 EDT 2015
^CāŽ

And here's 5b79165 failing immediately.

husserl ~/tmp/hystr cd $GOPATH/src/github.com/afex/hystrix-go ; git checkout 5b79165 ; cd -
Previous HEAD position was be7b59e... Merge branch 'master' into loadtest
HEAD is now at 5b79165... use buffered client for statsd collector to reduce TotalDuration overhead
husserl ~/tmp/hystr rm -rf $GOPATH/pkg/linux_amd64/github.com/afex/hystrix-go
husserl ~/tmp/hystr go build
husserl ~/tmp/hystr date ; while test $status -eq 0 ; ./hystr 2>/dev/null ; end ; date
Tue Aug 18 03:51:07 EDT 2015
Tue Aug 18 03:51:07 EDT 2015
husserl ~/tmp/hystr

And just so we're clear, here's the example.go I'm using as my test.

from hystrix-go.

afex avatar afex commented on July 30, 2024

ok, i believe i found how we are getting different results. i'm assuming you are using go 1.4, while i'm running with 1.5. in 1.5, GOMAXPROCS defaults to the number of cores on the system. if you run your setup with GOMAXPROCS > 1, you'll see the failure i mention. when i used GOMAXPROCS=1 and either 1.4.2 or 1.5, i see the same behavior you do: no failures with be7b59e.

i think this difference in behavior highlights my previous point, and i'd like to hear your thoughts on the matter.

specifically, when using a unbuffered channel (as in be7b59e) go only guarantees that the send will block until the message is received, irrespective of processing done with that message. the command/circuit logic will continue while the metrics are being incremented and the command response does not wait for this to finish before returning to the caller. therefore it is invalid to assume that the metrics for a command execution will be finished processing when that command completes.

using only a single core enables some level of predictability to this processing, but there is no deterministic behavior at a language level that says the metric will be processed by the time the error is returned to the caller. for example, think of the scenario where many new metrics are tracked and the increments become (say, 1ms) slower. there is nothing in be7b59e which guarantees the behavior your test depends on.

in short, i'm saying that this implementation has always been probabilistic. this is my understanding of how go handles communication over channels. does it match yours?

from hystrix-go.

peterbourgon avatar peterbourgon commented on July 30, 2024

(sigh) Yep, I should have seen this. Thanks for your investigation. I'll manage it in my unit test.

from hystrix-go.

afex avatar afex commented on July 30, 2024

thanks for taking the time to back-and-forth this with me!

from hystrix-go.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.