Comments (11)
If I time.Sleep or runtime.Gosched after "the circuit should have opened by now", then it works as expected. I guess adding the buffer removed a synchronization point in your handler. Do you consider that a problem?
from hystrix-go.
hi, thanks for your interest in the project!
you've correctly identified the behavioral change here. with a buffered channel for metrics reporting, the associated health measurement is now asynchronous from the command execution which would "trip" the circuit. the buffer was added to the metrics channel based on the load tests i ran for #40
the tradeoff of synchronization for performance was acceptable to me but i am open to a conversation about it if you feel differently. specifically, that production systems have not lost their ability to quickly detect and react to external outages and that the added need for waiting during tests is reasonable. if you think i'm missing something important, please let me know!
from hystrix-go.
Unfortunately the time.Sleep or runtime.Gosched isn't guaranteed to make the test pass; the scheduler isn't required to schedule the circuit breaker goroutine at that yield. So the commit changes the circuit breaker's behavior from deterministic to probabilistic. I'm not sure if that's necessarily good or bad, but it does make your breaker unique among the implementations that Go kit provides ā the others being sony/gobreaker and streadway/handy/breaker.
From skimming #40, it looks like the performance improvement you got from using the buffered channel was just for the statsd collector. Is that right? If so, could you get the same benefit by making changes to the statsd collector instead?
from hystrix-go.
the change benefits all collectors equally, but the default collector is in-memory so we have more control about guaranteeing its speed. i'm going to play with the idea of adding a bit more structure around the default collecter since it serves as the source of truth during tests. possibly having it be synchronous and any other registered collectors be async.
from hystrix-go.
before i began investigating changes to the library, i attempted to replicate your findings by executing your example program using my local checkout. i found that be7b59e also exhibited the failure behavior you shared for 5b79165. (building your code into a "gokit_bugreport" binary and using while ./gokit_bugreport; do :; done
for execution)
i think there may be a misunderstanding about how the channel is being processed. when the channel is unbuffered, the only guarantee go provides us is that the sending code will block until the message has been read, not until the processing for the message is completed. so it wasn't ever really deterministic, and the question isn't whether we need to wait, but how long we should wait before checking circuit health.
can you reproduce my findings on your end?
from hystrix-go.
No, I can't reproduce that. Here's my terminal session. Are you removing the built package between checkouts?
from hystrix-go.
looking at your session output it appears you are only running the program once. try using the while loop i describe to run the program continuously until it exits 1.
running the binary instead of go run
speeds up the loop considerably. in my tests using this method, a failed run happens almost immediately.
from hystrix-go.
No, I'm sorry, I still can't reproduce the failure on be7b59e. Here's the test run as you like it. I let it run about a minute.
ugh ~/tmp/hystr cd $GOPATH/src/github.com/afex/hystrix-go ; git checkout be7b59e ; cd -
HEAD is now at be7b59e... Merge branch 'master' into loadtest
ugh ~/tmp/hystr rm -rf $GOPATH/pkg/darwin_amd64/github.com/afex/hystrix-go
ugh ~/tmp/hystr go build
ugh ~/tmp/hystr date ; while test $status -eq 0 ; ./hystr 2>/dev/null ; end ; date
Tue Aug 18 09:38:47 CEST 2015
^Cfish: Job 2, './hystr 2>/dev/null ; ' terminated by signal SIGINT (Quit request from job control (^C))
ugh ~/tmp/hystr
Here's the test run with the other commit. It fails immediately.
ugh ~/tmp/hystr cd $GOPATH/src/github.com/afex/hystrix-go ; git checkout 5b79165 ; cd -
Previous HEAD position was be7b59e... Merge branch 'master' into loadtest
HEAD is now at 5b79165... use buffered client for statsd collector to reduce TotalDuration overhead
ugh ~/tmp/hystr rm -rf $GOPATH/pkg/darwin_amd64/github.com/afex/hystrix-go
ugh ~/tmp/hystr go build
ugh ~/tmp/hystr date ; while test $status -eq 0 ; ./hystr 2>/dev/null ; end ; date
Tue Aug 18 09:40:25 CEST 2015
Tue Aug 18 09:40:25 CEST 2015
ugh ~/tmp/hystr
For completeness sake, I ran the same test on a Debian Linux VPS. Same results. Here's be7b59e running for a long time with no failures.
husserl ~/tmp/hystr cd $GOPATH/src/github.com/afex/hystrix-go ; git checkout be7b59e ; cd -
Note: checking out 'be7b59e'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b new_branch_name
HEAD is now at be7b59e... Merge branch 'master' into loadtest
husserl ~/tmp/hystr go build
husserl ~/tmp/hystr rm -rf $GOPATH/pkg/*/github.com/afex/hystrix-go
husserl ~/tmp/hystr rm -rf $GOPATH/pkg/linux_amd64/github.com/afex/hystrix-go
husserl ~/tmp/hystr go build
husserl ~/tmp/hystr date ; while test $status -eq 0 ; ./hystr 2>/dev/null ; end ; date
Tue Aug 18 03:50:02 EDT 2015
^Cā
And here's 5b79165 failing immediately.
husserl ~/tmp/hystr cd $GOPATH/src/github.com/afex/hystrix-go ; git checkout 5b79165 ; cd -
Previous HEAD position was be7b59e... Merge branch 'master' into loadtest
HEAD is now at 5b79165... use buffered client for statsd collector to reduce TotalDuration overhead
husserl ~/tmp/hystr rm -rf $GOPATH/pkg/linux_amd64/github.com/afex/hystrix-go
husserl ~/tmp/hystr go build
husserl ~/tmp/hystr date ; while test $status -eq 0 ; ./hystr 2>/dev/null ; end ; date
Tue Aug 18 03:51:07 EDT 2015
Tue Aug 18 03:51:07 EDT 2015
husserl ~/tmp/hystr
And just so we're clear, here's the example.go I'm using as my test.
from hystrix-go.
ok, i believe i found how we are getting different results. i'm assuming you are using go 1.4, while i'm running with 1.5. in 1.5, GOMAXPROCS defaults to the number of cores on the system. if you run your setup with GOMAXPROCS > 1, you'll see the failure i mention. when i used GOMAXPROCS=1 and either 1.4.2 or 1.5, i see the same behavior you do: no failures with be7b59e.
i think this difference in behavior highlights my previous point, and i'd like to hear your thoughts on the matter.
specifically, when using a unbuffered channel (as in be7b59e) go only guarantees that the send will block until the message is received, irrespective of processing done with that message. the command/circuit logic will continue while the metrics are being incremented and the command response does not wait for this to finish before returning to the caller. therefore it is invalid to assume that the metrics for a command execution will be finished processing when that command completes.
using only a single core enables some level of predictability to this processing, but there is no deterministic behavior at a language level that says the metric will be processed by the time the error is returned to the caller. for example, think of the scenario where many new metrics are tracked and the increments become (say, 1ms) slower. there is nothing in be7b59e which guarantees the behavior your test depends on.
in short, i'm saying that this implementation has always been probabilistic. this is my understanding of how go handles communication over channels. does it match yours?
from hystrix-go.
(sigh) Yep, I should have seen this. Thanks for your investigation. I'll manage it in my unit test.
from hystrix-go.
thanks for taking the time to back-and-forth this with me!
from hystrix-go.
Related Issues (20)
- Callback on Circuit State Change HOT 1
- Does the project have a maintainer? HOT 4
- How to update config an runtime? HOT 1
- function trap
- can this package update to go mod?
- A little confusion of Lock
- A few questions...
- High memory usage issues
- can hystrix-go add config at runtime? hystrix.ConfigureCommand HOT 1
- Get the original error? HOT 1
- circuit may open even though ErrorPercentThreshold is bigger than 100
- method executed within hystrix behaving weiredly
- Incorrect build instructions
- Prometheus support for metric collection HOT 1
- How to not set MaxConcurrentRequests?
- Provide context for runFunc which is cancelled during timeouts HOT 2
- Panic when Timeout if setup output variable in function
- metricExchange.Monitor and poolMetrics.Monitor goroutine leak
- I need name info in Update func Callback
- I find the hystrix has an erro, I use hystrix.Do method in my code, once the hystrix circuit is open, it will runs the runFunc and fallbackFunc forever. In my understanding, once the circuit is open, the code will run fallbackFunc unless the SleepWindow is open. I find another question is the code will lead to memory leak
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ššš
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ā¤ļø Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hystrix-go.