Giter Club home page Giter Club logo

carbonzipper's People

Contributors

azhiltsov avatar civil avatar deniszh avatar dgryski avatar gksinghjsr avatar gunnihinn avatar jaderdias avatar kanatohodets avatar korservick avatar nnuss avatar szibis avatar zdykstra avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

carbonzipper's Issues

carbonzipper fails to start if it can't connect to graphite

2015/12/02 13:07:05 starting carbonzipper 9054756
2015/12/02 13:07:05 setting GOMAXPROCS= 24
2015/12/02 13:07:05 Setting concurrencyLimit 2048
2015/12/02 13:07:05 Using graphite host 127.0.0.1:3002
2015/12/02 13:07:05 unable to connect to to graphite: 127.0.0.1:3002:dial tcp 127.0.0.1:3002: getsockopt: connection refused

this is not necessary, I guess it should just keep on trying to connect, but not fail

New release?

Hi Damian, can you tag a new release so I can build, deploy, etc? I want to give carbonserver a spin in production today :D

unable to merge ovalues

Seeing this message in carbonzipper logs:

request: /render/?format=protobuf&from=1488213577&target=big.long.metric.name&until=1488299977: unable to merge ovalues: len(values)=8640 but len(ovalues)=1440

Not sure what to do here, or what info you need to diagnose. Any help would be appreciated.

carbonzipper 0.72 exit code -1

Hi,

I Have these picture of graphite plaform :
screen shot 2017-08-28 at 12 25 56

I use graphite-web version 1.0.2 , carbonapi version 0.8.0 and carbonzipper version 0.70 because with carbonzipper 0.72 fails.

When graphite-web send a query with wildcard the carbonzipper dead , only have these logs:

child process died with exit code -1
W0825 14:45:46.199808   894 logging.cpp:91] RAW: Received signal SIGTERM from process 10326 of user 0; exiting

Thanks

weird outage this morning

I am running is a slightly weird config while I try to migrate:

carbon-c-relay going to two separate clusters:
a) 3 hosts in carbon_ch hash, go-carbon/carbonserver
b) 4 hosts in jump fnva1 hash, go-carbon/carbonserver

carbonzipper is setup to read from all 7 host backends.

This morning, one of the jump_fnva1 machines went offline. At that time, I started to see really strange spikes of both write timeouts from carbon-c-relay, as well as carbonzipper reporting read timeouts.

See these graphs:

screen shot 2017-09-08 at 8 50 10 am

As soon as I go the go-carbon host back up, things went back to normal. But this behavior is strange, and not what I expected.

Does anyone know what could cause this?

carbonzipper 0.74 not returning the results for the multiple queries.

I have two queries below target=local.monitoring.U.rg.adc01.servers.ossgmsdataadc0101.cache.notConfirmedShardLenMax&target=local.monitoring.U.rg.adc01.servers.ossgmsdataadc0101.cache.maxSize
But zipper returns values only for the first part of the query.

2018-09-27T16:31:25.838-0500 DEBUG render got render request {"memory_usage_bytes": 0, "handler": "render", "carbonzipper_uuid": "2bcfb82f-2257-4543-a284-06813b3e45cb", "carbonapi_uuid": "", "request": "/render/?format=pickle&local=1&noCache=1&from=1538083585&until=1538083885&target=local.monitoring.U.rg.adc01.servers.ossgmsdataadc0101.cache.notConfirmedShardLenMax&target=local.monitoring.U.rg.adc01.servers.ossgmsdataadc0101.cache.maxSize&now=1538083885"} 2018-09-27T16:31:25.838-0500 DEBUG render querying servers {"memory_usage_bytes": 0, "handler": "render", "carbonzipper_uuid": "2bcfb82f-2257-4543-a284-06813b3e45cb", "carbonapi_uuid": "", "handler": "multiGet", "servers": ["http://172.16.0.162:8080", "http://172.16.0.163:8080"], "uri": "/render/?format=protobuf&from=1538083585&target=ocal.monitoring.U.rg.adc01.servers.ossgmsdataadc0101.cache.notConfirmedShardLenMax&until=1538083885"} 2018-09-27T16:31:26.518-0500 ERROR render query error {"memory_usage_bytes": 0, "handler": "render", "carbonzipper_uuid": "2bcfb82f-2257-4543-a284-06813b3e45cb", "carbonapi_uuid": "", "handler": "multiGet", "handler": "singleGet", "query": "http://172.16.0.163:8080//render/?format=protobuf&from=1538083585&target=ocal.monitoring.U.rg.adc01.servers.ossgmsdataadc0101.cache.notConfirmedShardLenMax&until=1538083885", "error": "Get http://172.16.0.163:8080/render/?format=protobuf&from=1538083585&target=local-dev-vergil.monitoring.US.rg.adc01.servers.ossgmsdataadc0101a.cache.notConfirmedShardLenMax&until=1538083885: dial tcp 172.16.0.163:8080: getsockopt: no route to host"} 2018-09-27T16:31:26.518-0500 DEBUG zipper_render decoded response {"name": "ocal.monitoring.U.rg.adc01.servers.ossgmsdataadc0101.cache.notConfirmedShardLenMax", "decoded": [{"name":"ocal.monitoring.U.rg.adc01.servers.ossgmsdataadc0101.cache.notConfirmedShardLenMax","startTime":1538083800,"stopTime":1538084100,"stepTime":300,"values":[4],"isAbsent":[false]}]} 2018-09-27T16:31:26.518-0500 DEBUG zipper_render only one decoded response to merge {"name": "local.monitoring.U.rg.adc01.servers.ossgmsdataadc0101.cache.notConfirmedShardLenMax"} 2018-09-27T16:31:26.518-0500 INFO access request served {"handler": "render", "carbonzipper_uuid": "2bcfb82f-2257-4543-a284-06813b3e45cb", "carbonapi_uuid": "", "format": "pickle", "target": "local.monitoring.U.rg.adc01.servers.ossgmsdataadc0101.cache.notConfirmedShardLenMax", "memory_usage_bytes": 0, "http_code": 200, "runtime_seconds": 0.67930205}

Q: troubleshoot carbonzipper -> carbonserver (via go-carbon) timeouts?

Hi,

First, I love this project! I'm in the middle of implementing this, and currently have this setup:

grafana -> carbonapi -> carbonzipper -> carbonserver on 3 servers

This morning I've seen a few timeouts being reported by carbonzipper. Nothing in carbonserver stats or logs that I can see.

Where should I be looking to see what's going on here? Thanks!

Fault Tolerence for backend failures?

If you query against carbonzipper on your frontend graphite webapp, how does it handle backend servers going down? In the traditional graphite stack, I have run into issues whereby the webapp slows to a crawl when one of it's cluster servers goes down. Is this behavior the same with the use of carbonzipper?

JSON input support

carbonserver has a mode to produce json data instead of pickles. Since json is more efficient (and clean) for Go to produce, it would be nice if zipper could read json instead of pickle, to probably optimise the interaction between the two a bit.

Improve documentation

If anybody else is going to use the zipper stack, we need do actually document how to set the whole thing up, including configuration and testing that all the moving parts are connected happily.

Proxy Backend logic

For some other databases it'll be nice to have a "proxy" backend logic - separate list of backends where you 100% sure that all of them have the same set of metric.

It's useful for a databases that supports distributed tables.

info target produces empty hash

The /info target of zipper produces an empty result ({}), while the carbonserver (store) produces the full info set. I suspect after the protocol upgrade, this one got broken.

logarithmic request time metrics

The request time histogram metrics should be logscale, such that we favour small request times with much more detail than the slower ones. Typically we see nothing taking more than 3 seconds, while we'd like to see more indepth analysis in what the distribution under the 1 second is.

Reload support

When config is modified, we want to have reload option - so connections won't get lost. Otherwise people can see zipper not responding.

Also we need to have a way to check that config is valid.

Make hostname configurable

Hi.

If you run carbonzipper in docker container you will have a lot of interesting hostnames.

Maybe we can use a template for metrics? like go-carbon?

graph-prefix = "carbon.agents.{host}"

Concurrent requests to same metric crash the program

Run current HEAD, make it talk to any store, and make concurrent requests to any single metric with cache disabled:

target='something'
while true; do
    curl -s 'http://127.0.0.1:8081/render/?target=$target&noCache=1&format=json' > /dev/null &
done

This will crash the carbonzipper process with a close of a closed channel; see [1].

The stack trace points to a manual call of a method that had been defer-called above, and which does close a channel. This is a red herring, as the issue is still reproducible after removing the manual call.

The real issue could be that the QueryItem cache keys are constructed just from the request data, so two requests for the same target that come in close enough to each other may be able to double-write a *QueryItem to the same cache key before one of them is locked. Once the cache key gets filled in, two or more goroutines then close the same channel.

Running with cache disabled only serves to exhibit the issue more quickly; we see this in production at Booking.com with the cache enabled.

[1] Stack trace:

panic: close of closed channel
goroutine 8297 [running]:
panic(0xf05d80, 0x107ab90)
/usr/lib64/go/src/runtime/panic.go:551 +0x3d9 fp=0xc429405110 sp=0xc429405070 pc=0x431629
runtime.closechan(0xc432f70840)
/usr/lib64/go/src/runtime/chan.go:333 +0x2b3 fp=0xc429405170 sp=0xc429405110 pc=0x405e23
github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/cache.(*QueryItem).StoreAbort(0xc421176b40)
/home/redacted/go/src/github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/cache/query.go:59 +0xb8 fp=0xc4294051b8 sp=0xc429405170 pc=0xce9608
github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast.(*BroadcastGroup).Fetch(0xc4201f8960, 0x1084500, 0xc428615ac0, 0xc4211027c0, 0x0, 0x0, 0x0)
/home/redacted/go/src/github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast/broadcast_group.go:258 +0xcc2 fp=0xc429405918 sp=0xc4294051b8 pc=0xceda72
github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast.(*BroadcastGroup).doSingleFetch(0xc4204750e0, 0x1084500, 0xc4222d7500, 0xc4222100c0, 0x10894a0, 0xc4201f8960, 0xc4211027c0, 0xc422f29bc0, 0xc422f29b00)
/home/redacted/go/src/github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast/broadcast_group.go:182 +0x8ff fp=0xc429405f98 sp=0xc429405918 pc=0xcebe1f
runtime.goexit()
/usr/lib64/go/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc429405fa0 sp=0xc429405f98 pc=0x463051
created by github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast.(*BroadcastGroup).Fetch
/home/redacted/go/src/github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast/broadcast_group.go:214 +0x920

Very high memory consumption

Hello here,

we are using the carbonzipper version 0.7.2, in combination with the carbonapi (0.8.0) and go-carbon (0.9.1).

Our cluster is constituted by 3 nodes proxied by the carbon-c-relay.

We recently upgraded from an old version of the stack, and since then we are seeing a very high memory consumption pattern:

pmap shows total 44201100K, and top around 25% usage on a very beefy server.

This is our current configuration:

carbonzipper

maxProcs: 16

timeouts:
    global: "10s"
    afterStarted: "2s"

concurrencyLimit: 0
maxIdleConnsPerHost: 100
expireDelaySec: 10

carbonapi

concurency: 20
cache:
   type: "mem"
   size_mb: 1024
   defaultTimeoutSec: 60

cpus: 2
tz: ""

sendGlobsAsIs: true

maxBatchSize: 1000

any suggestion on how to best approach the problem?

Many thanks!

support for some other databases

It seems to be useful to add a support for other database (this is currently depends on refactoring that's going on grpcNew branch) which have compatible data model. Examples:

  • metrictank
  • graphouse

maybe there are more.

At least all further refactoring should be done with this idea in mind and we should make it easier to add custom datamodel support to the zipper

dep not working

problem 1

dep has switched to use toml for manifest and lock file, right after carbonzipper and carbonapi started using dep.

From a clean env (dep not installed), make would fail:

$ make
/go/bin/dep
dep ensure
could not find project Gopkg.toml, use dep init to initiate a manifest
Makefile:21: recipe for target 'dep' failed
make: *** [dep] Error 1

If I install an old version of dep (e.g. 4105d3a), dep won't complain on this. Maybe you should update your local dep and re-dep init.

problem 2

In a clean env, dep ensure would fail with the following error:

$ dep ensure
solve error: No versions of github.com/gogo/protobuf met constraints:
	master: failed to create repository cache for https://github.com/gogo/protobuf with err:
unable to get repository: Cloning into '/go/pkg/dep/sources/https---github.com-gogo-protobuf'...

	v0.4: Could not introduce github.com/gogo/[email protected], as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	v0.3: Could not introduce github.com/gogo/[email protected], as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	v0.2: Could not introduce github.com/gogo/[email protected], as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	v0.1: Could not introduce github.com/gogo/[email protected], as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	master: failed to create repository cache for https://github.com/gogo/protobuf with err:
unable to get repository: fatal: destination path '/go/pkg/dep/sources/https---github.com-gogo-protobuf' already exists and is not an empty directory.

	bigendian: Could not introduce github.com/gogo/protobuf@bigendian, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	bigskip: Could not introduce github.com/gogo/protobuf@bigskip, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	gopherjs: Could not introduce github.com/gogo/protobuf@gopherjs, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	moretests: Could not introduce github.com/gogo/protobuf@moretests, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	proto3.2.0: Could not introduce github.com/gogo/[email protected], as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	ptypes: Could not introduce github.com/gogo/protobuf@ptypes, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	revert-271-issue-222: Could not introduce github.com/gogo/protobuf@revert-271-issue-222, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	std: Could not introduce github.com/gogo/protobuf@std, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	wkt: Could not introduce github.com/gogo/protobuf@wkt, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
ensure Solve(): No versions of github.com/gogo/protobuf met constraints:
	master: failed to create repository cache for https://github.com/gogo/protobuf with err:
unable to get repository: Cloning into '/go/pkg/dep/sources/https---github.com-gogo-protobuf'...

	v0.4: Could not introduce github.com/gogo/[email protected], as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	v0.3: Could not introduce github.com/gogo/[email protected], as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	v0.2: Could not introduce github.com/gogo/[email protected], as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	v0.1: Could not introduce github.com/gogo/[email protected], as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	master: failed to create repository cache for https://github.com/gogo/protobuf with err:
unable to get repository: fatal: destination path '/go/pkg/dep/sources/https---github.com-gogo-protobuf' already exists and is not an empty directory.

	bigendian: Could not introduce github.com/gogo/protobuf@bigendian, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	bigskip: Could not introduce github.com/gogo/protobuf@bigskip, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	gopherjs: Could not introduce github.com/gogo/protobuf@gopherjs, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	moretests: Could not introduce github.com/gogo/protobuf@moretests, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	proto3.2.0: Could not introduce github.com/gogo/[email protected], as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	ptypes: Could not introduce github.com/gogo/protobuf@ptypes, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	revert-271-issue-222: Could not introduce github.com/gogo/protobuf@revert-271-issue-222, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	std: Could not introduce github.com/gogo/protobuf@std, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.
	wkt: Could not introduce github.com/gogo/protobuf@wkt, as it is not allowed by constraint master from project github.com/go-graphite/carbonzipper.

I'm not familiar with dep, I don't know what's going wrong...

Merge logic should try harder to merge series when 1 is of different length.

mergeValues() will set responseLengthMismatch = true on the first case where there is a length mismatch and prevent further attempts to merge. It can happen that that this prevents merging from a following series that has the same length.

There is a TODO nearby that hints at a proposed solution.

We believe we've seen this when requesting data right at a rollup boundary and a race condition on the carbonstores will determine if we get it's data at higher or lower resolution.

More intelligent ovalues mismatch resolution

Currently zipper takes the first response received when there is a mismatch in granularity between copies received. This is due to configuration issue on the backends, but nevertheless it would be nice if the zipper could handle this more intelligently, such that we see more, and have less flapping (data is there, next load data isn't). For example, zipper could return the metric that has the least missing values (percentually?), iso the first received.

Use of caching in carbonzipper together with sendGlobsAsIs in carbonapi leads to "timeout waiting for more responses"

Hi!

We have faced the following issue:
if we enable sendGlobsAsIs in carbonapi and set expireDelaySec>0 in carbonzipper then request with a relatively large number of metrics (about 65) fails in carbonzipper:

2017-09-11T16:16:56.296Z        INFO    access  request served  {"handler": "find", "format": "protobuf", "target": "dc1.collectd.deployment.some_server.*.cpu.*.percent.*", "carbonzipper_uuid": "3af6bea0-b0d6-4a06-92ae-b8bef43d4058", "carbonapi_uuid": "40f0310b-95e6-4ffc-8a56-42d5ead76677", "http_code": 200, "runtime_seconds": 0.003241923}

2017-09-11T16:17:01.315Z        WARN    render  timeout waiting for more responses      {"memory_usage_bytes": 0, "handler": "render", "carbonzipper_uuid": "ae19164a-3f97-48a5-8cc7-3807818944f1", "carbonapi_uuid": "40f0310b-95e6-4ffc-8a56-42d5ead76677", "handler": "multiGet", "uri": "/render/?format=protobuf&from=1504973816&target=dc1.collectd.deployment.some_server.%2A.cpu.%2A.percent.%2A&until=1505146616", "timeouted_servers": ["http://192.168.113.70:8083", "http://192.168.113.85:8083", ...], "answers_from_servers": []}

2017-09-11T16:17:01.315Z        ERROR   access  request failed  {"handler": "render", "carbonzipper_uuid": "ae19164a-3f97-48a5-8cc7-3807818944f1", "carbonapi_uuid": "40f0310b-95e6-4ffc-8a56-42d5ead76677", "format": "protobuf", "target": "dc1.collectd.deployment.some_server.*.cpu.*.percent.*", "memory_usage_bytes": 0, "reason": "No responses fetched from upstream", "http_code": 500, "runtime_seconds": 5.018973257}

2017-09-11T16:17:01.315Z        WARN    slow    Slow Request    {"time": 5.019006852, "url": "/render/?format=protobuf&from=1504973816&target=dc1.collectd.deployment.some_server.%2A.cpu.%2A.percent.%2A&until=1505146616"}

...
2017-09-11T16:17:01.335Z        ERROR   render  error reading body      {"memory_usage_bytes": 0, "handler": "render", "carbonzipper_uuid": "ae19164a-3f97-48a5-8cc7-3807818944f1", "carbonapi_uuid": "40f0310b-95e6-4ffc-8a56-42d5ead76677", "handler": "multiGet", "handler": "singleGet", "query": "http://192.168.113.85:8083//render/?format=protobuf&from=1504973816&target=dc1.collectd.deployment.some_server.%2A.cpu.%2A.percent.%2A&until=1505146616", "error": "context canceled"}
...

(afterStarted: "5s" in this example)

If number of metrics more than maxBatchSize (in carbonapi) - everything is fine and this make sense.

If we set expireDelaySec: 0 (with enabled sendGlobsAsIs) then carbonzipper works fine:

2017-09-11T16:17:25.266Z        INFO    access  request served  {"handler": "find", "format": "protobuf", "target": "dc1.collectd.deployment.some_server.*.cpu.*.percent.*", "carbonzipper_uuid": "cdf92d40-0e09-4630-b83d-c21234f6087f", "carbonapi_uuid": "0882c166-6878-4cd6-a670-eeadd962fc64", "http_code": 200, "runtime_seconds": 0.002556903}

2017-09-11T16:17:25.819Z        INFO    access  request served  {"handler": "render", "carbonzipper_uuid": "698b8799-8f99-4061-a4d4-1e4f86c0a70f", "carbonapi_uuid": "0882c166-6878-4cd6-a670-eeadd962fc64", "format": "protobuf", "target": "dc1.collectd.deployment.some_server.*.cpu.*.percent.*", "memory_usage_bytes": 9959448, "http_code": 200, "runtime_seconds": 0.552167464}

And as you can see runtime_seconds is relatively small.

Could you please help with this issue?

Thanks in advance.

extract logging code

Currently the trivial logging code is duplicated across each program in the zipper stack. Extract it from one and import it into all the others.

format=pickle returns empty data set

Pointing graphite-web at carbonzipper via CLUSTER_SERVERS in local_settings.py

2017/01/26 20:53:06 request:  /metrics/find/?local=1&format=pickle&query=%2A
2017/01/26 20:53:06 querying servers= [http://10.20.14.4:8080 http://10.20.28.2:8080] uri= /metrics/find/?format=protobuf&local=1&query=%2A

Receiving an empty response and no debugging information. If I manually query the carbonserver instances for format={pickle,protobuf,json} I get valid data.

If I change:

2017/01/26 20:53:06 request:  /metrics/find/?local=1&format=pickle&query=%2A

TO

2017/01/26 20:53:06 request:  /metrics/find/?local=1&format=json&query=%2A

I get data.

I just pulled the latest master and built using golang 1.7.4 from golang.org.

Running carbonzipper with carbonserver on custom ports not working ?

Carbonzipper and carbonserver from latest master.

I would like to test this setup next to standard graphite-web setup, but i have some problems.

2015/08/30 13:27:14 starting carbonzipper (development version)
2015/08/30 13:27:14 setting GOMAXPROCS= 2
2015/08/30 13:27:14 Using graphite host loadbalancer-graphite-instances:2013
2015/08/30 13:27:14 listening on :8080
2015/08/30 13:27:14 singleGet: error querying  10.1.16.128:8080 / /metrics/find/?format=protobuf&query=%2A : unsupported protocol scheme ""
2015/08/30 13:33:04 singleGet: error querying  10.1.16.128:8080 / /render/?format=protobuf&from=1440940588&now=1440941188&target=diamond.fluentd.%2A.%2A.processcpu.fluentd&until=1440941188 : unsupported protocol scheme ""
2015/08/30 13:33:04 render: error querying backends for: /render/?target=diamond.fluentd.*.*.processcpu.fluentd&format=pickle&from=1440940588&until=1440941188&now=1440941188 backends: [10.1.16.128:8080]

Config for carbonzipper running on graphite-web/api instances:

{
  "Backends": [
    "10.1.16.128:8080"
  ],
  "GraphiteHost": "loadbalancer-graphite-instances:2013",
  "MaxProcs": 2,
  "Port":     8080,
  "Buckets":  10,
  "TimeoutMs": 10000,
  "TimeoutMsAfterAllStarted": 2000,

  "MaxIdleConnsPerHost": 100
}

Communication is working to carbonserver from carbonzipper hosts:

telnet 10.1.16.128 8080
Trying 10.1.16.128...
Connected to 10.1.16.128.
Escape character is '^]'.

And nothing in carbonserver logs:

2015/08/30 13:12:21 starting carbonserver (development build)
2015/08/30 13:12:21 reading whisper files from: /opt/graphite/storage/whisper
2015/08/30 13:12:21 maximum brace expansion set to: 10
2015/08/30 13:12:21 set GOMAXPROCS=2
2015/08/30 13:12:21 listening on :8080 

Carbonserver running with such parameters on graphite instances with whipser files - different instances then carbonzipper:

/usr/bin/carbonserver -logdir=/var/log/carbonserver/ -maxexpand=10 -maxprocs=2 -p=8080 -scanfreq=0 -w=/opt/graphite/storage/whisper -stdout=true -vv=true

From what i was able to get from code, configs should looks like i describe. What is wrong with that setup ? Any suggestions ?

Limit number of concurrent connections per backend

The carbonapi can generate very large queries which can easily overwhelm the backend. We need a way to limit the damage it can do, probably by having a maximum number of outbound requests for a particular storage instance.

fetchCarbonsearchResponse should have retry or fallback logic

It either need to retry the query to carbonsearch or fail the parent query if carbonsearch fail to respond in time or carbonsearch dies in the middle of a query.
One of other options could be fallback to other carbonsearch instance (config change needed)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.