devshawn / kafka-connect-healthcheck Goto Github PK
View Code? Open in Web Editor NEW๐ฅ A simple healthcheck wrapper to monitor Kafka Connect.
License: Apache License 2.0
๐ฅ A simple healthcheck wrapper to monitor Kafka Connect.
License: Apache License 2.0
Currently, it monitors all connectors/tasks for a given worker, or if no worker is given, all connectors.
It'd be nice to allow a list of connectors to monitor to be passed in.
When I run kafka-connect-healthcheck I get "kafka-connect-healthcheck: command not found". How do I fix it?
When a pod has trouble connecting to the kafka server due to "consumer poll timeout has expired", kakfa-connect-healthcheck continues to report a "live" status.
2020-08-13 17:32:45,750 INFO || [Worker clientId=connect-1, groupId=1] Member connect-1-85bcb553-57c6-4e0b-8131-234973432ed5 sending LeaveGroup request to coordinator REDACTED (id: 2147483644 rack: null) due to consumer poll timeout has expired. This means the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time processing messages. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records. [org.apache.kafka.clients.consumer.internals.AbstractCoordinator]
It appears that this is due to the HTTP server not reporting any errors when hitting the /status
endpoints for all of the connectors, and the worker not being assigned to any connectors.
However, hitting a "details" endpoint for a connector will lead to an HTTP 500: timeout error.
One way to address this issue would be to attempt to fetch the details for at least one connector (which seems to fail in the above situation). I will open a PR that implements this
{"healthy": false, "message": "Exception raised while attempting to calculate health result, assuming unhealthy.", "error": "HTTPConnectionPool(host='localhost', port=8083): Max retries exceeded with url: /connectors (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f685e6b8950>: Failed to establish a new connection: [Errno 111] Connection refused'))", "failure_states": ["FAILED"]}
Note: I can directly query status of connectors from port 8083
Hi I was wondering if there was a way to configure the max number of retries probably through an environment variable.
Would love to see this support a healthz endpoint
https://kubernetes.io/docs/reference/using-api/health-checks/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.