Giter Club home page Giter Club logo

Comments (13)

Artimi avatar Artimi commented on May 11, 2024 1

I'm having another issue where I would expect client to reconnect. Sometimes the connection is in a weird state that I cannot get any answer from the node therefore I get KafkaTimeoutError. This results to endless heatbeat loop because coordinator calls GroupCoordinator.ensure_coordinator_known and it fails in sending request. This repeats every 40,1 s because of timeout in AIOKafkaConnection._request_timeout_ms == 40000 and GroupCoordinator._retry_backoff_ms == 100. It seems that AIOKafkaConnection.connected() still holds true because it does not delete connection in Client.send() (from ready from _get_conn).
I think that on KafkaTimeoutError client should reconnect because this way it just disconnects and we have to restart the service. After restart everything runs smoothly until KafkaTimeoutError appears again.

Here I present excerpt of our logs:

March 28th 2017, 19:17:04.985    Error sending GroupCoordinatorRequest_v0 to node 0 [KafkaTimeoutError] -- marking coordinator dead
March 28th 2017, 19:17:04.986    Group Coordinator Request failed: KafkaTimeoutError
March 28th 2017, 19:17:45.088    Error sending GroupCoordinatorRequest_v0 to node 0 [KafkaTimeoutError] -- marking coordinator dead
March 28th 2017, 19:17:45.088    Group Coordinator Request failed: KafkaTimeoutError

from aiokafka.

tvoinarovskyi avatar tvoinarovskyi commented on May 11, 2024

Hey there, that's new, it should just reconnect. How do you actually reproduce this?

To answer your question of where it reconnects:

from aiokafka.

Artimi avatar Artimi commented on May 11, 2024

Ah, I'm sorry for wrong conclusion. It wasn't working for me because I didn't catch exceptions from get() and commit(). It seems that aiokafka is working flawlessly. Only thing that confused me is missing reconnect.backoff.ms. I was searching for that in the codebase and didn't find it so I concluded that aiokafka is not reconnecting. Sorry for my mistake and thanks for your response.

from aiokafka.

tvoinarovskyi avatar tvoinarovskyi commented on May 11, 2024

@Artimi NP). Glad it was resolved.
Somehow we end up using retry_backoff_ms for both reconnect and error backoff. Don't know if that's a big issue yet.

from aiokafka.

tvoinarovskyi avatar tvoinarovskyi commented on May 11, 2024

Could you provide full configuration for the consumer?

from aiokafka.

Artimi avatar Artimi commented on May 11, 2024

Sure:

self._consumer = aiokafka.AIOKafkaConsumer(
	loop = loop,
	bootstrap_servers = hosts,
	group_id = consumer_group,
	enable_auto_commit = False,
	auto_offset_reset = 'earliest',
	heartbeat_interval_ms = 3 * 1000,
	metadata_max_age_ms = 30 * 1000,
)

from aiokafka.

tvoinarovskyi avatar tvoinarovskyi commented on May 11, 2024

Then where did you get GroupCoordinator._retry_backoff_ms == 10000?

from aiokafka.

Artimi avatar Artimi commented on May 11, 2024

You are right, sorry it should be 100 as the default value is. I edited the comment.

from aiokafka.

tvoinarovskyi avatar tvoinarovskyi commented on May 11, 2024

So to sum it up, there is a case where Kafka can still have the socket in open state, but requests timeout. That's weird, but probably possible. So the fix should be to discard connections, that have timeouts.

from aiokafka.

Artimi avatar Artimi commented on May 11, 2024

Yes, exactly. Just delete the connection if it has timeout and reconnect again. This should do no harm, right?

from aiokafka.

Artimi avatar Artimi commented on May 11, 2024

Hey @Drizzt1991 thanks for merge. I just wanted to ask when do you plan next release?

from aiokafka.

tvoinarovskyi avatar tvoinarovskyi commented on May 11, 2024

@Artimi Great thanks for submitting issues, really helps. Release will be this week, probably on weekend.

from aiokafka.

tvoinarovskyi avatar tvoinarovskyi commented on May 11, 2024

@Artimi The release is in progress. It should be OK as of 0.2.2.

from aiokafka.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.