Which Faktory package and version? 1.6.1 Which Faktory worker package a

Duplicated scheduled jobs with multiple faktory servers about faktory HOT 13 CLOSED

alondahari commented on August 20, 2024

Duplicated scheduled jobs with multiple faktory servers

from faktory.

Comments (13)

mperham commented on August 20, 2024

Faktory doesn't support multi-server clustering. You can set up Redis replicas for data redundancy but only one Faktory can be operational at a time.

from faktory.

alondahari commented on August 20, 2024

Oh that's not great. It's a bit scary to have one point of failure like that. I believe we contacted you about this earlier and you said it would be possible to have two servers behind a load balancer, but I'll double check that.

from faktory.

mperham commented on August 20, 2024

You can but they need to run completely independent. For instance if you have two with the same configuration, they will both fire cron jobs. There's no concept of primary/standby or cluster leadership.

from faktory.

alondahari commented on August 20, 2024

Hi again, we updated our configuration to have a "primary" and "secondary" servers, with only the primary scheduling cron jobs. This seems to work for scheduling jobs, but for some reason we're seeing jobs complete and the being enqueued again, about half an hour after the job completed. Any ideas about what's would be the cause of that? the servers are sharing one redis (elasticache) instance.

Any help would be most appreciated!

from faktory.

mperham commented on August 20, 2024

I think 30 minutes is the default job reservation time. If you fetch job A from server 1 but don't ack it within 30 minutes, Faktory will re-enqueue job A for re-execution under the assumption that the worker failed somehow.

from faktory.

alondahari commented on August 20, 2024

Interesting. I wonder why it's not sending ack then. Also doesn't seem like it should be caused by a multi-server setup.

from faktory.

mperham commented on August 20, 2024

Is it possible your worker is FETCHing from Server 1 but ACKing to Server 2?

from faktory.

alondahari commented on August 20, 2024

It is possible, but wouldn't it update the same redis record?

from faktory.

mperham commented on August 20, 2024

Faktory keeps a list of outstanding jobs in memory, keyed by JID. If it's not in its records, it'll return an error. You can argue that Faktory should go to Redis for this operation but as Faktory is not designed to run in parallel, that's the way it works today.

from faktory.

alondahari commented on August 20, 2024

Hmmm I'm not sure how to proceed here then... would it be possible to change the implementation there? do you agree that
there should be a capability of having redundancy on the Faktory server?

from faktory.

mperham commented on August 20, 2024

Faktory does not support clustering. You can use Redis replicas to get a real-time backup. I'm not planning any changes here, redundancy is worthwhile but would add a tremendous amount of complexity to the system. Are you trying to solve a real problem or an imaginary one? Is Faktory reliability really an issue for you?

from faktory.

alondahari commented on August 20, 2024

It's not an issue we faced, but I wouldn't call it imaginary. We have redundancy with almost every part of our infrastructure, being proactive about mitigating issues rather than waiting until something fails.

Faktory reliability is an issue since if the server goes down enqueueing jobs will fail, which will result in errors affecting our users directly.

Our other option is to have a fallback of saving jobs to another database if the enqueueing fails, but I would consider that a hacky workaround. Open to any suggestions you might have.

from faktory.

mperham commented on August 20, 2024

The one option you can roll yourself is two shards. You can run two independent Faktorys in two different AZs/DCs and have your clients use some algorithm to choose their nearest Faktory. If that Faktory is unavailable, they can push to the backup Faktory.

But at the end of the day, Faktory is a primary datastore. Just as you don't really expect your app to be usable when postgres goes down, you should expect similar semantics here. I haven't had a report of Faktory crashing in over a year now? So I hope you can test and verify Faktory's reliability.

from faktory.

Duplicated scheduled jobs with multiple faktory servers about faktory HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent