Giter Club home page Giter Club logo

Comments (30)

foxx avatar foxx commented on May 24, 2024 13

Or you could drop Django :)

from django-rq.

aaugustin avatar aaugustin commented on May 24, 2024 8

FWIW I've been using this for years:

from django.db import connection
from rq.worker import Worker


class ConnectionClosingWorker(Worker):
    """
    RQ Worker that closes the database connection before forking.

    See also https://github.com/ui/django-rq/issues/17

    """

    def execute_job(self, job, queue):
        connection.close()
        super().execute_job(job, queue)

It's been too long since I wrote it to remember the details, but it solved this issue and hasn't caused problems in a largish rq deploy. By "largish", I mean "we're moving off Heroku because we hit their 300-dyno limit".

from django-rq.

selwin avatar selwin commented on May 24, 2024 1

I've been some more thinking about this, here's a few ideas:

Alternative 1

Writing a wrapper function for async functions - this doesn't require any changes to RQ but could be tedious if every single enqueued django function on Heroku needs to explicitly close db connection.

from django.db import connection

def func():
    #pass

def rq_func():
    func()
    connection.close()

queue.enqueue(rq_func)

Alternative 2

Implementing a job callback pattern, however, this idea was rejected in rq/rq#155, I'm also not a big fan of the notion of having callbacks in RQ.

queue.enqueue(f, callback=close_connection)

Alternative 3

Add a job_class argument to Worker to allow the use of custom job classes. In this case, Django RQ can define its own Job class to deal with this issue. It would look something like:

from rq.job import Job
from django.db import connection

ConnectionClosingJob(Job):

    def perform(self):
        result = super(DatabaseClosingJob, self).perform()
        connection.close()
        return result

# Somewhere in Django-RQ

worker = Worker(connection=connection, job_class=ConnectionClosingJob)

@nvie, what do you think?

from django-rq.

selwin avatar selwin commented on May 24, 2024

I've never written a Heroku application before so this will be hard for me to debug. Does explicitly closing the connection on your queued function fix it? I did a little bit of googling and it seems that this problem is not specific to Django/RQ. There's a resque thread discussing the same problem https://github.com/defunkt/resque/issues/367

from django-rq.

davecap avatar davecap commented on May 24, 2024

Yea I saw that thread too... I'm surprised I haven't seen any traces of this problem with Django though. I was kind-of wondering if I could implement a similar fix to the gem mentioned in that thread. What they're doing is adding the disconnect on some sort of "after_job" callback. Does such a thing exist with RQ?

from django-rq.

selwin avatar selwin commented on May 24, 2024

No, there's no such thing as "after_job" in RQ. We can propose adding one, as well as the ability to use a different worker class since this is already in RQ's roadmap (implementing concurrent workers).

However, we need to first ensure that the proposed solution does indeed solve the problem. Can you try if running "close_connection" in the function being queued fix this?

from django-rq.

davecap avatar davecap commented on May 24, 2024

Yep. I just tested it one of my async functions and I don't see the EOF errors.

from django-rq.

acjay avatar acjay commented on May 24, 2024

I'm experiencing the same issue. I'm glad I checked, because I had no idea what the cause issue was. I think alternatives 2 or 3 would be good. I would think of it less of a callback than as a hook. I would ideally like to do other post-job housekeeping.

from django-rq.

acjay avatar acjay commented on May 24, 2024

Just to add, I'm not sure Alternative 1 really works when func() is a bound method. In my use case, I'm binding a method at run time and queuing that method directly. I'm refactoring to try to pass a wrapper function instead that will recover the method reference job-side and do the necessary cleanup, but it's really gumming up the code.

from django-rq.

nvie avatar nvie commented on May 24, 2024

I don't understand the problem in the first place. Stripping away django-rq and RQ from the equation, what's the core of the problem we're looking at here? Isn't it that Django implicitly opens a connection, but does never explicitly close it until its Python process is terminated (which never happens in an RQ worker situation)?

from django-rq.

davecap avatar davecap commented on May 24, 2024

From what I understand it's a harmless postgres error message that occurs when no Terminate message is received. I wonder if enabling "autocommit" mode for the psycopg2 DB backend would fix it.

from django-rq.

acjay avatar acjay commented on May 24, 2024

@nvie That's one manifestation of the problem, but to me, the overall problem is that I want to be able to do other post-task work. In my case, I'm manually tracking tasks in flight so that I can prevent duplicate tasks with the same arguments from running. In the current paradigm, I have to schedule a wrapper function that calls that actual task function performs this clean up. But since there are multiple actual task functions that might be called, and since I can't pass function references as arguments, my wrapper function has to do some serious gymnastics to recreate the task function reference worker-side. The whole thing would be much easier if there were hooks for post-task work. I really like Alternative 3, personally.

from django-rq.

fangsterr avatar fangsterr commented on May 24, 2024

The problem I believe is that RQ forks its worker processes from the main process, and django shares its database connection from the parent process to the child process. http://stackoverflow.com/questions/8242837/django-multiprocessing-and-database-connections sums up the problem pretty well.

from django-rq.

selwin avatar selwin commented on May 24, 2024

We have decided to implement the fix on RQ itself instead of django-rq. RQ will allow the user to use a custom Worker class where we can then tell the worker to do close db connection before forking.

There's an open issue tracking this feature on rq/rq#233 . Feel free to contribute/chime in on the discussion on the thread.

from django-rq.

abulte avatar abulte commented on May 24, 2024

I came here because I was worried with the could not receive data from client: Connection reset by peer on my django/django-rq app running on Heroku.

I applied the tip described here to create "clean" connections when the worker is launched and avoir the "forked connections" syndrome that seems to be the root cause: rq/rq#233 (comment)

But I stil get reset connection message on Heroku. Did I miss something?

from django-rq.

selwin avatar selwin commented on May 24, 2024

@abulte mind showing me a snippet of your code that manages postgres connection? If I remember correctly @fdr also wanted to put this up on Heroku's docs so we should definitely work out a reliable way of doing this and properly document it.

from django-rq.

abulte avatar abulte commented on May 24, 2024

Here's my snippet, directly inspired from the mentioned link:

from rq import Worker
from django.db import close_connection

class AAWorker(Worker):

    def fork_and_perform_job(self, *args, **kwargs):
        close_connection()
        super(AAWorker, self).fork_and_perform_job(*args, **kwargs)

I should add that it did not solve the "connection reset" problem.

I would be very glad to have the heroku docs for doing that correctly ;-)

Thanks for your help.

from django-rq.

foxx avatar foxx commented on May 24, 2024

@abulte THANK YOU dude, this just saved me so much time and effort.

EDIT: Sorry, scratch that response, this didn't work for me, I think I have a different bug :/ Still thank you though lol.

from django-rq.

fdr avatar fdr commented on May 24, 2024

I went ahead and updated the Heroku docs like I promised I would. Somehow I missed the notification the first time around. Sorry for the long delay:

https://devcenter.heroku.com/articles/forked-pg-connections#django-rq

from django-rq.

selwin avatar selwin commented on May 24, 2024

@fdr thanks!

Sent from my phone

On Sep 5, 2014, at 6:42 AM, Daniel Farina [email protected] wrote:

I went ahead and updated the Heroku docs like I promised I would. Somehow I missed the notification the first time around. Sorry for the long delay:

https://devcenter.heroku.com/articles/forked-pg-connections#django-rq


Reply to this email directly or view it on GitHub.

from django-rq.

abulte avatar abulte commented on May 24, 2024

@fdr hi. You seem to have updated the docs with the snippet I provided on #17 (comment). But this snippet does not solve the "connection reset" like I said... At least in my experience.

from django-rq.

fdr avatar fdr commented on May 24, 2024

Does it solve any other inconvenient symptom?

from django-rq.

abulte avatar abulte commented on May 24, 2024

AFAIK no.

from django-rq.

fdr avatar fdr commented on May 24, 2024

On Fri, Sep 5, 2014 at 7:42 AM, Alexandre Bulté [email protected]
wrote:

AFAIK no.

Oh. Pity, I guess I'll remove it, then. Thanks for clarifying that for me.

from django-rq.

abulte avatar abulte commented on May 24, 2024

Thanks for trying ;-) Maybe @selwin has a proper way to do this?

from django-rq.

selwin avatar selwin commented on May 24, 2024

I've never used Heroku before so no :(. Does closing db connection right after worker is started and after job completion help?

I'll reopen this thread since we don't have a proper solution for this issue.

from django-rq.

fdr avatar fdr commented on May 24, 2024

On Fri, Sep 5, 2014 at 9:31 AM, Selwin Ong [email protected] wrote:

I've never used Heroku before so no :(. Does closing db connection right
after worker is started and after job completion help?

I'll reopen this thread since we don't have a proper solution for this
issue.

Heroku is irrelevant in this other than I making the recommendation to open
the bug because I've seen people get mystified by file descriptor
corruption because of fork() a bunch of times...

So, any pg/fork based worker is affected unless one moves connection
creation to post-fork.

from django-rq.

selwin avatar selwin commented on May 24, 2024

Ah ok, got it.

Win~

On Sat, Sep 6, 2014 at 11:25 PM, Daniel Farina [email protected]
wrote:

On Fri, Sep 5, 2014 at 9:31 AM, Selwin Ong [email protected]
wrote:

I've never used Heroku before so no :(. Does closing db connection right
after worker is started and after job completion help?

I'll reopen this thread since we don't have a proper solution for this
issue.

Heroku is irrelevant in this other than I making the recommendation to open
the bug because I've seen people get mystified by file descriptor
corruption because of fork() a bunch of times...

So, any pg/fork based worker is affected unless one moves connection
creation to post-fork.


Reply to this email directly or view it on GitHub
#17 (comment).

from django-rq.

sachinkagarwal avatar sachinkagarwal commented on May 24, 2024

I used this with django-rq:

In every function enqueued to run by a worker, I added the connection.close() at the end. Now the postgres logs do not have the "LOG: could not receive data from client: Connection reset by peer".

More importantly, I think those errors were causing the PG archive log to become larger. So fixing this took on importance for the PG backup not running away.

from django-rq.

mpetyx avatar mpetyx commented on May 24, 2024

Any idea or update on this problem?
I have the exact same issue, tried some of the hacks mentioned above, but nothing works for heroku..
I need to figure out how to post_fork a connection..
Any ideas?
Otherwise I would have to completely drop django-rq and go for celery of something similar...

from django-rq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.