Comments (30)
Or you could drop Django :)
from django-rq.
FWIW I've been using this for years:
from django.db import connection
from rq.worker import Worker
class ConnectionClosingWorker(Worker):
"""
RQ Worker that closes the database connection before forking.
See also https://github.com/ui/django-rq/issues/17
"""
def execute_job(self, job, queue):
connection.close()
super().execute_job(job, queue)
It's been too long since I wrote it to remember the details, but it solved this issue and hasn't caused problems in a largish rq deploy. By "largish", I mean "we're moving off Heroku because we hit their 300-dyno limit".
from django-rq.
I've been some more thinking about this, here's a few ideas:
Alternative 1
Writing a wrapper function for async functions - this doesn't require any changes to RQ but could be tedious if every single enqueued django function on Heroku needs to explicitly close db connection.
from django.db import connection
def func():
#pass
def rq_func():
func()
connection.close()
queue.enqueue(rq_func)
Alternative 2
Implementing a job callback pattern, however, this idea was rejected in rq/rq#155, I'm also not a big fan of the notion of having callbacks in RQ.
queue.enqueue(f, callback=close_connection)
Alternative 3
Add a job_class
argument to Worker
to allow the use of custom job classes. In this case, Django RQ can define its own Job
class to deal with this issue. It would look something like:
from rq.job import Job
from django.db import connection
ConnectionClosingJob(Job):
def perform(self):
result = super(DatabaseClosingJob, self).perform()
connection.close()
return result
# Somewhere in Django-RQ
worker = Worker(connection=connection, job_class=ConnectionClosingJob)
@nvie, what do you think?
from django-rq.
I've never written a Heroku application before so this will be hard for me to debug. Does explicitly closing the connection on your queued function fix it? I did a little bit of googling and it seems that this problem is not specific to Django/RQ. There's a resque thread discussing the same problem https://github.com/defunkt/resque/issues/367
from django-rq.
Yea I saw that thread too... I'm surprised I haven't seen any traces of this problem with Django though. I was kind-of wondering if I could implement a similar fix to the gem mentioned in that thread. What they're doing is adding the disconnect on some sort of "after_job" callback. Does such a thing exist with RQ?
from django-rq.
No, there's no such thing as "after_job" in RQ. We can propose adding one, as well as the ability to use a different worker class since this is already in RQ's roadmap (implementing concurrent workers).
However, we need to first ensure that the proposed solution does indeed solve the problem. Can you try if running "close_connection" in the function being queued fix this?
from django-rq.
Yep. I just tested it one of my async functions and I don't see the EOF errors.
from django-rq.
I'm experiencing the same issue. I'm glad I checked, because I had no idea what the cause issue was. I think alternatives 2 or 3 would be good. I would think of it less of a callback than as a hook. I would ideally like to do other post-job housekeeping.
from django-rq.
Just to add, I'm not sure Alternative 1 really works when func() is a bound method. In my use case, I'm binding a method at run time and queuing that method directly. I'm refactoring to try to pass a wrapper function instead that will recover the method reference job-side and do the necessary cleanup, but it's really gumming up the code.
from django-rq.
I don't understand the problem in the first place. Stripping away django-rq and RQ from the equation, what's the core of the problem we're looking at here? Isn't it that Django implicitly opens a connection, but does never explicitly close it until its Python process is terminated (which never happens in an RQ worker situation)?
from django-rq.
From what I understand it's a harmless postgres error message that occurs when no Terminate message is received. I wonder if enabling "autocommit" mode for the psycopg2 DB backend would fix it.
from django-rq.
@nvie That's one manifestation of the problem, but to me, the overall problem is that I want to be able to do other post-task work. In my case, I'm manually tracking tasks in flight so that I can prevent duplicate tasks with the same arguments from running. In the current paradigm, I have to schedule a wrapper function that calls that actual task function performs this clean up. But since there are multiple actual task functions that might be called, and since I can't pass function references as arguments, my wrapper function has to do some serious gymnastics to recreate the task function reference worker-side. The whole thing would be much easier if there were hooks for post-task work. I really like Alternative 3, personally.
from django-rq.
The problem I believe is that RQ forks its worker processes from the main process, and django shares its database connection from the parent process to the child process. http://stackoverflow.com/questions/8242837/django-multiprocessing-and-database-connections sums up the problem pretty well.
from django-rq.
We have decided to implement the fix on RQ itself instead of django-rq. RQ will allow the user to use a custom Worker
class where we can then tell the worker to do close db connection before forking.
There's an open issue tracking this feature on rq/rq#233 . Feel free to contribute/chime in on the discussion on the thread.
from django-rq.
I came here because I was worried with the could not receive data from client: Connection reset by peer
on my django/django-rq app running on Heroku.
I applied the tip described here to create "clean" connections when the worker is launched and avoir the "forked connections" syndrome that seems to be the root cause: rq/rq#233 (comment)
But I stil get reset connection message on Heroku. Did I miss something?
from django-rq.
@abulte mind showing me a snippet of your code that manages postgres connection? If I remember correctly @fdr also wanted to put this up on Heroku's docs so we should definitely work out a reliable way of doing this and properly document it.
from django-rq.
Here's my snippet, directly inspired from the mentioned link:
from rq import Worker
from django.db import close_connection
class AAWorker(Worker):
def fork_and_perform_job(self, *args, **kwargs):
close_connection()
super(AAWorker, self).fork_and_perform_job(*args, **kwargs)
I should add that it did not solve the "connection reset" problem.
I would be very glad to have the heroku docs for doing that correctly ;-)
Thanks for your help.
from django-rq.
@abulte THANK YOU dude, this just saved me so much time and effort.
EDIT: Sorry, scratch that response, this didn't work for me, I think I have a different bug :/ Still thank you though lol.
from django-rq.
I went ahead and updated the Heroku docs like I promised I would. Somehow I missed the notification the first time around. Sorry for the long delay:
https://devcenter.heroku.com/articles/forked-pg-connections#django-rq
from django-rq.
@fdr thanks!
Sent from my phone
On Sep 5, 2014, at 6:42 AM, Daniel Farina [email protected] wrote:
I went ahead and updated the Heroku docs like I promised I would. Somehow I missed the notification the first time around. Sorry for the long delay:
https://devcenter.heroku.com/articles/forked-pg-connections#django-rq
—
Reply to this email directly or view it on GitHub.
from django-rq.
@fdr hi. You seem to have updated the docs with the snippet I provided on #17 (comment). But this snippet does not solve the "connection reset" like I said... At least in my experience.
from django-rq.
Does it solve any other inconvenient symptom?
from django-rq.
AFAIK no.
from django-rq.
On Fri, Sep 5, 2014 at 7:42 AM, Alexandre Bulté [email protected]
wrote:
AFAIK no.
Oh. Pity, I guess I'll remove it, then. Thanks for clarifying that for me.
from django-rq.
Thanks for trying ;-) Maybe @selwin has a proper way to do this?
from django-rq.
I've never used Heroku before so no :(. Does closing db connection right after worker is started and after job completion help?
I'll reopen this thread since we don't have a proper solution for this issue.
from django-rq.
On Fri, Sep 5, 2014 at 9:31 AM, Selwin Ong [email protected] wrote:
I've never used Heroku before so no :(. Does closing db connection right
after worker is started and after job completion help?I'll reopen this thread since we don't have a proper solution for this
issue.Heroku is irrelevant in this other than I making the recommendation to open
the bug because I've seen people get mystified by file descriptor
corruption because of fork() a bunch of times...
So, any pg/fork based worker is affected unless one moves connection
creation to post-fork.
from django-rq.
Ah ok, got it.
Win~
On Sat, Sep 6, 2014 at 11:25 PM, Daniel Farina [email protected]
wrote:
On Fri, Sep 5, 2014 at 9:31 AM, Selwin Ong [email protected]
wrote:I've never used Heroku before so no :(. Does closing db connection right
after worker is started and after job completion help?I'll reopen this thread since we don't have a proper solution for this
issue.Heroku is irrelevant in this other than I making the recommendation to open
the bug because I've seen people get mystified by file descriptor
corruption because of fork() a bunch of times...So, any pg/fork based worker is affected unless one moves connection
creation to post-fork.—
Reply to this email directly or view it on GitHub
#17 (comment).
from django-rq.
I used this with django-rq:
In every function enqueued to run by a worker, I added the connection.close() at the end. Now the postgres logs do not have the "LOG: could not receive data from client: Connection reset by peer".
More importantly, I think those errors were causing the PG archive log to become larger. So fixing this took on importance for the PG backup not running away.
from django-rq.
Any idea or update on this problem?
I have the exact same issue, tried some of the hacks mentioned above, but nothing works for heroku..
I need to figure out how to post_fork a connection..
Any ideas?
Otherwise I would have to completely drop django-rq and go for celery of something similar...
from django-rq.
Related Issues (20)
- Regarding Logging
- django.core.exceptions.ImproperlyConfigured: Requested setting RQ_SHOW_ADMIN_LINK, but settings are not configured. You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings. HOT 2
- Worker pool implementation
- job_timeout is not killing subprocess
- Separate RQScheduler into its own fake model
- RQ Job Terminated Unexpectedly HOT 2
- Allow Specifying Default Serializer for Django-RQ Queue HOT 2
- How to run django-rq worker via Webhook/API call HOT 1
- Django dumpdata will fail because of unmanaged model HOT 1
- KeyError accessing stats page HOT 2
- Error in job_detail.html at line 226 HOT 2
- Sentinel support broken since 2.9.0 HOT 5
- Add support for floating point intervals in rescheduler command HOT 1
- ValueError: Invalid attribute name/AttributeError: module has no attribute
- keys of command in MULTI calls must be in same slot HOT 1
- TypeError in job_detail.html with Python 12 HOT 3
- Database Errors When Running rqworker-pool HOT 14
- RQ WorkerPool is not loading models at all
- `get_scheduler` should support a custom connection
- Tag 2.10.2 on Git HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from django-rq.