Giter Club home page Giter Club logo

Comments (6)

vukasin avatar vukasin commented on August 31, 2024

interesting. fork() system call shouldn't fail unless there is absolutely no memory left on the system (fork creates a new process so it really doesn't care about the parent's memory). Since all resources for the subprocess they are released by the OS when the process terminates. What might be leaking are the actual Subprocess objects with the stdout/stderr strings (which could be large but not huge). Are you handling the SIGCHLD in your process since that could potentially, cause leaking of PID's? Could you do a

ps aufxwww > pids.txt

and post the pids.txt file (or at least the relevant subtree)? Also, the size of the output from the subprocesses would be a good info too.

from tornado-subprocess.

heynemann avatar heynemann commented on August 31, 2024

I'm not handling SIGCHLD (or at least not that I'm aware of). The portion of my code that calls tornado_subprocess is:

self.pipe = Subprocess(self.on_response, timeout=self.timeout * 1.1 / 1000, args=["node", vm_file])
self.pipe.start()

Then in the on_response method:

def on_response(self, *args, **kw):
    if isinstance(args[0], (tuple, list, set)):
        status, stdout, stderr, has_timed_out = args[0]
    else:
        status, stdout, stderr, has_timed_out = args

    del self.pipe

pids.txt:

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2 0.0 0.0 0 0 ? S 2012 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S 2012 0:09 _ [ksoftirqd/0]
root 4 0.0 0.0 0 0 ? S 2012 0:00 _ [kworker/0:0]
root 5 0.0 0.0 0 0 ? S 2012 0:00 _ [kworker/u:0]
root 6 0.0 0.0 0 0 ? S 2012 0:00 _ [migration/0]
root 7 0.0 0.0 0 0 ? S 2012 0:04 _ [watchdog/0]
root 8 0.0 0.0 0 0 ? S< 2012 0:00 _ [cpuset]
root 9 0.0 0.0 0 0 ? S< 2012 0:00 _ [khelper]
root 10 0.0 0.0 0 0 ? S 2012 0:00 _ [kdevtmpfs]
root 11 0.0 0.0 0 0 ? S< 2012 0:00 _ [netns]
root 12 0.0 0.0 0 0 ? S 2012 0:00 _ [xenwatch]
root 13 0.0 0.0 0 0 ? S 2012 0:00 _ [xenbus]
root 14 0.0 0.0 0 0 ? S 2012 0:01 _ [sync_supers]
root 15 0.0 0.0 0 0 ? S 2012 0:00 _ [bdi-default]
root 16 0.0 0.0 0 0 ? S< 2012 0:00 _ [kintegrityd]
root 17 0.0 0.0 0 0 ? S< 2012 0:00 _ [kblockd]
root 18 0.0 0.0 0 0 ? S< 2012 0:00 _ [ata_sff]
root 19 0.0 0.0 0 0 ? S 2012 0:00 _ [khubd]
root 20 0.0 0.0 0 0 ? S< 2012 0:00 _ [md]
root 21 0.0 0.0 0 0 ? S 2012 0:18 _ [kworker/0:1]
root 23 0.0 0.0 0 0 ? S 2012 0:00 _ [kworker/u:1]
root 24 0.0 0.0 0 0 ? S 2012 0:00 _ [khungtaskd]
root 25 0.0 0.0 0 0 ? S 2012 0:00 _ [kswapd0]
root 26 0.0 0.0 0 0 ? SN 2012 0:00 _ [ksmd]
root 27 0.0 0.0 0 0 ? S 2012 0:00 _ [fsnotify_mark]
root 28 0.0 0.0 0 0 ? S 2012 0:00 _ [ecryptfs-kthrea]
root 29 0.0 0.0 0 0 ? S< 2012 0:00 _ [crypto]
root 37 0.0 0.0 0 0 ? S< 2012 0:00 _ [kthrotld]
root 38 0.0 0.0 0 0 ? S 2012 0:00 _ [khvcd]
root 57 0.0 0.0 0 0 ? S< 2012 0:00 _ [devfreq_wq]
root 155 0.0 0.0 0 0 ? S 2012 0:19 _ [jbd2/xvda1-8]
root 156 0.0 0.0 0 0 ? S< 2012 0:00 _ [ext4-dio-unwrit]
root 553 0.0 0.0 0 0 ? S 2012 0:00 _ [kjournald]
root 690 0.0 0.0 0 0 ? S 2012 0:08 _ [flush-202:1]
root 1 0.0 0.0 24340 2172 ? Ss 2012 0:01 /sbin/init
root 177 0.0 0.0 25384 1256 ? S 2012 0:00 mountall --daemon
root 241 0.0 0.0 17232 584 ? S 2012 0:00 upstart-udev-bridge --daemon
root 246 0.0 0.0 21592 1180 ? Ss 2012 0:00 /sbin/udevd --daemon
root 294 0.0 0.0 21464 668 ? S 2012 0:00 _ /sbin/udevd --daemon
root 295 0.0 0.0 21464 628 ? S 2012 0:00 _ /sbin/udevd --daemon
root 367 0.0 0.0 15188 380 ? S 2012 0:00 upstart-socket-bridge --daemon
root 433 0.0 0.0 7264 1020 ? Ss 2012 0:00 dhclient3 -e IF_METRIC=100 -pf /var/run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases -1 eth0
root 589 0.0 0.0 49956 2756 ? Ss 2012 0:05 /usr/sbin/sshd -D
root 11097 0.0 0.0 73360 3560 ? Ss 11:43 0:00 _ sshd: ubuntu [priv]
ubuntu 11221 0.0 0.0 73360 1676 ? S 11:43 0:00 _ sshd: ubuntu@pts/0
ubuntu 11222 10.4 0.2 25848 8288 pts/0 Ss 11:43 0:00 _ -bash
ubuntu 11322 0.0 0.0 16848 1204 pts/0 R+ 11:43 0:00 _ ps aufxwww
102 600 0.0 0.0 23816 916 ? Ss 2012 0:00 dbus-daemon --system --fork --activation=upstart
syslog 608 0.0 0.0 253716 1816 ? Sl 2012 1:14 rsyslogd -c5
root 663 0.0 0.0 14504 912 tty4 Ss+ 2012 0:00 /sbin/getty -8 38400 tty4
root 670 0.0 0.0 14504 924 tty5 Ss+ 2012 0:00 /sbin/getty -8 38400 tty5
root 676 0.0 0.0 14504 924 tty2 Ss+ 2012 0:00 /sbin/getty -8 38400 tty2
root 677 0.0 0.0 14504 920 tty3 Ss+ 2012 0:00 /sbin/getty -8 38400 tty3
root 679 0.0 0.0 14504 916 tty6 Ss+ 2012 0:00 /sbin/getty -8 38400 tty6
root 686 0.0 0.0 4328 636 ? Ss 2012 0:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket
root 687 0.0 0.0 19112 904 ? Ss 2012 0:01 cron
daemon 688 0.0 0.0 16908 376 ? Ss 2012 0:00 atd
whoopsie 709 0.0 0.0 187588 2792 ? Ssl 2012 0:00 whoopsie
root 712 0.0 0.3 59764 11808 ? Ss 2012 2:30 /usr/bin/python /usr/bin/supervisord
1001 31629 4.5 14.0 679248 541084 ? S 00:15 31:04 _ fightcode - waiting
1001 31630 4.3 14.0 677100 538872 ? S 00:15 30:01 _ fightcode - waiting
1001 31631 20.6 14.1 680548 542248 ? S 00:15 141:57 _ fightcode - waiting
1001 31644 19.6 14.4 694388 556052 ? S 00:15 135:36 _ fightcode - waiting
root 805 0.0 0.0 14504 916 tty1 Ss+ 2012 0:00 /sbin/getty -8 38400 tty1
root 3855 0.0 0.0 62820 1252 ? Ss 03:59 0:00 nginx: master process /usr/sbin/nginx
www-data 3856 0.0 0.0 63592 2948 ? S 03:59 0:09 _ nginx: worker process

Another server's pids.txt:

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2 0.0 0.0 0 0 ? S 2012 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S 2012 0:08 _ [ksoftirqd/0]
root 4 0.0 0.0 0 0 ? S 2012 0:00 _ [kworker/0:0]
root 5 0.0 0.0 0 0 ? S 2012 0:00 _ [kworker/u:0]
root 6 0.0 0.0 0 0 ? S 2012 0:00 _ [migration/0]
root 7 0.0 0.0 0 0 ? S 2012 0:03 _ [watchdog/0]
root 8 0.0 0.0 0 0 ? S< 2012 0:00 _ [cpuset]
root 9 0.0 0.0 0 0 ? S< 2012 0:00 _ [khelper]
root 10 0.0 0.0 0 0 ? S 2012 0:00 _ [kdevtmpfs]
root 11 0.0 0.0 0 0 ? S< 2012 0:00 _ [netns]
root 12 0.0 0.0 0 0 ? S 2012 0:00 _ [xenwatch]
root 13 0.0 0.0 0 0 ? S 2012 0:00 _ [xenbus]
root 14 0.0 0.0 0 0 ? S 2012 0:01 _ [sync_supers]
root 15 0.0 0.0 0 0 ? S 2012 0:00 _ [bdi-default]
root 16 0.0 0.0 0 0 ? S< 2012 0:00 _ [kintegrityd]
root 17 0.0 0.0 0 0 ? S< 2012 0:00 _ [kblockd]
root 18 0.0 0.0 0 0 ? S< 2012 0:00 _ [ata_sff]
root 19 0.0 0.0 0 0 ? S 2012 0:00 _ [khubd]
root 20 0.0 0.0 0 0 ? S< 2012 0:00 _ [md]
root 21 0.0 0.0 0 0 ? S 2012 0:14 _ [kworker/0:1]
root 23 0.0 0.0 0 0 ? S 2012 0:00 _ [kworker/u:1]
root 24 0.0 0.0 0 0 ? S 2012 0:00 _ [khungtaskd]
root 25 0.0 0.0 0 0 ? S 2012 0:00 _ [kswapd0]
root 26 0.0 0.0 0 0 ? SN 2012 0:00 _ [ksmd]
root 27 0.0 0.0 0 0 ? S 2012 0:00 _ [fsnotify_mark]
root 28 0.0 0.0 0 0 ? S 2012 0:00 _ [ecryptfs-kthrea]
root 29 0.0 0.0 0 0 ? S< 2012 0:00 _ [crypto]
root 37 0.0 0.0 0 0 ? S< 2012 0:00 _ [kthrotld]
root 38 0.0 0.0 0 0 ? S 2012 0:00 _ [khvcd]
root 57 0.0 0.0 0 0 ? S< 2012 0:00 _ [devfreq_wq]
root 157 0.0 0.0 0 0 ? S 2012 0:16 _ [jbd2/xvda1-8]
root 158 0.0 0.0 0 0 ? S< 2012 0:00 _ [ext4-dio-unwrit]
root 549 0.0 0.0 0 0 ? S 2012 0:00 _ [kjournald]
root 707 0.0 0.0 0 0 ? S 2012 0:07 _ [flush-202:1]
root 1 0.0 0.0 24340 2136 ? Ss 2012 0:01 /sbin/init
root 179 0.0 0.0 25384 1176 ? S 2012 0:00 mountall --daemon
root 245 0.0 0.0 17232 580 ? S 2012 0:00 upstart-udev-bridge --daemon
root 248 0.0 0.0 21568 1136 ? Ss 2012 0:00 /sbin/udevd --daemon
root 316 0.0 0.0 21432 712 ? S 2012 0:00 _ /sbin/udevd --daemon
root 336 0.0 0.0 21564 716 ? S 2012 0:00 _ /sbin/udevd --daemon
root 367 0.0 0.0 15188 372 ? S 2012 0:00 upstart-socket-bridge --daemon
root 421 0.0 0.0 7264 1020 ? Ss 2012 0:00 dhclient3 -e IF_METRIC=100 -pf /var/run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases -1 eth0
root 586 0.0 0.0 49956 2616 ? Ss 2012 0:04 /usr/sbin/sshd -D
root 1342 0.0 0.0 73360 3556 ? Ss 11:45 0:00 _ sshd: ubuntu [priv]
ubuntu 1470 0.0 0.0 73360 1672 ? S 11:45 0:00 _ sshd: ubuntu@pts/0
ubuntu 1471 8.1 0.2 25852 8284 pts/0 Ss 11:45 0:00 _ -bash
ubuntu 1571 0.0 0.0 16848 1196 pts/0 R+ 11:45 0:00 _ ps aufxwww
syslog 598 0.0 0.0 253716 1812 ? Sl 2012 0:58 rsyslogd -c5
102 603 0.0 0.0 23916 968 ? Ss 2012 0:00 dbus-daemon --system --fork --activation=upstart
root 659 0.0 0.0 14504 856 tty4 Ss+ 2012 0:00 /sbin/getty -8 38400 tty4
root 666 0.0 0.0 14504 860 tty5 Ss+ 2012 0:00 /sbin/getty -8 38400 tty5
root 672 0.0 0.0 14504 852 tty2 Ss+ 2012 0:00 /sbin/getty -8 38400 tty2
root 673 0.0 0.0 14504 860 tty3 Ss+ 2012 0:00 /sbin/getty -8 38400 tty3
root 675 0.0 0.0 14504 852 tty6 Ss+ 2012 0:00 /sbin/getty -8 38400 tty6
root 681 0.0 0.0 4328 632 ? Ss 2012 0:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket
root 683 0.0 0.0 19112 904 ? Ss 2012 0:00 cron
daemon 684 0.0 0.0 16908 376 ? Ss 2012 0:00 atd
whoopsie 698 0.0 0.0 187588 2732 ? Ssl 2012 0:00 whoopsie
root 705 0.0 0.3 59764 11776 ? Ss 2012 2:05 /usr/bin/python /usr/bin/supervisord
1001 21528 7.0 13.9 672148 533908 ? S 00:16 48:36 _ fightcode - waiting
1001 21529 7.3 13.9 672804 534488 ? S 00:16 50:51 _ fightcode - waiting
1001 21542 18.7 14.1 681328 543060 ? R 00:16 129:16 _ fightcode - waiting
1001 21549 18.9 14.0 679064 540696 ? S 00:16 130:21 _ fightcode - waiting
root 800 0.0 0.0 34620 1212 ? S 2012 0:22 /usr/bin/monit -c /etc/monit/monitrc
root 822 0.0 0.0 14504 860 tty1 Ss+ 2012 0:00 /sbin/getty -8 38400 tty1
root 27158 0.0 0.0 62820 1248 ? Ss 03:59 0:00 nginx: master process /usr/sbin/nginx
www-data 27159 0.0 0.0 63588 2732 ? S 03:59 0:07 _ nginx: worker process

The output from the subprocesses is actually quite large. I run a node.js script that outputs a potentially VERY big JSON object.

Thanks a LOT for helping us find what's going on. Anything else you need from us I can provide.

from tornado-subprocess.

vukasin avatar vukasin commented on August 31, 2024

if that json is very big, check if the subprocess instances get freed properly. as a workaround, you can add the following to the handler function

    def on_response(self, *args, **kw):
        if isinstance(args[0], (tuple, list, set)):
            status, stdout, stderr, has_timed_out = args[0]
        else:
            status, stdout, stderr, has_timed_out = args
        del self.pipe
        self.streams = []

this will force the return objects to be released

from tornado-subprocess.

vukasin avatar vukasin commented on August 31, 2024

hmm... this could explain our problem:

http://stackoverflow.com/questions/1216794/python-subprocess-popen-erroring-with-oserror-errno-12-cannot-allocate-memory

from tornado-subprocess.

heynemann avatar heynemann commented on August 31, 2024

Still having the same errors. It happens after a while. I'm going to try something different. I might have found an issue with our server.

It has ZERO swap space. We don't even have swap mounted. Gonna try getting a swap disk mounted.

Thanks for all your help. I'll let you know how it goes.

from tornado-subprocess.

vukasin avatar vukasin commented on August 31, 2024

ok, i'm closing this issue. i'll reopen it if it pops up again

from tornado-subprocess.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.