Giter Club home page Giter Club logo

Comments (8)

grosser avatar grosser commented on September 25, 2024

My first guess would be that passenger uses forking to spawn workers -> might mess up parallel in some way or kill the child processes of parallel

Can you reproduce it by running passenger locally ?
Does it still happen if you only sleep 10 inside the loop ?
Does it still happen if you do a ActiveRecord::Base.connection.reconnect! inside the loop ?

from parallel.

gloacai avatar gloacai commented on September 25, 2024

Let me clone off a new vm of my dev environment so I can get passenger and apache installed. This may take a bit.

from parallel.

grosser avatar grosser commented on September 25, 2024

Just passenger should do apache should not have anything to do with it,
should be possible locally :)
On Apr 17, 2013 5:01 PM, "gloacai" [email protected] wrote:

Let me clone off a new vm of my dev environment so I can get passenger and
apache installed. This may take a bit.


Reply to this email directly or view it on GitHubhttps://github.com//issues/55#issuecomment-16543092
.

from parallel.

gloacai avatar gloacai commented on September 25, 2024
  • Edited this since I originally wrote it way too late in the night to catch typo's*

Well more confusion but some progress.

I got apache and passenger setup on a local box and everything works fine.

On the ops server I then run a plain old ruby test file via the command line calling parallels it works fine.

I next have created a new method for the model called "test" so I can simplify the code to be looked at

def test
Rails.logger.info "Start test"
require 'parallel'
Parallel.map((1..100)) do | item |
Rails.logger.info "#{item}"
end
Rails.logger.info "Stop test"
ActiveRecord::Base.connection.reconnect!
end

This outputs all items (not necessarily in order because they are running in parallel) but never actually spits "Stop test" to the log but instead spits out a bunch of phusion passenger error message. I noticed that if I start dropping the max item value down into the low 20's it starts running through everything completely. If I keep dropping the max item value even more it seems to run through completely more often. This is a server (may be virtual) we're renting from a hosting provider and looking at /proc/cpuinfo 32 cpus are listed of all the same type. My local test virtual machine with phusion passenger only has 2 cpus. A max value of 22/23 looks like the threshold where everything above fails and everything below works but I sometimes get values around there either working or not working.

Moving ActiveRecord::Base.connection.reconnect! inside the loop before the logger statement appears to make no difference. In this example even though its a method of an ActiveRecord model I'm not actually querying the database at all.

I then decided to play with the :in_processes parameter. Setting the call to below I can get it to run everything.
Parallel.map((1..100), :in_processes => 2) do | item |

If I increase the :in_processes to 23 or higher I end up getting the Phusion passenger error messages again before it can say "Stop test". Similar to the case without specifying :in_processes the mid 20's seem to be unstable with 22/23 being the threshold.

As a workaround I guess I can use :in_processes => 20 but I'm concerned that 22/23 seems to be an arbitrary number and I'm actually running into some kind of instability that may hit me again later down the road.

So it looks like the code is definitely forking new processes which run their portion of the loop correctly; the issues is when they end and come back to a single process it's having issues. Is this something that is handled in your code or should I be poking Phusion Passenger folks.

Phusion Passenger error messages if anyone cares. As far as I can read it, there is the standard internal server error message except that I don't have 500.shtml page to display to the user.

Started GET "/500.shtml" for 75.161.107.220 at 2013-04-18 00:35:00 -0700

ActionController::RoutingError (No route matches [GET] "/500.shtml"):
actionpack (3.2.11) lib/action_dispatch/middleware/debug_exceptions.rb:21:in call' actionpack (3.2.11) lib/action_dispatch/middleware/show_exceptions.rb:56:inc
all'
railties (3.2.11) lib/rails/rack/logger.rb:32:in call_app' railties (3.2.11) lib/rails/rack/logger.rb:16:inblock in call'
activesupport (3.2.11) lib/active_support/tagged_logging.rb:22:in tagged' railties (3.2.11) lib/rails/rack/logger.rb:16:incall'
actionpack (3.2.11) lib/action_dispatch/middleware/request_id.rb:22:in call' rack (1.4.5) lib/rack/methodoverride.rb:21:incall'
rack (1.4.5) lib/rack/runtime.rb:17:in call' activesupport (3.2.11) lib/active_support/cache/strategy/local_cache.rb:72:in call'
rack (1.4.5) lib/rack/lock.rb:15:in call' rack-cache (1.2) lib/rack/cache/context.rb:136:inforward'
rack-cache (1.2) lib/rack/cache/context.rb:245:in fetch' rack-cache (1.2) lib/rack/cache/context.rb:185:inlookup'
rack-cache (1.2) lib/rack/cache/context.rb:66:in call!' rack-cache (1.2) lib/rack/cache/context.rb:51:incall'
railties (3.2.11) lib/rails/engine.rb:479:in call' railties (3.2.11) lib/rails/application.rb:223:incall'
railties (3.2.11) lib/rails/railtie/configurable.rb:30:in method_missing' passenger (3.0.19) lib/phusion_passenger/rack/request_handler.rb:96:inproces
s_request'
passenger (3.0.19) lib/phusion_passenger/abstract_request_handler.rb:516:in a ccept_and_process_next_request' passenger (3.0.19) lib/phusion_passenger/abstract_request_handler.rb:274:inm
ain_loop'
passenger (3.0.19) lib/phusion_passenger/rack/application_spawner.rb:206:in s tart_request_handler' passenger (3.0.19) lib/phusion_passenger/rack/application_spawner.rb:171:inb
lock in handle_spawn_application'
passenger (3.0.19) lib/phusion_passenger/utils.rb:470:in safe_fork' passenger (3.0.19) lib/phusion_passenger/rack/application_spawner.rb:166:inh
andle_spawn_application'
passenger (3.0.19) lib/phusion_passenger/abstract_server.rb:357:in server_mai n_loop' passenger (3.0.19) lib/phusion_passenger/abstract_server.rb:206:instart_sync
hronously'
passenger (3.0.19) lib/phusion_passenger/abstract_server.rb:180:in start' passenger (3.0.19) lib/phusion_passenger/rack/application_spawner.rb:129:ins
tart'
passenger (3.0.19) lib/phusion_passenger/spawn_manager.rb:253:in block (2 lev els) in spawn_rack_application' passenger (3.0.19) lib/phusion_passenger/abstract_server_collection.rb:132:in lookup_or_add'
passenger (3.0.19) lib/phusion_passenger/spawn_manager.rb:246:in block in spa wn_rack_application' passenger (3.0.19) lib/phusion_passenger/abstract_server_collection.rb:82:in
block in synchronize'
internal:prelude:10:in synchronize' passenger (3.0.19) lib/phusion_passenger/abstract_server_collection.rb:79:in
synchronize'
passenger (3.0.19) lib/phusion_passenger/spawn_manager.rb:244:in spawn_rack_a pplication' passenger (3.0.19) lib/phusion_passenger/spawn_manager.rb:137:inspawn_applic
ation'
passenger (3.0.19) lib/phusion_passenger/spawn_manager.rb:275:in handle_spawn _application' passenger (3.0.19) lib/phusion_passenger/abstract_server.rb:357:inserver_mai
n_loop'
passenger (3.0.19) lib/phusion_passenger/abstract_server.rb:206:in start_sync hronously' passenger (3.0.19) helper-scripts/passenger-spawn-server:99:in

'

from parallel.

grosser avatar grosser commented on September 25, 2024

I'd say try if it's not reproducible without passenger, try poking the
passenger folks, maybe they know something about a weird trick they are
pulling that might interfere with that.

It would be very helpful to have a simple rack app that fails,
I made a simple example here https://github.com/grosser/parallel_passenger
but it works fine...

On Thu, Apr 18, 2013 at 12:48 AM, gloacai [email protected] wrote:

Well more confusion but some progress.

I got apache and passenger setup on a local box and everything works fine.

On the ops server I then run a plain old ruby test file via the command
line calling parallels it works fine.

I next have created a new method for the model called "test" so I can
simplify the code to be looked at

def test
Rails.logger.info "Start test"
require 'parallel'
Parallel.map((1..100)) do | item |
Rails.logger.info "#{item}"
end
Rails.logger.info "Stop test"
ActiveRecord::Base.connection.reconnect!
end

This outputs all items (not necessarily in order because they are running
in parallel) but never actually spits "Stop test" to the error log but
instead spits out a bunch of phusion passenger error message. I noticed
that if I start dropping the max item value down into the load 20's it
sometimes starts running through everything completely. If keep dropping
the max item value even more it seems to run through completely more often.
This is a server (may be virtual) we're renting from a hosting provider and
looking at /proc/cpuinfo 32 cpus are listed of all the same type. My local
test virtual machine with phusion passenger only has 2 cpus. Setting it to
23 looks like the threshold where everything above fails and everything
below usually works.

Moving ActiveRecord::Base.connection.reconnect! inside the loop before the
logger statement appears to make no difference. In this example even though
its a method of an ActiveRecord model I'm not actually querying the
database at all.

I then decided to play with the :in_processes parameter. Setting the call
to below I can get it to run everything.
Parallel.map((1..100), :in_processes => 2) do | item |

If I increase the :in_processes to 23 or higher I end up getting the
Phusion passenger error messages again before it can say "Stop test".
Similar to the case without specifying :in_processes the mid 20's seem to
be unstable with 23 being the threshold.

As a workaround I guess I can use :in_processes => 20 but I'm concerned
that 22/23 seems to be an arbitrary number and I'm actually running into
some kind of instability that may hit me again later down the road.

So It looks like the code is definitely forking new processes which run
their portion of the loop correctly; the issues is when they end and come
back to a single process it's having issues. Is this something that is
handled in your code or should I be poking Phusion Passenger folk.

Phusion Passenger Error messages if anyone cares. As far as I can read it,
there is an internal server error with no further details.

Started GET "/500.shtml" for 75.161.107.220 at 2013-04-18 00:35:00 -0700

ActionController::RoutingError (No route matches [GET] "/500.shtml"):
actionpack (3.2.11)
lib/action_dispatch/middleware/debug_exceptions.rb:21:in
call'
actionpack (3.2.11) lib/action_dispatch/middleware/show_exceptions.rb:56:in
c
all'
railties (3.2.11) lib/rails/rack/logger.rb:32:in call_app'
railties (3.2.11) lib/rails/rack/logger.rb:16:inblock in call'
activesupport (3.2.11) lib/active_support/tagged_logging.rb:22:in tagged'
railties (3.2.11) lib/rails/rack/logger.rb:16:incall'
actionpack (3.2.11) lib/action_dispatch/middleware/request_id.rb:22:in
call'
rack (1.4.5) lib/rack/methodoverride.rb:21:incall'
rack (1.4.5) lib/rack/runtime.rb:17:in call'
activesupport (3.2.11)
lib/active_support/cache/strategy/local_cache.rb:72:in
call'
rack (1.4.5) lib/rack/lock.rb:15:in call'
rack-cache (1.2) lib/rack/cache/context.rb:136:inforward'
rack-cache (1.2) lib/rack/cache/context.rb:245:in fetch'
rack-cache (1.2) lib/rack/cache/context.rb:185:inlookup'
rack-cache (1.2) lib/rack/cache/context.rb:66:in call!'
rack-cache (1.2) lib/rack/cache/context.rb:51:incall'
railties (3.2.11) lib/rails/engine.rb:479:in call'
railties (3.2.11) lib/rails/application.rb:223:incall'
railties (3.2.11) lib/rails/railtie/configurable.rb:30:in method_missing'
passenger (3.0.19) lib/phusion_passenger/rack/request_handler.rb:96:in
proces
s_request'
passenger (3.0.19)
lib/phusion_passenger/abstract_request_handler.rb:516:in a
ccept_and_process_next_request'
passenger (3.0.19) lib/phusion_passenger/abstract_request_handler.rb:274:in
m
ain_loop'
passenger (3.0.19)
lib/phusion_passenger/rack/application_spawner.rb:206:in s
tart_request_handler'
passenger (3.0.19) lib/phusion_passenger/rack/application_spawner.rb:171:in
b
lock in handle_spawn_application'
passenger (3.0.19) lib/phusion_passenger/utils.rb:470:in safe_fork'
passenger (3.0.19) lib/phusion_passenger/rack/application_spawner.rb:166:in
h
andle_spawn_application'
passenger (3.0.19) lib/phusion_passenger/abstract_server.rb:357:in
server_mai
n_loop'
passenger (3.0.19) lib/phusion_passenger/abstract_server.rb:206:in
start_sync
hronously'
passenger (3.0.19) lib/phusion_passenger/abstract_server.rb:180:in start'
passenger (3.0.19) lib/phusion_passenger/rack/application_spawner.rb:129:in
s
tart'
passenger (3.0.19) lib/phusion_passenger/spawn_manager.rb:253:in block (2
lev
els) in spawn_rack_application'
passenger (3.0.19)
lib/phusion_passenger/abstract_server_collection.rb:132:in
lookup_or_add'
passenger (3.0.19) lib/phusion_passenger/spawn_manager.rb:246:in block in
spa
wn_rack_application'
passenger (3.0.19)
lib/phusion_passenger/abstract_server_collection.rb:82:in
block in synchronize'
internal:prelude:10:in synchronize'
passenger (3.0.19)
lib/phusion_passenger/abstract_server_collection.rb:79:in
synchronize'
passenger (3.0.19) lib/phusion_passenger/spawn_manager.rb:244:in
spawn_rack_a
pplication'
passenger (3.0.19) lib/phusion_passenger/spawn_manager.rb:137:in
spawn_applic
ation'
passenger (3.0.19) lib/phusion_passenger/spawn_manager.rb:275:in
handle_spawn
_application'
passenger (3.0.19) lib/phusion_passenger/abstract_server.rb:357:in
server_mai
n_loop'
passenger (3.0.19) lib/phusion_passenger/abstract_server.rb:206:in
start_sync
hronously'
passenger (3.0.19) helper-scripts/passenger-spawn-server:99:in'


Reply to this email directly or view it on GitHubhttps://github.com//issues/55#issuecomment-16562186
.

from parallel.

gloacai avatar gloacai commented on September 25, 2024

Well I took a break from this issue and I've come back to it after hearing that we've gotten the error to popup again infrequently today even though it's only set to use 15 processes. Did some poking around of the system this time instead of just the app and Phusion Passenger. I set the processes count back up to 30 so I could reliably force the error to appear. Lo and Behold every time the process dies and the application log gives the Parallel::Deadworker message; in the system's /var/log/messages I see a message stating:

May 4 13:00:07 vps6638 kernel: [11050141.364538] OOM killed process 12998 (ruby) vm:1815004kB, rss:100204kB, swap:0kB

Regenerating the error while Running a top along side I do see memory drop like a rock. The virtual server claims to have 1 gig of memory and is naturally using ~400meg of it. I'm still surprised that when I call this specific part of the app it's using the remaining 600 meg up.

I guess I need to do the following:

  • Figure out a way to profile it's memory usage for a single instance.
  • Figure out if there is anyway to restructure variables to limit the memory used
  • Figure out how to tell OOM not kill my ruby processes.
  • See how hard it is to move to REE to get copy_on_write_friendly you mention on the header page
  • Worst case bug the hosting provider to give us more memory

from parallel.

grosser avatar grosser commented on September 25, 2024

Or use 1.9.3 ;)

from parallel.

gloacai avatar gloacai commented on September 25, 2024

Actually we're already on 1.9.3p374. Let me look at the patch notes for up to p392 to see if anything else was released that might help or 2.0.0 might help.

from parallel.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.