Comments (9)
@blowfishpro Your comment helped me figure out what was happening and sent me on a wild chase...
I've noticed that sometimes garbage collection interrupts the flame graphs in Speedscope (2 frames deep):
Similar to how IO waiting appears as <main>
stack, forcing the user to check the edges to figure out the root cause:
But I remembered that in the past when looking at flamegraphs in speedscope it would show a single (garbage collection)
frame at the bottom of an existing stack instead of creating a new stack.
Looking around in the speedscope importer for stackprof I found this:
if (stack.length === 1 && stack[0].name === '(garbage collection)') {
stack = prevStack.concat(stack)
}
Meaning that when there is a single frame in the stack and it is named (garbage collection)
, instead of showing it as a new stack, it is concatenated to the previous one, stopping it from interrupting the previous stack.
I did some hacking and got it to concat both garbage collection and IO wait to the previous stack, while also dropping the remaining frames since they don't provide extra information:
if (stack[0].name === '(garbage collection)') {
stack = prevStack.concat(stack[0])
} else if (stack[0].name === '<main>') {
stack = prevStack.concat({...stack[0], name: "(io wait)"})
}
Now even framegraphs with many seconds are presented without root interruptions.
This is how garbage collection is displayed in the flamegraph:
And here is how it looks for io wait:
The original PR changing the stackprof imported in speedscope was merged 5 years ago (jlfwong/speedscope#85), and it seems to be a behavior they are interested in keeping (jlfwong/speedscope#178 (comment)).
I will try to submit a PR for these improvements in speedscope, but I'm also wondering if this behavior should instead be part of stackprof result object, offered as an option similar to how it offers ignore_gc
. What is your opinion?
@nateberkopec I came across this while trying to improve the output of rack-mini-profiler flamegraphs, since I noticed you are the new maintainer and have extensive experience profiling apps, do you see any pitfalls with the proposed approach?
from stackprof.
I've been wanting this too. When profiling Rails controller actions, you will end up with your app servers thread showing up if it is multithreaded. The only work around I have found is to use unicorn, which is not threaded (puma and webrick are).
from stackprof.
I was seeing something similar with Phusion Passenger. It's not really multithreaded by default but it does process requests on a thread that's not the main thread. Every time the application is waiting on I/O it shows up as whatever method the main thread is running (which is always the same) and you have to go scroll into the edges of that period to figure out what the app actually called.
from stackprof.
OMG! that's gorgeous!
I've presented more stackprof-generated flamegraphs to more people than probably anyone else in the Ruby community, and I have to explain this every time and it confuses the heck out of people. Your behavior, as screenshotted, is a far superior visualization.
If this is an option, we would turn it on by default in RMP, which is probably good enough I think.
from stackprof.
@technicalpickles I think for this, your best option is to actually use the included middleware
:
https://github.com/tmm1/stackprof/blob/master/lib/stackprof/middleware.rb
The documentation for that is probably a bit lacking though:
https://github.com/tmm1/stackprof#run
but it should behave as a normal middleware, and you shouldn't get the bloat as each request should be run on a single thread. I haven't used this myself directly, but in a Rails project, it should be as simple as doing a:
require 'stackprof/middleware'
app.middleware.insert_before 0, StackProf::Middleware, {:enabled => true, ...}
Hope this helps!
from stackprof.
@NickLaMuro I've been using code pretty similar to the middleware, but I'm not seeing anything obvious in there that would have the other bloat.
I was chatting with @nateberkopec about how to try to eliminate puma in the flamegraph, but he said you'd see that with any multithreaded webserver, which includes webrick, and that unicorn is the only option right now.
from stackprof.
After implementing this in speedscope it started to look like the wrong place to have this logic. It makes sense for stackprof to have the option to clean up its own output.
I looked into the C extension in stackprof, but my C is a bit rusty, and I was worried this might introduce extra overhead, so I opted to start a proof of concept to be provided as a method in StackProf::Report
. I will create a PR shortly.
The script below will clean up GC, and IO stacks from Puma in single and cluster modes. I guess since we have to detect what is the method the main thread is running for each server, this can get tricky and might be different for different server versions...
I welcome feedback. Feel free to execute the script below by calling it with:
ruby cleanup_io_wait_and_gc_stacks.rb -s source_file.json -d destination_file.json
https://gist.github.com/tiagotex/3d1dd48c26b36a5013dcbd84401f38b8
from stackprof.
This might be relevant. Has anyone attempted to test stackprof on ruby 3.2 yet?
from stackprof.
Linking: ruby/ruby#7784
from stackprof.
Related Issues (20)
- Problem with Ruby 2.7.6 and version 0.2.20 HOT 6
- Error installing with Ruby 3.1 HOT 3
- Segmentation fault on Ruby 2.7.6p219 HOT 4
- Possibility to filter certain files or classes?
- flaky test failure with ruby3.1
- Test failures on 32bit arch
- Flamegraph way too short in `wall` mode HOT 5
- Support Gecko format? HOT 2
- Weird timestamp deltas with puma HOT 2
- bug: "missed_samples" integer overflow possible HOT 1
- bug: possible state leak of `buffer_count` state between stackprof invocations HOT 2
- C99 error while compiling v0.2.25 HOT 1
- Add marker metadata feature HOT 1
- stackprof fails to build native extensions on ubuntu 22.04 and ruby 3.2.2 HOT 1
- "lines" attribute format?
- bug: state of _stackprof.last_sample_at can leak between invocations
- 3.3.0-dev: Segfault in `rb_profile_frames` with N:M threads enabled HOT 3
- Add discussions tab, hide the projects tab
- ruby 3.2.2 An error occurred while installing stackprof (0.2.26), and Bundler cannot
- Test failure StackProfTest#test_raw Expected 0 to be > 0.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stackprof.