Giter Club home page Giter Club logo

yjit-bench's Introduction

yjit-bench

Small set of benchmarks and scripts for the YJIT Ruby JIT compiler project, which lives in the Shopify/yjit repository.

The benchmarks are found in the benchmarks directory. Individual Ruby files in benchmarks are microbenchmarks. Subdirectories under benchmarks are larger macrobenchmarks. Each benchmark relies on a harness found in ./harness/harness.rb. The harness controls the number of times a benchmark is run, and writes timing values into an output file.

The run_benchmarks.rb script (optional) traverses the benchmarks directory and runs the benchmarks in there. It reads the output file written by the benchmarking harness. The output is written to multiple files at the end -- CSV, text and JSON -- so that results can be easily viewed or graphed in any spreadsheet editor.

Installation

Clone this repository:

git clone https://github.com/Shopify/yjit-bench.git yjit-bench

Benchmarking YJIT

yjit-bench supports benchmarking any Ruby implementation. But if you want to benchmark YJIT, follow these instructions to build and install YJIT.

If you install it with the name ruby-yjit on chruby, you should enable it before running ./run_benchmarks.rb:

chruby ruby-yjit

Usage

To run all the benchmarks and record the data:

cd yjit-bench
./run_benchmarks.rb

This runs for a few minutes and produces a table like this in the console (results below not up to date):

-------------  -----------  ----------  ---------  ----------  -----------  ------------
bench          interp (ms)  stddev (%)  yjit (ms)  stddev (%)  interp/yjit  yjit 1st itr
30k_ifelse     2372.0       0.0         447.6      0.1         5.30         4.16
30k_methods    6328.3       0.0         963.4      0.0         6.57         6.25
activerecord   171.7        0.8         144.2      0.7         1.19         1.15
binarytrees    445.8        2.1         389.5      2.5         1.14         1.14
cfunc_itself   105.7        0.2         58.7       0.7         1.80         1.80
fannkuchredux  6697.3       0.1         6714.4     0.1         1.00         1.00
fib            245.3        0.1         77.1       0.4         3.18         3.19
getivar        97.3         0.9         44.3       0.6         2.19         0.98
lee            1269.7       0.9         1172.9     1.0         1.08         1.08
liquid-render  204.5        1.0         172.4      1.3         1.19         1.18
nbody          121.9        0.1         121.6      0.3         1.00         1.00
optcarrot      6260.2       0.5         4723.1     0.3         1.33         1.33
railsbench     3827.9       0.9         3581.3     1.3         1.07         1.05
respond_to     259.0        0.6         197.1      0.4         1.31         1.31
setivar        73.1         0.2         53.3       0.7         1.37         1.00
-------------  -----------  ----------  ---------  ----------  -----------  ------------

The interp/yjit column is the ratio of the average time taken by the interpreter over the average time taken by YJIT after a number of warmup iterations. Results above 1 represent speedups. For instance, 1.14 means "YJIT is 1.14 times as fast as the interpreter".

Specific categories

By default, run_benchmarks.rb runs all three benchmark categories, --category headline,other,micro. You can run only benchmarks with specific categories:

./run_benchmarks.rb --category micro

You can also only the headline benchmarks with the --headline option:

./run_benchmarks.rb --headline

Specific benchmarks

To run one or more specific benchmarks and record the data:

./run_benchmarks.rb fib lee optcarrot

Running a single benchmark

This is the easiest way to run a single benchmark. It requires no setup at all and assumes nothing about the Ruby you are benchmarking. It's also convenient for profiling, debugging, etc, especially since all benchmarked code runs in that process.

ruby benchmarks/some_benchmark.rb

Ruby options

By default, yjit-bench benchmarks the Ruby used for run_benchmarks.rb. If the Ruby has --yjit option, it compares two Ruby commands, -e "interp::ruby" and -e "yjit::ruby --yjit. However, if you specify -e yourself, you can override what Ruby is benchmarked.

# "xxx::" prefix can be used to specify a shorter name/alias, but it's optional.
./run_benchmarks.rb -e "ruby" -e "yjit::ruby --yjit"

# You could also measure only a single Ruby
./run_benchmarks.rb -e "3.1.0::/opt/rubies/3.1.0/bin/ruby"

# With --chruby, you can easily specify rubies managed by chruby
./run_benchmarks.rb --chruby "3.1.0" --chruby "3.1.0+YJIT::3.1.0 --yjit"

# ";" can be used to specify multiple executables in a single option
./run_benchmarks.rb --chruby "3.1.0;3.1.0+YJIT::3.1.0 --yjit"

YJIT options

You can use --yjit_opts to specify YJIT command-line options:

./run_benchmarks.rb --yjit_opts="--yjit-version-limit=10" fib lee optcarrot

Running pre-init code

It is possible to use run_benchmarks.rb to run arbitrary code before each benchmark run using the --with-pre-init option.

For example: to run benchmarks with GC.auto_compact enabled a pre-init.rb file can be created, containing GC.auto_compact=true, and this can be passed into the benchmarks in the following way:

./run_benchmarks.rb --with-pre-init=./pre-init.rb

This file will then be passed to the underlying Ruby interpreter with -r.

Harnesses

You can find several test harnesses in this repository:

  • harness - the normal default harness, with duration controlled by warmup iterations and time/count limits
  • harness-perf - a simplified harness that runs for exactly the hinted number of iterations
  • harness-bips - a harness that measures iterations/second until stable
  • harness-continuous - a harness that adjusts the batch sizes of iterations to run in stable iteration size batches
  • harness-stats - count method calls and loop iterations
  • harness-warmup - a harness which runs as long as needed to find warmed up (peak) performance

To use it, run a benchmark script directly, specifying a harness directory with -I:

ruby -Iharness benchmarks/railsbench/benchmark.rb

There is also a robust but complex CI harness in the yjit-metrics repo.

Iterations and duration

With the default harness, the number of iterations and duration can be controlled by the following environment variables:

  • WARMUP_ITRS: The number of warm-up iterations, ignored in the final comparison (default: 15)
  • MIN_BENCH_ITRS: The minimum number of benchmark iterations (default: 10)
  • MIN_BENCH_TIME: The minimum seconds for benchmark (default: 10)

You can also use --warmup, --bench, or --once to set these environment variables:

# same as: WARMUP_ITRS=2 MIN_BENCH_ITRS=3 MIN_BENCH_TIME=0 ./run_benchmarks.rb railsbench
./run_benchmarks.rb railsbench --warmup=2 --bench=3

# same as: WARMUP_ITRS=0 MIN_BENCH_ITRS=1 MIN_BENCH_TIME=0 ./run_benchmarks.rb railsbench
./run_benchmarks.rb railsbench --once

There is also a handy script for running benchmarks just once using WARMUP_ITRS=0 MIN_BENCH_ITRS=1 MIN_BENCH_TIME=0, for example with the --yjit-stats command-line option:

./run_once.sh --yjit-stats benchmarks/railsbench/benchmark.rb

Using perf

There is also a harness to use Linux perf. By default, it only runs a fixed number of iterations. If PERF environment variable is present, it starts the perf subcommand after warmup.

# Use `perf record` for both warmup and benchmark
perf record ruby --yjit-perf=map -Iharness-perf benchmarks/railsbench/benchmark.rb

# Use `perf record` only for benchmark
PERF=record ruby --yjit-perf=map -Iharness-perf benchmarks/railsbench/benchmark.rb

This is the only harness that uses run_benchmark's argument, num_itrs_hint.

Measuring memory usage

--rss option of run_benchmarks.rb allows you to measure RSS after benchmark iterations.

./run_benchmarks.rb --rss

Rendering a graph

--graph option of run_benchmarks.rb allows you to render benchmark results as a graph.

# Write a graph at data/output_XXX.png (it will print the path)
./run_benchmarks.rb --graph

Installation

Before using this option, you might need to install the dependencies of Gruff:

# macOS
brew install imagemagick

# Ubuntu
sudo apt-get install libmagickwand-dev

Changing font size

You can regenerate a graph with misc/graph.rb, changing its font size.

Usage: misc/graph.rb [options] CSV_PATH
        --title SIZE                 title font size
        --legend SIZE                legend font size
        --marker SIZE                marker font size

Disabling CPU Frequency Scaling

To disable CPU frequency scaling with an Intel CPU, edit /etc/default/grub or /etc/default/grub.d/50-cloudimg-settings.cfg and add intel_pstate=no_hwp to GRUB_CMDLINE_LINUX_DEFAULT. It’s a space-separated list.

Then:

sudo update-grub
sudo reboot
sudo sh -c 'echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo'

To verify things worked:

  • cat /proc/cmdline to see the intel_pstate=no_hwp parameter is in there
  • ls /sys/devices/system/cpu/intel_pstate/ and hwp_dynamic_boost should not exist
  • cat /sys/devices/system/cpu/intel_pstate/no_turbo should say 1

Helpful docs:

yjit-bench's People

Contributors

byroot avatar casperisfine avatar chrisseaton avatar cursedcoder avatar dependabot[bot] avatar echiugoog avatar eightbitraptor avatar eregon avatar flavorjones avatar hmistry avatar jemmaissroff avatar jeremyevans avatar jhawthorn avatar jimmyhmiller avatar k0kubun avatar kddnewton avatar maximecb avatar nirvdrum avatar noahgibbs avatar p8 avatar peterzhu2118 avatar rwstauner avatar tenderlove avatar tenderworks avatar vinistock avatar xrxr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yjit-bench's Issues

Sequel YJIT regression

Hi.

I'm curious - what is the reason for the recent regression in Sequel benchmarks where there is no speedup from YJIT?

Create new rails edge benchmark based on lobste.rs

The goal is to create something that uses rails edge, is more representative of a "real" modern rails app, and is also more challenging, a bigger hill to climb than railsbench, to help us work on optimizing harder problems and edge cases.

Some notes from the meeting where we discussed this idea:

  • To be built directly in the yjit-bench repo (name the benchmark lobsters?)
  • Try to make it a larger app
    • Large number of routes to exercise the router
    • John Hatwhorn said: make a big routing table, many routes go nowhere, increase megamorphism
    • Multiple models, exercise different shapes
    • Use ERB
    • Try with and without caching.
    • Render a long list of things, a bit of nesting
      • Posts with comments
      • Minimize I/O
      • sqlite-mem backend
    • ActiveRecord associations
    • Can we make this benchmark bigger/more complex than ruby-lsp according to --yjit-stats metrics?
      • ruby-lsp generates about 3MiB of inline code.
      • railsbench sits at ~2200 ISEQs compiled. We could aim for 4000+ or more.
  • Exercise more rails features that are commonly used
  • Could start from lobsters
    https://github.com/lobsters/lobsters
    • Aaron says they have a tool to generate fake data
    • Commit the fake data in the repo
    • Get it on rails edge
    • Exercise many endpoints, many code paths

@noahgibbs will get started in a branch
@casperisfine, @eileencodes and @jhawthorn offered to lend a hand
@maximecb is happy to join in on pairing sessions

The ruby-lsp benchmark fails on ruby master

With a fresh ruby-dev I get this error:

$ ./run_benchmarks.rb ruby-lsp --once
Running benchmark "ruby-lsp" (1/1)
/Users/rwstauner/.rubies/ruby-dev/bin/ruby -I harness /Users/rwstauner/src/github.com/Shopify/yjit-bench/benchmarks/ruby-lsp/benchmark.rb
ruby 3.4.0dev (2024-01-18T15:35:46Z master 00814fd672) [arm64-darwin23]
Command: bundle check 2> /dev/null || bundle install
The Gemfile's dependencies are satisfied
/Users/rwstauner/src/github.com/Shopify/yjit-bench/benchmarks/ruby-lsp/benchmark.rb:24:in `block in <main>': undefined method `size' for nil (NoMethodError)

  rc_last = rc.size
              ^^^^^
        from /Users/rwstauner/src/github.com/Shopify/yjit-bench/harness/harness.rb:29:in `block in run_benchmark'
        from /Users/rwstauner/.rubies/ruby-dev/lib/ruby/3.4.0+0/benchmark.rb:313:in `realtime'
        from /Users/rwstauner/src/github.com/Shopify/yjit-bench/harness/harness.rb:29:in `run_benchmark'
        from /Users/rwstauner/src/github.com/Shopify/yjit-bench/benchmarks/ruby-lsp/benchmark.rb:18:in `<main>'
Command "/Users/rwstauner/.rubies/ruby-dev/bin/ruby -I harness /Users/rwstauner/src/github.com/Shopify/yjit-bench/benchmarks/ruby-lsp/benchmark.rb" failed in directory /Users/rwstauner/src/github.com/Shopify/yjit-bench
./run_benchmarks.rb:46:in `check_call': RuntimeError (RuntimeError)
        from ./run_benchmarks.rb:280:in `block in run_benchmarks'
        from ./run_benchmarks.rb:226:in `each'
        from ./run_benchmarks.rb:226:in `each_with_index'
        from ./run_benchmarks.rb:226:in `run_benchmarks'
        from ./run_benchmarks.rb:428:in `block in <main>'
        from ./run_benchmarks.rb:427:in `each'
        from ./run_benchmarks.rb:427:in `<main>'

We are currently using ruby-lsp 0.4.1.
If I upgrade to the latest 0.13.4 and then fixup the code to be able to run, the results are incorrect

diff --git a/benchmarks/ruby-lsp/Gemfile.lock b/benchmarks/ruby-lsp/Gemfile.lock
index adcb143..34b4766 100644
--- a/benchmarks/ruby-lsp/Gemfile.lock
+++ b/benchmarks/ruby-lsp/Gemfile.lock
@@ -17,7 +17,7 @@ GEM
     parser (3.2.2.3)
       ast (~> 2.4.1)
       racc
-    prettier_print (1.2.0)
+    prism (0.19.0)
     racc (1.7.1)
     rack (3.0.4.2)
     rainbow (3.1.1)
@@ -42,14 +42,12 @@ GEM
       activesupport (>= 4.2.0)
       rack (>= 1.1)
       rubocop (>= 1.33.0, < 2.0)
-    ruby-lsp (0.4.1)
+    ruby-lsp (0.13.4)
       language_server-protocol (~> 3.17.0)
-      sorbet-runtime
-      syntax_tree (>= 6, < 7)
+      prism (>= 0.19.0, < 0.20)
+      sorbet-runtime (>= 0.5.10782)
     ruby-progressbar (1.11.0)
-    sorbet-runtime (0.5.10679)
-    syntax_tree (6.0.0)
-      prettier_print (>= 1.2.0)
+    sorbet-runtime (0.5.11205)
     tzinfo (2.0.6)
       concurrent-ruby (~> 1.0)
     unicode-display_width (2.3.0)
diff --git a/benchmarks/ruby-lsp/benchmark.rb b/benchmarks/ruby-lsp/benchmark.rb
index fe13954..d7fc514 100644
--- a/benchmarks/ruby-lsp/benchmark.rb
+++ b/benchmarks/ruby-lsp/benchmark.rb
@@ -17,19 +17,16 @@
 # These benchmarks are representative of the three main operations executed by the Ruby LSP server
 run_benchmark(200) do
   # File parsing
-  document = RubyLsp::Document.new(content)
+  document = RubyLsp::RubyDocument.new(source: content, version: 1, uri: URI(file_uri))
 
   # Running RuboCop related requests
-  rc = RubyLsp::Requests::Diagnostics.new(file_uri, document).run
+  rc = RubyLsp::Requests::Diagnostics.new(document).perform
   rc_last = rc.size
 
   # Running SyntaxTree visitor requests
-  hl = RubyLsp::Requests::SemanticHighlighting.new(
-    document,
-    encoder: RubyLsp::Requests::Support::SemanticTokenEncoder.new,
-  ).run
-  hl_last = hl.data.size
+  hl = RubyLsp::Requests::SemanticHighlighting.new(Prism::Dispatcher.new).perform
+  hl_last = hl.size
 end
 
-raise("ruby-lsp benchmark: the RuboCop diagnostics test is returning the wrong answer!") if rc_last != 34
-raise("ruby-lsp benchmark: the Semantic Highlighting test is returning the wrong answer!") if hl_last != 1160
+raise("ruby-lsp benchmark: the RuboCop diagnostics test is returning the wrong answer: #{rc_last}") if rc_last != 34
+raise("ruby-lsp benchmark: the Semantic Highlighting test is returning the wrong answer: #{hl_last}") if hl_last != 1160

raises with

ruby-lsp benchmark: the RuboCop diagnostics test is returning the wrong answer: 0 (RuntimeError)

Railsbench: figure out configuration that's compatible with YJIT, and preferably with other Rubies

Right now Railsbench is weirdly incompatible with everything. Specifically, Bundler can't seem to find the Digest gems - it doesn't think 3.0.1pre exists since it's a default gem of prerelease Ruby. And it can't find digest-3.0.0 to install it even though it's on Rubygems.

Just running bin/rails doesn't work because the Gemfile has to say 3.0.0 (Bundler doesn't thing 3.0.1pre exists) or Ruby will activate the wrong gem first:

/Users/noah/.rubies/ruby-yjit-metrics-prod/lib/ruby/3.1.0/bundler/runtime.rb:300:in `check_for_activated_spec!': You have already activated digest 3.0.1.pre, but your Gemfile requires digest 3.0.0. Since digest is a default gem, you can either remove your dependency on it or try updating to a newer version of bundler that supports digest as a default gem. (Gem::LoadError)
	from /Users/noah/.rubies/ruby-yjit-metrics-prod/lib/ruby/3.1.0/bundler/runtime.rb:29:in `block in setup'
	from /Users/noah/.rubies/ruby-yjit-metrics-prod/lib/ruby/3.1.0/bundler/spec_set.rb:158:in `each'
	from /Users/noah/.rubies/ruby-yjit-metrics-prod/lib/ruby/3.1.0/bundler/spec_set.rb:158:in `each'
	from /Users/noah/.rubies/ruby-yjit-metrics-prod/lib/ruby/3.1.0/bundler/runtime.rb:24:in `map'

If 3.0.0 is installed manually (can't do it with bundle), we get a different error that doesn't obviously Google well or have an obvious-to-me cause:

noah@Noahs-MacBook-Pro-3 railsbench % bundle exec bin/rails console
/Users/noah/.gem/ruby/3.1.0/gems/globalid-0.4.2/lib/global_id/uri/gid.rb:176:in `<module:URI>': uninitialized class variable @@schemes in URI
Did you mean?  scheme_list (NameError)
	from /Users/noah/.gem/ruby/3.1.0/gems/globalid-0.4.2/lib/global_id/uri/gid.rb:6:in `<top (required)>'
	from /Users/noah/.gem/ruby/3.1.0/gems/activesupport-6.0.3.7/lib/active_support/dependencies.rb:324:in `require'
	from /Users/noah/.gem/ruby/3.1.0/gems/activesupport-6.0.3.7/lib/active_support/dependencies.rb:324:in `block in require'
	from /Users/noah/.gem/ruby/3.1.0/gems/activesupport-6.0.3.7/lib/active_support/dependencies.rb:291:in `load_dependency'
	from /Users/noah/.gem/ruby/3.1.0/gems/activesupport-6.0.3.7/lib/active_support/dependencies.rb:324:in `require'
	from /Users/noah/.gem/ruby/3.1.0/gems/globalid-0.4.2/lib/global_id/global_id.rb:6:in `<top (required)>'
	from /Users/noah/.gem/ruby/3.1.0/gems/activesupport-6.0.3.7/lib/active_support/dependencies.rb:324:in `require'
	from /Users/noah/.gem/ruby/3.1.0/gems/activesupport-6.0.3.7/lib/active_support/dependencies.rb:324:in `block in require'
	from /Users/noah/.gem/ruby/3.1.0/gems/activesupport-6.0.3.7/lib/active_support/dependencies.rb:291:in `load_dependency'
	from /Users/noah/.gem/ruby/3.1.0/gems/activesupport-6.0.3.7/lib/active_support/dependencies.rb:324:in `require'
	from /Users/noah/.gem/ruby/3.1.0/gems/globalid-0.4.2/lib/global_id.rb:1:in `<top (required)>'

I think the problem is that digest is still a default gem, but Bundler seems to not have been updated for that. This may be a case where we need to use an older Bundler.

run_benchmarks.rb includes all data twice

We include unscaled data, then data multiplied by 1000 for milliseconds. It's confusing and wasteful. But I'm not sure if anybody's currently relying on it, so I'm loathe to just remove it quietly.

Add compile-time benchmark

We'd like to use 30k_ifelse or similar as a compile-time benchmark. The right way to do it is probably to run with --yjit-call-threshold=1 for 1 iteration. But doing that doesn't fit well into our current harness structure. We'd like to be able to pass back arbitrary data, not just runs.

So this would require a bit of harness work for both yjit-bench and yjit-metrics, but then we could use our existing 30k_ifelse benchmark as a nice compile-time benchmark and track it.

Add liquid parsing/compilation benchmark

We're currently benchmarking liquid rendering, but we haven't looked at all at the parsing/compilation of liquid templates. It would be interesting to add that as a benchmark as well. This should also be a headline benchmark.

Tagging @k0kubun since we discussed this.

Add new `liquid-c` headline benchmark

This is a suggestion from Jimmy Miller. According to him, YJIT makes Liquid-C quite a bit faster, giving an extra boost above vanilla Liquid. That makes it worthwhile to track the performance of liquid-c, which will likely behave differently from vanilla liquid.

Add Bundler benchmark

Alan points out that a "run Bundler with a big Gemfile but with all relevant gems already installed" benchmark would address a fair number of real-world compile time worries. If our compile time, or similar startup time, gets too slow then this benchmark would clearly show the real-world impact.

Verify ruby-lsp is working during run

I'm seeing results from @eightbitraptor that look a lot like the ruby-lsp benchmark is failing, rapidly, and just not noticing, so it returns really fast, really unstable results.

So I figure I'll add some verification, to ensure that it's generating the right results, for some definition of that phrase. Probably "save the result from the last iteration, do a few basic checks on the value" sort of thing.

Find or create pure Ruby JSON parser benchmark

It could be nice to have a small JSON parser benchmark that is in pure Ruby and uses the getbyte method, so that we can make sure that YJIT can accelerate parsers written in pure Ruby.

Following a discussion with @jhawthorn in the Ruby future meeting.

The erubi_rails benchmark defines new classes in every request

This initialization code,

def initialize
@stub_assigns = {}
# The topic_view seems to hold the details for *this* view of *this* topic.
# It holds most of the interesting stubs for this view.
topic_struct = Struct.new(:id, :slug)
topic = topic_struct.new(4321, "lunches_forever")
topic_view_struct = Struct.new(:topic, :title, :posts, :url, :image_url, :post_custom_fields, :link_counts, :prev_page, :next_page, :read_time, :like_count, :published_time, :page_title, :print)
topic_view_obj = topic_view_struct.new(topic, "Lunches Forever", [], "/fake/topic/url", "https://fake.image.url/fake.png", {}, {}, nil, nil, 0.0, 0, DateTime.now, "Viewing Page", false)
def topic_view_obj.summary(*ignored); ""; end # Summary for meta tags
@stub_assigns[:topic_view] = topic_view_obj
# We'll create a user and some posts and then add them to the topic view object.
user_struct = Struct.new(:name, :username)
u = user_struct.new("Ali S. Fakenamington", "ali_s")
post_body = "what I had for lunch was a sandwich. But then the day before,<br/>\n" * 5 + "..."
post_struct = Struct.new(:user, :id, :topic, :action_code, :image_url, :created_at, :version, :post_number, :hidden, :cooked, :like_count, :reply_count)
post = post_struct.new(u, 1234, topic_view_obj, nil, "https://fake.image.url/fake.png", DateTime.now, 1, 1, false, post_body, 0, 0)
3.times { topic_view_obj.posts << post }
# Now add breadcrumbs and tags, which also get instance variables.
@stub_assigns[:breadcrumbs] = [ { url: "https://placeholder", color: "blue", name: "PageDivision" } ] * 3
tag_struct = Struct.new(:name)
@stub_assigns[:tags] = [ tag_struct.new("tag-name") ] * 3
end
, is run once for every request (verified by adding p [:FakeDiscourseController_init, object_id, self] as the first line of initialize).
That is likely unintended, and as a result there are 10 Struct.new per request, which involves creating new classes, defining methods dynamically, etc.

Here is the output I got on 3.1.0:

$ cd benchmarks/erubi_rails
$ ruby -I../../harness $PWD/benchmark.rb
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-linux]
Calling `DidYouMean::SPELL_CHECKERS.merge!(error_name => spell_checker)' has been deprecated. Please call `DidYouMean.correct_error(error_name, spell_checker)' instead.
[:FakeDiscourseController_init, 15640, #<FakeDiscourseController:0x00000000007a30>]
[:FakeDiscourseController_init, 15660, #<FakeDiscourseController:0x00000000007a58>]
[:FakeDiscourseController_init, 15780, #<FakeDiscourseController:0x00000000007b48>]
[:FakeDiscourseController_init, 15820, #<FakeDiscourseController:0x00000000007b98>]
[:FakeDiscourseController_init, 15860, #<FakeDiscourseController:0x00000000007be8>]
[:FakeDiscourseController_init, 15900, #<FakeDiscourseController:0x00000000007c38>]
[:FakeDiscourseController_init, 15940, #<FakeDiscourseController:0x00000000007c88>]
[:FakeDiscourseController_init, 15980, #<FakeDiscourseController:0x00000000007cd8>]
[:FakeDiscourseController_init, 16020, #<FakeDiscourseController:0x00000000007d28>]
[:FakeDiscourseController_init, 16060, #<FakeDiscourseController:0x00000000007d78>]
[:FakeDiscourseController_init, 16100, #<FakeDiscourseController:0x00000000007dc8>]
[:FakeDiscourseController_init, 16140, #<FakeDiscourseController:0x00000000007e18>]
[:FakeDiscourseController_init, 16180, #<FakeDiscourseController:0x00000000007e68>]
[:FakeDiscourseController_init, 16220, #<FakeDiscourseController:0x00000000007eb8>]
... 100 times
itr #1: 76ms

A large part of the benchmark seems spent in that initialization (~17% on 3.1.0, likely more with JITs).

cc @noahgibbs

Add one `hexapdf` benchmark

Apparently YJIT already shows speedups on this project, and it's open source. It's also "real-world" software that happens to not directly be a web use case. I was thinking that we could add just one of the larger benchmarks from their repository. Generally, longer-running benchmarks make for more accurate timing.

https://github.com/gettalong/hexapdf
https://github.com/gettalong/hexapdf/tree/master/benchmark

@noahgibbs would you be ok with taking this issue? I know you're busy with the upstreaming so this can wait. Otherwise we could ask Kevin or someone else on the team.

Create erb template rendering benchmark

I've been told that the erb template rendering system generated Ruby code. I'd be curious to add an erb template rendering benchmark to see how that performs. This would be good to have, particularly since erb is used by rails.

Ideally, the template(s) we use for the benchmark should have realistic complexity, ideally be based on an HTML template. We should start by checking if erb comes with performance benchmarks.

We may be able to optimize YJIT for erb, and we may also be able to get erb to alter the code it's generating to be more YJIT friendly based on our findings.

Setup scripts for root-required benchmarks

If a benchmark in a directory (e.g. railsbench/benchmark.rb, not getivar.rb) provides a setup.rb (or setup.sh?) we should have a simple way to run the top-level runner with sudo or root and do privileged setup (and shutdown, see below.)

This is useful for e.g. installing Rails app dependencies, such as for Discourse. It would also permit starting up and shutting down Docker instances. And since we probably want to mess with Docker in between Ruby configs (e.g. run YJIT versus no-JIT for a Rails app), that means we'll need the runner to retain root privileges. It's not a one-and-done install.

We may also want to assume benchmarks with a setup script require sudo/root, and just not run them if you don't have it. Or we could allow the benchmark to specify that it's root-required some other way, not just "if it has a setup script." If we can tell it needs root, we won't need to change how run_benchmarks.rb works -- if we run run_benchmarks.rb without root, it would keep running all non-root-required benchmarks like it does now. If we ran it with sudo, it would also pick up the only-possible-with-sudo benchmarks.

We'd want to make sure the error message is decent if they specify a sudo-only benchmark without using sudo. But that's easy enough.

Need to fix hexapdf size detection

I'm using an extremely rough metric for correct hexapdf output. It's definitely nondeterministic. Apparently the current metric is occasionally wrong and hexapdf generates a size we don't like. That makes sense -- I previously just ran it a bunch of times to get "typical" sizes and then added a check.

During a benchmark run today, I saw this once:

itr #1: 2718ms
/Users/noah/yjit/yjit-bench/benchmarks/hexapdf/benchmark.rb:43:in `block in <main>': Incorrect size 569799 for file /tmp/hexapdf-result-001.pdf! (RuntimeError)
	from /Users/noah/yjit/yjit-bench/benchmarks/hexapdf/benchmark.rb:41:in `each'
	from /Users/noah/yjit/yjit-bench/benchmarks/hexapdf/benchmark.rb:41:in `<main>'
Exception in benchmark: nil, Ruby: ruby-yjit-metrics-prod, Error: RuntimeError / "Failure in benchmark test harness, exit status: 1"

I'll look into it and see if I can get a more solid bound on sizes and/or other ways to detect correct vs incorrect output.

Bundler version in Gemfile.lock should be compatible with earlier Rubies

I used the Bundler built into YJIT (Ruby 3.1 prerelease) when updating the Gemfile.lock last time, and Bundler gets picky about that. It's a much better idea retain compatibility with other Rubies.

It would be nice to just allow any Bundler version, but that requires either hand-editing Gemfile.lock on every commit or using a very old Bundler version exclusively when updating it. Both "fixes" are quite annoying. On the plus side, Gemfile.lock is really simple and hand-editable.

Anyway: for now I'll pick an older Bundler version and specify it in Gemfile.lock.

Better solution for running benchmarks once

Often we want to run benchmarks for just a single iteration to gather some stats or test something.

Currently, our solution for this is ad-hoc. Some benchmarks like optcarrot and railsbench have a run_once.rb script, which duplicates some of the code in benchmark.rb. This is also problematic because not only does it duplicate code, but it's missing the logic for auto-installing dependencies in railsbench, for example.

Example:
https://github.com/Shopify/yjit-bench/blob/main/benchmarks/railsbench/run_once.rb

ruby --yjit-stats benchmarks/railsbench/run_once.rb

Maybe a better solution would be to have a run-once harness, so we can do something like:

ruby --yjit-stats -I./harness-once benchmarks/railsbench/benchmark.rb

@noahgibbs maybe you have an opinion on this ?

nice is not applied, railsbench fails to bundle install

Hello,

I've been trying to run this and noticed that the generated command:

setarch x86_64 -R nice -20 taskset -c 11 ruby --yjit -I ./harness benchmarks/optcarrot/benchmark.rb

actually causes the process to have a nice level of 19, the least priority, when trying locally on Linux.

It seems one needs nice -n -40 (can be tested with nice -n -40 sleep 1000) to actually use a negative nice level, and that also needs sudo (otherwise: nice: cannot set niceness: Permission denied and it has no effect).

BTW, I also noticed run_benchmarks.rb uses 4 spaces as indentation which is quite unusual for Ruby. I'd make a PR for it, but it's probably best if you change it to avoid any conflict.

Finally, my run failed with:

$ ./run_benchmarks.rb
...
Running benchmark "railsbench" (12/13)
setarch x86_64 -R nice -20 taskset -c 11 ruby --yjit -I ./harness benchmarks/railsbench/benchmark.rb
Could not find concurrent-ruby-1.1.8 in any of the sources
Run `bundle install` to install missing gems.
./run_benchmarks.rb:13:in `check_call': RuntimeError (RuntimeError)
	from ./run_benchmarks.rb:220:in `block in run_benchmarks'
	from ./run_benchmarks.rb:188:in `each'
	from ./run_benchmarks.rb:188:in `each_with_index'
	from ./run_benchmarks.rb:188:in `run_benchmarks'
	from ./run_benchmarks.rb:282:in `<main>'
zsh: exit 1     ./run_benchmarks.rb

So I guess one should bundle install before in benchmarks/railsbench (either in README or by the harness).
bundle install currently fails though due to mimemagic 0.3.5 being yanked.

Jekyll fails with latest prerelease Ruby due to taint-mode checking

The most recent Ruby removes support for taint/untaint (https://stackoverflow.com/questions/12165664/what-are-the-rubys-objecttaint-and-objecttrust-methods). That's fine, they're very rarely used.

But some gems support taint-mode, which means they use those APIs, which means they break with latest prerelease Ruby. For instance, Liquid. Liquid uses taint mode up through version 4.0.3. Latest released Jekyll constrains Liquid to be ~>4.0. So: every released version of the Jekyll gem is broken with prerelease Ruby after around 27th Dec.

The Jekyll benchmark has been difficult in a number of ways, and I'm going to handle this, short-term, by turning off Jekyll in yjit-metrics. But we'll probably want a longer-term solution of some kind for yjit-bench.

GraphQL benchmark

GraphQL is important to Shopify. It's a good idea to have a benchmark for that.

The OptCarrot benchmark used in this repo leaks Fibers

I noticed the OptCarrot benchmark used in this repo does something quite unusual:

run_benchmark(5) do
rom_path = File.join(__dir__, "examples/Lan_Master.nes")
argv = ["--headless", "--frames", 200, "--no-print-video-checksum", rom_path]
nes = Optcarrot::NES.new(argv).run
end

That creates a new Optcarrot::NES every time, and that also means it leaks one Fiber per instance/run.
CRuby has the feature/behavior/bug to hard kill (skip ensure code) Fibers when they are no longer reachable (not sure how reliable it is though), but that's in general not really feasible without native coroutines: oracle/truffleruby#959 (comment)
And it's a proper mess semantically-speaking.

I would suggest to either:

  • Run OptCarrot as it meant to be (AFAIK), i.e., try to achieve >= 60 FPS, so reuse the same Optcarrot::NES instance and keep rendering more frames. The OptCarrot README shows a 3000 frames plot to show peak performance. Lan_Master.nes has basically a consistent frames loop after some startup.
  • Fix OptCarrot to actually dispose the PPU Fiber so the benchmark is more sensible to run with multiple Optcarrot::NES instances. That still means some extra IO etc (to load the cartridge) and maybe some other issues as the code seems not designed to be run like that.

Many deoptimizations/invalidations when a benchmark ends

I've noticed this on TruffleRuby:

[engine] opt inval. id=1137  Integer#==                                                                                                                                                         |Timestamp 15414910326666|Src (core):1|Reason hierarchy is unmodified Array

And many many other invalidations after the last iteration was measured.

So the good news is it shouldn't affect the captured times.
The bad news is this may cause the process to take a while longer to terminate, may cause more memory to be used due to extra invalidations & compilations, etc.

Given the above means a module was included or prepended to Integer, this is almost certainly the side effect of require 'json' so late:

require "json"

Maybe we could just serialize to json ourselves to avoid this, given the data we dump is so simple (RUBY_DESCRIPTION and floats). Any thoughts on that?

Requiring json early doesn't seem good, especially for smaller benchmarks.

Of course it's not great the json gem does such things (https://github.com/flori/json/blob/248bc5bf7ea597d7a20396a8def855a1988bf490/lib/json/common.rb#LL69C11-L69C18 and https://github.com/flori/json/blob/248bc5bf7ea597d7a20396a8def855a1988bf490/lib/json/pure/generator.rb#L285) but I don't think we can do anything about that here. In fact if it would not include but just define the methods directly it wouldn't be an issue for per class & method name caches like in TruffleRuby (and I think in CRuby too).

Add a Railsbench-style benchmark using latest edge Rails

We should add a benchmark that tracks latest Edge Rails. That way we'll notice if YJIT has a large performance change relative to Rails, and whether Rails has added code we do especially well or poorly on.

Previous issue description

Railbench: update to Rails 6.1

This will require updating the Gemfile and doing a "bundle update" at a minimum. There may be other changes as well.

Rails 6.1 made some changes to template rendering that may be relevant to us (rails/rails#43713). Thanks for pointing this out, @byroot!

Renaming the optcarrot benchmark or measuring only rendering and not startup/shutdown

The optcarrot benchmark in this repo:
https://github.com/Shopify/yjit-bench/blob/main/benchmarks/optcarrot/benchmark.rb
measures something quite different from the original:
https://github.com/mame/optcarrot#benchmark-example

The original benchmark clearly measures the FPS/frames-per-second.

The benchmark in this repo instead measures:

  • Create a new NES and associated objects
  • load the cartridge from disk
  • emulate 3 seconds/180 frames
  • stop the emulator

as a single measurement.

I think this is so different that it's worth making that clear in the benchmark name.
Maybe optcarrot-start-stop or something like that?

Of course another option would be to run optcarrot as the original benchmark, measuring FPS.
That would be IMHO a better fix, instead of running the benchmark in a non-standard way.

I understand this other way might make the benchmark a little bit more stable (#34), but it also makes it much less realistic, who plays for 3 seconds and switch off the NES?
IMHO the workfload stabilizes soon enough so with a bit of warmup it's not much of an issue to measure FPS and keep the NES instance running, like done in https://github.com/oracle/truffleruby/blob/master/bench/optcarrot/optcarrot.rb or in the original benchmark: https://github.com/mame/optcarrot/blob/22838002b9f091bc0c27520ecaa6daff9683a212/lib/optcarrot/nes.rb#L89
As you can see on https://eregon.me/blog/2016/11/28/optcarrot.html the FPS for CRuby is very stable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.