Comments (24)
On our codebase it only takes 3.5 minutes to run 8x in parallel but it takes 12 minutes to do it serially. I'm sure with inbuilt support for parallelizing it can get even better.
Adding built-in support for parallel execution requires a lot of changes. If this wasn't the case we would have done it by now. :-) Frankly, I doubt we'll every get there - I'm certainly not eager to work on this myself.
from rubocop.
Any chance of revisiting this decision?
$ rubocop --version
0.44.1
$ rubocop -L | time xargs -P`sysctl -n hw.ncpu` -n250 rubocop -a
...
221.84 real 1473.85 user 21.34 sys
$ rubocop -L | time xargs -P`sysctl -n hw.ncpu` -n250 rubocop -a
...
216.42 real 1469.30 user 20.55 sys
$ time rubocop -a
...
real 11m57.188s
user 11m43.502s
sys 0m11.112s
On our codebase it only takes 3.5 minutes to run 8x in parallel but it takes 12 minutes to do it serially. I'm sure with inbuilt support for parallelizing it can get even better.
from rubocop.
Any chance of reopening this? One of my projects has ~2300 files. It takes around 2 minutes to run Rubocop currently with our configuration on my 2.6 GHz Intel Core i7 Macbook Pro, and only using one out of 8 cores seems suboptimial.
from rubocop.
Is there any interest in revisiting this? Our codebase has around 1,600 files to check, and writing a quick script to split the file list into several groups and check each group in a separate process roughly halves the time it takes to check our codebase (1m30s down to 45s).
from rubocop.
I'm going to run some benchmarks!
from rubocop.
Taking a look, I think that the time it takes to run the cops is trivial compared to the I/O time to read each file. The performance was actually worse when trying to parallelize the cops (this may be different with JRuby).
However, parallelizing the file reads, I was able to check the rubocop source 8 seconds faster by switching 2 lines of code (using the Parallel gem):
# ...
require 'parallel'
# ...
#target_files(args).each do |file|
Parallel.each(target_files(args)) do |file|
# ...
For some reason it's not incrementing the files checked when it runs in parallel.
With a little more tinkering, I may be able to set it up so that the -p flag would use the parallel method. This would allow the brave people to use it while not causing problems for anyone else.
from rubocop.
This sounds very promising. I support your suggestion about the -p flag.Β
β
Cheers,
Bozhidar
On Sat, May 4, 2013 at 9:04 PM, Ryan Boland [email protected]
wrote:
Taking a look, I think that the time it takes to run the cops is trivial compared to the I/O time to read each file. The performance was actually worse when trying to parallelize the cops (this may be different with JRuby).
However, parallelizing the file reads, I was able to check the rubocop source 8 seconds faster by switching 2 lines of code (using the Parallel gem):require 'parallel' # ... #target_files(args).each do |file| Parallel.each(target_files(args)) do |file| #...
For some reason it's not incrementing the files checked when it runs in parallel.With a little more tinkering, I may be able to set it up so that the -p flag would use the parallel method. This would allow the brave people to use it while not causing problems for anyone else.
Reply to this email directly or view it on GitHub:
#117 (comment)
from rubocop.
Btw, @bolandrm, you might also take a look at Celluloid. I haven't used it, but I've hearing it's a pretty good gem if you're looking for a higher level threading API.
from rubocop.
@bolandrm If you use Parallel, you should use Parallel.map
and return report
at the end of the block.
That way you can loop Report#entries when processing is finished and get the correct number of offenses :)
Don't try to change non-local variables in the block, that causes trouble.
What's the speedup if you use processes instead of threads?
from rubocop.
@bolandrm See jurriaan/rubocop@de02724b75d1 for how to fix incrementing files and offences
from rubocop.
@jurriaan Good call, thanks.
I was thinking that Parallel used processes by default (if possible on the particular system). Their documentation isn't that great. I was going to check out Celluloid.
from rubocop.
Celluloid looks interesting, never used it though. Looking forward to your results!
from rubocop.
You really do not need Celluloid here... Parallel is more than enough. I would suggest simply marshalling the offences, then using Parallel#map and then displaying them in the main process.
from rubocop.
@bolandrm Are you working on this right now? I could look into this if you want to?
from rubocop.
@jurriaan If you can cook something up soon it would be great. I would have done that myself by now, but I'm currently tied up porting cops to Parser.
from rubocop.
@jurriaan please feel free to take this over. I was planning on finishing it eventually but I'm really busy at the moment.
On May 15, 2013, at 1:48 AM, Bozhidar Batsov [email protected] wrote:
@jurriaan If you can cook something up soon it would be great. I would have done that myself by now, but I'm currently tied up porting cops to Parser.
β
Reply to this email directly or view it on GitHub.
from rubocop.
I'll be closing this one, since @edzhelyov and I have an idea that would render the need for running cops in parallel obsolete.
from rubocop.
Fine, I didn't have the time to implement it, and the code was changing too much for me. What's the new idea? :)
from rubocop.
We'll traverse the AST only once in a dispatch-like class, that would propagate the parser events to the interested cops(all cops that are implemented as parser processors). Since most of the execution time is currently spend in traversing the AST over and over in each cop that approach should speed up things significantly. There are a few cops for which this is not applicable, but overall there should be a huge speed improvement.
from rubocop.
Sounds great! π
from rubocop.
@bbatsov π I was also thinking the behavior βtraversing the AST over and over in each copβ wastes too much CPU. Though, in exchange for the waste, we could implement each cop without considering side effects on other cops.
Actually I was about to refactor the inspection and parsing logic of CLI
, but probably I should leave them for now?
from rubocop.
The inspection and parsing logic should definitely be moved out of the CLI
anyways. Probably you should discuss your refactoring plans with @edzhelyov. Maybe it would be best if you wrote an email about them on the mailing list.
from rubocop.
@jcoglan What about the caching mechanism introduced in v0.34.2? Doesn't that solve your problem?
from rubocop.
@jonas054 Yes, I've tried that now and it makes a huge difference. Thanks :)
from rubocop.
Related Issues (20)
- False positive of `Style/RedundantLineContinuation`
- Cop idea: Disallow having `yield` in a `lambda`
- Regression in v.1.63.3 HOT 6
- False positive `Lint/Syntax` with ruby 3.3.1 and anonymous rest arg/keyword argument forwarding HOT 2
- False positive for `Layout/HeredocIndentation` with `parser_prism` HOT 2
- False positive for `Layout/LineEndStringConcatenationIndentation` with Prism
- `Lint/ShadowingOuterLocalVariable` does not take into account `if`/`else` branches HOT 1
- `class Foo; bar do` will crash Metrics/BlockLength HOT 2
- `Style/ArgumentsForwarding` enforce anonymous bad autocorrect
- regression in 1.63.4 -- "Could not locate Gemfile" HOT 2
- False Positive for `Lint/ImplicitStringConcatenation` with prism HOT 1
- `Style/HashEachMethods` mistakenly reported for zipped arrays HOT 2
- New cops warning still show up even after they have been configured
- `cop.documentation_url` is nil for extension cops
- Question about Lint/ToEnumArguments HOT 1
- Metrics/AbcSize with `auto-gen-only-exclude` option
- Unexpected change to Style/Copyright from #12804 HOT 4
- uninitialized constant Bundler::SharedHelpers (NameError) HOT 13
- `Style/MapIntoArray` misinterprets my class design HOT 3
- Rubocop::Config#gem_version_to_major_minor_float doesn't work for Rails 8.0
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rubocop.