webmozarts / console-parallelization Goto Github PK
View Code? Open in Web Editor NEWEnables the parallelization of Symfony Console commands.
License: MIT License
Enables the parallelization of Symfony Console commands.
License: MIT License
With the current code:
$pwd = $_SERVER['PWD'];
$scriptName = $_SERVER['SCRIPT_NAME'];
return str_starts_with($scriptName, $pwd)
? $scriptName
: $pwd.DIRECTORY_SEPARATOR.$scriptName;
If the script name is absolute, e.g. /usr/local/bin/box
, the resolved script name will be a nonsensical path.
Hello,
I've just have a look on this project that seems usefull. As there is a major version v2 under development, I've look on it (branch main
), and found that some resources are still missing in GitHub repository (for beta 2)
Checks
Namespaces did not follow Composer PSR-4 structure
Unable to test it in real condition. Or did I missed something ?
PS: I've tried with following composer.json
file
{
"require": {
"webmozarts/console-parallelization": "^2@dev"
}
}
Right now, fetchItems()
is only called once, which is fine.
However, we also want to use the ParallelizationTrait to process queue alike structures, or at least do another run if some items couldn't be processed for any reason.
Probably it wouldn't be that much refactoring to add a loop, so that fetchItems()
will be called again before
fetchItems()
returns an empty result?
Optionally, fetchItems()
could have an argument which contains the number of the rounds (first round: 0, second round: 1, ...) so that the developer can implement a stop after the first round.
As an alternative solution, fetchItems()
could remain the same (=default implementation), but optionally, fetchNextItems()
can be implemented with support for queue processing.
I can provide a PR if you like.
Hi guys,
I just want to thank you for that amazing bundle, which is very likely to be part of Pimcore soon!
๐๐๐
That's why I want to ask if there already exists a (rough) roadmap for the future releases?
If runTolerantSingleCommand()
would be protected, then a custom error handling can be added for child processes.
Currently commands that implement the Parallelization trait must be executed in bin/
.
Example: cd ~/www/bin && bin/console pimcore:thumbnails:image --processes 1
-> works.
Example: cd ~www && ~/www/bin/console pimcore:thumbnails:image --processes 1
-> error message: Expected a string. Got: boolean.
Reason: $consolePath = realpath(getcwd().'/bin/console');
returns false
if the script is executed from the home directory (debian).
I don't have a solution right now, but according to https://www.php.net/manual/en/function.getcwd.php:
On some Unix variants, getcwd() will return FALSE if any one of the parent directories does not have the readable or search mode set, even if the current directory does.
I just stumpled upon your new ItemBatchIterator implementation: https://github.com/webmozarts/console-parallelization/blob/master/src/ItemBatchIterator.php
Actually we use a very similar concept to iterate over various types of files (XML, CSV, Excel, arrays, etc.) and are using PHP's Iterable
and Countable
interfaces.
interface IteratorInterface extends \Iterator, \Countable {
}
Probably it would be a good idea to also let the ItemBatchIterator implement a similar interface to make the solution very generic and allow multiple types of iterators?
See https://github.com/webmozarts/console-parallelization/blob/master/composer.json#L19:
Should it be probably "symfony/console": "^3.0 || ^4.0 || ^5.0"...
instead of "symfony/console": "3.0 || ^4.0 || ^5.0"...
As Symfony 4+ kinda deprecate injecting the whole container, wouldn't it make sense for the trait, rather than asking a getContainer
method, ask for whatever it needs (a few parameters and the logger IIRC) ? So that these can be injected and used from a command as a service.
p
is currently used as the alias for the number of concurrent processes that are started by the trait:
https://github.com/webmozarts/console-parallelization/blob/master/src/Parallelization.php#L139
p
is also a quite commonly used input option shortcut (parent, part, ...) for existing commands.
Please reconsider to pick another, less commonly used shortcut or just the remove the shortcut.
When writing a new logger just to decorate one or two methods, it is quite verbose to do so. It would make sense IMO to include a LoggerDecorator which:
This way the user could extend this class and override just the desired method.
$numberOfSegments = (int) ceil($numberOfItems / $segmentSize);
Assert::positiveInteger($numberOfSegments);
When double quotes are used in input options, the command outputs do not work properly anymore.
Example:
bin/console app:ecommerce:bootstrap --list-condition="o_id=20"
causes an error messages and the console output doesn't work properly anymore.
See
As a simple workaround we wrapped string values into quotes:
$optionString .= ' --'.$name.'='.is_string($arrayValue) ? sprintf('"%s"', $arrayValue) : arrayValue;
With that approach, several combinations worked:
bin/console app:ecommerce:bootstrap --list-condition="o_id=20"
bin/console app:ecommerce:bootstrap --list-condition="o_id in('20')"
bin/console app:ecommerce:bootstrap --list-condition='o_id in("20")'
We should discuss whether to add that simple improvement, or add and test some kind of escaper solution, such as
https://github.com/symfony/symfony/blob/master/src/Symfony/Component/Yaml/Escaper.php (which is unfortunately marked as "internal").
Hi,
I encountered an error using this package with a number of items greather than the segment size.
The child processes return the following message on each line :
===== Process Output=========
Could not open input file: bin/console
Using realpath() for the console path fixes the problem
On Windows systems an error occurs because PHP_EOL is not "\n" there but "\n\r".
Line 109 \Webmozarts\Console\Parallelization\ProcessLauncher
(instead of "\n" PHP_EOL should be used here)
It's unlikely you want to log all the notices of running the command in parallel to your regular log files. However, you likely want to log the item failure with your traditional logger.
This is standard in bash and I thought it was specific to bash only, but apparently the Symfony console does the same too:
if ($this->autoExit) {
if ($exitCode > 255) {
$exitCode = 255;
}
exit($exitCode);
}
If in runSingleCommand()
an exception is thrown, then currently the child process stops.
There is no out-of-the-box solution to react in the parent. Is there a way to forward specific exceptions to the parent, so that the parent can handle the exceptions?
Why are items casted to strings in https://github.com/webmozarts/console-parallelization/blob/master/src/Parallelization.php#L367?
In my use-cases an item is an array or a data object, which we have to serialize into a string, e.g. by applying serialize()
, and afterwards unserializing the data in runSingleCommand()
.
Probably the serialization/unserialization part could be done by the trait out of the box?
@webmozart I have a process with many items which I can't run in multiple processes. As far as I can see there is no way to spawn child processes for every item, is there? In the steps there is a lot of work done which uses a lot of RAM. With child processes, every data usage would be removed after one run.
Is there any way to trigger the single items within child processes without having multiple processes?
Hello,
Is it possible to provide documentation for using the ::getParallelExecutableFactory
?
It's not clear, and I don't understand how to use it, I have many command using the runBeforeFirstCommand function before the upgrade.
Regards,
Louis
fidry/console
Hi. When PWD is not set in der $_SERVER
vars this error comes up:
In ParallelExecutorFactory.php line 437:
mb_strpos() expects parameter 2 to be string, null given
It is caused in this method of class ParallelExecutorFactory:
private static function getScriptPath(): string
{
$pwd = $_SERVER['PWD'];
$scriptName = $_SERVER['SCRIPT_NAME'];
return 0 === mb_strpos($scriptName, $pwd)
? $scriptName
: $pwd.DIRECTORY_SEPARATOR.$scriptName;
}
Thanks, kind regards!
Tim
Sometimes you want to provide a custom argument in your command and combine ith with several options.
With the current solution (v1.0.0) it is difficult to add custom arguments as the trait already has an internal "item" argument.
protected function configure()
{
parent::configure();
$this
->setDescription('Processes the preparation and/or update-index queue.')
->addArgument('queue', InputArgument::REQUIRED | InputArgument::IS_ARRAY)
;
self::configureParallelization($this);
$this
->addOption('tenant', null, InputOption::VALUE_REQUIRED | InputOption::VALUE_IS_ARRAY)
//...
}
Error:
Cannot add an argument after an array argument.
Do you think it is possible to replace the item argument and define it as an option instead?
We need to use the argument as it is because we don't want to break compatibility in Pimcore.
Trait:
->addArgument(
'item',
InputArgument::OPTIONAL,
'The item to process'
)
```
Found:
OUT Failed to process the item "..." ...
This is incorrect as the item has a name and that name should be used instead.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.