Base library for a lexer that can be used in Top-Down, Recursive Descent Parsers.
This lexer is used in Doctrine Annotations and in Doctrine ORM (DQL).
Base library for a lexer that can be used in Top-Down, Recursive Descent Parsers.
Home Page: https://www.doctrine-project.org/projects/lexer.html
License: MIT License
Base library for a lexer that can be used in Top-Down, Recursive Descent Parsers.
This lexer is used in Doctrine Annotations and in Doctrine ORM (DQL).
For some reason this repository is not tested on CI.
Hello,
I've a critical error with version 1.1.0
It seems to bug at init app.
[2019-08-01 11:20:41] php.CRITICAL: preg_split() expects parameter 2 to be string, float given {"exception":"[object] (Symfony\Component\Debug\Exception\FatalThrowableError(code: 0): preg_split() expects parameter 2 to be string, float given at /var/www/clients/client1/web8/web/sfserv2018/vendor/doctrine/lexer/lib/Doctrine/Common/Lexer/AbstractLexer.php:255)"} []
[2019-08-01 11:20:41] request.CRITICAL: Uncaught PHP Exception TypeError: "preg_split() expects parameter 2 to be string, float given" at /var/www/clients/client1/web8/web/sfserv2018/vendor/doctrine/lexer/lib/Doctrine/Common/Lexer/AbstractLexer.php line 255 {"exception":"[object] (Symfony\Component\Debug\Exception\FatalThrowableError(code: 0): preg_split() expects parameter 2 to be string, float given at /var/www/clients/client1/web8/web/sfserv2018/vendor/doctrine/lexer/lib/Doctrine/Common/Lexer/AbstractLexer.php:255)"} []
Hi,
on Line 255 there is preg_split($this->regex, $input, -1, $flags)
should be preg_split($this->regex, (string)$input, -1, $flags) due to 7.3 type casting
can you fix this, please
Thanks
When you have a lexer with formats containing multiple characters the reset position doesn't work as expected. Position is an internal pointer to the position of the lexer in the tokens
array and not to the position as exposed in the token itself.
private function parseNamedReference(): string
{
$startPosition = $this->lexer->token['position'];
while ($this->lexer->moveNext()) {
}
�
$this->lexer->resetPosition($startPosition);
$this->lexer->moveNext();
$this->lexer->moveNext();
}
In the example above I would expect that a resetPosition would throw me back to the position on method entry. But since my tokens do have multiple characters, this doesn't work.
A fix would be to set the index of each token. Like this:
$this->tokens[$match[1]] = [
'value' => $match[0],
'type' => $type,
'position' => $match[1],
];
However, this would break the step process using $this->position++
another solution could be to have a map between the token position and location in the tokens
array. This would have an impact on the memory usage since it would require an extra array of integers.
I would be happy to provide a patch to fix this issue, but I would like to have some guidance on what is expected in this library. Any change in resetPosition
would be a breaking change as it would change the behavior of this lib.
Today, getCatchablePattern
should return an array, and the matching value are passed into getType
.
It could be really handy to have and indexed array and passed the matching index into the getType
function like this:
In this example, I used the documented DQL lexer (not sure if the code is 100% right, but it's for the example)
protected function getCatchablePatterns()
{
return [
- '[a-z_][a-z0-9_]*\:[a-z_][a-z0-9_]*(?:\\\[a-z_][a-z0-9_]*)*', // aliased name
- '[a-z_\\\][a-z0-9_]*(?:\\\[a-z_][a-z0-9_]*)*', // identifier or qualified name
- '(?:[0-9]+(?:[\.][0-9]+)*)(?:e[+-]?[0-9]+)?', // numbers
- "'(?:[^']|'')*'", // quoted strings
- '\?[0-9]*|:[a-z_][a-z0-9_]*', // parameters
+ 'aliasedName' => '[a-z_][a-z0-9_]*\:[a-z_][a-z0-9_]*(?:\\\[a-z_][a-z0-9_]*)*', // aliased name
+ 'idOrQualifiedName' => '[a-z_\\\][a-z0-9_]*(?:\\\[a-z_][a-z0-9_]*)*', // identifier or qualified name
+ 'numbers' => '(?:[0-9]+(?:[\.][0-9]+)*)(?:e[+-]?[0-9]+)?', // numbers
+ 'quotedString' => "'(?:[^']|'')*'", // quoted strings
+ 'parameters' => '\?[0-9]*|:[a-z_][a-z0-9_]*', // parameters
];
}
protected function getType(&$value, $patternIndex)
{
$type = self::T_NONE;
switch (true) {
// Recognize numeric values
- case (is_numeric($value)):
+ case ('numbers' === $patternIndex):
if (strpos($value, '.') !== false || stripos($value, 'e') !== false) {
return self::T_FLOAT;
}
return self::T_INTEGER;
// Recognize quoted strings
- case ($value[0] === "'"):
+ case ('quotedString' === $patternIndex):
$value = str_replace("''", "'", substr($value, 1, strlen($value) - 2));
return self::T_STRING;
// Recognize identifiers, aliased or qualified names
- case (ctype_alpha($value[0]) || $value[0] === '_' || $value[0] === '\\'):
+ case ('idOrQualifiedName' === $patternIndex || 'aliasedName' === $patternIndex):
$name = 'Doctrine\ORM\Query\Lexer::T_' . strtoupper($value);
if (defined($name)) {
$type = constant($name);
if ($type > 100) {
return $type;
}
}
if (strpos($value, ':') !== false) {
return self::T_ALIASED_NAME;
}
if (strpos($value, '\\') !== false) {
return self::T_FULLY_QUALIFIED_NAME;
}
return self::T_IDENTIFIER;
// Recognize input parameters
- case ($value[0] === '?' || $value[0] === ':'):
+ case ('parameters' === $patternIndex):
return self::T_INPUT_PARAMETER;
// Recognize symbols
case ($value === '.'):
return self::T_DOT;
case ($value === ','):
return self::T_COMMA;
case ($value === '('):
return self::T_OPEN_PARENTHESIS;
case ($value === ')'):
return self::T_CLOSE_PARENTHESIS;
case ($value === '='):
return self::T_EQUALS;
case ($value === '>'):
return self::T_GREATER_THAN;
case ($value === '<'):
return self::T_LOWER_THAN;
case ($value === '+'):
return self::T_PLUS;
case ($value === '-'):
return self::T_MINUS;
case ($value === '*'):
return self::T_MULTIPLY;
case ($value === '/'):
return self::T_DIVIDE;
case ($value === '!'):
return self::T_NEGATE;
case ($value === '{'):
return self::T_OPEN_CURLY_BRACE;
case ($value === '}'):
return self::T_CLOSE_CURLY_BRACE;
// Default
default:
// Do nothing
}
return $type;
}
I can of course open a PR on that if you are OK with the idea.
can you pls make one tag at https://packagist.org/ for php7.1 at this commit id
db47018
Many applications still running in symfony3.4 and php 7.1
Upgrading from 1.0.0 to 1.1.0 led to unexpected breaks because of the new php version requirement.
Per https://www.doctrine-project.org/projects/lexer.html, it shows that the latest minor releases are not maintained. Is this true? 1.2.0 was released a few months ago. Maybe the Doctrine website needs to be updated?
After upgrading dependencies which includes this package from 1.2.1 to 1.2.2 phpstan now complaints because of invalid types for a method call.
I have commented it directly here: 863bff4#r67532655
In the TYPO3 implementation we have a call which uses the Lexer public class constants, which are integer typed constants to check thinks, like:
$isNextToken = $lexer->isNextTokenAny([Lexer::T_INDEX, Lexer::T_KEY]);
phpstan now complains because this do not match with the changed docblock type for the param from array
to string[]
.
Not sure, either the docblock should be changed to int[]|string[]
or mixed[]
to mitigate this. Not sure about the whole implemention. If token keywords should be string, the class constants should be changed too.
Example error message from phpstan:
1177 Parameter #1 $tokens of method
Doctrine\Common\Lexer\AbstractLexer::isNextTokenAny() expects
array<string>, array<int, int> given.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.