Comments (11)
Presumably the reason for using regex to get the length of the string was to prevent a dep on mb_string
yet allow unicode characters. You could do something like this instead:
if ($this->maximumKeyLength !== Capabilities::UNLIMITED_KEY_LENGTH
&& count(preg_split('/./u', $key, null)) > $this->maximumKeyLength
) {
This is much slower than preg_match
but at least works. For a key 10000 😄's long and 10000 iterations:
preg_split
is ~3spreg_match
is ~0.001s
Doing a length comparison:
mb_strlen
is ~0.3sstrlen
is ~0.0001s
Fixing this problem for the .1% of users who need massive key sizes will negatively impact the 99.9% majority speed-wise.
Perhaps something like this (not unicode accurate but gives a conservative limit). I'm not aware of a PCRE constant that provides a concrete limit to quantifier limits.
protected const UNICODE_KEY_LENGTH_CHECK_LIMIT = 3000;
if ($this->maximumKeyLength !== Capabilities::UNLIMITED_KEY_LENGTH) {
if (
$this->maximumKeyLength > self::UNICODE_KEY_LENGTH_CHECK_LIMIT
&& strlen($key) > $this->maximumKeyLength
) {
// Exception
} else if (preg_match('/^.{'.($this->maximumKeyLength + 1).',}/u', $key)) {
// Exception
}
}
Another approach would be to remove the check entirely for key lengths over a certain size. Rational being that you require modifying the default memory_limit
and have a niche use-case so are likely technical enough to enforce key length yourself. Also, the underlying adapter would throw an exception should you exceed its limits.
protected const UNICODE_KEY_LENGTH_CHECK_LIMIT = 3000;
if (
$this->maximumKeyLength !== Capabilities::UNLIMITED_KEY_LENGTH
&& strlen($key) < self::UNICODE_KEY_LENGTH_CHECK_LIMIT
&& preg_match('/^.{'.($this->maximumKeyLength + 1).',}/u', $key)
) {
// Exception
}
from laminas-cache.
Running into the same issue:
PHP Warning: preg_match(): Compilation failed: number too big in {} quantifier at offset 9 in /vendor/laminas/laminas-cache/src/Psr/SimpleCache/SimpleCacheDecorator.php on line 338
- laminas/laminas-cache-storage-adapter-redis 1.2.0
- laminas/laminas-cache 2.12.1
from laminas-cache.
Reading the PSR-16 specifications it seems the key should not be bigger then 64, This was previously imposed in validateKey
method
// explicit verification
https://github.com/laminas/laminas-cache/blob/2.11.x/src/Psr/SimpleCache/SimpleCacheDecorator.php
// new capabilities verification
https://github.com/laminas/laminas-cache/blob/2.12.x/src/Psr/SimpleCache/SimpleCacheDecorator.php#L337
The fix that I would suggest is to not allow more than 64 characters even if the storage capabilities allows it.
from laminas-cache.
Another issue that I see is that now Storages that may have capability less then 64 characters as maximum length will trigger an exception "The maximum key length capability must allow at least 64 characters".
If the storage must support at least 64 bytes in key and the PSR-16 indicates that the key must not be more than 64 bytes ... then it does not make any sense to have a property that stores the maximumKeyLength
. If the storage is validated, that means the hard limit is by default 64.
I believe that if there is a maximum key validation that means storages with less then 64 bytes for the maximum key length should be allowed.
from laminas-cache.
I have created a Pull request. #155
from laminas-cache.
I think we need an optimization of the check regarding the key length.
I dont think that regex is the best way to verify a maximum length of a string.
PSR-16 states that at least 64 characters have to be supported. Thats the minimum, not the maximum.
You can read about this here: https://www.php-fig.org/psr/psr-16/
Key - A string of at least one character that uniquely identifies a cached item. Implementing libraries MUST support keys consisting of the characters
A-Z
,a-z
,0-9
,_
, and.
in any order in UTF-8 encoding and a length of up to 64 characters. Implementing libraries MAY support additional characters and encodings or longer lengths, but MUST support at least that minimum. Libraries are responsible for their own escaping of key strings as appropriate, but MUST be able to return the original unmodified key string. The following characters are reserved for future extensions and MUST NOT be supported by implementing libraries:{}()/\@:
There are also integration tests which checks that:
https://github.com/php-cache/integration-tests/blob/824633c9547dc3c7ac5faeca03dd82939cdeb73c/src/SimpleCacheTest.php#L422
https://github.com/php-cache/integration-tests/blob/824633c9547dc3c7ac5faeca03dd82939cdeb73c/src/CachePoolTest.php#L122
from laminas-cache.
Okay, I've had some investigations and came to a conclusion:
For the 2.x versions, the redis adapter v1 is in charge.
To fix this problem, we wont be able to totally avoid mb_strlen
but I think I got something which at least will decrease the need for it.
However, due to the fact that this component does not yet require mbstring
as an extension, we cannot rely on it. Using the symfony polyfill was an idea, but it uses iconv
which could be missing as well.
So instead of changing the regex handling here, I will prepare a new minor version of the redis adapter which decreases the maximum key length to 65534
which will ensure that the regex of the decorator won't blow up.
In v2 of the redis adapter, the 512 MB will be re-added.
In v3 of the cache component, mbstring
will be required (see #157) and the following code might work good enough:
// This already exists tho
if ($this->maximumKeyLength === Capabilities::UNLIMITED_KEY_LENGTH) {
return;
}
// a single UTF-8 character may use up to 4 characters, so we do a naive calculation here to verify if that limit is exceeded
// keep in mind that for the redis adapter, a cache key must exceed 128000000 characters (128MB) to not pass this check!!!
$maximumNonUnicodeKeyLength = $this->maximumKeyLength / 4;
if (!isset($key[$maximumNonUnicodeKeyLength + 1])) {
return;
}
// current functionality
$validator = function (string $key): bool {
return ! (bool) preg_match('/^.{'. $this->maximumKeyLength . ',}$/u',$key);
};
// if the key length exceeds the maximum quantifier length, use `mb_strlen`
if ($this->maximumKeyLength > 65535) {
$validator = function (string $key): bool {
return mb_strlen($key, 'UTF-8') <= $this->maximumKeyLength;
};
}
if ($validator($key)) {
return;
}
throw new SimpleCacheInvalidArgumentException(sprintf(
'Invalid key "%s" provided; key is too long. Must be no more than %d characters',
$key,
$this->maximumKeyLength
));
What do you think?
from laminas-cache.
Maybe just use if else
to avoid declaring an anonymous function that is immediately overridden ?
from laminas-cache.
From PSR-16, the enforcement of the minimum key length is already achieved with the current code and doesn't exceed regex limits.
Redis allows for 512 MB for key length, which is a size limit, not a character length. I believe the current implementation (https://github.com/laminas/laminas-cache-storage-adapter-redis/pull/14/files) is incorrect as a unicode 512000000 character string exceeds this.
If the adapters returned a max byte size for the key, you can derive the max character length with multiplication and only need strlen
. This removes any headaches with character encoding and puts the onus on the adapter to report its capabilities. Perhaps a flag to say which calculation its using would be helpful for BC.
The goal of caching is performance and mb_strlen
not only requires slower introspection of the source, but it also carries the baggage of a new extension. By changing to byte limit instead of character length you'll be able to achieve validation while retaining speed.
The check for max limit is maybe out of the remit of the PSR-16 code considering as enforcement hasn't been recommended, although the simple error message is convenient.
from laminas-cache.
@SorX14 good point. Using mb_strlen
on a 4-byte character would lead to a result of 1
.
So maybe not depending on mbstring at all would be a good thing. Maybe using utf8_decode
along with strlen
as suggested by @ghostwriter would be a better approach. But utf8 characters which will consume more than a Single Byte wont be detected either as these are converted to a single ?
as of how I understand the documentation of utf8_decode
.
Maybe we keep the maximum key length of redis at 65534
as it should be long enough anyway. If some one needs longer keys, they should redis directly rather than a library. 🤷🏼♂️
from laminas-cache.
Closed with #158
from laminas-cache.
Related Issues (20)
- Documentation: Add integration examples for `mezzio` and `laminas-mvc` HOT 10
- Upgrade to PHP 8 does not work because of dependencies requirements HOT 3
- Migration to PHP 8.0 with replace leads to unexpected errors HOT 8
- [RFC]: PSR compatibility via dedicated satellites HOT 1
- [RFC]: remove conflict with symfony/console HOT 5
- Cannot add laminas cache to my project. HOT 3
- PHP 8.1.1 support HOT 6
- A cache storage implementation is required, while the CaptureCache pattern does not require one HOT 2
- CacheItemPoolDecorator does not properly handle multiple unsaved deferred items HOT 3
- RFC: Configure AdapterPluginManager from the config HOT 9
- Use `Clock`-Interface (PSR-20) in `PSR-6` `CacheItem`
- Dependency Dashboard
- psr/cache V3 and psr/simple-cache V3 compatibilities HOT 1
- laminas-cache is still using Laminas\Cache\StorageFactory::factory() which no longer exists HOT 6
- Provide native support for PSR-20 HOT 1
- [RFC]: Detach serializer plugin from `laminas-serializer` HOT 4
- Add `cache` configuration key documentation HOT 3
- Update `vimeo/psalm` to v5.16 once available HOT 1
- Remove virtual package requirement `laminas/laminas-cache-storage-implementation` HOT 1
- laminas-feed still relevant for this component? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from laminas-cache.