Comments (4)
These are Unicode surrogate code points. They don't correspond to a valid Unicode character. This is mojibake. You can't split them meaningfully into characters if they are not characters in the first place. What exactly is your expectation here?
from doc-en.
From the PCRE docs:
In addition to checking the format of the string, there is a check to ensure that all code points lie in the range U+0 to U+10FFFF, excluding the surrogate area. The so-called "non-character" code points are not excluded because Unicode corrigendum # 9 makes it clear that they should not be.
Characters in the "Surrogate Area" of Unicode are reserved for use by UTF-16, where they are used in pairs to encode code points with values greater than 0xFFFF. The code points that are encoded by UTF-16 pairs are available independently in the UTF-8 and UTF-32 encodings. (In other words, the whole surrogate thing is a fudge for UTF-16 which unfortunately messes up UTF-8 and UTF-32.)
We may want to document, that surrogates are not supported; converting to Utf-8 first may yield the desired result.
from doc-en.
from doc-en.
Java works with UTF-16, PHP' PCRE with UTF-8.
from doc-en.
Related Issues (20)
- strcasecmp return type and changelog notes appear incorrect HOT 1
- Report a bug HOT 1
- Incorrect configuration at the PDO installation page HOT 1
- RFC: Disjunctive Normal Form Types - invalid intersection example HOT 2
- OpenSSL error cURL and MSSQL HOT 1
- PHP_OUTPUT_HANDLER_PROCESSED is only available as of PHP 8.4 HOT 4
- ReflectionProperty::getDefaultValue() and promoted properties HOT 1
- ob_start PHP_OUTPUT_HANDLER_FLUSHABLE wrong ob_end_flush instead of ob_flush HOT 2
- Support more than parameter attributes attributes HOT 7
- Make it clearer that Exceptions cannot be cloned HOT 6
- set_error_handler and set_exception_handler do not necessarily return a callable
- Empty `id` attribute in the `Examples` section HOT 5
- Dark mode or theme HOT 1
- Wrong return type for settype
- wordwrap() should mention that it only supports ASCII characters HOT 2
- in_array HOT 3
- curl_setopt page, the CURLOPT_XFERINFOFUNCTION constant
- docs fail to load
- PHP 8.2.18 HOT 2
- DateTimeImmutable::createFromFormat doesn't mention "p"
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from doc-en.