Comments (10)
Although not fully clarified here what went wrong, I added the check for the json_encode
and throw an exception. I think that makes it more clear for the user for similar errors in the future.
from meilisearch-php.
The user should make sure that the content send to MS has a proper encoding. If the json encoding fails the code should definitely throw an exception, good point.
I would support throwing an, f. ex. FailedJsonEncodingException
, exception with a proper message (with the json_last_error_message
.
from meilisearch-php.
It's not working with some Bulgarian, German, Spanish, French markdown files... and some English ones.
Bulgarian: 1972-03-01.0300.md
German: 2015-12-05.1205.md
French: 1986-12-14.1214.md
I think the english files using this a strange comma: ’
instead of '
or maybe this one “
instead of "
? I'm not sure why they use different commas but we have lots of translators so they probably have strange keyboards.
I am having trouble determining an exact which character is breaking the json_encode
function... but I also found you can parse a flag to overcome this error:
json_encode($body, JSON_INVALID_UTF8_SUBSTITUTE)
The flag JSON_UNESCAPED_UNICODE
didn't seem to fix the error but JSON_INVALID_UTF8_SUBSTITUTE
was OK.
It seems like you can test it with this:
$test = utf8_decode("Düsseldorf")
json_encode($test)
=> false
from meilisearch-php.
@alallema @codedge Although the issue is closed now (thanks!) I've been working on finding out why this occurs in my code but you're not able to replicate it. I have an update which might be useful to know.
Since MeiliSearch has a 1000 word limit, I was splitting my long articles into chunks and uploading all these chunks to the index. When you use str_split
is splits the string at a byte level not a char level... so it may cause problems with UTF-8 encoding as it's not a unicode-aware function.
from meilisearch-php.
Hi @tao!
Thanks a lot to raise this error and coming with a solution. If you want to create a PR to fix it would be with pleasure. If not no problem I will do it.
Sorry about this!
from meilisearch-php.
I haven't created a pull request because I'm not sure how you prefer to handle errors... and if you want to throw a custom error and add a note to the docs.
from meilisearch-php.
Maybe something like this:
public function patch($path, $body = null, $query = [])
{
$data = $this->prepareRequest($body);
$request = $this->requestFactory->createRequest(
'PATCH',
$this->baseUrl.$path.$this->buildQueryString($query)
)->withBody($this->streamFactory->createStream($data));
return $this->execute($request);
}
private function prepareRequest($body)
{
$request = json_encode($body);
if (!$request) throw new \Exception(json_last_error_msg());
return $request;
}
And the response will be something like this:
Exception
Malformed UTF-8 characters, possibly incorrectly encoded
at vendor/meilisearch/meilisearch-php/src/Http/Client.php:153
149▕ private function prepareRequest($body)
150▕ {
151▕ $request = json_encode($body);
152▕
➜ 153▕ if (!$request) throw new \Exception(json_last_error_msg());
154▕
155▕ return $request;
156▕ }
157▕
from meilisearch-php.
@tao,
I forgot can you provide a document to try to reproduce the error?
Thanks again
from meilisearch-php.
Hi @tao,
For the PR, I suggest like @codedge that you create a custom Exception
like FailedJsonEncodingException
with the specific message of json_last_error_message
. You can take an example from InvalidArgumentException
.
Your codes for fixing the bug seem to be good.
But again, we love to work with the community, so if you want to create a PR, it would be great! But if you don't want it or didn't have time, it's okay. Don't hesitate to ask if you have questions and check our contributing guidelines.
Thanks a lot
from meilisearch-php.
Hi @tao!
I try to reproduce your issue but I can't get it. If I put your file in the document["content"] it works:
$contents[0] = file_get_contents('1972-03-01.0300.md');
$contents[1] = file_get_contents('1986-12-14.1214.md');
$contents[2] = file_get_contents('2015-12-05.1205.md');
$documents = [
['id' => 123, 'lang' => 'Bulgarian'],
['id' => 456, 'lang' => 'French',],
['id' => 1, 'lang' => 'German'],
];
for ($index = 0; $index <= 2; $index++) {
$documents[$index]["content"] = $contents[$index];
}
$client->index('mdFile')->addDocuments($documents);
And if I transformed your md
file into json
file it's work too.
I think I'm missing something. Can you explain to me how to reproduce your problem?
Thank you
from meilisearch-php.
Related Issues (20)
- Ensure the error handling are following the best practice
- Add CodeCov
- [v1.4] [Prototype] User dictionary settings API HOT 1
- [v1.4] [Prototype] Separators settings api HOT 1
- Best type hint for setting return HOT 5
- The provided API key is invalid HOT 2
- Issue when upgrading from < 1.3 to >= 1.3 with api keys set to empty string HOT 3
- Cannot update ranking rules on index HOT 6
- http client recommendation: Symfony? HOT 10
- Split CI tests with different http clients into separate jobs
- Can't update faceting on the index. HOT 5
- We need a way to delete dumps HOT 1
- [v1.8] New `searchCutoffMs` index settings HOT 3
- [v1.8] Hybrid search improvements HOT 3
- Documentation CI does not work
- [v1.9.0] Filter by ranking score HOT 1
- [v1.9.0] Get similar documents HOT 1
- [v1.9] Distinct field at search time HOT 1
- [v1.9] Hybrid search changes HOT 1
- php-http/httplug and php-http/client-common are not required deps anymore? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from meilisearch-php.