Giter Club home page Giter Club logo

Comments (8)

yc-wang00 avatar yc-wang00 commented on June 8, 2024 1

Thanks for your active and quick response! Really appreciate it. I have tried your work around and make it work for my case.

I am also curious about the issue so keep me update if you found it later.

Again thanks. really like this project and keep the good work!

from surrealdb.

phughk avatar phughk commented on June 8, 2024

This question can probably be better answered by @emmanuel-keller , but from the network side we may configure the limit of network messages to allow this in the future.

from surrealdb.

emmanuel-keller avatar emmanuel-keller commented on June 8, 2024

The Rust SDK has a hard currently a hard limit related to the maximum size of a message.

pub(crate) const MAX_MESSAGE_SIZE: usize = 64 << 20; // 64 MiB

Workaround 1 - Use u16:

In this case, a possible option (if it fits) would be to use u16 rather than u32. That will reduce the size of the statement and make it pass.

Workaround 2 - Use HTTP:

The HTTP connection has different size limit. You may try with:

 let db = Surreal::new::<Http>("localhost:3301").await?;

Workaround 3 - array::push

You can split the vector in smaller array and use the arraypush function to build the final vector.

Follow up

We are going to make this configurable in the SDK. Notes that you will also have to increase the value in the server using the environment variable.

The value must also be increased server side.

export SURREAL_WEBSOCKET_MAX_MESSAGE_SIZE=134000000

from surrealdb.

yc-wang00 avatar yc-wang00 commented on June 8, 2024

I believe you've reached a hard limit in the Rust SDK, related to the maximum size of a message.

pub(crate) const MAX_MESSAGE_SIZE: usize = 64 << 20; // 64 MiB

Would it be possible to use u16 ? Or do you indeed need u32?

Thanks for your response!

Re. (1) Ahh I see. Yes I do need to store in u32 since this array is used to store token (LLM context), a llama3 tokenizer vocab size can go up to 128256 which needs u32 to store.

Re. (2) I tried the http connection and it gives me the same thing.

Re. (3) working around 3, could you give more context on how I should use array::push?

Re. follow up, can I configure the env var right now or in the future version?

from surrealdb.

yc-wang00 avatar yc-wang00 commented on June 8, 2024

pub(crate) const MAX_MESSAGE_SIZE: usize = 64 << 20; // 64 MiB

Also regarding this line, it the max size looks like 64 MB, but u32 * array size (2_000_000) should be only 8 MB if my math is correct. Just wondering if this is expected?

from surrealdb.

emmanuel-keller avatar emmanuel-keller commented on June 8, 2024

array::push as an operator equivalent: +=

Here is how you can do that:

CREATE foo:1 SET bar = [1, 2, 3];
[[{ bar: [1, 2, 3], id: foo:1 }]]

UPDATE foo:1 SET bar += [4, 5, 6];
[[{ bar: [1, 2, 3, 4, 5, 6], id: foo:1 }]]

SELECT * FROM foo;
[[{ bar: [1, 2, 3, 4, 5, 6], id: foo:1 }]]

Server side, you can already change the configuration using the environment variable (since 1.4.x).

Also regarding this line, it the max size looks like 64 MB, but u32 * array size (2_000_000) should be only 8 MB if my math is correct. Just wondering if this is expected?

That's a good point. It all depend on the serialisation. Which version of the server and the SDK are you using?

from surrealdb.

yc-wang00 avatar yc-wang00 commented on June 8, 2024

That's a good point. It all depend on the serialisation. Which version of the server and the SDK are you using?

I am using

  • surrealdb 1.4.2 for linux on x86_64
  • sdk: surrealdb = "1.4.2" (in cargo.toml)

from surrealdb.

emmanuel-keller avatar emmanuel-keller commented on June 8, 2024

The current binary serialization is not optimal for large vectors. Due to the parsing process, the Vec<u32> is currently transformed into a Vec<Value>. The Value structure itself is an enum, which, in turn, points to another enum structure (Number) that represents numbers, which are internally stored as 64 bits. These structures are also versioned. So, each number is likely to consist of 64 bits + 2 ordinals (8 bits each) + 2 version holders (8 bits each), totaling about 96 bits per element. This brings us close to 24MB.

That said, that's still under the 64MB limit. So, we are still currently investigating this issue.

from surrealdb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.