Giter Club home page Giter Club logo

Comments (5)

ray6080 avatar ray6080 commented on September 26, 2024 1

hi @saschamcdonald thanks for sharing this. will take a look into it.

from kuzu.

ray6080 avatar ray6080 commented on September 26, 2024

Hi @saschamcdonald , thanks for reporting this! We've made significant changes to the rel storage since v0.1.0, and that should be why the behaviour is inconsistent between these two versions, though the exception is not expected. We'd love to look into the exception and fix it. Is it possible for you to share the parquet file with us?

from kuzu.

saschamcdonald avatar saschamcdonald commented on September 26, 2024

re share the parquet file with us?: Unfortunately I have a GDPR issue as the data could be personally identifiable. From review of the changes between versions of KuzuDB as referenced in the raised issue, it could be that ingress of a large parquet file ( relative to a client's available memory ) into kuzu v 0.2.0, is potentially not chunking ingress in terms of memory spill to disc as effectively as kuzu v 0.0.12 via it's default settings. I'll try and create test data over the next couple of weeks to offer a repeatable test data set for the team. I think memory to disc spill is potentially the issues. In the interim is there a debug level I can set to capture more info for the team? - brb.

from kuzu.

ray6080 avatar ray6080 commented on September 26, 2024

Hi @saschamcdonald , thanks for the info. Is it possible to share the rel table schema and some statistical information of the dataset, like number of nodes and number of rels and also some degree distribution info? We can try to reproduce this locally on our side.

I'll try and create test data over the next couple of weeks to offer a repeatable test data set for the team.

That would be much appreciated!

In the interim is there a debug level I can set to capture more info for the team?

Unfortunately, we currently don't have a way to collect more debugging info without compiling from source and running inside a debugger.

from kuzu.

saschamcdonald avatar saschamcdonald commented on September 26, 2024

@ray6080 Here is a small repo containing my code that generates test data and offers environment repeatability for the issue and hopefully useful for the team:
https://github.com/saschamcdonald/ch_06_kuzudb_tests

from kuzu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.