Giter Club home page Giter Club logo

Comments (6)

psaxton avatar psaxton commented on June 2, 2024 1

@jasonbosco Thanks for the quick response. I've updated to Typesense 26.0 and included the --db-compaction-interval=21600 in the arguments. Disk usage has held around 8-9GB since then but that seems quite high for the number of documents. It could just be leftover cruft from 0.25.2. I've wiped the persistent storage, restarted the cluster and started a fresh index and will let it stew over the weekend to see what happens.

from typesense.

jasonbosco avatar jasonbosco commented on June 2, 2024

We've improved disk usage in v26.0, for this specific write pattern of creating new timestamped collections and deleting the old one, like how the scraper does.

Could you try upgrading to it, and then setting db-compaction-interval = 21600 as a Typesense server parameter, to see if that helps?

from typesense.

psaxton avatar psaxton commented on June 2, 2024

After rebuilding all the data under 26.0 the disk usage under /usr/share/typesense/data is much more steady and consistent across cluster nodes. We are noticing that one node seems to be keeping everything in ./db while the other 2 appear to be moving data over to ./state/snapshot despite all containers being started with identical parameters. Any ideas why that may be?

typesense-0:

typesense-0:/$ cd /usr/share/typesense/data
typesense-0:/usr/share/typesense/data$ df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme2n1    9.8G  1.5G  8.3G  16% /usr/share/typesense/data
typesense-0:/usr/share/typesense/data$ du -h --max-depth=2
4.0K    ./models
236K    ./state/log
1.5G    ./state/snapshot
8.0K    ./state/meta
1.5G    ./state
16K     ./lost+found
3.9M    ./db/archive
30M     ./db
4.0K    ./meta/archive
5.3M    ./meta
1.5G    .

typesense-1:

typesense-1:/$ cd /usr/share/typesense/data
typesense-1:/usr/share/typesense/data$ df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme4n1    9.8G  1.6G  8.2G  16% /usr/share/typesense/data
typesense-1:/usr/share/typesense/data$ du -h --max-depth=2
236K    ./state/log
8.0K    ./state/meta
1.6G    ./state/snapshot
1.6G    ./state
4.0K    ./models
16K     ./lost+found
3.9M    ./db/archive
30M     ./db
4.0K    ./meta/archive
5.3M    ./meta
1.6G    .

typesense-2:

typesense-2:/$ cd /usr/share/typesense/data
typesense-2:/usr/share/typesense/data$ df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme3n1    9.8G  1.5G  8.3G  16% /usr/share/typesense/data
typesense-2:/usr/share/typesense/data$ du -h --max-depth=2
16K     ./lost+found
3.9M    ./db/archive
1.5G    ./db
4.0K    ./models
4.0K    ./meta/archive
5.3M    ./meta
236K    ./state/log
680K    ./state/snapshot
8.0K    ./state/meta
928K    ./state
1.5G    .

from typesense.

kishorenc avatar kishorenc commented on June 2, 2024

Can you tell me what type of disk you are using for the data directory?

When a write arrives, it is written to the raft log and also written to the store in db directory. Every 1 hour, a snapshot happens where the contents of the db directory is hard linked within the state/snapshot directory (hard linking is like soft link but happens at the inode level so that data is not duplicated). When Typesense is restarted, we replace the db directory with the contents of the db from state/snapshot . You can confirm this behavior in a Typesense server on your localhost.

So ideally, the db directory should be more or less have the same data in state/snapshot unless a lot of writes have happened before a snapshot occurs.

from typesense.

psaxton avatar psaxton commented on June 2, 2024

from typesense.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.