There is currently no documentation about the persistence mechanism used for hyperdex.

Does hyperdex persist data to disk? Data is f

Documentation about persistence about hyperdex HOT 5 CLOSED

rescrv commented on July 19, 2024

Documentation about persistence

from hyperdex.

Comments (5)

rescrv commented on July 19, 2024

Does hyperdex persist data to disk?

Data is flushed to disk in the background.

Does hyperdex persist spaces greater than memory onto disk?

Yes.

How does hyperdex handle requests for items currently stored on disk?

Currently, it relies upon memory-backed files. We are very careful to keep all data needed for a particular request in one location so that it will be faulted in together.

What format does hyperdex persist data to disk?

A custom binary format. It's not set in stone yet, but the general structure will be similar to what is in hyperdisk/ now.

Is data persisted to disk synchronously or asynchronously flushed?

We persist data to disk asynchronously, and rely upon the higher-level replication of HyperDex to ensure that up to $f$ node failures will not impact durability.

Is there a mechanism to re-start a cluster, without data-loss, from a catastrophic full-cluster failure?

In order to truly protect against data loss from a full-cluster failure, we'd need to persistently log every message. This cost is unacceptably high. Although we do not handle full-cluster failure right now, we intend to guard against such failures through periodic snapshots, and a cluster-wide recovery mechanism.

Is there any possibility for the disk storage to become corrupted? Are there mechanisms in place to recover from corrupted storage?

There is always this possibility. It is on our short-medium term priorities to make this robust, both through use of adequate check-summing, and by enabling fetches from secondary replicas.

I'll consider this comment a good start on documenting this aspect of HyperDex.

from hyperdex.

roja commented on July 19, 2024

Fantastic response :)

from hyperdex.

fclairamb commented on July 19, 2024

Is there some news on that ?

In order to truly protect against data loss from a full-cluster failure, we'd need to persistently log every message.

On your website is published a performance comparison page which starts with:

Picking a NoSQL data store is a difficult task. To help with this process, we provide the results of a comparison between HyperDex and other popular NoSQL systems.

In this page, hyperdex is clearly beating Cassandra and mongodb. But cassandra does persistently store every change made.

This changes a lot of things. Even for reads performances as cassandra might have to search data among different sstables (an interesting discussion about it).

Maybe you should add something about it in your benchmark so that it actually helps us pick the right NoSQL data store.

I'm very interested by hyperdex for two main reasons:

Its searchable capabilities (it's great, there are so many new ways to organize data on nosql clusters with it)
Low memory footprint (especially compared to cassandra or mongodb)

But these full-cluster failures can happen (power outage, cascading failure, simple distributed human mistake, etc.) and I don't know that many people who would accept this kind of trade-off.

==> Is there any plan improve this ?
Periodic snapshot isn't really one unless it's associated with some change logs.

from hyperdex.

rescrv commented on July 19, 2024

It's worth noting that in the year since this bug report was opened, HyperDex has undergone significant change and much of what was written in the bug report under-sells HyperDex's capabilities. Right now we use LevelDB which does write directly to disk on each write, not asynchronously.

It's also worth noting that Cassandra's default settings, and more importantly, the settings we use in the benchmark do not persistently store every change made upon acknowledgement. Data is only written to disk periodically. It's unclear from the documentation whether it's even passed to a "write" call when the operation completes. In HyperDex, the data is indeed passed to the OS with "write" when you receive an acknowledgement.

Because HyperDex uses LevelDB, it too has to search for data among different sstables. This alone is not a reason for slow read operations. The organization of Cassandra's SSTs means they must look in more tables for a get, possibly many more. LevelDB maintains invariants that put a hard upper bound on the number of tables to look in.

We're working on full-cluster backup solutions usable for intra-datacenter, or inter-datacenter failover, allowing you to keep multiple hot spare clusters in different datacenters, and then bring failed clusters up-to-date.

If you'd like to continue this discussion, I'm happy to do so, but ask that we either take it to the HyperDex-discuss mailing list, or open another bug report as the topic has drifted from the original questions. Unless you're actually reporting a bug, the discussion list is where I'd prefer to take it.

from hyperdex.

fclairamb commented on July 19, 2024

My point was simply that cassandra doesn't leave dirty data that it can't handle afterwards. It indeed doesn't flush the changes to logs instantly.

You're right, I should continue this to the hyperdex-discuss group. Thank you for your very complete answer.

from hyperdex.

Documentation about persistence about hyperdex HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent