Comments (10)
The remote peer only needs to be aware of the latest record to be able to verify and build a replica of the source db's entries. Therefore, only the latest addition is available via 'update'. For example:
await db1.add('a')
await db1.add('b')
// state of the oplog is 'a' <- 'b' where <- 'b' follows 'a'.
await db2.open(db1.address)
db2.on('update', (entry) => {
// 'update' only gives 'b' because 'b' follows 'a'. db2 does not need to be notified of the update 'a'; 'b' is enough to determine and replicate the correct oplog entries.
})
await db1.add('c') // this will fire db2's update event listener again because db1's oplog has been appended to and the (entry) will be 'c'.
I'll provide a link to https://github.com/orbitdb/orbitdb/blob/main/docs/OPLOG.md which may provide clarity.
from orbitdb.
Thank you. This clarifies it. Is there any chance to get the full delta of changes? I have a business process which has to post process only new documents on each node. E.g. could I get some current value, counter, hash or anything before opening the other database and then compute the delta on my own? E.g. I see this function:
log.iterator({ amount: 1 })
-> get the last update
gt: return all entries after entry with specified hash (exclusive)
-> get all new updates
from orbitdb.
Yes, the gt iterator param is one option.
If items are coming in from one peer you could loop over the database using iterator
, processing each item until you reach the most recently "processed" document.
If items are being sync-ed from various peers you may need to keep a record of the documents already processed and use it to determine whether a document requires processing.
However, the best method will be determined by your technical requirements.
Also, remember, if you are working at the oplog level you are working with operations. You may want to consider working with the data at the database level.
from orbitdb.
If items are coming in from one peer you could loop over the database using iterator
This means using log.iterator({ amount: 1 })
before connecting to any different peer's DB is not giving the latest hash of the database? I would use this otherwise if it works as latest element while syncing with all other databases.
Also, remember, if you are working at the oplog level you are working with operations. You may want to consider working with the data at the database level.
But on database level I do not have something like log.iterator({ amount: 1 })
to get the last document and I have also no option to get any newer documents then specific hash, right?
from orbitdb.
You have the iterator available at the db level, yes. The params available to you will depend on the data store being used. I would recommend looking at https://api.orbitdb.org/ for more information about the functions available.
from orbitdb.
My logic now is getting the log heads and then iterating over all log entries after this. This seems to be functional I can get all new changes:
const db2 = await orbitdb2.open('subscription-data', {type: 'documents'})
let db2a = await orbitdb2.open(db1.address, {sync: false})
const heads = await db2a.log.heads()
await db2a.close()
db2a = await orbitdb2.open(db1.address)
db2a.events.on('update', async (entry) => {
let filter = {}
if (heads.length > 0) {
filter = {gt: heads[0].hash }
}
for await (const record of db2a.log.iterator(filter)) {
console.log("new entry", record)
}
})
But the documentation of log.heads()
returns an array. I'm wondering why multiple heads are possible and what I'm missing here? Also i open the database with sync: false
to get the original state, close it and open it again. Somehow this feels inefficient.
from orbitdb.
But the documentation of log.heads() returns an array.
You can have multiple heads because peers can add records at the same (logical) time.
I'm not sure why you are working at the log level. Why not process records from the document store then mark them as processed?
from orbitdb.
The first issue I see is that I do not have write access to the synced database replica, I assume your approach would be to overwrite the document with the same document and set a marker attribute to the JSON structure. I.e. write access is necessary, right? My understanding here is that the database I open with orbitdb.open
is the same instance as the remote one just a local copy and that changes to this copy would also be reflected in the original database of the owning peer.
Another issue I see that I will miss deletions. How can I see a DELETE operation? I cannot iterate over all documents - I assume - and do a full database comparison to find out which documents have been deleted. My business process needs to know this, hence I has hoping to get all updates, also deletions with the update
listener.
Also if I do this this generic marker would be seen also by other peers and then these peers would think that they have already processed it on their end, which is not the case. I could also add a peer specific marker, but this would mean to have 100 specific marker attributes in the document for supporting 100 peers. And all peers need write access.
Also, what I see in the code it that the query
function is using the iterator
also and I think I cannot afford a complete database scan, if I have 10000 documents it looks like the iterator looks into all 10000 documents then. If something like in SQL a specific query would be possible with a WHERE clause, then I could consider this. Am I wrong here with my assumptions?
from orbitdb.
What you're describing seems more like a software architecture problem rather than a bug or issue with OrbitDB. If you're looking for help with integration of OrbitDB with your software systems, you may get better traction on the OrbitDB Lobby where other developers may have run into similar implementation issues.
from orbitdb.
I have added a question in the Lobby. I will add it here if my problem can be solved.
from orbitdb.
Related Issues (20)
- Orbitdb does not work on Nest.js. HOT 5
- db.address wont open and opening a new db gives me the same address HOT 2
- In React, Uncaught Error: Cannot find module 'interface-datastore/key' HOT 2
- Database stalls on traverseAndVerify() HOT 2
- a priori knowledge of database address HOT 12
- TimeoutError: request timed out when syncing between databases HOT 5
- Local network only IPFS HOT 14
- How to sync with several databases HOT 5
- Remove malicious node HOT 3
- why node 20? HOT 3
- Recover from [AggregateError: All promises were rejected] Error: Want for xyz aborted HOT 4
- Uncaught Error: Could not append entry: although write access is granted HOT 7
- Error: fetching raw block for CID HOT 4
- TS7016: Could not find a declaration file for module @orbitdb/core. HOT 3
- Error: No native build was found for platform=darwin arch=arm64 runtime=node abi=115 uv=1 armv=8 libc=glibc node=20.11.0 webpack=true HOT 1
- Log Entries created on remote/other Peers are not Pinned Locally HOT 5
- database record limits for acceptable lookup performance HOT 7
- Error: Want for bafyreighfb3szinnqrktnuyriifexl2bcgpisbrlbaxfs4kgqp6ocwjbmy aborted HOT 11
- Sync only to N entries. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from orbitdb.