Giter Club home page Giter Club logo

hyperhyperspace-core's Introduction

Hyper Hyper Space

An offline-first shared data library for creating p2p apps that work in the browser (and now also NodeJs).

tldr;

This library helps you create distributed data structures, mostly for p2p applications. It works like an object store, where objects have to follow some conventions to enable secure remote sync. You can see an example here. More info (including how to run the example) below.

Intro

In the same way in which the Internet bridges networks together, the Hyper Hyper Space attempts to create a cryptographically secure append-only distributed data layer that makes information universally accessible. We follow two guiding principles:

  • Make all data local: always read and modify data locally
  • Communicate only through data sync: do not use APIs or any form of remoting

You can read our White Paper to find out more about how this works.

This project is experimental. All APIs may change, bugs exist and the crypto has not been audited.

Playground

You can play with the library, using it to synchronize a plain javascript object from your browser's console, in this playground page.

Examples

To create datatypes that can be shared using HHS, you need to extend the HashedObject and MutableObject classes. You can learn more on the Data Model section below, or jump to a few examples in this repo.

To run the example chat app, clone the examples repo and do

yarn build

yarn start

If you're using windows, replace start by winstart above.

Objectives

Enable the creation of p2p apps that work in-browser, without requiring any infrastructure, and that are as practical and functional as centralized apps. Find abstractions and algorithms that make creating and reasoning about these apps intuitive and predictable. Explore new models for online collaboration platforms that follow the p2p model yet are frictionless to use for the general public, and are

  • respectful of everyone's privacy and data ownership rights
  • transparent in their handling of information

by default.

Data model

HHS uses an immutable typed-objects local storage model. Objects are both retreived and cross-referenced using a structural hash of their contents as their id (a form of content-based addressing).

Mutability is implemented using CRDTs. Identities and data authentication are cryptographic.

Objects and their references form an immutable DAG, a fact that is used for data replication in HHS p2p mesh.

You can read more about HHS data model, including code samples, here.

Mesh network

A peer in the HHS mesh network is a pair containing an identity (i.e. a typed identity object per the data model above) and an endpoint (URL). The in-browser networking used by HHS is based on WebRTC. While this allows direct browser-to-browser data streams, WebRTC connection establishment needs the two parties to exchange a few messages out-of-band, using a signalling server. We have developed a tiny service (77 lines of python at the moment). While everyone can run their own, we are providing a public instance running at the URL wss://mypeer.net:443. To listen for peer connections, the browser will form an endpoint using the signalling server URL and some arbitrary information (usually involving its identity hash, but that is determined by the app), and connect to the signalling server over a websocket. To connect to another peer, the browser will open a websocket to the other peer's signalling server. Two peers don't need to use the same signalling server to be able to connect.

Peer groups use simple randomized algorithms to choose how peers interconnect to each other within the group, epidemic gossip to discover any new state, and cryptographically secured deltas to send missing operations back and forth.

Apps will configure groups of peers, and the HHS mesh provides primitives for effortlessly synchronizing objects within each peer group (this boils down to syncrhonizing their sets of CRDT operations for each shared object).

Spaces

A space is a data unit that can be shared and discovered easily. It has root object that can be used to bootstrap and synchronize the space.

Project status

There is a demo of a simple fully in-browser p2p chat app running here. However, the library has been fully rewritten since that demo was created.

Re-wiring the demo to use the current version of the library is currently WIP. Check out the Account library in the next section.

Using outside the browser

If you need to use this library directly in NodeJs, outside of a web browser, you need to import @hyper-hyper-space/node-env.

hyperhyperspace-core's People

Contributors

jdb8 avatar joigno avatar micahscopes avatar potherca avatar sbazerque avatar sbillig avatar ylebre avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hyperhyperspace-core's Issues

Support recursive sync

The new mutation events should allow a correct implementation of sync'ing in recursive mode.

Support for changing the default PeerGroup in MeshNode

If no peer group is specified in the MeshNode, a default peer group is created per-object. Add support for accessing the default peer group for a given object, and for changing the default peer group per node (e.g. after setting, and if no peer group is supplied to sync, that peer group should be used).

Stop & start space sync seems to fail

Stopping & starting the sync of a space (for example by choosing "No" in the Sync column, then "Yes" again) seems to leave the space in a non-sync'ing state (or at least that what seems by looking at the UI)

Improve hashing of public keys

When hashing a public key in the process of creating an identity, remove all whitespace / newlines before doing the actual hash, in order to make the process deterministic.

Make the Store & its backends support checkpointing

Right now, whenever obj.loadAllChanges() is called, all the mutations are loaded from the store and applied. It'd be very convenient to enable mutable types to save a checkpoint to the store when they so choose, and having loadAllChanges() automatically use the most recent checkpoint, having to apply only those mutations that happened after that point.

Dump and restore spaces to JSON

Dump and restore Hyper Hyper Space spaces to and from JSON literals.

So starting from the space's entry point, it should be possible to dump all its internal state into a single JSON-encoded string. The JSON should have enough metadata to recover the space without any additional information. This is fairly similar to what Context and specially LiteralContext do. It may be easier to go through a (memory-backed?) store to do the saving, and then using its contents to populate a LiteralContext to be included in the generated JSON. For loading it up back again, maybe it would be easier to load the contents of the LiteralContext back into a store (may be optional - if omitted we could just create a memory-backed one) than attempting to populate all the objects recursively automatically.

The generated JSON could be a first file format for importing / exporting spaces to and from files.

Comparison with other p2p efforts

Hi, just read the white paper, and this project seems great.

There are a lot of similar p2p projects with similar goals and principles currently going on.

I suggest we list here similar projects and discuss the differences, overlaps and potential for collaboration & integration. After all, interoperability in one ecosystem is worthless if its only another isolated platform that compete without i nteroperability with each others.

Off the top of my head:

Make sync status visible from outside the MeshNode

Even though the real sync state of objects cannot be known (because of network availability, gossip propagation time, etc.), we should make the discovered state observable from outside the sync agent.

(Don't forget to update the react bindings as well)

Assumptions/questions after a few days of reading

I'm really excited to be learning HHS, as I can see a lot of great, very intelligent, work has gone into it already. From reading the documentation it sounds like it will work for the app I'm building. Now I'm trying to understand how I would use HHS to implement the data model for my app. I've used many different databases but nothing like p2p. I'm very new to this space (no depth of knowledge of Merkle-DAG's, spanning trees, WebRTC etc...but I get graphs). I'm trying to bridge the gap between what I want to do and how HHC works. Here's what I've been able to piece together after a few days with a fresh set of eyes. My apologies in advance if I missed anything obvious in the docs. If anyone is able to confirm or correct my understanding and point me in the right direction I would really appreciate it! Open source examples are always welcome.


High Level

I can put my static SPA app on a CDN and peers will sync app data among themselves. A signaling server is needed to aid in this process, but load on the server is not high. HHS is like a toolbox with tools to enable this.

A Store is an abstract for actual data persistence to a peer, for instance IndexedDB in the browser. An app's data model is made up of many interlinked spaces. Sync logic takes data in a 'Space' and updates the store on a peer using the verify function of each class. The full graph of spaces is represented across the network of peers. Individual peers only have the spaces they edited or requested. Access is granted or revoked using the CausalSet. The whitepaper says CapabilitySet but I think the API has changed since it was written.

Spaces

Spaces can be nested within other spaces by referencing those spaces. Spaces can be discovered independently (of any particular app) via a 3-word code. A space is an abstract for exactly one grouping of data, "a chat room, an article, a blog, an ecommerce store, etc." A class (written by a dev) defines that grouping of data, a potential combination of literals and references to other spaces/class instances, based on business logic. A space is implemented when a class inherits from either an immutable HashedObject or a mutable abstract, ie MutableSet which is a MutableObject.

When to use one or the other, I'm not sure. Seems like a good use-case for HashedObject would be an error log. Or would an error log be represented as a HashedObject nested inside a MutableSet? What are some good use-cases for an immutable space? Will most app data be mutable?

"They can be universally looked up using 3-word codes, like suburb-suburb-awake." If a space really is a class instance, does that mean every single class instance needs a 3-word code? Is this what the Shuffle class is for? If I code a BlogSet class and an Article class, and there are 3 articles in the blog, does that mean there must be 4 spaces total for the blog? Is there ever a situation where the blog would be just one space?

Data Model

(my main question...how to model my app data?)

It seems like an application data model would represent a graph, with the application (for a particular user) pointing to some logical root. Various spaces/classes would then be organized under the root in a way that makes sense according to the business logic. Each user would have various access to all or parts of this node. ie.

App Data (for Acme Employee: Bob) =>

Acme Corp {
  Meta: {orgSpace: "acme-corp-best"}
  EmployeeSet -> Employee: {name, startDate, title, boss...}
  CustomerSet -> Customer: {name, address...}
  OrderSet -> Order: {date, total, Lines -> LineItem: {product, qty, price, amount, total}, customer...}
  ProductSet -> Product: {name, priceSet...}
  OutsideMeetingSet -> Meeting: {mtgDateTime, discussion, Invites -> Invite:{orgSpace: "abc-corp-great"}}
}

*Bob may be granted access to all or parts of this graph depending on his role at Acme.
*This is not the app I want to build, just an example.

Peer Discovery

"Peers in the network form application-defined groups over which mutable objects are synchronized. The method for obtaining peers is also application defined..." I'm not sure what this means. Let's say both Acme Corp ("acme-corp-best") and ABC Corp ("abc-corp-great") are pointing to their respective Org graphs. Does HHS have logic, or does the app need custom logic, to make sure Acme employees are primarily trying to sync with other Acme employees and not ABC employees, other than for shared data like an OutsideMeeting discussion Forum which both companies can access? How does an app define peer groups? Implicitly? Or is there example code for this?

Other questions:

  • I'm not able to find this link. Does it still exist somewhere?
  • How does HHS handle when a class is modified, ie. adding or removing an attribute on an existing Space/Class?
  • With the library being relatively new, are there abstracts I should avoid for now? Any pitfalls to be aware of?
  • I'm at 984kb for the es module...very hefty for the type of app I want to build. Uglify may be able to cut that in half. Are there plans to reduce that or has anyone attempted tree-shaking?
  • Are users able to login seamlessly from any device? If so, how secure is their password?
  • Besides the examples within the HHS github, are there any others that would serve as good examples?
  • "...the Hyper Hyper Space project proposes a framework for universal information access." I think P2P is one way the little guy can compete with BigTech because of the potential for scalability. But, ironically, completely open data gives that monopoly a warm welcome through the front door. In the client/server model, companies can protect (some) data from competitors indefinitely. But patents expire after a number of years. Let's say the little guy builds a really great platform and corresponding graph. Is there a way to protect that investment from a well-funded competitor?
  • Are there any performance benchmarks for HHS?
  • How many signaling servers are required for x number of users?
  • Gun uses a relay server to assist in data persistence and availability. Does HHC factor this in?
  • Are there plans in the works for documentation of HHC classes and various use-cases or comparisons with other database models...something to help bridge that mental model gap?
  • From p2p-chat-cli "the chat room is lost forever"...Is this because the app is syncing but not persisting to a store?

Again, great work on HHC and thank you to anyone who can help confirm/deny assumptions and answer questions. It looks very promising!

ES Module support

Hello, I was wondering if you have any plans to provide an ES module that could be requested via a CDN?

I am having trouble building this on my local machine because it looks like the wrtc library only runs on Node 14 and under. I am on a machine that is running Node 15 currently.

This library looks fantastic and would love to explore it more!

Create an identity management module

Right now, Identity objects are HashedObject derivatives and are stored as any other object. They contain some information about the holder of the identity -just an info map, I usually use just two fields (info.name and info.type) but it's really application defined- and a key pair. Whenever an object is assigned an author (by means of obj.setAuthor(id), when it is saved in the store that id object will be looked up in the store and used to cryptographically sign obj. Hence the store is functioning as an identity & key store as well. A trick is used to prevent the private part of the key pair being synchronized whenever the identity is sent to other peers: the hash of the private part is replaced by a custom computation that's done on-the-fly, so that the Identity object has no hash-references pointing to the private key that would make the synchronizer to actually send it.

I'd like to move the identity / signing part to a different module, and allow for the application to impose limitations to what things can be signed with each loaded identity. I'd also like to make this more explicit, and not rely on any exceptional behavior.

Add support for marking some prevOps as obsolete

Right now it is not possible to indicate that some mutation operations can safely be ignored. I'd like to add another special field to MutationOp that is similar to prevOps but, instead of indicating that this op comes after some others, it indicates that such previous ops are obsolete and that they should not be fetched.

The sync agent should take this field into consideration when backtracking from the current state, and don't ask for ops that have been obsoleted.

Since in the most common case this cannot be done without breaking BFT, try to do it in a way that doesn't impose the burden of checking that there is no op-obsolescence on all the types that don't support it (maybe a parameter passed to MutableObject's constructor enabling obsolescence?).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.