ipld / go-ipld-deprecated Goto Github PK

View Code? Open in Web Editor NEW

32.0 22.0 12.0 49 KB

Golang IPLD Dev Entry Point Repo

Home Page: https://ipld.io

License: MIT License

Go 96.94% Shell 0.22% Makefile 2.83%

go-ipld-deprecated's Introduction

This package is deprecated!

See instead:

Content ID / CID: https://github.com/ipfs/go-cid
Node interface for IPLD: https://github.com/ipfs/go-ipld-format
CBOR IPLD Node: https://github.com/ipfs/go-ipld-cbor
Git IPLD Node: https://github.com/ipfs/go-ipld-git
Ethereum IPLD Node: https://github.com/ipfs/go-ipld-eth
Bitcoin IPLD Node: https://github.com/ipfs/go-ipld-btc
Zcash IPLD Node: https://github.com/ipfs/go-ipld-zcash

go-ipld

The Go implementation of IPLD

This is the Go implementation of the IPLD spec.

WIP

Install

TODO

Usage

TODO

Contribute

Feel free to join in. All welcome. Open an issue!

This repository falls under the IPFS Code of Conduct.

Want to hack on IPFS?

License

MIT

go-ipld-deprecated's People

Contributors

Stargazers

Watchers

Forkers

mildred dardevelin benjaminbollen kubuxu codeaudit mheiber parkan pawal isabella232

go-ipld-deprecated's Issues

Question about multicodec header and object hash

I am wondering how to make sure we won't be changing every hash of every object once we switch from protobufs to ipld. We have the multicodec to handle that transition, but that doesn't work for hashing, becaue hash(protobuf) != hash(multicodec_header + protobuf).

In that case:

do we want to transfer the full object with the multicodec header, but when computing hash, ignore the header if it is pointing to the old protobuf format?
do we want to transfer the same object we have now, without the multicodec header (hashing is not changed) and detect when the header is absent and decode it as protobuf?

I'd lean towards the second solution, and this needs to be implemented (I can't see any hint on that being already implemented in the code). Or perhaps that shouldn't be implemented in go-ipld but rather in the caller package, when dealing with an error (header mismatch or something like that).

@jbenet, what do you think ? Do we need to implement a fallback mechanism? In go-ipld or elsewhere?

protocol buffer compatibility

There is something that puzzle me in coding/pb, it's the JSON format in which we deserialize the protocol buffer object. Whereas a unixfs would look like this in the new IPLD format:

{
  "file.txt": {
    mlink: "<hash>"
  }
}

The new format looks like this:

{
  links: [
    { name: "file.txt", hash: "<hash>", size: <size> }
  ]
  data: "<data">
}

The problem is that links will be completely different. Now, you can get the file at the path /ipfs/<hash of pb object>/file.txt, when we'll use IPLD, the file will no longer be accessible with the same path but rather the path /ipfs/<hash of pb object>/files/0.

How do you plan on making the JSON format you generate from the protocol buffer object compatible with the new IPLD format? I still don't get it.

Namespacing/Addressing thoughts

I was wondering about using/supporting this for a key addressing scheme.

Consider these references for a key:

dbname/tld/domain/ipld/somekey/childkey
domain.tld/dbname/ipld/somekey/childkey
ipld.domain.tld/dbname/somekey/childkey
somekey.ipld.domain.tld/dbname/childkey
childkey.somekey.ipld.domain.tld/dbname/

Everything but the first one could have http:// thrown in front of it and work.

Users wanting DNS compatible databases should put a {"tld" : {"domain": {"ipld": {link}}}} key link at the top of their db that links to the rest of their database. This takes care of potential data value requests for those parts o the keyspace hierarchy that were created primarily for making the DNS query work....
....

From there "dbaddress" is some top level ultimate database starting point in the IPLD infrastructure.

While this database might be linked to by some other database (possibly multiple even), the way it was referenced makes it impractical to try and "go up" from here.

....

Everything coming before the first / is treated as the root context for database operations. The :portid that can optionally be after the domain name is an "allowed but ignored" part of the root context.
Everything after dbaddress is treated as the application's referenced keyname at that db root context.

Extended query response objects could be something like:

for childkey.somekey.ipld.domain.tld/dbname

{
  "database": "dbaddress"
  "root": "/tld/domain/ipld/somekey/childkey/",  
  "relid": "", 
  "key": "/tld/domain/ipld/somekey/childkey"
  "value": { "key": "datavalue" }
}

for ipld.domain.tld/dbname/somekey/childkey

{
  "database": "dbaddress"
  "root": "/tld/domain/ipld/",  
  "relid": "somekey/childkey", 
  "key": "/tld/domain/ipld/somekey/childkey"
  "value": { "key": "datavalue" }
}

for dbname/tld/domain/ipld/somekey/childkey

{
  "database": "dbaddress"
  "root": "/",  
  "relid": "tld/domain/ipld/somekey/childkey", 
  "key": "/tld/domain/ipld/somekey/childkey"
  "value": { "key": "datavalue" }
}

This approach treats every key in the database as a potential database root for a query, and it mimics, and is compatible with, the familiar http namespaces.

Because this naming convention enables any key in the hierarchy to be referenced as a root; that might be leveraged to create access control points for authorization. Just because any key could be made a root, doesn't mean the system has to accept that any particular root is valid for this database.

Resolve remaining IPLD questions

@mildred and i discussed where IPLD should go and what to do with the JSON-LD trickiness encountered. We should finish resolving what the plan is and document it here.

Implement the IPLD spec

This issue is here to track the differences between the current implementation and the IPLD spec:

ipfs/specs#59 (not yet decided): Convert protobuf messages to the specified format
ipfs/specs#61 (not yet decided): Encode in CBOR using tags
ipfs/specs#62 or ipfs/specs#60 (not yet decided): Implement path resolution and walking
ipfs/specs#63 and #15: Test that reading a protocol buffer message without the multicodec header works.
ipfs/specs#64 (not yet decided): Use the correct key for link names (link vs mlink)
remove everything about escaping as it seems we don't need it any longer
implement changes to spec (as in ipld/js-ipld-dag-cbor#17 and ipfs/specs#101)
decide the structure of the code that we were discussing before

If I missed anything, please tell so I can add to the list.

@context vs @type

@jbenet quick question that was left unanswered during the meeting:

@context can make our JSON blobs look nicer, but it also means that other people would have to use our context or create their own while importing the @type's from our @context, since there can be only one context per JSON blob.

What would be the best way for people to merge our @context into their @context?

//cc @whyrusleeping

rebase on go-multicodec

https://github.com/jbenet/go-multicodec is ready. we need to pull it in and rebase go-ipld on top of it. We should basically remove https://github.com/ipfs/go-ipld/tree/master/coding.

Tricky parts:

ugorji codec allowed us to specify the map type to use when decoding. looks like the new decoders do not. not sure what we can do about that without walking the whole datastructure and converting everything.
make sure decoding different wire types (cbor, json) to the same object works ok.
make a package for marshaling + hashing (multihash) an ipld object. (maybe coding pkg can turn into this?)

notes

these are all equivalent.

(1) no context, direct linked data link.

{
  "author": { 
    "@type": "/ipfs/<hash-of-mlink-schema>/mlink", 
    "@value": "<hash-of-author>"
  },
  "committer": { 
    "@type": "/ipfs/<hash-of-mlink-schema>/mlink", 
    "@value": "<hash-of-committer>"
  },
  "object": { 
    "@type": "/ipfs/<hash-of-mlink-schema>/mlink", 
    "@value": "<hash-of-object>"
  },
  "comment": "hello there this is a comment."
}

(2) mdag general context, use mlink shortcut.

define general mdag context

{
  "@context": {
    "mlink": "/ipfs/<hash-of-mlink-schema>/mlink"
  },
}

use it in commits

{
  "@context": "/ipfs/<hash-of-mdag-schema>/mdag",
  "author": { 
    "@type": "mlink", 
    "@value": "<hash-of-author>"
  },
  "committer": { 
    "@type": "mlink", 
    "@value": "<hash-of-committer>"
  },
  "object": { 
    "@type": "mlink", 
    "@value": "<hash-of-object>"
  },
  "comment": "hello there this is a comment."
}

(3) specific commit context. well typed

define commit context

{
  "@context": [
    "/ipfs/<hash-of-mdag-schema>/mdag", // import mdag.
    {
      "author": { "@type": "mlink" },
      "committer": { "@type": "mlink" },
      "object": { "@type": "mlink" },
      "comment": { "@type": "string" }
    }
  ]
}

use it in commits

{
  "@context": "/ipfs/<hash-of-commit-schema>/commit",
  "author": "<hash-of-author>",
  "committer": "<hash-of-committer>",
  "object": "<hash-of-object>",
  "comment": "hello there this is a comment."
}

Move to IPLD organization

Should this be moved?

IPLD Data Model

So, reading #4, there seems to be a few separate issues:

what gets encoded into the wire format
how that gets represented in a human-readable format (JSON, YAML, etc)
how IPLD maps to language-specific datatypes

A major deficiency in JSON is its lack of (user-defined) datatypes. Several workarounds to this issue have been proposed, by reserving a special key in each JSON object:

_type: https://www.npmjs.com/package/typed-json
__proto__: http://tobyho.com/2009/10/02/typed-deserialization-with/

It looks like the @context key proposed in #4 is trying to achieve the same thing.

Whilst this is a reasonable solution for encoding into JSON, I don't think it should be a fundamental part of IPLD, as other representations actually have proper support for representing type information:

CBOR tags: https://tools.ietf.org/html/rfc7049#section-2.4
YAML user-defined datatypes: https://en.m.wikipedia.org/wiki/YAML#Data_types

It would be nice if these features could be supported by IPLD.

So, on the wire, we could have CBOR-tagged data, encoding this into JSON would give something like:

{"@some_reserved_key":"identifier for Person type",
 "name":"David"}

or in YAML:

!Person { name: David }

and mapping into native (say, JS) datatypes:

class Person {
  constructor(name) {
    this.name = name;
  }
}
object = ipld.decode(data, {'person type': Person})

CC: @jbenet

Implement missing features to include IPLD in IPFS

Perhaps we don't need to do anything, but I would think that perhaps we are missing a fes things to integrate IPLD in IPFS.

If I'm not mistaken, the go-ipld package should replace github.com/ipfs/go-ipfs/merkledag package. Is that right ?

If so, I'd think we should modify both the old merkledag package to look a bit more like go-ipld (remove external access to struct members, rename a few things to match go-ipld function names if needed). That will also require modifying the core of ipfs.

Then, we must implement any missing function (if any) in go-ipld, and make the switch.

Does that sound like a good idea ? @jbenet, anyone ?

Replace the Node type (which is a map) with an interface ?

Currently, the Node type is defined as :

type Node map[string]Node

This makes all IPLD users very dependent on the implementation of the Node object, and forces IPLD to parse everything on the JSON/CBOR data to put it in the map. The alternative is to provide only a handle on the data and decode it on demand, and let the Node consumer store just the things he needs.

Say an application just need some small keys on many Node objects which otherwise are also containing huge chunks of data. We can imagine a mode of operation where the huge data is left on the disk while the file is merely traversed to look for the small keys the application needs. The big data shouldn't need to be kept on memory if we don't use it.

To allow this, I'd propose to make Node an interface equipped with a walk function (that would walk the JSON-like data structure, not local paths as the current walk function does) :

type Node interface {
    func Walk(WalkFun walker) error;
}

And then:

const TokenInt = 1 << iota;
const TokenFloat;
const TokenString;
const TokenObjectKey;
const TokenArrayIndex;
const SkipObject error;
const SkipValue error;
type WalkFun fun(err error, path []interface{}, tokenType int, token interface{}) error;

An example of decoding the following JSON:

{
  a: 1,
  b [ "c", "d" ]
}

Would result in the walk function being called this way:

WalkFun(nil, []string{}, TokenObjectKey, "a");
WalkFun(nil, []string{"a"}, TokenInt, 1);
WalkFun(nil, []string{}, TokenObjectKey, "b");
WalkFun(nil, []string{"b"}, TokenArrayIndex, 0);
WalkFun(nil, []string{"b", 0}, TokenString, "c");
WalkFun(nil, []string{"b"}, TokenArrayIndex, 1);
WalkFun(nil, []string{"b", 1}, TokenString, "d");

This would be a very low level block that will be used to implement higher level functions, like path walking within an object, or decoding to application specific formats.

To provide compatibility with ipfs/merkledag, we could add to the interface a few functions:

type NodeMdagCompatible {
    Node;
    func Links() []Link;
    func Data() []byte;
}

The Links and Data function would traverse the object using the walk function to provide a list of links and the data bytes required for ipfs/merkledag users.

Add deprecated notice

This module is out of date, and as far as I can tell, not likely to be updated soon. Should we add a deprecated notice?

Make current protobuf mapping

We need to make a mapping from the current protobuf based ipfs object, to the new ipld object. i.e. make an ipld document out of it. Important note: whan marshaling it, it should continue to marshal to protobuf, so that links are preserved, objects dont change, and so on. and it should skip the multicodec header when hashing protobuf, so the links continue to work.