Giter Club home page Giter Club logo

Comments (30)

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

At the moment, two reasons exist for the document to be internally recognized as map[string]interface{}:

  • GetIn function traverses serialized JSON document structure to find values to be indexed/unindexed
  • To make use of query results, documents have their unique ID chucked in and then written to disk.

How about we introduce the following two new APIs, for embedded usage:

  • (col *Col) Insert(object interface{}) (documentID uint64)
  • (col *Col) Read(documentID uint64, object interface{})
  • (col *Col) Update(documentID uint64, newObject interface{})

For Insert and Update, the implementation serializes the user defined structure and manipulate the result string to put in the ID.

For Read, it will simply deserialize the document JSON into the structure.

How does it sound?

from tiedot.

jbenet avatar jbenet commented on May 17, 2024

Hey! the interface methods look good to me. 👍

Well, i think that adding the PK to the object is not a good idea. It dirties up the user's data. For instance, usually one expects data put into a database to be returned exactly the same. More concretely:

checksum1 = sha(val1)
db.put(key, val1)
val2 = db.get(key)
checksum2 = sha(val2)
assert val1 == val2
assert checksum1 == checksum2

Adding the key modifies the document, and breaks modularity. I think either:

  • don't put the pk in the dict. pk should only be a handle (like a file descriptor) to the data.
  • wrap the data in an outer dictionary with the key. Store {'pk': <pk>, 'data': <user's data>}. only return <user's data>.
  • if you really want to return the pk to the users in the dict, then allow users to use the pk field in their structs (with pk defaulting to 0 if not there), and to change the pk name. (e.g. if I want to use id or key because i have some other id scheme, i should be able to).

also, looking at the source, looks like there isn't a good way to address arrays with paths in GetIn. If I'm understanding correctly, GetIn tries all array items. (e.g. foo,bar returns 1 in {"foo": [{"baz":2}, "b", {"bar":1}]} -- am I right?) Perhaps may be better to explicitly address the arrays with indices, like mongo/bson do (foo,2,bar above).

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

You are right, GetIn tries all array items to find the interested document value. I am aware of the Mongo feature, but rarely see it being used in real applications.

Regarding the doc ID - right now it is called "_pk". We may change the name to "@pk" for example - that is valid for JSON but invalid to be deserialized into Go structure.

How do you type thumb up? Is it a unicode character?

from tiedot.

jbenet avatar jbenet commented on May 17, 2024

rarely see it being used in real applications.

So, what happens here: events,date from {"events": [{"date":1}, {"date":2}, {"date":3}]} ? There's no way to explicitly address the individual events, so I couldn't ask to index on "the date of the first event".

that is valid for JSON but invalid to be deserialized into Go structure.

Something that will return {"_pk": 1, "other": "stuff"} and {"other": "stuff"} depending on the API that's used? This seems inconsistent to me.

Let me ask this, why do you want to return the _pk in the first place? is there any place where it gets returned that you didn't previously have it already?

:+1: -> 👍 :)

from tiedot.

alexandrestein avatar alexandrestein commented on May 17, 2024

Hey everyone.

Return id is useful when you show the user multiple elements and he select one of those. (for update or open)
The simplest way to do that is to hide the element id and when user click, the app get the related id.

The "_id" (as MongoDB) or "_key" sounds good to me.

I would prefer to have the id wrap in the object by default if the struct define a named field like "_id" for example. As "mgo" (GoLang MongoDB driver) works.

What about that?

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

OK, now index on particular array element sounds useful.

What about changing the ID attribute name from "_pk" into "@id"? Because Go structure cannot have member "@id" attribute, therefore the JSON deserializer will happily skip that one.

Quite a few NoSQL engines put document ID in the document itself, like Alexandre said. In the case of tiedot, it helps a lot with query performance.

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

If you don't mind, I shall change the ID attribute name from "_pk" to "@id" straight away.

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

Completed @id change: 3cc156b

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

Sorry about the delay.
The three new functions will make it into nextgen branch.

Also in nextgen branch, "@id" will no longer be physically stored in document itself, but rather only maintained in lookup table.

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

these APIs will make their way into version 3.1 =)

from tiedot.

stunti avatar stunti commented on May 17, 2024

Hi.
I am new to tiedot but I can't find these functions with that signature. Are they implemented under a different name?
Thanks

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

I am terribly sorry, very unfortunately the features did not make their way into the 3.1. For now map[string]interface{} is still the only supported document structure.

Here are some usage examples:
https://github.com/HouzuoGuo/tiedot/blob/master/example.go

Note that: document ID is no longer an attribute in the document content itself.

from tiedot.

geoah avatar geoah commented on May 17, 2024

Seems that these have been introduced in 3.2, is this stable to use or not yet ready?

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

@geoah This is a trivial feature and I forgot about it during 3.2, sorry! Actually the interface between 3.0 and 3.2 have not changed.
It should be really simple to write, let me see if I can get it done in the next several days.

from tiedot.

kylewolfe avatar kylewolfe commented on May 17, 2024

Hi Houzuo,

I'm interested in using tiedot a little more and have a few questions on this subject. In reference to :

At the moment, two reasons exist for the document to be internally recognized as map[string]interface{}:

GetIn function traverses serialized JSON document structure to find values to be indexed/unindexed
To make use of query results, documents have their unique ID chucked in and then written to disk.

The method Col.ForEachDoc (https://godoc.org/github.com/HouzuoGuo/tiedot/db#Col.ForEachDoc) returns a []byte and in turn basically a json document that can quickly be unmarshaled using json.Unmarshal. I've found that using json to unmarshal is far better than something that does a map[string]interface{} -> struct conversion.

My questions: 1) how far along are you with any refactor to the API? 2) is the design final? 3) if not, and this []byte json document is indeed how the data comes off the wire. may I suggest creating a new data type of []byte that has a method to convert return a map[string]interface{}?

I think an API that allows both map[string]interface{} AND a JSON compatible struct in / out would be amazing.

A pain point I currently have is that for me to read from tiedot in to a struct I have to do something like this:

// convert each row back in to a Session type
    for id := range queryResult {
        doc, err := col.Read(id)
        if err != nil {
            Logger.Fatalln(err)
        }
        doc["Id"] = id

        ns := &Session{}
        err = mapToStruct(doc, ns)
        if err != nil {
            Logger.Fatalln(err)
        }
        s = append(s, *ns)
    }

In drivers such as mgo, there is a mechanism for unmarshaling the Id directly in the the given struct (here I had to inject it into my map before unmarshaling). You'll also notice that s and queryResult had to be allocated ahead of time. If a type is returned that has the ability to convert its self to both map[string]interface as well as []byte, I can skip that and go right in to unmarshaling using my preferred package.

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

Thank you very much for the suggestions @kylewolfe and sorry I have not been active with the project development.

For your use case I'd like to propose four new functions:

  • An Insert function (doc interface{})(id int, err error)
  • An Update function (id int, doc interface{}) (err error)
  • A Read function (id int, destination interface{}) (err error)
  • Another Read function (id int, idAttrName string, destination interface{}) (err error)

I hope they'll cover your use cases. What do you suggest to be their names?

If they look good to you I should be able to implement them within a day or two.

from tiedot.

kylewolfe avatar kylewolfe commented on May 17, 2024

Thanks for the quick reply!

Your proposed methods would use json to (un)marshal?
What would idAttrName be used for?

Is the idea of a single return type that can do multiple things out of the realm of possibility? Or maybe just too much of a breaking change? Ideally an interface such as this would be better, so that Id is handled by (un)marshaling:

type Foo struct {
    Id int
    Foo, Bar string
}

f := &Foo{Foo: "foo", Bar: "bar"}
if err := col.Insert(f); err != nil {
    panic(err)
}
fmt.Println(f.Id) // has new id

f.Foo = "baz"
if err = col.update(f); err != nil {
    panic(err)
}

mgo behaves this way so that I do not have to take an extra step to populate Id, if it exists. What are your thoughts?

I'd be happy to start digging in and trying to make some updates to allow for this. I have a high need for embedded db usage, so it would probably be beneficial for me to poke around and understand a little more about tiedot at a lower level.

from tiedot.

kylewolfe avatar kylewolfe commented on May 17, 2024

Going back to the new return type. If []byte indeed does come off the wire (as I see it in ForEachDoc) and you also need map[string]interface for indexing purposes, etc. Then a type something like this could be used across the whole application:

type Row []byte {}
func (r Row) ToMap() map[string]interface{} // allows you to do what you need for indexing

Then during ForEachDoc, both json unmarshaling and any existing functionality around map[string]interface{} can be used.

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

All right, I misunderstood your original use case regarding mgo. It will be a little difficult to support mgo style API, but I'll try my best to make the existing APIs more pleasant.

json.Marshal/Unmarshal uses JSON byte array instead of string, which happens to be compatible with the representation of JSON documents in the storage. However the higher level API in Col such as Col.Read() returns map[string]interface{} which is inconvenient.

So starting from Col.Read I propose two more functions (names are not final)

Read2(id int, destination interface{}) reads document at specified ID and deserialise it to destination, structure of your choice.

Read3(id int, idAttrName string, destination interface{}) is similar to above, but it also artificially introduces (or modifies) an ID attribute denoted by the idAttrName, so that deserialised structure will have the document ID as well.

And finally, to make Insert and Update easier to use, they will also take interface{} instead of map[string]interface{} as the document input. These two functions will convert the input into index-capable map[string]interface{} automatically, for maintaining indexes.

So in the end the new APIs behave similar to mgo API in terms of supporting any structure as document and treatment of document ID in the deserialised structure.

How does that sound?

from tiedot.

kylewolfe avatar kylewolfe commented on May 17, 2024

I think I may have a crack at this tonight. Rather than what you describe in Read3, how about utilizing a structs tag feature within reflect?

type Foo struct {
    Id int `tiedot: id`
    Foo, Bar string
}

I'm assuming your looking to keep API compatibility for awhile longer (before v4 if at all?)

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

it will definitely be ideal if tiedot can make use of the tag, although I do not know how. Please feel free to experiment with the feature and I look forward to hear from your findings.

from tiedot.

kylewolfe avatar kylewolfe commented on May 17, 2024

Here's a little show and tell of what I have so far (not ready for a PR): master...kylewolfe:api_break

I titled the branch api_break, thinking that I would write up what an ideal final API would look like, regardless of breaking changes, but then I realized you could actually switch to a format in which mgo uses (doc interface{}) and still keep backward compatibility.

Insert now supports map[string]interface, the new type of Document, []byte and any pointer to a struct. If a pointer to a struct is given, it will search for a tiedot tag of Id, and update that struct directly, allowing for my earlier example.

Once tested (and other functions converted), this should allow for:

type Foo struct {
    Id int `tiedot: id`
    Foo string
}

f := &Foo{Foo: "bar"}
if _, err := col.Insert(f); err != nil {
    panic(err)
}
// f.Id now has the new id

f.Foo = "baz"
if _, err := col.Update(f.Id,  f); err != nil { // Ideally, the first argument wouldn’t be needed, but oh well
    panic(err)
}

What are your thoughts on this direction?

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

That looks even better than I previously thought! well done!
I added you into collaborator list so feel free to push changes.

from tiedot.

kylewolfe avatar kylewolfe commented on May 17, 2024

Excellent! I'll move my work in to a branch on the official repo and continue to play. Do you have a standard or thought on breaking changes? I realized that some more thought needs to go in to a few other methods (ForEachDoc, for example, I believe would have to receive a breaking change in order to support struct unmarshaling)

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

Well I think people are quite familiar with the idea that softwares evolve and breaking changes will occur . As long as the actual feature isn't taken away I wouldn't mind changing some function signatures.

from tiedot.

arafath-mk avatar arafath-mk commented on May 17, 2024

Hi.. What is the status of this issue..?

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

@arafath-mk sorry, little progress has been made regarding this issue, although it should not be a difficult matter.
@kylewolfe hello! do you by any chance have a moment to take a look at this?

from tiedot.

kylewolfe avatar kylewolfe commented on May 17, 2024

Unfortunately, I opted to use a different solution for my project. As a result I did not get around to finishing my work on this issue. My partial work is still available though (master...kylewolfe:api_break) if someone wants to pick up the baton.

from tiedot.

phpfs avatar phpfs commented on May 17, 2024

Hey there - any news on adding a function to directly insert a struct? :)

from tiedot.

HouzuoGuo avatar HouzuoGuo commented on May 17, 2024

thanks for your interest in this project @phpfs , this project started as a short programming exercise, i'm very sorry that i didn't have a chance to take it higher :|

from tiedot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.