Giter Club home page Giter Club logo

nds's Introduction

nds

Build Status Coverage Status GoDoc

Package github.com/qedus/nds is a datastore API for the Google App Engine (GAE) Go Runtime Environment that uses memcache to cache all datastore requests. It is compatible with both Classic and Managed VM products. This package guarantees strong cache consistency when using nds.Get* and nds.Put*, meaning you will never get data from a stale cache.

Exposed parts of this API are the same as the official one distributed by Google (google.golang.org/appengine/datastore). However, underneath github.com/qedus/nds uses a caching stategy similar to the GAE Python NDB API. In fact the caching strategy used here even fixes one or two of the Python NDB caching consistency bugs.

You can find the API documentation at http://godoc.org/github.com/qedus/nds.

One other benefit is that the standard datastore.GetMulti, datastore.PutMulti and datastore.DeleteMulti functions only allow you to work with a maximum of 1000, 500 and 500 entities per call respectively. The nds.GetMulti, nds.PutMulti and nds.DeleteMulti functions in this package allow you to work with as many entities as you need (within timeout limits) by concurrently calling the appropriate datastore function until your request is fulfilled.

How To Use

You can use this package in exactly the same way you would use google.golang.org/appengine/datastore. However, it is important that you use nds.Get*, nds.Put*, nds.Delete* and nds.RunInTransaction entirely within your code. Do not mix use of those functions with the google.golang.org/appengine/datastore equivalents as you will be liable to get stale datastore entities from github.com/qedus/nds.

Ultimately all you need to do is find/replace the following in your codebase:

  • datastore.Get -> nds.Get
  • datastore.Put -> nds.Put
  • datastore.Delete -> nds.Delete
  • datastore.RunInTransaction -> nds.RunInTransaction

Versions

Versions are specified using Go Modules.

  • Version 1.x.x: Can be found on branch master and uses the google.golang.org/appengine package.
  • Version 2.x.x: Can be found on branch v2. This is a major update and takes advantage of the new cloud.google.com/go/datastore package provided by Google. It currently in an experimental state and we welcome contributions.

nds's People

Contributors

dependabot-preview[bot] avatar dependabot[bot] avatar derekperkins avatar jongillham avatar shkw avatar someone1 avatar stephanos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nds's Issues

Tag v2 branch as v2.0.0

Trying to get this to work on a go1.10 project and I think the release needs to be tagged to work.

No cache hits

I'm having issues where nothing ever seems to be a cache hit, despite repeated Get calls for the same database key. Below is my appstats profile for 4 subsequent calls, which should involve the last 3 getting a cache hit.
image

Is there some way for me to check or turn on debugging flags to show me why the cache is being invalidated?

This is running on the devserver 1.9.12.

script to install current Go SDK

I recently had the problem that each time the Go SDK version was changed my Travis build script wouldn't work anymore until I manually upgraded it to the latest version. Since then I wrote a small script to automatically pull the latest, correct Go SDK. If you accept that it might break anytime (since the URL is not officially documented), it might be interesting to you: https://gist.github.com/stephanos/d48fd3500614bd83e63e.

Cheers! :)

memcache: server error

We just had this show up in our logs from nds. memcache: server error Any idea what might be causing that?

Migrating data when the underlying struct changes

We have to now put on the big boy pants and start migrating some data which is already present in datastore. The struct has since then changed: a few columns added (which is not a problem) and some columns removed (which is a problem).

When trying to load such an entity from datastore (via nds; to keep the cache consistent) we will end up with an error.

Any recommendations on how to go about this?

customize memcache expiration

I'm about to integrate my hrd library with nds and noticed that it is currently not possible to specify a custom memcache expiration time.

Is this something you think might be worthwhile?

Memcache Calls Doubled

I'm running https://github.com/mjibson/appstats to track my RPC calls. When I use datastore directly, it functions as expected and there is one call to the datastore.

image

As soon as I make the exact same nds call, it makes 4 calls to memcache instead of the two that I'd expect.

image

Do you know what could be causing that problem?

Support full datastore API

If the full datastore API were supported, nds would be a drop-in replacement. Is there any problem with the missing funcs being wrappers around the datastore equivalent?

Support Managed VMs

Since App Engine will probably move more and more towards Managed VMs (currently beta) it might be a good idea to support both platforms, vanilla App Engine and Managed VMs.

Note: The problem is that the import path has changed, "appengine" becomes "google.golang.org/appengine", see related forum discussion.

Add In-Memory Cache

I was remembering David Symond's talk on high performance apps and I went back and revisited it, and this slide popped out at me.
http://talks.golang.org/2013/highperf.slide#19

He shows memcache being 20x faster than datastore, and RAM 1000x faster than memcache. That got me thinking that it might be fantastic to add an in-memory cache option to nds. I'm thinking that you would initialize nds in the Init and provide a fixed memory size. Maybe I'm willing to allocate 5% of the total instance memory to this extra cache layer. Then any datastore lookup would first check RAM, then memcache and then finally datastore. If nds isn't ever initialized with a RAM cache, there would be no difference in how it functions.

I found this package earlier today that looks like it should handle all the heavy lifting of promoting and expiring cache items. Since it's from CloudFlare, I trust that it is highly performant and well coded, especially looking at their benchmarks.
http://godoc.org/github.com/cloudflare/golibs/lrucache

This is obviously isn't a must-have, nds has already made huge strides for us. I wanted to toss the idea out for future development. What are your thoughts?

Tests fail when using nds

I have a reasonably large App Engine Classic app using nds. In production, it works well without any issues. However, when running my tests via goapp test ./... I get a datastore failure with an error of datastore: invalid key.

If I however, run the particular test that failed directly via gapp test -run=FailingTestName ./..., the test that failed in the broad test passes.

If I modify the number of retries that my app will perform on a datastore fail to a higher number (say 6), the broad test goapp test ./... passes.

If I substitute datastore. for all my nds. function calls, the tests pass.

I am using the latest version of nds, with App Engine 1.9.40 on a 2014 Macbook Pro.

What further information can I give to help solve this problem?

ErrFieldMismatch in GetMulti returns only one result

When using GetMulti, if there is a difference between datastore and the struct used to load the data into, datastore returns the data, with an error type ErrFieldMismatch - cannot load field "X" into a "Y": no such struct field.
The result from nds for N structs that trigger this error, is only a single populated struct, and N-1 empty structs (no fields populated), with N errors.
It should be all the structs with their values, and N errors.

I believe that this can be fixed by changing https://github.com/qedus/nds/blob/master/get.go#L359 to cacheItems[index].err = err
I can provide a PR if needed.

Possible stale cache when using RunInTransaction with specific order of memache failure.

This issue revolves around RunInTransaction which locks all memcache keys that use Put and Delete within the transaction prior to committing to the datastore.

The following scenario would produce stale results when calling Get.

  • RunInTransaction locks memcache successfully.
  • Memcache fails/locks are evicted.
  • Memcache recovers.
  • Another client calls Get on the entities that were supposed to be locked and repopulates memcache.
  • RunInTransaction commits to the datastore and memcache is left with stale entity values.

I currently don't see any way of completely eliminating this consistency issue. Note that it does not affect Get, Put or Delete when used outside of RunInTransaction.

Note that this issue does not affect consistency guarantees when just using nds.Get* and nds.Put* functions.

Gracefully handle memcache quota limits

We started randomly seeing memcache quota errors that ended up failing our Put operations. We're paying customers, so I'm not sure why those started showing, but it seems reasonable to me to have nds bypass memcache and go directly to datastore on this error. What do you think?

image

nds item not stored warning

In production on a low load service, I'm getting the following error about once a day:

nds:lockMemcache AddMulti memcache: item not stored

My understanding is that is't simply a case of saving to memcache failing, in which case it only means that the next Get would have to get the contents from the datastore instead of memcache.

If this is indeed the case, please close this issue, but on the offchance that my understanding is incorrect, I thought it would be useful to have a record.

Remove all dependence on datastore.PropertyList for serialisation.

datastore.PropertyList used to serialize into local memory and memcache. This is probably less efficient than using GOB with memcache and just keeping the interface within local memory.
Note that the memcache namespace prefix will have to be changed if this is fixed.

Query Support

Do you have any intention of supporting Queries via GetAll or Run? I can see how that would be much more challenging than Get/Put/Delete.

[v2] Possible marshal/unmarshaling bug

I've been getting warnings from the logger I placed in the onError function:

nds:loadCache setValue: datastore: cannot load field "rg" into a "models.UserAccount": multiple-valued property requires a slice field type

I think there's an opportunity to update the tests to ensure that any logs made are expected, otherwise to fail. This may help uncover situations in which we think we're pulling from the cache but are really failing at a later step (e.g. setValue) and falling back to the datastore.

Calling AddMulti for a single entity instead of Add

I'm occasionally getting a warning of: nds:lockMemcache AddMulti memcache: item not stored

While I'm fine with the occasional Memcache error being thrown, this is happening when saving a single entity. I'm not sure why nds is calling AddMulti for a single entity instead of Add.

Panic with datastore.PropertyLoadSaver

I get a panic with the stack trace shown below with "nds". It seems that "x*Record" is nil in "Load"

====================================
// Lädt einen Datensatz aus dem Datastore
func (x *Record) Load(ps []datastore.Property) error {
x.props = make([]datastore.Property, 0, len(ps)) <============== line storage.go:165
x.Created = time.Now()
x.Modified = x.Created
....

=================================
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x0 pc=0x47b88a]

goroutine 31 [running]:
panic(0x8ff760, 0xc82000a070)
/home/tim/Dokumente/go_appengine/goroot/src/runtime/panic.go:481 +0x3e6
main66787.(*Record).Load(0x0, 0xc820285c70, 0x5, 0x5, 0x0, 0x0)
storage.go:165 +0x75a
github.com/qedus/nds.setValue(0x944b20, 0xc8201e23a0, 0x196, 0xc820285c70, 0x5, 0x5, 0x0, 0x0)
/home/tim/Dokumente/gopath/src/github.com/qedus/nds/nds.go:155 +0x1e4
github.com/qedus/nds.loadDatastore(0x7f81cd1d36e8, 0xc820270ae0, 0xc8201df720, 0x1, 0x1, 0x7f81cd1cf0a8, 0x800d60, 0x0, 0x0)
/home/tim/Dokumente/gopath/src/github.com/qedus/nds/get.go:339 +0x5c7
github.com/qedus/nds.getMulti(0x7f81cd1d36e8, 0xc820270ae0, 0xc8202424d0, 0x1, 0x15, 0x800d60, 0xc820273340, 0x97, 0x0, 0x0)
/home/tim/Dokumente/gopath/src/github.com/qedus/nds/get.go:164 +0x3cf
github.com/qedus/nds.GetMulti.func1(0x7f81cd1d36e8, 0xc820270ae0, 0xc8202768d0, 0x1, 0x1, 0xc8202768e0, 0x0, 0xc8202424d0, 0x1, 0x15, ...)
/home/tim/Dokumente/gopath/src/github.com/qedus/nds/get.go:78 +0x19c
created by github.com/qedus/nds.GetMulti
/home/tim/Dokumente/gopath/src/github.com/qedus/nds/get.go:81 +0x32b

Possible stale cache when using Delete and a memcache lock fails.

nds.Delete currently locks memcache and then deletes its entity from the datastore. If the following situation occurs then stale cache will be left in the datastore:

  • Delete is called and locks the memcache item to be deleted.
  • The memcache lock is evicted from memcache.
  • Get is called and loads the entity into memcache.
  • Delete removes the entity from the datastore.

Therefore the entity will no longer be in the datastore but will be in memcache.

Note that this issue does not affect consistency guarantees when just using nds.Get* and nds.Put* functions.

Different namespaces used for memcache and datastore

If ctx and key have a different namespace for any kind of request in nds, then nds will use the key's namespace for the datastore request, as intended, but will do the memcache requests without creating a new context with that namespace in it, causing the memcache requests to be against a different namespace

New Methods CachePut & CachePutMulti

It's not uncommon that when I use nds.Put, I know that I'm going to be needing it soon after. This is especially true in my data pipeline. Since a Put doesn't add an item to memcache, I was thinking that it would be great to have CachePut / CachePutMulti functions to go ahead and preemptively stick them into memcache.

The alternative of course would be to have Put / PutMulti always insert items into Memcache.

Thoughts?

It is not possible to detect if nested contexts are transactional.

Take the following situation where I create a custom context:

type myContext struct {
    appengine.Context
} 

If I do this:

nds.RunInTransaction(c, func(tc appengine.Context) error {
    mc := &myContext{tc}

    // The following calls to the NDS API will not be executed
    // the transactional code path.
    nds.Get(mc, key, val)
    nds.Put(mc, key, val)
    nds.Delete(mc, key)
}, nil) 

There is no way of the nds.Get/Put/Delete functions knowing that the mc context is a transactional context in the current API. Note that if the context is not wrapped then nds.Get/Put/Delete works as expected.

The only nice way I can see around this subtle issue is to change the NDS API as shown in this context branch: https://github.com/qedus/nds/tree/context

I wrap appengine.Contexts a lot to add extra functionality that must be bound to a single context/request. The changes I propose to the API will allow me to continue doing this without fear that I am not in a transaction when I think I am.

There is some more discussion here: https://groups.google.com/forum/#!topic/golang-nuts/xrxjgCGzUfs

Change locking policy for Transaction

This was driving me up a wall, but nds treats transactions differently than the datastore package.

Specifically, it keeps an internal lock which is never unlocked after a commit or rollback. This will keep executing code deadlocked and is different than how the datastore package handles things.

Comments around this specify:

// tx.Unlock() is not called as the tx context should never be called
// again so we rather block than allow people to misuse the context.

Should we reconsider this policy or better document the difference in behavior from the datastore package?

Support saving of entities to memcache in putMulti()

If I got it right currently entities are not stored to memcache on Put().

Suppose we run in transaction and need to update some entities. Then current execution order is:

  1. User code calss nds.RunInTransaction()
  2. Get entity by key from datastore (not stored to memcache)
  3. Change entities
  4. User code calls nds.PutMulti() with transactional context
  5. nds.putMulti() add items to tx.lockMemcacheItems (memcache not changed)
  6. nds.putMulti() call datastorePutMulti()
  7. nds.putMulti() executes deferrer that removes from memcache tx.lockMemcacheItems.

First of all it's not clear why we create lockItem() at line 129 as it seems it is not used further in current implementation.

Second just removing items from memcache results in unnecessary paid Get() operation on following request while potentially results could be retrieved from the "free" memcache. For example I have a chat application and whenever user writes to the bot and bot sends response I do following:

  1. On request Chat entity is retrieved outside of transaction
  2. On response Chat entity updated within transaction (if state changed)
    Currently the first Get() is not leveraging memcache with always a miss. This increases response time & costs.

So the suggestion is:

  1. Before putting nds should try to lock items in memcache.
  2. If previous pre-put locks was successful nds tries to put items to memcaches using CAS.
  3. If lock or putting to memcache failed try delete from memcache as currently.
  4. Not related improvement: If delete from memcache failed create a deferred task for removing items from memcache. That could mitigate temporarily memcache unavailability.

I know caching is hard to get right and probably I'm missing something.

edit: fixed urls to permalinks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.