I have some items whose values are very expensive to initialize. I want to initialize

This is one of our requirements, and I agree with <a class="user-mention notranslate"

Get/Set Races about ccache HOT 4 OPEN

karlseguin commented on June 24, 2024

Get/Set Races

from ccache.

Comments (4)

karlseguin commented on June 24, 2024

I haven't touched this (or even Go) in a long time. But everything you say sounds right.

Let me verbosely go through my thoughts:

1 - It sounds like you watch a Fetch method that does tracking and is atomic. Looking at the existing Fetch, I'm not sure why that wasn't pushed down to the bucket (and thus, made atomic). It seems like if we pushed the existing Fetch to the bucket, you could add a TrackingFetch that behaves a lot like TrackingGet, i.e.: TrackingFetch would call Fetch and then track() the item.

2 - There appears to be a race condition, yes. I racked my brain on this for a few minutes and didn't see a way out. But, what if on gc, we set the refCount to -1. We can use a CompareAndSwap to favor the get:

// If `track` is called BEFORE this, CompareAndSwap will return false because refCount 
// will be 1, not 0.
// If `track is called AFTER This, refCount will be set to -1, which we can leverae (see next block)
if c.tracking == false || atomic.CompareAndSwapInt32(&item.refCount, 0, -1) == true {
  // this is all the same
  c.bucket(item.key).delete(item.key)
  c.size -= item.size
  c.list.Remove(element)
  item.promotions = -2
}

Now we just need to guard against -1:

// track will return "false" if, after adding 1, the count is 0 (which means our GC 
// ran and set it to -1)
func (i *Item) track() bool {
  atomic.AddInt32(&i.refCount, 1) != 0
}  

...

func (c *Cache) TrackingGet(key string) TrackedItem {
 ...
  // if trackin fails, return NilTracked
  if item.track() == false {
    return NilTracked
  }
  return item
}

???

from ccache.

alecbz commented on June 24, 2024

That said I can see it's a pretty specific request. I'm happy to make a PR.

FWIW I don't think the general problem (only-once initialization) is too specific at all, we have the same use-case, as I imagine a lot of other people do.

from ccache.

eli-darkly commented on June 24, 2024

This is one of our requirements, and I agree with @alecbz that it's a pretty common one— generally speaking it's just coalescing of requests, which is something you'd definitely want to do if you have many goroutines contending for the same expensive resource through the cache.

We're currently implementing it using singleflight. Basically, instead of this—

var myCacheStuff struct {
	cache *ccache.Cache
	ttl time.Duration
}

cacheEntry, err := myCacheStuff.Fetch(
	key,
	myCacheStuff.ttl,
	func() (interface{}, error) {
		return doExpensiveQuery(key)
	),
)

—we do this:

var myCacheStuff struct {
	cache *ccache.Cache
	ttl time.Duration
	requestCoalescer singleflight.Group
}

cacheEntry, err := myCacheStuff.Fetch(
	key,
	myCacheStuff.ttl,
	func() (interface{}, error) {
		value, err, _ := myCacheStuff.requestCoalescer.Do(key, func() (interface{}, error) {
			return doExpensiveQuery(key)
		})
		return value, err
	),
)

This seems to work OK, but it is probably less efficient than it would be if this logic were built into the cache, since singleflight.Group has to maintain its own concurrent map to keep track of the active requests, which 1. is redundant since the cache already has an index and 2. is just one big map rather than buckets, so contention is probably higher.

from ccache.

eli-darkly commented on June 24, 2024

I should point out that my solution above isn't atomic— it does have a race condition where, if two goroutines see the initial cache miss at the same time, goroutine 1 calls requestCoalescer.Do first and for whatever reason goroutine 2 is going slowly enough that it does not get to requestCoalescer.Do until the first call to Do has already completed, it will do the query again (since singleflight only guards against the same thing being done twice at the same time, not the same thing being done an extra unnecessary time in general). That may not be super significant since the whole idea here was that the query would not execute quickly, otherwise we wouldn't care so much about a redundant one, but it is an inherent limitation of this approach where the coalescing is done outside of the cache.

from ccache.

Get/Set Races about ccache HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent