Giter Club home page Giter Club logo

fastcache's Introduction

Build Status GoDoc Go Report codecov

fastcache - fast thread-safe inmemory cache for big number of entries in Go

Features

  • Fast. Performance scales on multi-core CPUs. See benchmark results below.
  • Thread-safe. Concurrent goroutines may read and write into a single cache instance.
  • The fastcache is designed for storing big number of entries without GC overhead.
  • Fastcache automatically evicts old entries when reaching the maximum cache size set on its creation.
  • Simple API.
  • Simple source code.
  • Cache may be saved to file and loaded from file.
  • Works on Google AppEngine.

Benchmarks

Fastcache performance is compared with BigCache, standard Go map and sync.Map.

GOMAXPROCS=4 go test github.com/VictoriaMetrics/fastcache -bench='Set|Get' -benchtime=10s
goos: linux
goarch: amd64
pkg: github.com/VictoriaMetrics/fastcache
BenchmarkBigCacheSet-4      	    2000	  10566656 ns/op	   6.20 MB/s	 4660369 B/op	       6 allocs/op
BenchmarkBigCacheGet-4      	    2000	   6902694 ns/op	   9.49 MB/s	  684169 B/op	  131076 allocs/op
BenchmarkBigCacheSetGet-4   	    1000	  17579118 ns/op	   7.46 MB/s	 5046744 B/op	  131083 allocs/op
BenchmarkCacheSet-4         	    5000	   3808874 ns/op	  17.21 MB/s	    1142 B/op	       2 allocs/op
BenchmarkCacheGet-4         	    5000	   3293849 ns/op	  19.90 MB/s	    1140 B/op	       2 allocs/op
BenchmarkCacheSetGet-4      	    2000	   8456061 ns/op	  15.50 MB/s	    2857 B/op	       5 allocs/op
BenchmarkStdMapSet-4        	    2000	  10559382 ns/op	   6.21 MB/s	  268413 B/op	   65537 allocs/op
BenchmarkStdMapGet-4        	    5000	   2687404 ns/op	  24.39 MB/s	    2558 B/op	      13 allocs/op
BenchmarkStdMapSetGet-4     	     100	 154641257 ns/op	   0.85 MB/s	  387405 B/op	   65558 allocs/op
BenchmarkSyncMapSet-4       	     500	  24703219 ns/op	   2.65 MB/s	 3426543 B/op	  262411 allocs/op
BenchmarkSyncMapGet-4       	    5000	   2265892 ns/op	  28.92 MB/s	    2545 B/op	      79 allocs/op
BenchmarkSyncMapSetGet-4    	    1000	  14595535 ns/op	   8.98 MB/s	 3417190 B/op	  262277 allocs/op

MB/s column here actually means millions of operations per second. As you can see, fastcache is faster than the BigCache in all the cases. fastcache is faster than the standard Go map and sync.Map on workloads with inserts.

Limitations

  • Keys and values must be byte slices. Other types must be marshaled before storing them in the cache.
  • Big entries with sizes exceeding 64KB must be stored via distinct API.
  • There is no cache expiration. Entries are evicted from the cache only on cache size overflow. Entry deadline may be stored inside the value in order to implement cache expiration.

Architecture details

The cache uses ideas from BigCache:

  • The cache consists of many buckets, each with its own lock. This helps scaling the performance on multi-core CPUs, since multiple CPUs may concurrently access distinct buckets.
  • Each bucket consists of a hash(key) -> (key, value) position map and 64KB-sized byte slices (chunks) holding encoded (key, value) entries. Each bucket contains only O(chunksCount) pointers. For instance, 64GB cache would contain ~1M pointers, while similarly-sized map[string][]byte would contain ~1B pointers for short keys and values. This would lead to huge GC overhead.

64KB-sized chunks reduce memory fragmentation and the total memory usage comparing to a single big chunk per bucket. Chunks are allocated off-heap if possible. This reduces total memory usage because GC collects unused memory more frequently without the need in GOGC tweaking.

Users

FAQ

What is the difference between fastcache and other similar caches like BigCache or FreeCache?

  • Fastcache is faster. See benchmark results above.
  • Fastcache uses less memory due to lower heap fragmentation. This allows saving many GBs of memory on multi-GB caches.
  • Fastcache API is simpler. The API is designed to be used in zero-allocation mode.

Why fastcache doesn't support cache expiration?

Because we don't need cache expiration in VictoriaMetrics. Cached entries inside VictoriaMetrics never expire. They are automatically evicted on cache size overflow.

It is easy to implement cache expiration on top of fastcache by caching values with marshaled deadlines and verifying deadlines after reading these values from the cache.

Why fastcache doesn't support advanced features such as thundering herd protection or callbacks on entries' eviction?

Because these features would complicate the code and would make it slower. Fastcache source code is simple - just copy-paste it and implement the feature you want on top of it.

fastcache's People

Contributors

dokterbob avatar fmstephe avatar grbit avatar hnakamur avatar holiman avatar panmari avatar richardartoul avatar tenmozes avatar valyala avatar wizardishungry avatar zchee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fastcache's Issues

discuss about timing of getChunk()

each bucket limit the memory as 1/512 of total.
eg: I need 512MB cache, but each bucket only 1 MB.
if many keys goes to one bucket, many data will over write when chunks full.

I don't have any data to show this will happen.
but for logic, this will be better: don't limit bucket memory to 1/512 of total.

file fastcache.go, func (b *bucket) Set, line 335:

if chunkIdxNew >= uint64(len(b.chunks)) {  // len(b.chunks) here could use all buckets used chunks count

I wish I said clear with Chinglish. :-)
thanks.

fastcache doesn't store value with length of exactly 512k

I've faced with an issue that fastcache doesn't store value with length of exactly 512k

func main() {
	cache := fastcache.New(1024)

	var key, val []byte

	key = []byte("k")
	val = make([]byte, 512000)

	cache.SetBig(key, val)
	val = cache.GetBig(nil, key)

	if val == nil {
		fmt.Println("Value is missing")
	} else {
		fmt.Println("Value is OK")
		fmt.Println(len(val))
	}
}

The code prints "Value is missing", but if I create val as make([]byte, 512000+1) then I can set and get value.

What is the purpose of "dst" parameter in Get and HasGet?

Hi!

I'm having trouble wrapping up in my head, what is the purpose of having a dst parameter in the Get and HasGet methods.

As far as I can tell, the returned value is being prepended by the dst byte slice. But what is a use case for such a logic?

I was originally expecting this dst to be a destination buffer, where the value bytes would be written (not appended), whenever a match is found, thus saving on memory allocations, as the slice size won't change.
That's where I got confused.

I would appreciate any hint on how the dst is supposed to be used and maybe you a lready have plans to introduce a Get API with destination buffer parameters for cases when the (max) value size is known in advance.

Thanks!

bug about set idxNew in Set function

func (b *bucket) Set(k, v []byte, h uint64) {
	...
    b.mu.Lock()
	idx := b.idx
	idxNew := idx + kvLen
	chunkIdx := idx / chunkSize
	chunkIdxNew := idxNew / chunkSize
	if chunkIdxNew > chunkIdx { 
		if chunkIdxNew >= uint64(len(b.chunks)) { 
			idx = 0
			idxNew = kvLen
			chunkIdx = 0
			b.gen++ 
			if b.gen&((1<<genSizeBits)-1) == 0 {
				b.gen++
			}
		} else {
            //actually idx = idxNew
			idx = chunkIdxNew * chunkSize
            //bug here: repeat set idxNew = chunkIdxNew * chunkSize + kvlen
			idxNew = idx + kvLen
			chunkIdx = chunkIdxNew
		} 
		b.chunks[chunkIdx] = b.chunks[chunkIdx][:0]
	}
    ...
    // set idxNew
    b.idx = idxNew
}

i mark the bug in the comment. In the else block repeat set idxNew cause need to waste kvLen bytes memory.

Support larger cache sizes

The maxbytes parameter to New() is an int, which effectively limits the cache size to ~2GB. It would be nice if it supported int64 so we could create very large caches

2 Questions on fastcache

There is no cache expiration. Entries are evicted from the cache only on cache size overflow. Entry deadline may be stored inside the value in order to implement cache expiration.

does this mean FIFO or ALL entries are evicted from the cache?

  1. 64kb chunks are designed for all values (e.g. 512b values, 1k value etc) or purely for 64kb values only?
    e.g. if my values are mostly 512bytes in sizes, should I lower it to 512b instead manually inside the program? and if so, should I also change the values for SetBig (distinct API)?

can't compile to wasm

I'm trying to use this package in a wasm environment but it fails miserably

     imports github.com/VictoriaMetrics/fastcache
        imports golang.org/x/sys/unix: build constraints exclude all Go files in /Users/x/go/pkg/mod/golang.org/x/[email protected]/unix

How do I write a package that I can adjust the

Basically I would like to make my pre-processing of fastcache to be done so I can assign the size of the cache inside my program instead of directly on fastcache.

How do I do this?


import(
        "github.com/VictoriaMetrics/fastcache"
)
type Cache struct {
        Cache fastcache
}

func New(maxBytes int) *Cache {
        if maxBytes <= 0 {
                panic(fmt.Errorf("maxBytes must be greater than 0; got %d", maxBytes))
        }
        //      var c fastcache.Cache
        //      maxBucketBytes := uint64((maxBytes + bucketsCount - 1) / bucketsCount)
        //      for i := range c.buckets[:] {
        //              c.buckets[i].Init(maxBucketBytes)
        //      }

        Fcache = fastcache.New(maxBytes)
        return Fcache
}

func (*Cache) Set(k,v []byte) {
       //do something before fastcache.Set(k,v)
}

Great work! I understand that the ringbuffer store "duplicate key"...

Great work! I understand that the ringbuffer store "duplicate key" so is there a version that only store 1 key so that it does not "pollute" the buffer?

or how do i get it done? i'm using fastcache as a +1 counter and it is not ideal since the buffer just keeps adding this way.

Many allocations

I almost use freecache only as a cache and wanted to give fastcache a try. So I cloned it and run the tests:

/go/src/github.com/VictoriaMetrics/fastcache$ go test -bench=.
goos: linux
goarch: amd64
pkg: github.com/VictoriaMetrics/fastcache
BenchmarkSetBig-4 100000 21733 ns/op 12061.66 MB/s 15 B/op 0 allocs/op
BenchmarkGetBig-4 100000 23658 ns/op 11469.90 MB/s 26 B/op 0 allocs/op
BenchmarkBigCacheSet-4 100 12304106 ns/op 5.33 MB/s 3197467 B/op 71 allocs/op
BenchmarkBigCacheGet-4 200 7980345 ns/op 8.21 MB/s 2123027 B/op 131107 allocs/op
BenchmarkBigCacheSetGet-4 50 20196867 ns/op 6.49 MB/s 6919267 B/op 131215 allocs/op
BenchmarkCacheSet-4 300 4975663 ns/op 13.17 MB/s 19053 B/op 38 allocs/op
BenchmarkCacheGet-4 300 3846946 ns/op 17.04 MB/s 19013 B/op 37 allocs/op
BenchmarkCacheHas-4 300 4020827 ns/op 16.30 MB/s 19013 B/op 37 allocs/op
BenchmarkCacheSetGet-4 200 12254412 ns/op 10.70 MB/s 28624 B/op 58 allocs/op
BenchmarkStdMapSet-4 100 16950404 ns/op 3.87 MB/s 387533 B/op 65559 allocs/op
BenchmarkStdMapGet-4 300 3453446 ns/op 18.98 MB/s 42698 B/op 226 allocs/op
BenchmarkStdMapSetGet-4 30 83693808 ns/op 1.57 MB/s 680041 B/op 65613 allocs/op
BenchmarkSyncMapSet-4 50 33187992 ns/op 1.97 MB/s 3594094 B/op 264812 allocs/op
BenchmarkSyncMapGet-4 500 3318527 ns/op 19.75 MB/s 25449 B/op 791 allocs/op
BenchmarkSyncMapSetGet-4 50 20310120 ns/op 6.45 MB/s 3594073 B/op 264811 allocs/op
BenchmarkSaveToFile/concurrency_1-4 5 233652213 ns/op 143.61 MB/s 21166353 B/op 3128 allocs/op
BenchmarkSaveToFile/concurrency_2-4 10 124151525 ns/op 270.27 MB/s 21314589 B/op 3143 allocs/op
BenchmarkSaveToFile/concurrency_4-4 20 92345918 ns/op 363.36 MB/s 21610400 B/op 3168 allocs/op
BenchmarkSaveToFile/concurrency_8-4 20 96639664 ns/op 347.21 MB/s 21610358 B/op 3168 allocs/op

I was wondering why I have about 38 allocs/op when I compare it to your benchmark you have 2.

Support timeout for SaveToFile/SaveToFileConcurrent

In our project, fastcache consumes 40~50GB and it takes about 7 minutes to save the cache to the disk, which is too long.
As we can get benefit from saving only partial part of the cache to the disk, it would be nice if SaveToFile/SaveToFileConcurrent accepts Context to support timeout. How do you think?

Iterations by indexes

Hi.

For example:

set("username:victoria", "value")
set("username:metrics", "value")
...

scan("username:*", 10) // last 10

Is it possible to add this much-needed feature?

If not, please advise how to iterate the data without losing high speed and quality Fastcache?

Create a slice (like an index) with bucket indexes or keys or xxhashes, iterate all elements of the slice and Get values?

Thanks

Cache expiration recommendations

It is easy to implement cache expiration on top of fastcache by caching values with marshaled deadlines and verifying deadlines after reading these values from the cache.

While I completely agree with this statement, this only handles "cache hit" expiration. How would you recommend implementing a scheduler, that cleans up the cache once in a while?

The main issue is that I don't see a way to "iterate over all cache keys", neither a way to "delete all by value" (e.g. by expiration) in order to implement this.

Currently, it seems that the only possible way to achieve this is by creating an additional data structure (e.g. thread-safe map[string]int with tokenKey -> expirationVal or reversed, map[int][]string with expirationKey -> []tokenVal) in order to have some sort of iteration over keys / expirations.

However, this adds extra complexity (has to be kept in sync, so basically holding a cache for another cache) and memory (duplicating the keys), especially in cases with large keys (I'm using ~1000 char JWTs as keys, no need for value in this scenario, just expiration, which is usually quite big, e.g up to 24 hours).

@valyala Looking forward to hear you take on this, I would really like to use fastcache in our products.

Clarification: k and v contents may be modified after returning from Set.

Several methods have a similar descriptor to the following in the docs:

k and v contents may be modified after returning from Set.

Upon first reading, I assumed that the Set() method would modify the k and v byte arrays and copies should be passed into the fastcache methods.

I couldn't believe it and only after spending some time digging in the source did I realize that these sentences probably meant that the fastcache methods themselves stored copies of the k and v arrays and that you didn't have to worry about reusing the originals.

Just a heads up, I found the particular wording confusing, perhaps consider a slightly different wording.

There are some confusions in the process of reading the source code

    b.gen++
    if b.gen&((1<<genSizeBits)-1) == 0 {
    b.gen++
    }

Hello the developers of fastcache. I am learning how this library running by read the source code.
And i have some confuses when reading the code of function Get() in fastcache.go, line 341.

In my opinion, the field gen in bucket means the reset rounds that the chunk had experienced.
And if gen reach the maximum 1<<genSizeBits it should be reset to 1 and not be gen++.

Is my understanding correct?

Should add LoadOrStore() like sync.Map?

just like the sync.Map:
vv, ok := c.LoadOrStore(k, v)
then we could know if the key exists in the concurrent race case, and to avoid duplicated set operation.

[Feature] Notify when data evicted

Use channel to notify evicted

Hope to have the following effects

cache := fastcache.New(1024 * 1024 * 1024)
evictChan := make(chan []byte)
cache.NotifyEvicted(evictChan)

select {
case bt := <-evictChan:
        //do something here
        break
}

contribution to the workingsetcache.go

//sort of missing the deletion and so here it is.
func (c *Cache) Del(key []byte) {
curr := c.curr.Load().(*fastcache.Cache)
curr.Del(key)
prev := c.prev.Load().(*fastcache.Cache)
prev.Del(key)
}

Vulnerability: CVE-2022-29526

A vulnerability has been discovered in the library.

Go before 1.17.10 and 1.18.x before 1.18.2 has Incorrect Privilege Assignment. When called with a non-zero flags parameter, the Faccessat function could incorrectly report that a file is accessible.

Link

possible to remove the allocations? make it 0?

if cant remove the big cache, maybe the normal cache? there's 2allocs/op now... is it possible to remove it?

BenchmarkBigCacheSet-4 2000 10566656 ns/op 6.20 MB/s 4660369 B/op 6 allocs/op
BenchmarkBigCacheGet-4 2000 6902694 ns/op 9.49 MB/s 684169 B/op 131076 allocs/op
BenchmarkBigCacheSetGet-4 1000 17579118 ns/op 7.46 MB/s 5046744 B/op 131083 allocs/op
BenchmarkCacheSet-4 5000 3808874 ns/op 17.21 MB/s 1142 B/op 2 allocs/op
BenchmarkCacheGet-4 5000 3293849 ns/op 19.90 MB/s 1140 B/op 2 allocs/op
BenchmarkCacheSetGet-4 2000 8456061 ns/op 15.50 MB/s 2857 B/op 5 allocs/op

how many bytes overhead for storing one data item?

how many bytes overhead for storing one data item?
for small byte size and for large 64kb byte sizes.
how many additional bytes are needed per entry? Can you please put this as FAQ on front page?
thx

A way to completly disable auto eviction.

Looking for fast cache without auto eviction by time or filling. Does your cache have any ways to make all data persistent in cache while Delete method is not triggered?

Any possibility of OOM?

I dont see any configuration settings for limiting the number of bytes or slices / elements allocated.

Thanks for the software by the way

Memory leak when using `LoadFromFile` multiple times

I have a service that every N minutes loads a fastcache from file. Every time this happens, the program leaks memory. This was easily observable by simply loading a cache multiple times in the same program and then using htop/ActivityMonitor to see this steady increase of memory. In my particular use case, the cache is rather large (~18M keys, 662MB on disk).

The snippet to reproduce this is straight-forward:

package main

import (
	"fmt"
	"os"
	"os/signal"
	"runtime"
	"syscall"
	"time"

	"github.com/VictoriaMetrics/fastcache"
)

func main() {
	cachePath := "path_to_cache"

	tick := time.NewTicker(15 * time.Second)
	sig := make(chan os.Signal, 2)
	signal.Notify(sig, syscall.SIGINT, syscall.SIGTERM)
	fmt.Println(os.Getpid())
	for {
		select {
		case <-tick.C:
			newCache, err := fastcache.LoadFromFile(cachePath)
			if err != nil {
				panic(err)
			}
			_ = newCache
			runtime.GC()
			fmt.Println("reloading cache done")
		case <-sig:
			tick.Stop()
			return
		}
	}
}

After some profiling and debugging I've found that the use of getChunk() with memmap is the culprit. When we reload the cache every time, we never call putChunk, only getChunk, which causes the library to continuosly allocate memory via memmap which is never munmaped. To verify this, I added a panic to putChunk, which was never triggered.

I think perhaps one solution is to add LoadFromFile as a method also to the Cache object, so that the chunks from the old buckets can be reused when loading a new cache. I might be able to help with a PR if time permits, but I thought in the meantime you should be aware of this issue

Is it thread-safe?but my program show that it is not thread-safe

//result not correct
func TestName1(t *testing.T) {

	c := fastcache.New(1024 * 1024)

	wg := sync.WaitGroup{}

	for i := 0; i < 1000; i++ {

		wg.Add(1)
		go func(i2 int) {

			result := c.Get(nil, []byte(`k`), )

			r, _ := strconv.Atoi(string(result))
			r++
			c.Set([]byte(`k`), []byte(strconv.Itoa(r)))

			wg.Done()
		}(i)
	}

	wg.Wait()

	result := c.Get(nil, []byte(`k`), )
	log.Println(result)
	log.Println(string(result))

}

//result is correct 
func TestName2(t *testing.T) {

	c := fastcache.New(1024 * 1024)

	wg := sync.WaitGroup{}

	mutex := sync.Mutex{}

	for i := 0; i < 1000; i++ {

		wg.Add(1)
		go func(i2 int) {

			mutex.Lock()
			result := c.Get(nil, []byte(`k`), )

			r, _ := strconv.Atoi(string(result))
			r++
			c.Set([]byte(`k`), []byte(strconv.Itoa(r)))

			mutex.Unlock()

			wg.Done()
		}(i)
	}

	wg.Wait()

	result := c.Get(nil, []byte(`k`), )
	log.Println(result)
	log.Println(string(result))

}


just curious over SetBig

Why not add SetBig here,

if len(k) >= (1<<16) || len(v) >= (1<<16) {
              SetBig(k,v) //what's wrong with adding here and needing to do a new SetBig?
    }

Because I don't know if my cache value will exceed 64kb so I'll do :

if len(k) >= (1<<16) || len(v) >= (1<<16) {
               fastcache.SetBig(k,v)
     }else{
               fastcache.Set(k,v)
     }

Any problems with the above? With reference to function below:

   func (b *bucket) Set(k, v []byte, h uint64) {
setCalls := atomic.AddUint64(&b.setCalls, 1)
if setCalls%(1<<14) == 0 {
	b.Clean()
}

if len(k) >= (1<<16) || len(v) >= (1<<16) {
	// Too big key or value - its length cannot be encoded
	// with 2 bytes (see below). Skip the entry.
	return
}

build error: constant 1099511627776 overflows int

Hi valyala, thanks for this great project!

When I tried build fastcache in 386 target, I met a error as title.

I figure out it and here's is my patch

diff --git a/fastcache.go b/fastcache.go
index e97cc2e..4ac93c9 100644
--- a/fastcache.go
+++ b/fastcache.go
@@ -12,6 +12,8 @@ const bucketsCount = 512
 
 const chunkSize = 64 * 1024
 
+const maxBucketSize uint64 = 1 << 40
+
 // Stats represents cache stats.
 //
 // Use Cache.UpdateStats for obtaining fresh stats from the cache.
@@ -140,8 +142,8 @@ type bucket struct {
 }
 
 func (b *bucket) Init(maxBytes uint64) {
-	if maxBytes >= (1 << 40) {
-		panic(fmt.Errorf("too big maxBytes=%d; should be smaller than %d", maxBytes, 1<<40))
+	if maxBytes >= maxBucketSize {
+		panic(fmt.Errorf("too big maxBytes=%d; should be smaller than %d", maxBytes, maxBucketSize))
 	}
 	maxChunks := (maxBytes + chunkSize - 1) / chunkSize
 	b.chunks = make([][]byte, maxChunks)
diff --git a/file.go b/file.go
index c642feb..dcaa299 100644
--- a/file.go
+++ b/file.go
@@ -187,8 +187,8 @@ func (b *bucket) Load(r io.Reader, maxChunks uint64) error {
 	}
 
 	maxBytes := maxChunks * chunkSize
-	if maxBytes >= (1 << 40) {
-		return fmt.Errorf("too big maxBytes=%d; should be smaller than %d", maxBytes, 1<<40)
+	if maxBytes >= maxBucketSize {
+		return fmt.Errorf("too big maxBytes=%d; should be smaller than %d", maxBytes, maxBucketSize)
 	}
 	chunks := make([][]byte, maxChunks)
 	chunksLen, err := readUint64(r)

Is it possible to merge this?

Why do methods not return error?

The methods of this package don't return any error, such as in the Set() method;

fastcache/fastcache.go

Lines 145 to 149 in 8835719

func (c *Cache) Set(k, v []byte) {
h := xxhash.Sum64(k)
idx := h % bucketsCount
c.buckets[idx].Set(k, v, h)
}

Which in turn calls the buckets Set() method that silently skips caching entries at lines;

fastcache/fastcache.go

Lines 308 to 312 in 8835719

if len(k) >= (1<<16) || len(v) >= (1<<16) {
// Too big key or value - its length cannot be encoded
// with 2 bytes (see below). Skip the entry.
return
}

fastcache/fastcache.go

Lines 319 to 323 in 8835719

if kvLen >= chunkSize {
// Do not store too big keys and values, since they do not
// fit a chunk.
return
}

Is there a design decision as to why these methods don't return an error?

why fastcache not do hashcode collision?

in file fastcache.go, func (b *bucket) Set, line 359:
b.m[h] = idx | (b.gen << bucketSizeBits)

I think the code should be:

if oldValue, ok := b.m[h]; ok{
    //do something, or the last key of same hashcode will lost
}

thanks.

API shortcoming on nil values

We've been integrating fastcache into our project and @holiman found an interesting corner case that fastcache handles correctly internally, but fails to properly surface it to the user.

In our specific use case, values associated to keys can be empty. So, what we would like to do is insert key->nil mappings and be able to retrieve them. This works half way:

cache := fastcache.New(500 * 1024)

fmt.Println(cache.Has([]byte("a")))
cache.Set([]byte("a"), nil)
fmt.Println(cache.Has([]byte("a")))

^ correctly returns false and then true.


However, if I use Get, things don't look so well:

cache := fastcache.New(500 * 1024)

fmt.Println(cache.Get(nil, []byte("a")) == nil)
cache.Set([]byte("a"), nil)
fmt.Println(cache.Get(nil, []byte("a")) == nil)

^ returns nil in both cases. You could even set []byte{} and both would still return nil.


Why is this an issue? Because we cannot differentiate between a value being inside-the-cache-and-empty, or a value missing-from-the-cache requiring database access. The current API only allows using Has-then-Get to cover this, but that's not thread safe any more.

The fault is a combination of https://github.com/VictoriaMetrics/fastcache/blob/master/fastcache.go#L386, because Go's append will keep dst as nil if appending an empty slice; and https://github.com/VictoriaMetrics/fastcache/blob/master/fastcache.go#L161, because it discards the found flag and only returns dst.

One solution would be to change the append line so it special cases the scenario where both dst and the found value is nil, but that won't really work if the user does supply a non-nil dst.

Our best solution until now is to add a second Get method that returns not only the cached value, but also the flag whether it was found or not. This way we don't break the API.


Does this seem good to you? We can open a PR if you want.

Found mis-handled disk corruption or other internal bug

I am a heavy user of this cache in production and have ran into a bug. We save our cache to disk and have O(1GB) of cache per node storing many small and medium size objects using the SetBig function.

Yesterday we ran into a machine in a "zombie" situation piling up tens of thousands of goroutines doing cache lookups. They were all attempting to acquire the RLock to read a bucket in the cache.

Four goroutines were attempting to write to the cache and had paniced. Due to a quirk in our usage, we have a defer that will eventually also write to the cache under some circumstances that will run in the same goroutine where the panic happened. This results in deadlock.

The panic happened at fastcache.go:318, the first line in this snippet:

chunk := b.chunks[chunkIdx]
if chunk == nil {
    chunk = getChunk()
    chunk = chunk[:0]
}

I'm not sure how this would come up, but it should be handled instead of crashing by returning an error that corruption was found. This may not be possible given the lack of checksums in the library generally.

I don't have a great solution here, but I thought you should know about this. On our end, we are wrapping our usage with a defer function that will nuke the cache from disk, sync the directory, and call os.Exit.

Questions about how to check the cashe usage and, about specifying the cache size

Hi Team,

I've developed an application and used FastCache as an in-memory data store. So far, everything works fine. One thing I'm struggling is to figure out is how to see the cache usage in megabytes.

So,

  1. Is there a way to get the cache usage (not the allocation) and/or the free quota?
  2. If it's not possible at the moment, would such a feature be made available in the future?

There's some confusion in the documentation as well, when creating a cache using fastcache.New(maxBytes)
(https://pkg.go.dev/github.com/VictoriaMetrics/fastcache#New)

  1. If we need a cache of 64MB, should we specify maxBytes as 64 or 64000000?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.