Giter Club home page Giter Club logo

Comments (17)

rueian avatar rueian commented on September 4, 2024 2

Hi @xiluoxi,

There are 3 new fields of the rueidis.ClientOption that can affect performance introduced in the v0.0.76, which are ReadBufferEachConn, WriteBufferEachConn, and PipelineMultiplex.

Increasing the ReadBufferEachConn and WriteBufferEachConn will require more memory and save more TCP system calls.
Increasing the PipelineMultiplex will use more TCP connections to pipeline commands to one redis node. This will use more CPU but could lower the latencies and cache contention.

You can try to increase or decrease them to see how they will affect performance and find better values for your case.

This is the result of the same code and same machines of the previous simulation but with the v0.0.76:
v0 0 76
Now it achieves 28x read throughput and latencies are also improved but with more gorotines used.

from rueidis.

xiluoxi avatar xiluoxi commented on September 4, 2024

This will be better when I restart the service.

from rueidis.

rueian avatar rueian commented on September 4, 2024

@xiluoxi, thank you for reporting this. I will look into that as soon as possible.

from rueidis.

rueian avatar rueian commented on September 4, 2024

Hi @xiluoxi, please try the new v0.0.74. The memory leak in the LRU cache should be fixed.

from rueidis.

xiluoxi avatar xiluoxi commented on September 4, 2024

@rueian After the service runs for a period of time, the memory usage will still rise.

from rueidis.

xiluoxi avatar xiluoxi commented on September 4, 2024

In the test, the memory increased abnormally, from 4.1% to 21%.Next, the performance of rueidis improved

from rueidis.

rueian avatar rueian commented on September 4, 2024

In the test, the memory increased abnormally, from 4.1% to 21%.Next, the performance of rueidis improved

Hi @xiluoxi, just want to clarify. Do you expect that it keeps using 4.1% of memory? How much time did it take to occupying 21% of memory?

Next, the performance of rueidis improved

Do you mean that though there is still memory leak issue, but the latency issue is solved?

from rueidis.

xiluoxi avatar xiluoxi commented on September 4, 2024

The latency issue still exists。

from rueidis.

xiluoxi avatar xiluoxi commented on September 4, 2024

At the same time, a new problem has emerged. In a highly concurrent write scenario,the memory usage will increase rapidly when redis fails or processes slowly.

from rueidis.

xiluoxi avatar xiluoxi commented on September 4, 2024

The latency issue still exists。

In a highly concurrent read scenario,The latency > 100ms

from rueidis.

xiluoxi avatar xiluoxi commented on September 4, 2024

There are two rueidis clients connect to different redis servers in my service, one for reading and one for writing.I'm not sure they can influence each other.

from rueidis.

rueian avatar rueian commented on September 4, 2024

They should not affect each other. What is the relationship between these two Redis servers? Are they Redis Cluster?

BTW, Are you using DoCache or DoMultiCache to send commands?

from rueidis.

xiluoxi avatar xiluoxi commented on September 4, 2024

They should not affect each other. What is the relationship between these two Redis servers? Are they Redis Cluster?

BTW, Are you using DoCache or DoMultiCache to send commands?

No Redis cluster, and they are on two different servers. using DoCache

from rueidis.

rueian avatar rueian commented on September 4, 2024

Hi @xiluoxi,

At the same time, a new problem has emerged. In a highly concurrent write scenario, the memory usage will increase rapidly when redis fails or processes slowly.

This may be caused by the fact that, currently, the command builder does not reuse the command buffer of previously failed commands due to some racing problems. This may take some time to improve.

In a highly concurrent read scenario,The latency > 100ms

I have done some tests on Google Cloud but I am still not able to simulate your situation.

I created two instances in the same zone of Google Cloud, their spec:

  1. n2d-highcpu-4 (4core, 4G ram, AMD Rome, ip: 10.140.0.52)
  2. n2-highcpu-8 (8core, 8G ram, Intel Cascade Lake, ip: 10.140.0.51)

The first machine was running Redis 7.0.4 + Prometheus + Grafana
The second machine was running the following program with rueidis v0.0.75 and compiled with go 1.19:

package main

import (
	"context"
	"fmt"
	"math/rand"
	"net/http"
	"strconv"
	"time"

	"github.com/go-redis/redis/v9"
	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promauto"
	"github.com/prometheus/client_golang/prometheus/promhttp"
	"github.com/rueian/rueidis"
)

func prepData(n int) []string {
	data := make([]string, n)
	for i := range data {
		data[i] = strconv.Itoa(i)
	}
	rand.Shuffle(len(data), func(i, j int) { data[i], data[j] = data[j], data[i] })
	return data
}

const (
	keyCount   = 1000000 
	readers    = 8
	writers    = 2
	useGoRedis = false // please change it
	cacheSize  = 512 * (1 << 20) // 512 MB
	addr       = "10.140.0.52:6379" // please change it
)

func main() {
	rand.Seed(time.Now().UnixNano())
	bucket := []float64{250, 500, 750, 1000, 2500, 5000, 7500, 10000, 25000, 50000, 75000, 100000, 250000, 500000, 750000, 1000000}

	wl := promauto.NewHistogram(prometheus.HistogramOpts{Name: "micro_write_latency", Buckets: bucket})
	rl := promauto.NewHistogram(prometheus.HistogramOpts{Name: "micro_read_latency", Buckets: bucket})

	go func() {
		http.Handle("/metrics", promhttp.Handler())
		http.ListenAndServe(":2112", nil)
	}()

	rc, err := rueidis.NewClient(rueidis.ClientOption{
		InitAddress:       []string{addr},
		CacheSizeEachConn: cacheSize,
	})
	if err != nil {
		panic(err)
	}

	gc := redis.NewUniversalClient(&redis.UniversalOptions{
		Addrs: []string{addr},
	})

	ctx := context.Background()

	goredisWrite := func(key, data string) error {
		return gc.Set(ctx, key, data, 0).Err()
	}
	goredisRead := func(key string) error {
		return gc.Get(ctx, key).Err()
	}
	rueidisWrite := func(key, data string) error {
		return rc.Do(ctx, rc.B().Set().Key(key).Value(data).Build()).Error()
	}
	rueidisCache := func(key string) error {
		return rc.DoCache(ctx, rc.B().Get().Key(key).Cache(), time.Hour).Error()
	}

	var wfn func(key, data string) error
	var rfn func(key string) error

	if useGoRedis {
		wfn = goredisWrite
		rfn = goredisRead
	} else {
		wfn = rueidisWrite
		rfn = rueidisCache
	}

	writeFn := func(keys, data []string) {
		for i, k := range keys {
			ts := time.Now()
			err := wfn(k, data[i])
			wl.Observe(float64(time.Since(ts).Microseconds()))
			if err != nil {
				panic(err)
			}
		}
	}
	readFn := func(keys []string) {
		for _, k := range keys {
			ts := time.Now()
			err := rfn(k)
			rl.Observe(float64(time.Since(ts).Microseconds()))
			if err != nil {
				panic(err)
			}
		}
	}

	{
		keys := prepData(keyCount)
		data := prepData(keyCount)
		commands := make(rueidis.Commands, len(keys))
		for i := range commands {
			commands[i] = rc.B().Set().Key(keys[i]).Value(data[i]).Build()
		}
		ts := time.Now()
		for _, resp := range rc.DoMulti(ctx, commands...) {
			if err := resp.Error(); err != nil {
				panic(err)
			}
		}
		fmt.Println("ready", time.Since(ts))
	}

	if useGoRedis {
		rc.Close()
	} else {
		gc.Close()
	}

	for i := 0; i < writers; i++ {
		go func() {
			keys := prepData(keyCount)
			data := prepData(keyCount)
			for {
				writeFn(keys, data)
			}
		}()
	}
	for i := 0; i < readers; i++ {
		go func() {
			keys := prepData(keyCount)
			for {
				readFn(keys)
			}
		}()
	}
	time.Sleep(time.Hour)
}

This program records metrics of 8 concurrent readers and 2 concurrent writers that keep reading and writing 1000000 keys.

The result of goredis v9:
goredis-8-2

The result of rueidis v0.0.75 with an additional 512MB client-side cache:
rueidis-8-2

While indeed rueidis used more memory for client-side caching, it achieved 14x read throughput over goredis (887874/61978) in this case with P99 latencies <0.5ms and no memory leak.

Would you mind sharing more details about your machine/network spec and traffic patterns like concurrency, read/write ratio, cache-hit ratio as well as avg key/value size? So that I can help you and find the causes of your problem.

from rueidis.

xiluoxi avatar xiluoxi commented on September 4, 2024

You can try this case, read and write the same key with high concurrency, mainly reading.

from rueidis.

rueian avatar rueian commented on September 4, 2024

You can try this case, read and write the same key with high concurrency, mainly reading.

Hi @xiluoxi, the previous simulation I posted is reading and writing the set of keys with high concurrency, mainly reading.
Would you mind sharing more details about your machine spec? For example how many CPUs do you have on one machine? And it will be also helpful to know your key/value size respectively.

from rueidis.

xiluoxi avatar xiluoxi commented on September 4, 2024

Hi @xiluoxi,

There are 3 new fields of the rueidis.ClientOption that can affect performance introduced in the v0.0.76, which are ReadBufferEachConn, WriteBufferEachConn, and PipelineMultiplex.

Increasing the ReadBufferEachConn and WriteBufferEachConn will require more memory and save more TCP system calls. Increasing the PipelineMultiplex will more TCP connections to pipeline commands to one redis node. This will use more CPU but could lower the latencies and cache contention.

You can try to increase or decrease them to see how they will affect performance and find better values for your case.

This is the result of the same code and same machines of the previous simulation but with the v0.0.76: v0 0 76 Now it achieves 28x read throughput and latencies are also improved but with more gorotines used.

Thanks,I will try.

from rueidis.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.