Giter Club home page Giter Club logo

Comments (7)

purnesh42H avatar purnesh42H commented on July 23, 2024

Thanks @zejunlitg for the question. I will take a look and get back to you

from grpc-go.

purnesh42H avatar purnesh42H commented on July 23, 2024

It's my belief that this is caused by the fact that the underlying TCP connection is closed on the client side, but client side tried still to write on it.

Could you clarify more what do you mean by above? "client preface" is the string that must be sent by new connections from clients. This error indicates a failure when trying to write the initial client message (client preface) to establish the gRPC connection. The specific error "use of closed network connection" suggests that the TCP connection was closed unexpectedly.

from grpc-go.

zejunlitg avatar zejunlitg commented on July 23, 2024

@purnesh42H
AFAIK, this error happens within golang's net package:

conn, err := net.Dial("tcp", ":8888")
if err != nil {
  log.Println("dial error:", err)
  return

// close the connection here
conn.Close()

// then try to write over the connection, will throw the error 
// 'write tcp x.x.x.x:PORT_SRC->x.x.x.x:PORT_SRC: use of closed network connection'
n, err = conn.Write(buf)

That's why I said the connection is closed on the client side, I hope this clarifies.

I agree that this happen unexpectedly, it's exactly what happened, can you help me understand more about what I can do when it happens? Do I:

  1. retry calling the same RPC
  2. re-create the gRPC client, then call the same RPC with the new client
  3. instead of manually retry like point 1 & 2, use gRPC's built-in retry configuration?
    Which one is preferred and why does it work?

from grpc-go.

purnesh42H avatar purnesh42H commented on July 23, 2024

@zejunlitg please refer to retry documentation for more details, if not already done.

Meanwhile, could you provide more details on following?

  1. Example code of retry example with your modifications (if any)
  2. What is the reason for transport failure? Is there anything wrong with the server? See How to turn on logging

from grpc-go.

zejunlitg avatar zejunlitg commented on July 23, 2024

@purnesh42H I've read the retry documentation and it does not answer my question. That's why I'm posting here for a dev answer. Unless I missed it in the doc, to be very explicit, the question is:
does gRPC retry handle the fact that the network connection gets unexpectedly closed? This involves implementation details that the doc does not reveal.

RE 1:
I copied the retry policy in golang example:

var retryPolicy = `{
    "methodConfig": [{
        // config per method or all methods under service
        "name": [{"service": "grpc.examples.echo.Echo"}],
        "waitForReady": true,

        "retryPolicy": {
            "MaxAttempts": 4,
            "InitialBackoff": ".01s",
            "MaxBackoff": ".01s",
            "BackoffMultiplier": 1.0,
            // this value is grpc code
            "RetryableStatusCodes": [ "UNAVAILABLE" ]
        }
    }]
}`

And then in the example it's using this API grpc.NewClient():

conn, err := grpc.NewClient(ctx,grpc.WithTransportCredentials(insecure.NewCredentials()), grpc.WithDefaultServiceConfig(retryPolicy))

The only difference in my use is I'm using this:

grpc.Dial(endPoint, DialOptions()...)

and then here's the options we're using, I'm plugging in grpc.WithDefaultServiceConfig(retryPolicy) here:

func DialOptions() []grpc.DialOption {
	bc := backoff.DefaultConfig
	bc.MaxDelay = 5 * time.Second
	return []grpc.DialOption{
		grpc.WithTransportCredentials(insecure.NewCredentials()),
		grpc.WithConnectParams(grpc.ConnectParams{
			Backoff: bc,
		}),
		grpc.WithDefaultCallOptions(CallOptions()...),
	}
}

RE 2, no idea about the reason, from the server log, the RPC call is not received -- we have set up interceptor that prints RPC receiving & finishing log, normally when the server receives the RPC call it would be logged. When this issue happened, no relevant log was found on the server side. As I mentioned before, this is a rare issue that's difficult to reproduce. Regardless, we still want to know the course of action for best practice. We can manually call the same RPC after some sleep or we can use gRPC built-in retry mechanism, I'm still not sure if the former or latter would work, if you can provide some insights I appreciate it.

from grpc-go.

purnesh42H avatar purnesh42H commented on July 23, 2024

Thanks for the details. I will get back to you on transport retries. Meanwhile, to answer your other question, one way to repro client preface write network failure is to provide your custom dialer implementing net.Conn and override write(). See WithContextDialer

from grpc-go.

purnesh42H avatar purnesh42H commented on July 23, 2024

@zejunlitg in the example retry client retry policy have UNAVAILABLE as RetryableStatusCodes which is the error code for client preface write failure, so client will retry. As mentioned above, you can verify this by providing your own custom dialer implementing *net.Conn.

So, to answer your question, retry policies are the recommended way for dealing with transient failures. However, the recommended approach is to fetch the retry configuration (which is part of the service config) from the name resolver rather than defining it on the client side.

Feel free to reopen the issue if you have anymore questions

from grpc-go.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.