Giter Club home page Giter Club logo

Comments (12)

netogallo avatar netogallo commented on July 19, 2024

Hi, I am trying to get involved in the development of CH and thought about starting with this issue. I just wanted to ask if I understand the issue at a technical level:

Each Node has an EndPoint (localEndPoint) and every time newLocalNode is called, a new endpoint is created for that node so there is one node for every endpoint (even though it is theoretically possible to have multiple nodes for a single endpoint using the createBareLocalNode function). So the issue consists of implementing a function closeConnectionsTo which given a node it should call the close function on all connections in the LocalNodeState? I know nodes and endpoints are different, but I don't find any place where endpoints and connections opened with the endpoint can be associated except the node to which the endpoint belongs. If some light could be shed here would be nice.

Thank you

from distributed-process.

edsko avatar edsko commented on July 19, 2024

No, closeConnectionTo is a purely Network.Transport level construct. You'd be working with Network.Transport and Network.Transport.TCP. You'd also have to think very carefully about how this construct would work in potentially other network transports. This one isn't particularly easy.

from distributed-process.

hyperthunk avatar hyperthunk commented on July 19, 2024

So @edsko it sounds like what we're asking for is a means of tracking all the connections associated with a specific Endpoint and having a kind of bulk-close. And that needs to work at the transport level.

I think the abstraction isn't too hard to get right. If you move the mechanism into the policy layer (i.e., Network.Transport API) and basically provide the data structures and manipulation logic there, then the utility layer (e.g., Network.Transport.Foo or whatever) can simply call into these.

Better still, move the API calls into the policy layer itself and parameterise them by the implementation itself. Say for example, that we have some type class (I know, I'm obsessed with these right!) that we can work with. Somewhere in Network.Transport we might see this kind of code...

type CreateTransportResult = Either (TransportError NewEndPointErrorCode) EndPoint

data TransportInternals a = TransportInternals a

class TransportProvider a where    
  createNew :: a -> TransportConfiguration -> IO CreateTransportResult
  getInternals  :: a -> Transport -> TransportInternals a

createTransport :: (TransportProvider a) => a -> TransportConfiguration -> IO CreateTransportResult
createTransport provider = createNew provider config

Now you can do something similar (with or without the type classes) so that the implementations have to provide some sort of EndPointProvider as well. The key to this is to make sure that the functions which deal with an endpoint are defined in the policy layer, where we can handle tracking them properly. Quite where we choose to maintain this state is debatable. We could use shared memory or go some other way, e.g.,

data EndPoint = EndPoint {
    provider :: EndPointProvider   -- provides the callback functions
  , address :: EndPointAddress
  , connections :: MVar (StrictList Connection)
    --- etc
  }

connectEndpoint :: EndPoint -> Reliability -> ConnectHints -> IO (Either (TransportError ConnectErrorCode) Connection)
connectEndpoint ep r h = do
  addr <- address ep
  impl <- provider ep
  -- let the backend do the real world
  conn <- backendConnect impl addr r h
  case conn of
      Left _ -> return conn
      Right cn -> appendSTM (connections ep) cn >> return conn

disconnectEndpoint :: EndPoint -> IO ()   --- probably want some return value but hey...
disconnectEndpoint ep = do
  conns <- connections ep
  foldrSTM conns closeConnection   -- we can collect the outcomes here...
  writeSTM conns newEmptyStrictList  

Anyway, that's just a sketch, but hopefully you see my point. By moving some of the mechanism into Network.Transport we can handle these kinds of things in that layer, and leave the particulars of mechanism and utility (where you're dealing with specific implementation details) to the backends.

So gentlemen - how does that sound? Certainly it's a bigger refactoring than just implementing closeConnectionTo but does it seem like the right approach in principle?

And if so, how do you feel about picking this up and making a start @netogallo? We'll step in and help as needed of course, as and when we're able to.

from distributed-process.

edsko avatar edsko commented on July 19, 2024

No, I don't think that's the right way to go. You need to add closeConnectionTo to Network.Transport alright, but nothing else. None of the above data structures should go there, because different transport implementations might implemented this in entirely different ways. For instance, Network.Transport.TCP goes to great lengths to make sure there is at most a single TCP connection between any two endpoints. So, in that case, closeConnectionTo is simply a matter of closing that socket, and dealing with the fallout.

Network.Transport is tiny -- it is just the API specification really, and it should probably stay that way.

You are welcome to try and modify Network.Transport.TCP but be aware that it is a very complicated piece of code -- at least, by my standards. Probably the hardest bit of code I've ever written. I don't think closeConnectionTo should be too difficult to add, but the issues are subtle so I can't say for sure.

from distributed-process.

edsko avatar edsko commented on July 19, 2024

As regards getting the abstraction right -- you will need to talk to people involved with other kinds of transports. Is it realistic to demand something like closeConnectionTo in a UDP transport? (Probably.) What about Infiniband (CCI)? (Less sure there.)

from distributed-process.

hyperthunk avatar hyperthunk commented on July 19, 2024

None of the above data structures should go there, because different transport implementations might implemented this in entirely different ways. For instance, Network.Transport.TCP goes to great lengths to make sure there is at most > a single TCP connection between any two endpoints

Hmn, good point, I hadn't thought about that. Clearly it's much better for Network.Transport.TCP to just close the socket, so folding over the lightweight connections and closing them is pointless.

I don't think closeConnectionTo should be too difficult to add, but the issues are subtle so I can't say for sure.

Well if your comment about just closing the socket is right then it doesn't sound too onerous. Where would you expect the API to live? On EndPoint like so?

data EndPoint = EndPoint {
  receive :: IO Event
  --- snip
  disconnectEndpoint :: IO ()

What is the different between closeConnectionTo and closeEndPoint :: IO () that already exists? Is it because basically the other providers might want to tear their connections down without closing the endpoint itself? What's the use-case for that? The TODO in Node.hs says that you're completely disconnecting from the EndPoint because of the error, if I'm reading it right.

It does sound like for TCP that closeConnectionTo == closeEndPoint anyway, perhaps with some added error/exception handling.

from distributed-process.

hyperthunk avatar hyperthunk commented on July 19, 2024

As regards getting the abstraction right -- you will need to talk to people involved with other kinds of transports. Is it realistic to demand something like closeConnectionTo in a UDP transport? (Probably.) What about Infiniband (CCI)? (Less sure there.)

Yes indeed.

What's the use-case for that?

Just realised - you're closing the connections to the other endpoint but not closing the local end!

from distributed-process.

edsko avatar edsko commented on July 19, 2024

Just realised - you're closing the connections to the other endpoint but not closing the local end!

Yes, this is not closeEndpoint. The Network.Transport API has this concept of a 'bundle' of connections between two endpoints. closeConnectionTo closes such a bundle in one go, without closing the endpoints themselves or connections they might have to other endpoints.

Note that that "in one go" is in serious need of expansion. How synchronous is this operation? What about data currently in the network? Etc. Etc. These are not easy questions to answer.

from distributed-process.

netogallo avatar netogallo commented on July 19, 2024

Implementing such function does seem to require careful consideration since users will be expecting somewhat homogeneous behavior in all transports which might be tricky. I will probably return to this issue once I have better understanding about transports, especially since I am not a networks guy. Thanks for the valuable feedback, I believe this discussion will be very useful to tackle the problem, but for the moment I will review CH a little more to figure out where my skills can be most useful. Thank you for the feedback.

from distributed-process.

hyperthunk avatar hyperthunk commented on July 19, 2024

@edsko - we deal with these kinds of problems every day at RabbitMQ and they are indeed very tricky to get right. In fact, I'd say I spend > 30% of my engineering efforts hardening the broker against situations that arise because of unreliable networks, and that's just considering TCP/IP.

The problem with making this operation synchronous is that it can block for an arbitrary period of time, depending on the semantics of the underlying network/transport/protocol/etc. That's not good news when you're trying to shut down, upgrade, restart or whatever. One solution is timeouts, but then you're never sure just how long they should be and of course this should be a matter of policy. For connected protocols like TCP, using heartbeats is usually the right solution, with configurable parameters to control the frequency. Even for disconnected protocols this is usually a good idea.

Making this operation asynchronous is fine, in theory, but there would need to be some kind of completion callback or result on which the caller can block if required.

from distributed-process.

edsko avatar edsko commented on July 19, 2024

Yes, agreed on all the above. My main point is: don't rush into this one :)

from distributed-process.

mboes avatar mboes commented on July 19, 2024

This issue was moved to haskell-distributed/network-transport#25

from distributed-process.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.