Giter Club home page Giter Club logo

wendy's People

Contributors

alecthomas avatar edsrzf avatar jessta avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wendy's Issues

difficult to determine bound port when using auto-assigned port

I'm using the following to bind Wendy to all interfaces and pick a free port at random:

node := wendy.NewNode(id, "0.0.0.0", global_ip, region, 0)

This works, according to netstat but Wendy doesn't seem to update the Node struct with the actual port that gets bound. I don't really want to have to parse netstat output to figure out what port I'm on, it seems like it'd be better if Wendy just saved that info back into the Node after Listen() is called.

Proposal: Duplicate to nearby nodes

The Pastry paper makes no mention of duplicating information to nearby Nodes, though some applications the authors built on top of Pastry (PAST, for example) reference the ability to store redundant copies of information in nearby nodes.

I'd propose adding a new Duplicate field to the message body. There are three ways I could see this working:

  • When a message is received with the Duplicate field set to n, where n is an integer above 0, it dispatches a copy of the message (with the Duplicate field set to 0) to the n Nodes on either side of the leaf set. So if a message arrived with the Duplicate field set to 4, the message would be mirrored to the 8 closest Nodes to the receiving Node.
  • When a message is received with the Duplicate field set to true, it checks its redundancy configuration (set with SetRedundancy) and uses that as n to copy to the n Nodes on either side of the leaf set.
  • A combination: the Duplicate field determines the number of Nodes to store the information on, but the redundancy configuration on the Node can set a maximum value.

Option 1 allows for a message to determine its priority on a sliding scale. It could be important enough to store redundantly, but not necessarily important enough to copy to the entire leaf set. Or it could be important enough to copy to the entire leaf set. It's up to the sender to decide that.

Option 2 allows a Node to determine its level of redundancy. A message either needs to be stored redundantly or it does not; there is no sliding scale here. This allows for Nodes to know more about the number of connections they'll be opening when they receive a message like this. I'm not sure what benefit this offers, really, except that it limits the damage a rogue Node in the cluster could do.

Option 3 kind of allows both: a message still gets a say in how important it is, but the Node can lock down how many connections it opens for redundancy messages.

Thoughts? Feedback? Anyone feel strongly about this, either way?

Redundancy can also be achieved by applications using the OnForward callback to store copies of the information as it traverses the cluster, but that doesn't carry guarantees quite as strong. If a message is delivered in a single hop, that information now lives only on a single Node, which makes it volatile. This proposal would let applications create a strong guarantee about the redundancy of their information.

join announcement is delayed

Upon joining a cluster, I noticed that the actual announcement is significantly delayed. In particular, the debug output shows that a node has joined a full 20 seconds before the OnNodeJoin or OnNewLeaves events are fired.

I tracked this down to line 646 of cluster.go. My understanding of reading through the code and the debug output is that the join process goes like this: (please slap me if I've gone awry)

  • send join message to cluster
  • cluster accepts join message and node into cluster (debug output says "Node ... joined!")
  • cluster sends state tables back to the new node
  • the new node doesn't know it has successfully joined yet, waits for 2 * NETWORK_TIMEOUT and announces presence
  • after announcement is sent, the node proclaims itself joined
  • the OnNodeJoin and OnNewLeaves events are fired for all nodes in the cluster

I'm guessing that there is a reason why there is a delay set to 2 * NETWORK_TIMEOUT, but I'm not sure what it is. (Truthfully, my networking skills are pretty poor, so I dare not hazard a guess.)

I would be very happy to work on a fix for this problem, I'm just not sure what the fix would look like yet. Therefore, I am seeking guidance. :-)

My inclination is to try and announce the node's presence immediately, and if it fails, try again after a longer timeout. I just don't know what if it fails means in this context.

Thanks!

Exit when port is in use

In some use cases (https://gist.github.com/4306474) it's possible to get a really confusing error message (throw: all goroutines are asleep - deadlock!) if a port is in use when Wendy attempts to use it. Need to investigate and make sure that this is just me being dumb in how I wrote the test script, not a shortcoming of Wendy's.

is there an idiom for ignoring regions?

To me, ignoring regions means that the "global IP" should always be used and the "local IP" should never be used. One sensible way to do this is to use the same region for every node (the empty string, perhaps) and always set the local and global IP addresses to be the same. However, judging by this code, the local IP ends up getting used. That's probably OK, but it does seem a little weird.

So I guess my question is: is this idiom acceptable? I think it might be cleaner to special case a zero value. For example, if the region is empty, then the global IP is always used. Or if the local IP is empty, then then global is always used.

I realize this is a small point, but I was slightly confused by it until I read the full story on what regions were.

Take another look at the Neighborhood Set

For many hundreds of thousands (or millions of nodes) the memory cost and network communication cost of storing the entire network on every node is prohibitive. It also runs counter to the spirit of the paper. The Neighborhood Set was originally omitted from the implementation to try and keep the complexity of the implementation to a minimum, but it probably deserves another look at this point.

Switch to a trie

As @edsrzf and I have been discussing, a trie would probably be a better data structure choice for the routing table.

Wendy crashes when telnetted on listening port.

Hi All,

I am writing an application based on wendy, but wendy crashes very bad when a internet scan hits it.
The cluster is starting and listening on the port (30000 in the example) , but if I just do:

echo quit | telnet localhost 30000

the result is:

wendy(c29bf87c157cb90742b2d1553e7d09b6) 2015/04/03 16:54:43 invalid character 'q' looking for beginning of value
panic: invalid character 'q' looking for beginning of value

goroutine 19 [running]:
_/home/uriel/Documents/gitio/averno/legione.(*debugWendy).OnError(0x8b3e98, 0x7fa52f32d110, 0xc20801e000)
    /home/uriel/Documents/gitio/averno/legione/laterali.go:28 +0x74
github.com/secondbit/wendy.(*Cluster).fanOutError(0xc208076280, 0x7fa52f32d110, 0xc20801e000)
    /home/uriel/Documents/gitio/golib/src/github.com/secondbit/wendy/cluster.go:393 +0x1ca
github.com/secondbit/wendy.(*Cluster).handleClient(0xc208076280, 0x7fa52f32d088, 0xc208038070)
    /home/uriel/Documents/gitio/golib/src/github.com/secondbit/wendy/cluster.go:441 +0x1aa
created by github.com/secondbit/wendy.(*Cluster).Listen
    /home/uriel/Documents/gitio/golib/src/github.com/secondbit/wendy/cluster.go:305 +0xc2f

goroutine 1 [select (no cases)]:
_/home/uriel/Documents/gitio/averno/legione.Initialize()
    /home/uriel/Documents/gitio/averno/legione/legione.go:55 +0x647
main.main()
    /home/uriel/Documents/gitio/averno/main.go:30 +0x1f

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:2232 +0x1

goroutine 5 [IO wait]:
net.(*pollDesc).Wait(0xc208010220, 0x72, 0x0, 0x0)
    /usr/local/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc208010220, 0x0, 0x0)
    /usr/local/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).readFrom(0xc2080101c0, 0xc20807db08, 0x400, 0x400, 0x0, 0x0, 0x0, 0x7fa52f32ba08, 0xc20800a9b0)
    /usr/local/go/src/net/fd_unix.go:269 +0x4a1
net.(*UDPConn).ReadFromUDP(0xc208038030, 0xc20807db08, 0x400, 0x400, 0xc20803ac00, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/udpsock_posix.go:67 +0x124
github.com/prestonTao/upnp.(*SearchGateway).send(0xc20801e600, 0xc20803e120)
    /home/uriel/Documents/gitio/golib/src/github.com/prestonTao/upnp/SearchGatewayMsg.go:82 +0x6ea
created by github.com/prestonTao/upnp.(*SearchGateway).Send
    /home/uriel/Documents/gitio/golib/src/github.com/prestonTao/upnp/SearchGatewayMsg.go:31 +0x7a

goroutine 8 [IO wait]:
net.(*pollDesc).Wait(0xc208010290, 0x72, 0x0, 0x0)
    /usr/local/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc208010290, 0x0, 0x0)
    /usr/local/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).readFrom(0xc208010230, 0xc20807eb08, 0x400, 0x400, 0x0, 0x0, 0x0, 0x7fa52f32ba08, 0xc20800aa48)
    /usr/local/go/src/net/fd_unix.go:269 +0x4a1
net.(*UDPConn).ReadFromUDP(0xc208038048, 0xc20807eb08, 0x400, 0x400, 0xc20803acc0, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/udpsock_posix.go:67 +0x124
github.com/prestonTao/upnp.(*SearchGateway).send(0xc20801e6a0, 0xc20803e180)
    /home/uriel/Documents/gitio/golib/src/github.com/prestonTao/upnp/SearchGatewayMsg.go:82 +0x6ea
created by github.com/prestonTao/upnp.(*SearchGateway).Send
    /home/uriel/Documents/gitio/golib/src/github.com/prestonTao/upnp/SearchGatewayMsg.go:31 +0x7a

goroutine 10 [select]:
github.com/secondbit/wendy.(*Cluster).Listen(0xc208076280, 0x0, 0x0)
    /home/uriel/Documents/gitio/golib/src/github.com/secondbit/wendy/cluster.go:296 +0xcda
_/home/uriel/Documents/gitio/averno/legione.func·001()
    /home/uriel/Documents/gitio/averno/legione/legione.go:42 +0x61
created by _/home/uriel/Documents/gitio/averno/legione.Initialize
    /home/uriel/Documents/gitio/averno/legione/legione.go:47 +0x640

goroutine 11 [IO wait]:
net.(*pollDesc).Wait(0xc208010300, 0x72, 0x0, 0x0)
    /usr/local/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc208010300, 0x0, 0x0)
    /usr/local/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).accept(0xc2080102a0, 0x0, 0x7fa52f32ba08, 0xc20800aca8)
    /usr/local/go/src/net/fd_unix.go:419 +0x40b
net.(*TCPListener).AcceptTCP(0xc208038068, 0x645d60, 0x0, 0x0)
    /usr/local/go/src/net/tcpsock_posix.go:234 +0x4e
net.(*TCPListener).Accept(0xc208038068, 0x0, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/tcpsock_posix.go:244 +0x4c
github.com/secondbit/wendy.func·001(0x7fa52f32d000, 0xc208038068, 0xc20803e300)
    /home/uriel/Documents/gitio/golib/src/github.com/secondbit/wendy/cluster.go:286 +0x37
created by github.com/secondbit/wendy.(*Cluster).Listen
    /home/uriel/Documents/gitio/golib/src/github.com/secondbit/wendy/cluster.go:294 +0x9d0

is there a precise reason or it is a bug? Is there a way to avoid it?

thanks in advance

Uriel

https://github.com/uriel-fanelli/averno

Better end-to-end integration tests

We currently test all the pieces of the algorithm, but don't have automated tests for the overall algorithm itself. If Wendy is ever going to exit alpha status and be usable in production, we need better tests that show that the algorithm itself works as expected, in addition to all its component parts.

Request for Feedback: Naming and customising

As Pastry grows, it's going to want to change from its original genesis, which was meant to be a vanilla implementation of the Pastry DHT algorithm.

This raises two questions: should we allow it to evolve outside the original specification? If so, should we change the name?

The first question addresses what the community wants. This package grew far more popular than we ever anticipated, so I'm not sure where the value proposition with the community lies. Do people like it because it enables distributing computing, meaning we can stray from the paper as much as we like, as long as it functions on the same basic principles? Or do people like it because it tries hard to remain true to the spirit of the paper, in which case we should focus our efforts on fulfilling the designs of the paper?

The second is ultimately dependent on the first. If we decide the software should be limited to the scope of the paper, I think naming it after the paper is appropriate. If, however, the software should be limited only to providing a package for distributed networks, I feel it might be disingenuous or misleading to retain the name Pastry, if that is not what the software actually is anymore. The alternate name I would use, were I to change the name, would be "Wendy"--this package was developed to power "Peter", our distributed messaging framework, so "Wendy" would fit in well with that theme. If such a name change were to occur, how should it be handled? Should we fork the project, and link from the README to the new project? Should we (as only a few days old) simply rename the repo, and hope for the best? Any thoughts or opinions? Fortunately, the attention was all linked to http://secondbit.org/blog/introducing-pastry/ and http://secondbit.org/pastry, so we shouldn't strand too many people or break too many links if the github repo changes its name.

This is clearly not just my pet project anymore, so I'll be trying to gauge the interest and opinions of the community as we do things with the package. I'm loathe to set up a mailing list for the package, but I feel like we may have to. Any opinions on that?

"Detected race condition"

I get the following error quite often when a new node joins an existing cluster:

Leaf set changed:  [0xc200106f50]
Node joined:  34623439663430632d663238632d3462
wendy(616a736b64686661736b6a6664686173) 2013/01/18 09:41:49 Detected race condition. 34623439663430632d663238632d3462 sent the message at 2013-01-18 09:41:49.681600935 -0500 EST, but we last updated our state tables at 2013-01-18 09:41:49.68245834 -0500 EST. Notifying 34623439663430632d663238632d3462.

This is during development, when both nodes are running on the same machine.

Is this expected? If so, perhaps using a less "scary" term than "race condition" might be good.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.