Giter Club home page Giter Club logo

pants's Introduction

Pants is complete and no longer actively maintained.

Pants is a lightweight framework for writing asynchronous network applications in Python. Pants is simple, fast and elegant.

Pants is available under the Apache License, Version 2.0

Docs

Check out the documentation at pantspowered.org

Install

Pants can be installed using pip:

pip install pants

You can also grab the latest code from the git repository:

git clone git://github.com/ecdavis/pants

Pants requires Python 2.7 - Python 3 is not yet supported.

Examples

Here's an absurdly simple example - an echo server:

from pants import Engine, Server, Stream

class Echo(Stream):
    def on_read(self, data):
        self.write(data)

Server(Echo).listen(4040)
Engine.instance().start()

Want a stupidly fast web server? Got you covered:

from pants.web import Application

app = Application()

@app.route('/')
def hello(request):
    return "Hello, World!"

app.run()

pants's People

Contributors

aphonicchaos avatar ecdavis avatar stendec avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pants's Issues

Painless Address Families (IPv4, IPv6, and UNIX)

I propose that the creation of sockets in instances of _Channel, Stream, and Datagram be delayed until an address family is known (rather than in __init__() as happens currently), allowing any instance to connect to either an IPv4, an IPv6, or a unix address.

Based upon the input given to the call to either connect() or listen(), it can be determined which address family is in use, and an appropriate socket object may be created at that time. This would make the pants.unix module unnecessary, and make things generally much easier on the end developer.

Valid Address Formats:

  • If a single string is given as an address, the address family is AF_UNIX. An exception can, at that point, be raised on platforms where unix sockets are unavailable.
  • Given a tuple or list of four elements, the address family is AF_INET6, formatted as (host, port, flowinfo, scopeid). The last two may be None if flowinfo and scopeid are unavailable. An exception can, at that point, be raised on platforms where socket.has_ipv6 is False.
  • For a tuple or list of just two elements, it is less clear. Generally, such a tuple will represent an AF_INET address. However, IPv6 address may be supplied without flowinfo and scopeid. As such, the given host should be checked for validity as both an IPv4 and an IPv6 address. If it is neither, it is assumed to be a hostname, which should then be looked up. Both A and AAAA records should be checked, and a family chosen based on the resulting address.

Examples:

  • AF_UNIX: "/tmp/mysql.sock"
  • AF_INET: ("", 80), ("10.1.1.52", 80), ("www.google.com", 80)
  • AF_INET6: ("", 80, None, None), ("::1", 80), ("ipv6.google.com", 80)

Any addresses that do not match one of the provided address formats should raise an exception.

Outdated examples in docs folder

They're not linked anywhere, but the examples in docs/examples/ are old and broken. They should be deleted from the repo as there are plenty of good examples through the current docs.

pants-0.10.0: _update_addr() is too specific for Channel

The _update_addr() method (and the remote_addr and local_addr attributes) are too protocol-specific for the Channel class and should be moved up one level to the Stream/Datagram classes. This will pose further problems in the future when we want to implement AF_UNIX protocols, as they have different address formats (so it will be difficult to have a Stream class, for instance, that works with both AF_INET and AF_UNIX).

Data Received from Closing Socket is Ignored

Because of how the read event works, any buffered data will be destroyed if the socket is closed while reading, rather than being processed and sent on to on_read before on_close is called. I just noticed this implementing a basic, raw HTTP client as an example for someone.

The following code hits upon the problem:

import sys

from pants import Stream, engine


class RawHttpClient(Stream):
    def on_connect(self):
        self.write("\r\n".join([
            "GET %s HTTP/1.1" % self.path,
            "Host: %s" % self.host,
            "Connection: close",

            # Finish with an extra blank line.
            "\r\n"
            ]))

        self.read_delimiter = "\r\n\r\n"

    def on_read(self, data):
        print "Got headers!"
        print data
        print "-" * 80

        self.read_delimiter = None
        self.on_read = self.read_body

    def read_body(self, data):
        sys.stdout.write(data)

    def on_close(self):
        print ""
        print "-" * 80
        print "Done."
        print ""

        engine.stop()


client = RawHttpClient()
client.host = "www.google.com"
client.path = "/"


client.connect(("www.google.com", 80))
print client

engine.start()

I'd expect this to print the full source code of the Google homepage, but the last chunk gets cut off.

Multiple engine support

Add support for multiple engines. It should be sufficient to pass "engine" as a kwarg to _Channel.init() and use that value rather than Engine.instance().

Datagram should switch to listening mode when data is written

SOCK_DGRAM sockets automatically bind to a local address when data is written to them. Pants should alter its state to reflect this by setting _listening to True and updating events (if necessary) either directly before a call to sendto() or directly after. Not sure which.

Run Pants in its own thread

Add the ability to run Pants in its own thread. This should have absolutely no effect on the usability or performance of "regular" Pants and, ideally, should be as simple as passing Engine.start to a Thread object.

In practice it will probably be slightly more complicated due to the global nature of the engine. One possibility is requiring the developer to create a new Engine instance and pass that instance to all their channels before starting it up in a thread. This seems like a fairly unpleasant API, though, and would probably mean the whole thing would break if someone didn't follow the requirement...

Quandary!

Resolve error over-reaction.

Pants' default reaction to an error on a channel is to close the channel. This is not necessary in all cases, and contributes to a sharp decline in performance at high scale.

Improve the timer API

Timers are kinda weird. Improve them.

Possibilities:

  • Use datetimes rather than UNIX timestamps.
  • Allow the use of seconds or timedelta delays.
  • Base expiry time off actual current time rather than latest poll time.

Multicast Support

Datagram instances should be able to easily handle multicast.

Sending

For sending multicast packets, there should be a convenience function that sets the IP_MULTICAST_TTL option. Fortunately, Datagram.listen() already makes it easy to call bind() on the socket. However, a helper that performs both steps at once may be useful. Perhaps include IP_MULTICAST_TTL in listen()?

Receiving

Receiving multicast is much more involved than sending, given that a socket must subscribe itself to the multicast group it's interested in. This could be done in a register() method, or perhaps a connect() method to keep with standard method names.

Fix time precision problems on Windows

The precision of the time.time() function on Windows is too low for Pants to accurately execute cycle functions within 0.01 seconds of when they should be executed. This may affect other platforms too, but it does not affect OS X or Linux.

One possible solution is to use the clock() function to provide extra precision. This could be done for Windows only, or for all platforms. Further experimentation is required.

Auto-Flush

Try flushing all data at the end of every Engine.poll call and see what happens.

Non-Blocking Connect

Stream.connect needs to use pants.dns for resolving hostnames, rather than relying on the socket.connect internals to do it, as socket.connect uses a blocking function internally, and thus itself blocks--even with non-blocking sockets.

If pants.dns fails to resolve a hostname, Pants could then fall back to sending the domain name to connect and hoping the OS itself has better luck resolving the name. There could be a keyword argument for connect to force it to use the OS for resolving as well.

This will require a function capable of differentiating between a domain and an IP address (with support for both IPv4 and IPv6). Such a function should actually be built into pants.dns at some point.

Update exception syntax to be more modern.

Currently the prevalent exception-catching syntax through the codebase is:

except Exception, err:

it should be updated to the more modern equivalent:

except Exception as err:

Use repr(self) in log messages

Rather than doing:

log.info("Something happened on %s #%d" % (self.__class__.__name__, self.fileno))

do:

log.info("Something happened on %r" % self)

and simply implement a repr method in _Channel that returns an appropriate identifier string. Avoid using fileno, because it can be None sometimes.

SSL support

What the title says. SSL support in core Pants.

pants.contrib.web - Per-Domain Routing

There should be an easy way to route requests with different Host headers to different handlers. Either a new class that does the splitting before requests reach Application instances, or a modification to Application.route and the routing internals.

The routing internals would probably be best:

from pants.contrib.web import *
app = Application()

@app.route("www.example.com/")
def hello():
    return "www.example.com's homepage!"

@app.route(".example.com/")
def hello2():
    return "Subdomains of example.com's homepage!"

Routes will always start with a /, so it can be assumed that any text before the first / is a domain for Host matching. Further, requests to example.com should, unless otherwise specified, be assumed equivilent to www.example.com, thus making www the default subdomain.

Fix contrib/DNS breakage.

Recent changes to Stream/Datagram will probably have caused some minor breakage in various contrib modules as well as pants.util.dns - fix any issues that have arisen. Potential issues with:

  • Separation of Stream into Stream and StreamServer.
  • Address format changes - both in remote_addr/local_addr attributes and in connect/listen/write methods, but might have caused more widespread issues.
  • Removal of pants.datagram.sendto function.
  • Any use of the Channel class (there shouldn't be any) will now have to use the _Channel class.

Add some sort of error reporting method.

There should be a standard interface for errors involving a Channel to be passed to user code when the error is the result of a non-blocking function call, or any other situation in which a raised exception would not reach user code.

For example, if a channel fails to connect to a remote host because the provided host name is invalid, there isn't a clear way to return that at this point. I suppose it should be implemented as something like Channel.on_error.

Issues are, how should different errors be identified? As instances of exceptions passed as an argument to that function? And what about closing a channel if it's active when the error occurs?

Additional Types for read_delimiter

read_delimiter is nice, but it isn't as flexible as it could be. To that end, the following types should be accepted:

1. Regular Expressions

import re
from pants import Connection

class Test(Connection):
    def on_connect(self):
        self.read_delimiter = re.compile(r"(?:Hello|Goodbye) world.")

Should on_read be passed the match object in addition to the data?

2. struct_delimiter

struct is a useful module, and in many cases, people will use it to read received network data. I propose a new type, struct_delimiter is created to make it easy to read binary network protocols.

from pants import Connection, struct_delimiter

class Test(Connection):
    def on_connect(self):
        self.read_delimiter = struct_delimiter("!I2H")
    def on_read(self, id, length, stuff):
        pass

Using struct_delimiter should automatically parse the data before sending it to on_read. Additionally, if a byte-order isn't specified in the construction string, network order should be assumed for obvious reasons.

PEP-8 compliance

Clean up the source so that it is PEP-8 compliant where appropriate.

Python 3 Support

My py3 branch[1] implements support for python 3 in pants with the caveat that pants.http hasn't been ported. I've created this issue as a place to centralize discussion around the port until @ecdavis creates a py3 branch to link a pull request to. By merging into another branch, we avoid the possibility of me destroying 3 years of prior work.

[1] https://github.com/aspidites/pants

WebSockets standard

WebSockets implementation needs to be rewritten to adhere to the standard. Have fun!

pants-0.10.0: Complete core documentation.

Docstrings for the core modules must be completed before the 0.10.0 release. Documents for the modules themselves must be drafted, as well as a simple walkthrough and some examples.

HTTPClient Revamp

I really need to fix up HTTPClient with better functionality for it to be useful. At this point, the interface feels hackish to me, and it doesn't even support everything an HTTP client should support.

Desired Features

  • Full HTTP/1.1 Support
  • HTTP Proxies
  • Basic/Digest Authentication
  • Multipart Request Bodies, including Files.
  • Persistent Cookies
  • Redirection History
  • Full Unicode Support

I'm unsure at this point if the callback upon receiving a request should return two values ((status, response)), or simply a response object with a non-standard status to represent errors such as connection timeouts, invalid responses, or redirect loops.

For inspiration, I should check out requests (https://github.com/kennethreitz/requests), since that seems really popular these days.

sendfile is broken on OS X

So sendfile doesn't work properly on OS X. It will send some number of bytes but return 0 as the number of bytes sent. This stops Pants from updating the offset, causing it to resend the same bytes repeatedly until the client disconnects or OS X decides to return the number of bytes it actually sent.

Tested this in a variety of ways:

  • When nbytes is 0, sendfile seems to always return 0 bytes sent.
  • When nbytes is size of file, sendfile seems to always return 0 bytes sent.
  • When nbytes is 65535, sendfile starts off returning 65535 bytes sent but then it becomes intermittent - sometimes returning the correct number, other times 0.
  • When nbytes is 1024, sendfile seems to work properly.

The key thing to remember is that when it reports that it has sent 0 in these cases it has actually sent data to the remote host! This can be confirmed by running a fileserver and using wget or a browser to download a file. You will find that the download finishes (i.e. the client has received all the bytes it expected to receive and closed the connection) but the file hash does not match the original. Since the offset is not updated, this leads me to conclude that the same chunks of the file are getting sent repeatedly.

pants-0.10.0: Standardise use of "status methods" vs. attribute access

Channel, Stream and Datagram have various "status methods" (closed(), active(), connected(), listening()) that are simply wrappers for direct attribute access. Using these methods internally can result in significant performance loss due to the overhead of calling a method vs. accessing an attribute directly many times a second.

Use of these methods should be eliminated internally and it should be noted in their implementations that they are for external use only.

Fix Datagram and DNS and re-enable Pants resolve

Datagram is woefully outdated and needs to be completely rewritten and brought up to Pants standard. Once this has been done, DNS needs to be fixed to work with it.

Pants resolve was disabled in commit c2d2f08 and once Datagram and DNS have been fixed, it should be re-enabled.

Internal buffer size limit.

If the developer specifies a string read_delimiter that is never found, the internal _recv_buffer attribute will keep growing and growing. It may be a good idea to put an upper limit on its size to prevent memory usage from exploding in those cases.

Change the point at which on_close() is called

Currently, _Channel.on_close() is called after the channel has been completely cleaned up. This means that it's impossible to access any channel-specific state (buffered data, addresses, filenos, etc.) in on_close() callbacks.

Consider altering the code to call on_close() before cleanup starts.

Update WebSockets for 0.10.

Currently, the websocket code is old, broken, and ugly. It should be updated to both work on 0.10, and use the latest websocket draft. It should also have better integration with the other HTTP stuff, and maybe a debugger.

pants-0.10.0: Rename Channel._readable and Channel._writable

The Channel._readable and Channel._writable attributes have misleading names. When _readable is False, the Channel will not read data until it receives a read event from the Engine. Similarly, when _writable is False, data will not be written until a write event arrives. When _readable or _writable are True, however, the Channel will simply read/write data without bothering to check if it can. If an error occurs, _readable/_writable is set to False and it begins waiting for the event again.

It may be necessary to change the meaning of these attributes to make the code clearer. I.e. if the attributes were named _wait_for_read_event and _wait_for_write_event the current values would need to be reversed.

pants-0.10.0: close()/end() method duplication

Both the Stream and Datagram classes implement a close() method that is responsible for cleaning up attributes specified in the Channel class. Perhaps a better approach to this would be to write a close() method in Channel that cleans up its own attributes and then to override that method in Stream and Datagram classes.

Similarly, the end() methods in Stream and Datagram are identical, and could easily be moved into the Channel class.

Code duplication in network/unix modules.

There is significant code duplication in the convenience classes for network and Unix streams - it would be nice to find some way to reduce the duplication to make the code easier to maintain, wouldn't it?

hosts Files

pants.dns should be able to read the OS's hosts file and obey it for A or AAAA record queries where appropriate, rather than its current behavior of simply ignoring the file and sending every query to a DNS server.

pants-0.10.0: Rename _send() to write()

Channel.write() is a convenience method spawned by a rather silly idea. In practice, using write() results in performance loss and overriding it is no simpler than it would be if it implemented the actual writing functionality. The "overridable" write() should be removed and replaced with what is currently named _send(). Some of the contrib code will need to be changed, as it is naughty and currently uses _send().

Pants -really- needs tests

A good test suite should be written that exercises a large portion of the codebase. I think nose is probably best, and it's okay to require it for running tests. doctests would also be useful in certain conditions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.