Giter Club home page Giter Club logo

hijk's People

Contributors

audreyt avatar avar avatar avarbkng avatar dgryski avatar dsteinbrunner avatar gugod avatar jackdoe avatar kaoru avatar mstevens avatar zakame avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hijk's Issues

A request debug mode

Provide a mechanism to produce request traces to a path user specified.
I'm thinking of something similar to 'tcpflow -e http' . It should probably store one request per file, and let file names contain some connection id and a timestamp. Users can know what happens in which order within a connection.

IMHO this should be global and/or local:

Local (log specific requests):

use Hijk;
use Hijk::DEBUG trace => "/tmp/hijk_debug/";
Hijk::DEBUG::request({...}); # logged
Hijk::request({...}); # not logged

Global (log all requests):

use Hijk;
use Hijk::DEBUG trace => "/tmp/hijk_debug/", global => 1;
Hijk::DEBUG::request({...}); # logged
Hijk::request({...}); #  logged

Provide a way to manage the socket cache

There's some potentially nasty issues with connection pileups in the
socket cache. Imagine the following scenario:

  • I have an LB pool of 50 servers
  • Before I connect I always resolve that.lb-pool and have Hijk
    connect to the IP
  • My DNS library gives me a new IP every 3 seconds

In a long living child each child will end up having 50 open
connections to that.lb-pool, even though it's only using one of the
servers at a time.

Furthermore if you don't do the resolving yourself but instead pass a
hostname you have the opposite problem, now you only ever connect to
one server for the lifetime of the child, whereas the assumption with
you getting a consistent resolution for 3 seconds is that we can take
out boxes from the pool and know that connections to them will stop in
3 seconds + whatever the max request time is.

I think this is best solved with a combination of documenting the
problem, and just changing the interface to:

my %connection_cache;
# ...
Hijk::request(
    connection_cache => \%connection_cache,
);
# Later, clear open connections, say every few requests
shutdown(delete $connection_cache{$_} for keys %connection_cache;

The "head" we return should be a an ArrayRef[Str], not a HashRef[Str]

Can't believe I didn't notice this before, but we don't support servers returning multiple headers, e.g. multiple Set-Cookie headers. This needs to be fixed, but is obviously a breaking API change. We could migrate to it with an option to request(), or just change it. Does anyone besides us use this module anyway? We could just guard calls to it with a test for the version.

Hijk should check the return value of inet_aton() before using it

Filing this before I forget, Hijk does this given a hostname:

$ perl -MSocket=sockaddr_in,inet_aton -e 'sockaddr_in(80, inet_aton("doesnotexist"))'
Bad arg length for Socket::pack_sockaddr_in, length is 0, should be 4 at /home/v-perlbrew/perl5/perlbrew/perls/perl-5.19.6/lib/5.19.6/x86_64-linux/Socket.pm line 833.

Which leads to very confusing error messages from Hijk when you're using it to talk to e.g. a HAProxy pool that's been depleted.

win32 support

Should we support win32 platform in some way? Lets have some discussion.

Provide structured return codes

I now have code that wraps this library that basically has to do:

my $res;
my $is_timeout;
eval {
    $res = Hijk::request(...);
    1;
} or do {
    my $error = $@ || "Zombie Error";
    $is_timeout = 1 if $error =~ /^(?:READ TIMEOUT|CONNECT TIMEOUT|select\(2\) error|send error)/;
};

And it's quite likely that I've missed something. I think it would
suck less if the interface was something like this:

# Won't die unless something's terribly wrong, i.e. not on just a
# normal expected every once in a while timeout
my $res = Hijk::request(...);
if (exists $res->{error}) {
    if ($res->{error} == Hijk::Error::CONNECT_TIMEOUT or
        $res->{error} == Hijk::Error::READ_TIMEOUT) {
        # handle timeouts
    } else {
        die "PANIC: No idea what to do with a Hijk $res->{error} error";
    }
} else {
    # We know at this point that we have a real HTTP reply
    if ($res->{status} == 200) {
    } elsif ($res->{status} == 500) {
        # ... it could even be a 500 (which is not a $res->{error})
    }
}

So that users didn't have to regex match against arbitrary internal
error messages, but instead just had to check for an optional "error"
key in the returned hash.

Hijk Tests Hang for Days

Since the v0.28 release, our RPM building job has been hanging for days at a time on the Hijk tests. Output:

- Building Hijk-0.28
2019/01/16-08:42:53 main (type Net::Server::HTTP -> MultiType -> Net::Server::PreFork) starting! pid(8435)
Resolved [*]:13796 to [::]:13796, IPv6
Not including resolved host [0.0.0.0] IPv4 because it will be handled by [::] IPv6
Binding to TCP port 13796 on host :: with IPv6
Group Not Defined.  Defaulting to EGID '1999 135 1999'
User Not Defined.  Defaulting to EUID '489'
2019/01/16-08:42:55 Server closing!
HTTP::Server::PSGI: Accepting connections at http://0:14492/
    # Connecting to a wrong port: 14492
    # Dying message: Connection refused at /home/jenkins-slave/workspace/CentOS6-Perl-5.20/BUILD/Hijk-0.28/blib/lib/Hijk.pm line 289.
127.0.0.1 - - [16/Jan/2019:08:43:10 -0800] "GET /?t=5 HTTP/1.1" 200 32 "-" "-"
127.0.0.1 - - [16/Jan/2019:08:43:15 -0800] "GET /?t=5 HTTP/1.1" 200 33 "-" "-"

And it's just stopped there, hanging out, nothing happening. I was unable to replicate when building manually; could there be something waiting on a TTL or something?

test failures on MSWin32 platform

(Forwarded from personal email)

My system:
Strawberry perl on Win7 64bit
(This is perl 5, version 18, subversion 2 (v5.18.2) built for
MSWin32-x64-multi-thread)

Thanks!
Goetz

cpan> install Hijk
Running install for module 'Hijk'
Running make for G/GU/GUGOD/Hijk-0.13.tar.gz
  Has already been unwrapped into directory
C:\apps\strawberry-perl\cpan\build\Hijk-0.13-aPXOUN
  Has already been made
Running make test
C:\apps\strawberry-perl\perl\bin\perl.exe "-MExtUtils::Command::MM"
"-MTest::Harness" "-e" "undef *Test::Harness::Switches; test_harness(0,
'inc', 'blib\lib', 'blib\arch')" t/*.t
t/build_http_message.t ................... ok
t/live-connect-timeout.t ................. skipped: Enable live testing
by setting env: TEST_LIVE=1
t/live-elasticsearch.t ................... skipped: Enable live testing
by setting env: TEST_LIVE=1
t/live-google.t .......................... skipped: Enable live testing
by setting env: TEST_LIVE=1
t/live-invalid-domain.t .................. skipped: Enable live testing
by setting env: TEST_LIVE=1
t/live-plack.t ........................... skipped: Enable live testing
by setting env: TEST_LIVE=1
t/parse-http-connection-close-message.t ..
#   Failed test at t/parse-http-connection-close-message.t line 36.
t/parse-http-connection-close-message.t .. 1/? #          got: '0'
#     expected: '200'

#   Failed test at t/parse-http-connection-close-message.t line 37.
#          got: undef
#     expected: ''

#   Failed test at t/parse-http-connection-close-message.t line 39.
#     Structures begin differing at:
#          $got = undef
#     $expected = HASH(0x8aeb80)

#   Failed test 'threw Regexp ((?^:0 bytes))'
#   at t/parse-http-connection-close-message.t line 50.
# expecting: Regexp ((?^:0 bytes))
# found: normal exit
# Looks like you failed 4 tests of 4.
t/parse-http-connection-close-message.t .. Dubious, test returned 4
(wstat 1024, 0x400)
Failed 4/4 subtests
t/parse-http-message.t ...................
#   Failed test at t/parse-http-message.t line 34.
t/parse-http-message.t ................... 1/? #          got: '0'
#     expected: '200'

#   Failed test at t/parse-http-message.t line 35.
#          got: undef
#     expected: 'OHAI'

#   Failed test at t/parse-http-message.t line 37.
#     Structures begin differing at:
#          $got = undef
#     $expected = HASH(0x80e330)
# Looks like you failed 3 tests of 3.
t/parse-http-message.t ................... Dubious, test returned 3
(wstat 768, 0x300)
Failed 3/3 subtests

Test Summary Report
-------------------
t/parse-http-connection-close-message.t (Wstat: 1024 Tests: 4 Failed: 4)
  Failed tests:  1-4
  Non-zero exit status: 4
t/parse-http-message.t                 (Wstat: 768 Tests: 3 Failed: 3)
  Failed tests:  1-3
  Non-zero exit status: 3
Files=8, Tests=27,  1 wallclock secs ( 0.06 usr +  0.01 sys =  0.08 CPU)
Result: FAIL
Failed 2/8 test programs. 7/27 subtests failed.
dmake.exe:  Error code 131, while making 'test_dynamic'
  GUGOD/Hijk-0.13.tar.gz
  C:\apps\strawberry-perl\c\bin\dmake.exe test -- NOT OK
//hint// to see the cpan-testers results for installing this module, try:
  reports GUGOD/Hijk-0.13.tar.gz
Running make install
  make test had returned bad status, won't install without force
Stopping: 'install' failed for 'Hijk'.
Failed during this command:
 GUGOD/Hijk-0.13.tar.gz                       : make_test NO

cpan>

Extend the error constants to cover read & select errors

It sucks that with every Hijk release I have to go and read the source
and carefully update some regexes catching its error values.

I have this error handling wrapping Hijk, which I see isn't complete
anymore with 0.14 (the all-caps stuff is re-thrown Hijk::Error::*):

if ($error =~ /^CONNECT TIMEOUT/s) {
    $http_had_connect_timeout = 1
} elsif ($error =~ /^READ TIMEOUT/s) {
    $http_had_read_timeout = 1
} elsif ($error =~ /^CANNOT RESOLVE/s) {
    $http_had_resolve_failure = 1;
} elsif ($error =~ /(?:send error|select\(\) error)/) {
    $http_had_send_or_select_error = 1;
} elsif ($error =~ /Failed to read http (?:body|head) from socket/) {
    $http_had_read_error = 1;

This is the 0.14 status of where Hijk will die:

20 matches for "die" in buffer: Hijk.pm
 38:            die "Failed to read http " .( $decapitated ? "body": "head" ). " from socket. errno = $!"
 41:        die "Failed to read http " .( $decapitated ? "body": "head" ). " from socket. Got 0 bytes back, which shouldn't happen"
103:                die "Failed to read chunked body from socket. errno = $!"
106:            die "Failed to read chunked body from socket. Got 0 bytes back, which shouldn't happen <$buf> <$current_buf>"
172:    socket($soc, PF_INET, SOCK_STREAM, $tcp_proto) || die "Failed to construct TCP socket: $!";
173:    my $flags = fcntl($soc, F_GETFL, 0) or die "Failed to set fcntl F_GETFL flag: $!";
174:    fcntl($soc, F_SETFL, $flags | O_NONBLOCK) or die "Failed to set fcntl O_NONBLOCK flag: $!";
177:        die "Failed to connect $!";
187:            die "select() error on constructing the socket: $!";
192:        die $!;
253:            die "select() error before write(): $!";
261:            die "send error ($r) $!";
272:        die $err;

I've found that in practice when the service I'm using Hijk to query
goes down the read/connect timeouts are also associated with
send/select errors & read errors for those connections that are in
progress. I've never had any of the other errors during "normal"
production operations.

I think we should create new constants for this and wrap this, but
what should they be? Just:

Hijk::Error::SEND_ERROR
Hijk::Error::SELECT_ERROR
Hijk::Error::READ_ERROR

Should we extend the error interface to also pass along an
error_message in the cases where we're now dying with some message
mentioning $!?

Should we also include the socket construction errors? Or is that too
obscure? IMO we should wrap any errors that can happen during "normal"
operations, i.e. when you're querying some service and it goes down on
the other end, but it's probably not worth exhaustively wrapping all
possible errors, e.g. if we can't construct a socket we should
probably just die.

Any thoughts on this? I can trivially hack up a patch to do this, but
thought I'd start a discussion on what the interface should be, and
how far we should go with this.

SSL support?

There was a bit of discussion on our security team about adding SSL support to Hijk. Filing a bug so we have place for discussion.

Ability to specify connect and read timeouts separately, maybe total request timeout

I.e. now we have:

timeout => seconds

It would be nice to have:

connect_timeout => seconds,
read_timeout => seconds,

And, this is more complex but also having a:

total_read_timeout => seconds,

Would be very nice. I.e. you'd have something like:

# We should have a connection almost instantly
connect_timeout => 0.05,

# It might take the server a bit to send us something back, it's
# thinking
read_timeout      => 0.50,

# The read timeout is on a per-hunk basis, so e.g. if it's 0.5s
# the server could just SLOWLY send us a new hunk every 0.4
# seconds forever, resulting in us hanging forever. A
# total_read_timeout would use Time::HiRes::time() or something
# like that to see what the total time is that we've been spending
# on the accumulated read timeouts, and exit when that's exceeded.
total_read_timeout => 2,

The total_read_timeout isn't essential, and arguably that should just
be done by whatever calls Hijk in an alarm, just threw it in there
because we had a chat about it the other day.

One thing that makes just splitting up the connect/read timeouts
slightly painful from an API perspective is that you can't just have
one or the other, if you set one you must set both, this is because
the read timeout will only work on a non-blocking socket, which we
only set up if we have a timeout for the connect.

Provide a way to only use HTTP 1.0

I found that when having Hijk talk to nginx turning keep-alive off resulted in no slowdown whatsoever for this app I'm running, linux + nginx socket setup is really fast.

Also after finding that 8071b49 resolved the issue I reported in http://lists.unbit.it/pipermail/uwsgi/2013-December/006795.html I don't need to use the cached socket facility at all.

A lot of complexity in socket management in the library would just go away if it didn't have to deal with cached sockets.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.