Giter Club home page Giter Club logo

hackney's Introduction

hackney - HTTP client library in Erlang

Copyright (c) 2012-2023 Benoรฎt Chesneau.

Version: 1.20.1

hackney

hackney is an HTTP client library for Erlang.

Build Status Hex pm

Main features:

  • no message passing (except for asynchronous responses): response is directly streamed to the current process and state is kept in a #client{} record.
  • binary streams
  • SSL support
  • Keepalive handling
  • basic authentication
  • stream the response and the requests
  • fetch a response asynchronously
  • multipart support (streamed or not)
  • chunked encoding support
  • Can send files using the sendfile API
  • Optional socket pool
  • REST syntax: hackney:Method(URL) (where a method can be get, post, put, delete, ...)

Supported versions of Erlang are 22.3 and above. It is reported to work with version from 19.3 to 21.3.

Note: This is a work in progress, see the TODO for more information on what still needs to be done.

Useful modules are:

  • hackney: main module. It contains all HTTP client functions.

  • hackney_http: HTTP parser in pure Erlang. This parser is able to parse HTTP responses and requests in a streaming fashion. If not set it will be autodetected if it's a request or a response that's needed.

  • hackney_headers Module to manipulate HTTP headers.

  • hackney_cookie: Module to manipulate cookies.

  • hackney_multipart: Module to encode/decode multipart.

  • hackney_url: Module to parse and create URIs.

  • hackney_date: Module to parse HTTP dates.

Read the NEWS file to get the last changelog.

Installation

Download the sources from our Github repository

To build the application simply run 'rebar3 compile'.

To run tests run 'rebar3 eunit'. To generate doc, run 'rebar3 edoc'.

Or add it to your rebar config

{deps, [
    ....
    {hackney, ".*", {git, "git://github.com/benoitc/hackney.git", {branch, "master"}}}
]}.

Basic usage

The basic usage of hackney is:

Start hackney

hackney is an OTP application. You have to start it first before using any of the functions. The hackney application will start the default socket pool for you.

To start in the console run:


$ ./rebar3 shell

It is suggested that you install rebar3 user-wide as described here. This fixes zsh (and maybe other shells) escript-related bugs. Also this should speed things up.

> application:ensure_all_started(hackney).
ok

It will start hackney and all of the application it depends on:

application:start(crypto),
application:start(public_key),
application:start(ssl),
application:start(hackney).

Or add hackney to the applications property of your .app in a release

Simple request

Do a simple request that will return a client state:

Method = get,
URL = <<"https://friendpaste.com">>,
Headers = [],
Payload = <<>>,
Options = [],
{ok, StatusCode, RespHeaders, ClientRef} = hackney:request(Method, URL,
                                                        Headers, Payload,
                                                        Options).

The request method returns the tuple {ok, StatusCode, Headers, ClientRef} or {error, Reason}. A ClientRef is simply a reference to the current request that you can reuse.

If you prefer the REST syntax, you can also do:

hackney:Method(URL, Headers, Payload, Options)

where Method, can be any HTTP method in lowercase.

Read the body

{ok, Body} = hackney:body(ClientRef).

hackney:body/1 fetch the body. To fetch it by chunk you can use the hackney:stream_body/1 function:

read_body(MaxLength, Ref, Acc) when MaxLength > byte_size(Acc) ->
	case hackney:stream_body(Ref) of
		{ok, Data} ->
			read_body(MaxLength, Ref, << Acc/binary, Data/binary >>);
		done ->
			{ok, Acc};
		{error, Reason} ->
			{error, Reason}
	end.

Note: you can also fetch a multipart response using the functions hackney:stream_multipart/1 and hackney:skip_multipart/1.

Note 2: using the with_body option will return the body directly instead of a reference.

Reuse a connection

By default all connections are created and closed dynamically by hackney but sometimes you may want to reuse the same reference for your connections. It's especially useful if you just want to handle serially a couple of requests.

A closed connection will automatically be reconnected.

To create a connection:

Transport = hackney_ssl,
Host = << "friendpaste.com" >>,
Port = 443,
Options = [],
{ok, ConnRef} = hackney:connect(Transport, Host, Port, Options).

To create a connection that will use an HTTP proxy use hackney_http_proxy:connect_proxy/5 instead.

To get local and remote ip and port information of a connection:

> hackney:peername(ConnRef).
> hackney:sockname(ConnRef).

Make a request

Once you created a connection use the hackney:send_request/2 function to make a request:

ReqBody = << "{	\"snippet\": \"some snippet\" }" >>,
ReqHeaders = [{<<"Content-Type">>, <<"application/json">>}],
NextPath = <<"/">>,
NextMethod = post,
NextReq = {NextMethod, NextPath, ReqHeaders, ReqBody},
{ok, _, _, ConnRef} = hackney:send_request(ConnRef, NextReq),
{ok, Body1} = hackney:body(ConnRef).

Here we are posting a JSON payload to '/' on the friendpaste service to create a paste. Then we close the client connection.

If your connection supports keepalive the connection will be kept open until you close it exclusively.

Send a body

hackney helps you send different payloads by passing different terms as the request body:

  • {form, PropList} : To send a form
  • {multipart, Parts} : to send your body using the multipart API. Parts follow this format:
    • eof: end the multipart request
    • {file, Path}: to stream a file
    • {file, Path, ExtraHeaders}: to stream a file
    • {file, Path, Name, ExtraHeaders} : to send a file with DOM element name and extra headers
    • {Name, Content}: to send a full part
    • {Name, Content, ExtraHeaders}: to send a full part
    • {mp_mixed, Name, MixedBoundary}: To notify we start a part with a a mixed multipart content
    • {mp_mixed_eof, MixedBoundary}: To notify we end a part with a a mixed multipart content
  • {file, File} : To send a file
  • Bin: To send a binary or an iolist

Note: to send a chunked request, just add the Transfer-Encoding: chunked header to your headers. Binary and Iolist bodies will be then sent using the chunked encoding.

Send the body by yourself

While the default is to directly send the request and fetch the status and headers, if the body is set as the atom stream the request and send_request function will return {ok, Client}. Then you can use the function hackney:send_body/2 to stream the request body and hackney:start_response/1 to initialize the response.

Note: The function hackney:start_response/1 will only accept a Client that is waiting for a response (with a response state equal to the atom waiting).

Ex:

ReqBody = << "{
      \"id\": \"some_paste_id2\",
      \"rev\": \"some_revision_id\",
      \"changeset\": \"changeset in unidiff format\"
}" >>,
ReqHeaders = [{<<"Content-Type">>, <<"application/json">>}],
Path = <<"https://friendpaste.com/">>,
Method = post,
{ok, ClientRef} = hackney:request(Method, Path, ReqHeaders, stream, []),
ok  = hackney:send_body(ClientRef, ReqBody),
{ok, _Status, _Headers, ClientRef} = hackney:start_response(ClientRef),
{ok, Body} = hackney:body(ClientRef),

Note: to send a multipart body in a streaming fashion use the hackney:send_multipart_body/2 function.

Get a response asynchronously

Since the 0.6 version, hackney is able to fetch the response asynchronously using the async option:

Url = <<"https://friendpaste.com/_all_languages">>,
Opts = [async],
LoopFun = fun(Loop, Ref) ->
        receive
            {hackney_response, Ref, {status, StatusInt, Reason}} ->
                io:format("got status: ~p with reason ~p~n", [StatusInt,
                                                              Reason]),
                Loop(Loop, Ref);
            {hackney_response, Ref, {headers, Headers}} ->
                io:format("got headers: ~p~n", [Headers]),
                Loop(Loop, Ref);
            {hackney_response, Ref, done} ->
                ok;
            {hackney_response, Ref, Bin} ->
                io:format("got chunk: ~p~n", [Bin]),
                Loop(Loop, Ref);

            Else ->
                io:format("else ~p~n", [Else]),
                ok
        end
    end.

{ok, ClientRef} = hackney:get(Url, [], <<>>, Opts),
LoopFun(LoopFun, ClientRef).

Note 1: When {async, once} is used the socket will receive only once. To receive the other messages use the function hackney:stream_next/1.

Note 2: Asynchronous responses automatically checkout the socket at the end.

Note 3: At any time you can go back and receive your response synchronously using the function hackney:stop_async/1 See the example test_async_once2 for the usage.

Note 4: When the option {follow_redirect, true} is passed to the request, you will receive the following messages on valid redirection:

  • {redirect, To, Headers}
  • {see_other, To, Headers} for status 303 and POST requests.

Note 5: You can send the messages to another process by using the option {stream_to, Pid} .

Use the default pool

Hackney uses socket pools to reuse connections globally. By default, hackney uses a pool named default. You may want to use different pools in your application which allows you to maintain a group of connections. To use a different pool, do the following:

Method = get,
URL = <<"https://friendpaste.com">>,
Headers = [],
Payload = <<>>,
Options = [{pool, mypool}],
{ok, StatusCode, RespHeaders, ClientRef} = hackney:request(Method, URL, Headers,
                                                        Payload, Options).

By adding the tuple {pool, mypool} to the options, hackney will use the connections stored in that pool. The pool gets started automatically the first time it is used. You can also explicitly configure and start the pool like this:

PoolName = mypool,
Options = [{timeout, 150000}, {max_connections, 100}],
ok = hackney_pool:start_pool(PoolName, Options),

timeout is the time we keep the connection alive in the pool, max_connections is the number of connections maintained in the pool. Each connection in a pool is monitored and closed connections are removed automatically.

To close a pool do:

hackney_pool:stop_pool(PoolName).

Note: Sometimes you want to disable the default pool in your app without having to set the client option each time. You can now do this by setting the hackney application environment key use_default_pool to false. This means that hackney will not use socket pools unless specifically requested using the pool option as described above.

To disable socket pools for a single request, specify the option {pool, false}.

Use a custom pool handler.

Since the version 0.8 it is now possible to use your own Pool to maintain the connections in hackney.

A pool handler is a module that handles the hackney_pool_handler behaviour.

See for example the hackney_disp a load-balanced Pool dispatcher based on dispcount].

Note: for now you can`t force the pool handler / client.

Automatically follow a redirection

If the option {follow_redirect, true} is given to the request, the client will be able to automatically follow the redirection and retrieve the body. The maximum number of connections can be set using the {max_redirect, Max} option. Default is 5.

The client will follow redirects on 301, 302 & 307 if the method is get or head. If another method is used the tuple {ok, maybe_redirect, Status, Headers, Client} will be returned. It will only follow 303 redirects (see other) if the method is a POST.

Last Location is stored in the location property of the client state.

ex:

Method = get,
URL = "http://friendpaste.com/",
ReqHeaders = [{<<"accept-encoding">>, <<"identity">>}],
ReqBody = <<>>,
Options = [{follow_redirect, true}, {max_redirect, 5}],
{ok, S, H, Ref} = hackney:request(Method, URL, ReqHeaders,
                                     ReqBody, Options),
{ok, Body1} = hackney:body(Ref).

Use SSL/TLS with self signed certificates

Hackney uses CA bundles adapted from Mozilla by certifi. Recognising an organisation specific (self signed) certificates is possible by providing the necessary ssl_options. Note that ssl_options overrides all options passed to the ssl module.

ex (>= Erlang 21):

CACertFile = <path_to_self_signed_ca_bundle>,
CrlCheckTimeout = 5000,
SSLOptions = [
{verify, verify_peer},
{versions, ['tlsv1.2']},
{cacertfile, CACertFile},
{crl_check, peer},
{crl_cache, {ssl_crl_cache, {internal, [{http, CrlCheckTimeout}]}}},
{customize_hostname_check,
  [{match_fun, public_key:pkix_verify_hostname_match_fun(https)}]}],

Method = get,
URL = "http://my-organisation/",
ReqHeaders = [],
ReqBody = <<>>,
Options = [{ssl_options, SSLoptions}],
{ok, S, H, Ref} = hackney:request(Method, URL, ReqHeaders,
                                  ReqBody, Options),

%% To provide client certificate:

CertFile = <path_to_client_certificate>,
KeyFile = <path_to_client_private_key>,
SSLOptions1 = SSLoptions ++ [
{certfile, CertFile},
{keyfile, KeyFile}
],
Options1 = [{ssl_options, SSLoptions1}],
{ok, S1, H1, Ref1} = hackney:request(Method, URL, ReqHeaders,
                                     ReqBody, Options1).

Proxy a connection

HTTP Proxy

To use an HTTP tunnel add the option {proxy, ProxyUrl} where ProxyUrl can be a simple url or an {Host, Port} tuple. If you need to authenticate set the option {proxy_auth, {User, Password}}.

SOCKS5 proxy

Hackney supports the connection via a socks5 proxy. To set a socks5 proxy, use the following settings:

  • {proxy, {socks5, ProxyHost, ProxyPort}}: to set the host and port of the proxy to connect.
  • {socks5_user, Username}: to set the user used to connect to the proxy
  • {socks5_pass, Password}: to set the password used to connect to the proxy

SSL and TCP connections can be forwarded via a socks5 proxy. hackney is automatically upgrading to an SSL connection if needed.

Metrics

Hackney offers the following metrics

You can enable metrics collection by adding a mod_metrics entry to hackney's app config. Metrics are disabled by default. The module specified must have an API matching that of the hackney metrics module.

To use folsom, specify {mod_metrics, folsom}, or if you want to use exometer, specify{mod_metrics, exometer} and ensure that folsom or exometer is in your code path and has been started.

Generic Hackney metrics

Name Type Description
hackney.nb_requests counter Number of running requests
hackney.total_requests counter Total number of requests
hackney.finished_requests counter Total number of requests finished

Metrics per Hosts

Name Type Description
hackney.HOST.nb_requests counter Number of running requests
hackney.HOST.request_time histogram Request time
hackney.HOST.connect_time histogram Connect time
hackney.HOST.response_time histogram Response time
hackney.HOST.connect_timeout counter Number of connect timeout
hackney.HOST.connect_error counter Number of timeout errors
hackney_pool.HOST.new_connection counter Number of new pool connections per host
hackney_pool.HOST.reuse_connection counter Number of reused pool connections per host

Metrics per Pool

Name Type Description
hackney_pool.POOLNAME.take_rate meter meter recording rate at which a connection is retrieved from the pool
hackney_pool.POOLNAME.no_socket counter Count of new connections
hackney_pool.POOLNAME.in_use_count histogram How many connections from the pool are used
hackney_pool.POOLNAME.free_count histogram Number of free sockets in the pool
hackney_pool.POOLNAME.queue_count histogram queued clients

Contribute

For issues, comments or feedback please create an issue.

Notes for developers

If you want to contribute patches or improve the docs, you will need to build hackney using the rebar_dev.config file. It can also be built using the Makefile:

$ rebar3 update
$ rebar3 compile

For successfully running the hackney test suite locally it is necessary to install httpbin.

An example installation using virtualenv::

$ mkvirtualenv hackney
$ pip install gunicorn httpbin

Running the tests:

$ gunicorn --daemon --pid httpbin.pid httpbin:app
$ rebar3 eunit
$ kill `cat httpbin.pid`

Modules

hackney
hackney_app
hackney_bstr
hackney_connect
hackney_connection
hackney_connections
hackney_cookie
hackney_date
hackney_headers
hackney_headers_new
hackney_http
hackney_http_connect
hackney_local_tcp
hackney_manager
hackney_metrics
hackney_multipart
hackney_pool
hackney_pool_handler
hackney_request
hackney_response
hackney_socks5
hackney_ssl
hackney_stream
hackney_sup
hackney_tcp
hackney_trace
hackney_url
hackney_util

hackney's People

Contributors

andrewdryga avatar anthonator avatar awetzel avatar benoitc avatar britto avatar deadtrickster avatar edgurgel avatar g-andrade avatar gdamjan avatar gmile avatar hiend avatar jimdigriz avatar jlouis avatar keynslug avatar kpribylov avatar kuroneer avatar leonardb avatar lexmag avatar matrixise avatar msbt avatar paulo-ferraz-oliveira avatar rj avatar sborrazas avatar scohen avatar sergetupchiy avatar sikanhe avatar spydon avatar tnt-dev avatar tsloughter avatar voltone avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hackney's Issues

add `raw/1` function

To extract the latest socket and latest downloaded buffer:

{Status, Transport, Socket, Buffer} = raw(Client),
  • Status: waiting_response, on_status, on_headers, on_body,
  • Transport: The current transport module
  • Socket: the current socket
  • Buffer: Data fetched but not yet processed

This feature can be useful when you want to create a simple proxy, rerouting on the headers and the status line and continue to forward the connection for example.

add {async, once} support

Would be interresting when receiving a connection asynchronously to stream it once, process the result, then tell to the request worker to send the next part.

ex:

Url = <<"https://friendpaste.com/_all_languages">>,
Opts = [{async, once}],


LoopFun = fun(Ref) ->
        receive
            {Ref, {status, StatusInt, Reason}} ->
                io:format("got status: ~p with reason ~p~n", [StatusInt,
                                                              Reason]),
            {Ref, {headers, Headers}} ->
                io:format("got headers: ~p~n", [Headers]),
            {Ref, done} ->
                ok;
            {Ref, Bin} ->
                io:format("got chunk: ~p~n", [Bin]),
            Else ->
                io:format("else ~p~n", [Else]),
                ok
        end
    end.

{ok, {response_stream, StreamRef}} = hackney:get(Url, [], <<>>, Opts),

LoopFun(StreamRef),
next_stream(StreamRef),
LoopFun(StreamRef),
....

While we are here we should have a way to stop an asynchronous request to continue it synchronously if needed.

ets_lru in included_applications breaks using hackney in a release

When I try to build a release (with exrm, but that shouldn't matter so much) I get a failure if I depend on hackney:

Errors generating release 
          Duplicated application included: 
    ets_lru included in idna and hackney

hackney depends on idna. hackney uses benoitc/ets_lru, idna uses cloudant/ets_lru. Because of this, the release can't determine which of those should serve as the application, so it bombs.

Removing ets_lru from included_applications gets me past the error when building the release. I haven't finished the release yet, but assuming all works well after that change.

Please update rebar

The rebar in your repository is too old and contains the bug that makes it incompatible with erlang.mk (basically the deps it pulls are always in ./deps instead of REBAR_DEPS_DIR environment variable). Thanks!

hackney:connect/4 failing with binary URL

The following fails:

$ hackney:connect(hackney_tcp_transport, <<"localhost">>, 5680, []).
** exception error: no function clause matching string:to_lower(<<"localhost">>) (string.erl, line 493)
     in function  hackney_pool:checkout/4 (src/hackney_pool.erl, line 54)
     in call from hackney_connect:socket_from_pool/4 (src/hackney_connect.erl, line 145)
     in call from hackney_connect:connect/5 (src/hackney_connect.erl, line 28)

Whereas supplying a string is just fine:

{ok, ConnRef} = hackney:connect(hackney_tcp_transport, "localhost", 5680, []).

The README uses binaries to specify the URL, so that's what I was expecting as well -- https://github.com/benoitc/hackney#to-create-a-connection

Sidenote: the var Transport in the linked README example should be assigned to hackney_tcp_transport instead of hackney_tcp_protocol.

hackney_manager crash during upgrade

I use couchbeam in my application.

When trying to hot upgrade my application (release_handler:install_release), I encountered this error:

2013-12-29 20:09:12.160 [error] <0.751.0> gen_server hackney_manager terminated with reason: no function clause matching hackney_manager:handle_call(which_children, {<0.713.0>,#Ref<0.0.5.230363>}, {dict,156,52,64,32,260,156,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[[<0.6479.2>|#Ref<0.0.3.7650>],...],...},...}}) line 165
2013-12-29 20:09:12.160 [error] <0.751.0> CRASH REPORT Process hackney_manager with 0 neighbours exited with reason: no function clause matching hackney_manager:handle_call(which_children, {<0.713.0>,#Ref<0.0.5.230363>}, {dict,156,52,64,32,260,156,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[[<0.6479.2>|#Ref<0.0.3.7650>],...],...},...}}) line 165 in gen_server:terminate/6 line 744
2013-12-29 20:09:12.161 [error] <0.713.0> release_handler: {'EXIT',{{function_clause,[{hackney_manager,handle_call,[which_children,{<0.713.0>,#Ref<0.0.5.230363>},{dict,156,52,64,32,260,156,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[[<0.6479.2>|#Ref<0.0.3.7650>],[<0.30458.2>|#Ref<0.0.4.9951>],[<0.31087.2>|#Ref<0.0.4.18161>]],[[<0.5540.3>|#Ref<0.0.4.106767>]],[[<0.23821.0>|#Ref<0.0.1.16877>],[<0.25925.0>|#Ref<0.0.1.39188>],[<0.1317.1>|#Ref<0.0.1.132772>],[<0.28491.1>|#Ref<0.0.2.155142>],[<0.13980.2>|#Ref<0.0.3.93600>],[<0.30467.2>|#Ref<0.0.4.10072>]],[[<0.10411.0>|#Ref<0.0.0.120075>]],[[<0.14014.2>|#Ref<0.0.3.93730>]],[[<0.13537.0>|#Ref<0.0.0.158821>],[<0.25944.0>|#Ref<0.0.1.39364>],[<0.3799.3>|#Ref<0.0.4.85130>]],[[<0.25929.0>|#Ref<0.0.1.39234>],[<0.1321.1>|#Ref<0.0.1.132817>]],[[<0.14033.2>|#Ref<0.0.3.93926>],[<0.29468.2>|#Ref<0.0.3.260101>]],[[<0.23756.0>|#Ref<0.0.1.16029>],[<0.9579.1>|#Ref<0.0.1.222401>],[<0.28298.2>|#Ref<0.0.3.245104>]],[[<0.519.3>|#Ref<0.0.4.45752>]],[[<0.25933.0>|#Ref<0.0.1.39275>],[<0.1325.1>,...],...],...},...}}],...},...]},...}}
error during a which_children call to hackney_manager (<0.751.0>). [State: running] Exiting ... 

2013-12-29 20:09:12.161 [error] <0.750.0> Supervisor hackney_sup had child hackney_manager started with hackney_manager:start_link() at <0.751.0> exit with reason no function clause matching hackney_manager:handle_call(which_children, {<0.713.0>,#Ref<0.0.5.230363>}, {dict,156,52,64,32,260,156,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[[<0.6479.2>|#Ref<0.0.3.7650>],...],...},...}}) line 165 in context child_terminated
2013-12-29 20:10:17.561 [error] <0.733.0> gen_server hackney_manager terminated with reason: no function clause matching hackney_manager:handle_call(which_children, {<0.695.0>,#Ref<0.0.0.548>}, {dict,0,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],...}}}) line 165

I had to shutdown the app and install it to upgrade.

Any ideas on why this is happening?

hackney_pool is freezing because of some ssl fsm stuff

Hi,

I started measuring performance and analysing some issues with sudden hackney_pool freezing when a lot of https requests happening.

I found a workaround in hackney 0.7 6e5bd6d but before opening a pull request I found out that there are a lot of changes made to the lib since that version. So after merge my fix is not working anymore. Seems that now the lag is somewhere in find_connection/3: Transport:setopts.

To test this I use a custom pool [{timeout, 150000}, {pool_size, 50}] and simply spawn 100 processes and each process makes an https request. Ideally each of them should be handled, but I'm getting such error for more than a half:

{timeout,{gen_server,call,[<0.1236.0>,{checkout,{"_bucket_.s3.amazonaws.com",443,hackney_ssl_transport},<0.3643.0>}]}}

The hackney_pool process then hangs with a long message queue for some time and then cleans up after 20 seconds or so.

Here is an fprof result of hackney_pool process trace:

25> fprof:analyse([no_callers]).
Processing data...
Creating output...
%% Analysis results:
{  analysis_options,
 [{callers, false},
  {sort, acc},
  {totals, false},
  {details, true}]}.

%                                               CNT       ACC       OWN
[{ totals,                                      812,30854.914,   15.027}].  %%%


%                                               CNT       ACC       OWN
[{ "<0.1236.0>",                                812,undefined,   15.027}].   %%

{  {proc_lib,init_p_do_apply,3},                  0,30842.595,    0.000}.
{  {gen_server,loop,6},                          51,30842.595,    1.029}.
{  {gen_server,decode_msg,8},                    52,30842.560,    1.015}.
{  {gen_server,handle_msg,5},                    52,30842.508,    2.446}.
{  suspend,                                      91,30827.568,    0.000}.
{  {hackney_pool,handle_call,3},                 52,30008.235,    1.079}.
{  {hackney_pool,find_connection,3},             51,30006.672,    1.598}.
{  {hackney_ssl_transport,setopts,2},             2,30000.775,    0.011}.
{  {ssl,setopts,2},                               2,30000.764,    0.021}.
{  {ssl_connection,sync_send_all_state_event,2},   3,30000.494,    0.019}.
{  {ssl_connection,set_opts,2},                   2,30000.475,    0.009}.
{  {gen_fsm,sync_send_all_state_event,3},         3,30000.475,    0.021}.
{  {gen,call,4},                                  3,30000.448,    0.012}.
{  {gen,do_call,4},                               3,30000.436,    1.478}.
{  {dict,find,2},                                52,    4.550,    1.401}.
{  {dict,get_slot,2},                            54,    1.387,    0.934}.
{  {dict,get_bucket,2},                          52,    1.055,    0.443}.
{  {gen_server,reply,2},                         51,    0.939,    0.939}.
{  {dict,find_val,2},                            52,    0.724,    0.724}.
{  {dict,get_bucket_s,2},                        52,    0.612,    0.612}.
{  {hackney_pool,store_connection,3},             1,    0.479,    0.027}.
{  {erlang,phash,2},                             54,    0.453,    0.453}.
{  garbage_collect,                               2,    0.274,    0.274}.
{  {proplists,expand,2},                          2,    0.268,    0.032}.
{  {dict,store,3},                                2,    0.145,    0.030}.
{  {proplists,expand_0,2},                        6,    0.119,    0.042}.
{  {proplists,expand_1,3},                        4,    0.077,    0.018}.
{  {dict,on_bucket,3},                            2,    0.070,    0.032}.
{  {proplists,'-expand/2-lc$^0/1-0-',1},          6,    0.063,    0.049}.
{  {proplists,expand_2,4},                        8,    0.059,    0.059}.
{  {hackney_ssl_transport,controlling_process,2},   1,    0.041,    0.005}.
{  {ssl,controlling_process,2},                   1,    0.036,    0.004}.
{  {proplists,key_uniq,1},                        2,    0.035,    0.015}.
{  {erlang,setelement,3},                         9,    0.034,    0.034}.
{  {ssl_connection,new_user,2},                   1,    0.032,    0.004}.
{  {dict,maybe_expand,2},                         2,    0.028,    0.008}.
{  {proplists,key_uniq_1,2},                      4,    0.020,    0.020}.
{  {dict,maybe_expand_aux,2},                     2,    0.020,    0.014}.
{  {proplists,flatten,1},                         4,    0.019,    0.019}.
{  {erlang,send,3},                               3,    0.018,    0.018}.
{  {erlang,monitor,2},                            3,    0.017,    0.017}.
{  {dict,'-store/3-fun-0-',3},                    2,    0.015,    0.008}.
{  {proplists,property,1},                        4,    0.014,    0.014}.
{  {erlang,exit,1},                               2,    0.012,    0.012}.
{  {erlang,send_after,3},                         1,    0.010,    0.010}.
{  {dict,store_bkt_val,3},                        2,    0.007,    0.007}.
{  {erlang,demonitor,2},                          1,    0.006,    0.006}.
{  {dict,size,1},                                 1,    0.005,    0.005}.
{  undefined,                                     0,    0.000,    0.000}.

Async request

Is it possible to receive async response on a different process than self? I mean, is there any way to pass a pid to receive the async response, status, etc?

Happy Holidays! ๐ŸŽ… ๐ŸŽ†

hackney doesn't work with proxies configured to accept GET (no CONNECT) requests

At the moment hackney always makes CONNECT request when working through proxy. In fact, most proxies that can be found on Internet don't accept CONNECT, they work with GET only. For CONNECT they reply "403 forbidden". http://en.wikipedia.org/wiki/HTTP_tunnel - "HTTP Tunneling without using CONNECT" section.

For example 202.107.222.50:80 (not sure if it will work at the time when you read this issue though) works fine when set in browser but returns 403 when used in hackney:request. Difference in connection method can be seen in Wireshark.

It would be nice to add an option to use such proxies and have a way to specify which method should be used - CONNECT or GET.

edown dependency

Isn't edown a doc dependency? Any chance it can be splitted out to a separate rebar config? This way projects that depend on hackney wouldn't need to fetch edown at all

Chunked Response

After my attempt to use hackney to handle a chunked response and it failing to work properly -- it would stream the body fine, but it would not respect the size of the chunks instead simply returning what it had read in even if that wasn't a full chunk -- I started looking into the code.

It looks to me like that is exactly what it is doing. Even though in te_chunked it is tracking the size of whats received so far and keeping a buffer when that is returned to transfer_decode it simply calls content_decode and stream_body_recv is not called again to continue receiving till the chunk is complete.

The only time it would call stream_body_recv again is if the data read in is <<>> causing it to return the atom 'more'. But that should not be the only case, correct?

Am I missing something?

add cookie support

hackney should support coookies:

  • provide a way to parse them from the headers
  • provide a way to add them to request headers

support Expect header

hackney should support the Expect header while sending the header:

  • send the expect header,
  • if continue, then send the body then start to read the response
  • else: return the response immediately

Should be supported on streamin or non streaming requests. It would skip the need to send the full body and wait for the response.

Error request

what am I doing wrong?

16> Method = get,
16> URL = <<"https://friendpaste.com">>,
16> Headers = [],
16> Payload = <<>>,
16> Options = [],
16> {ok, StatusCode, RespHeaders, ClientRef} = hackney:request(Method, URL,
16>                                                         Headers, Payload,
16>                                                         Options).
** exception error: bad argument
     in function  ets:lookup/2
        called as ets:lookup(hackney_pool,default)
     in call from hackney_pool:find_pool/2 (src/hackney_pool.erl, line 170)
     in call from hackney_pool:checkout/4 (src/hackney_pool.erl, line 57)
     in call from hackney_connect:socket_from_pool/4 (src/hackney_connect.erl, line 149)
     in call from hackney_connect:connect/5 (src/hackney_connect.erl, line 32)
     in call from hackney:request/5 (src/hackney.erl, line 223)

Get current location after redirect

Is it possible to get the current location (url) after redirect when {follow_redirect, true}? Say if a page redirects from http://foo.com to http://bar.com and

{ok, 200, _, Ref} = hackney:request(get, "http://foo.com", [], <<>>, [{follow_redirect, true}]).

how can I get bar.com from Ref?

Error compile Hackney

When i try rebar compile i have this error:
hackney_socks5.erl:12: can't find include lib "kernel/src/inet_dns.hrl"

Error when the server closes directly after returning a reply

Hit this error while trying Hackney against a bad webserver:

When the server returns the Body and closes the TCP socket the following is return to the caller

{error,{closed,<<"{\"ok\":\"ok\"}">>},
            [],
            {client,hackney_tcp_transport,"localhost",58520,netloc,[],
                    #Port<0.2643>,infinity,false,5,false,nil,undefined,
                    connected,on_header,nil,normal,
                    #Fun<hackney_request.send.2>,waiting,4096,
                    <<"{\"ok\":\"ok\"}">>,
                    {1,0},
                    nil,nil,nil,nil}} 

In this case the server is replying with {"ok":"ok"} then closing the socket

Attached is a server that behaves this way: https://gist.github.com/4024014

This server is able to serve data to curl and chrome.

make hackney an http entry point for erlang applications

Right now Hackney is using the classic API you can find in any HTTP clients proposing a method request or any REST entry point. Also for now hackney is defined per node and can only be used locally (mostly).

Instead we should be able to create HTTP processes on which you can send a request, and retrieve your response from it from any Erlang node.

An Entry point would be a registered process associated to an Host, Port, Options:

 {ok, Pid} = new_client(ClientName, Host, Port, Options)
 {ok, Pid} =  get_client(ClientName)

 ClientName = {local,Name} | {global,GlobalName} | {via,Module,ViaName}
 Name = atom()
 GlobalName = ViaName = term()

On which you can send and retrieve infos:

Ref = make_ref(),
From = self()
Pid ! {From, Ref, Method, Path, Headers}
receive
     {Ref, ok} -> 
            ...
end

The methods will receive a new optional ClientName argument.
It would be possible to pause or stop any request.
It would be possible to redirect the request to another process/node.
1 pool of connections / client will be maintained.

splitting hackney ?

With the introduction of the new features (multipart, headers parsing, cookies, htttp parser), it comes to mind that such tools could be used in other contexts than hackney. You could for example build your own HTTP server, parse HTTP over another protocol, or any other things.

Like I see it hackney could be split in 2 libraries:

  • hackney_lib: generic HTTP toolkit. no rename would be done and the modules will still use the hackney_ prefix.
  • hackney: the client, using hackney_lib to parse the different HTTP protocols.

thoughts?

Strange problem on recurring fetch of the content.

I have a series of processes that roughly does this:

fetch_json_result(Url) ->
Options = [{follow_redirect,true},{pool, default}],
{ok, _SC, RH, Client} = hackney:request(get,Url,[],<<>>,Options),
Body = try hackney:body(Client) of
{ok,Body0,
} -> Body0
after
hackney:close(Client)
end.

I have these processes (there is a bout 20 of them) go fetch some JSON for processing. For a while all seems to be fine, but after about 10 minutes all I get back is

{badmatch, {error,emfile}}

bad match is obviously due to me not caring about the process dying or not. But I wonder why hackney stops working after a few minutes of activity? Restarting my process does not help. I did not try to restart hackney pool but I thought I do not have to.

Is there anything that I can do?

Async requests and reading body

I have tried to use hackney in asynchronous mode but I don't understand how to deal with body properly - for a short answer (returning 503 in this test case) I get body in Reason of {hackney_response, Ref, {status, Code, Reason}} and after that I fail to call hackney:body as body is read already.
Real life responses could be lengthy so they will need to be read with stream_body but what to do with the short ones?
Currently I'm using synchronous calls but for some usecases asynchronous requests could be useful.

Don't send the url fragments in the request

Currently, given an url with a fragment, like http://stackoverflow.com/questions/146159/is-fortran-faster-than-c/146221#146221 will send the fragment #146221 in the request itself (on the wire).

Some of the web servers do tolerate that, but stackoverflow is one example that doesn't.

I have kind of a sollution here:
https://github.com/gdamjan/hackney/commit/9b564203c011ec2389770bacfeb808c2d6766da5
that instead of sending RawPath, recombines the URL from the Path and Query.

Send headers and body together

I am using hackney in production.
I need to send headers and body together for the performance on my case.
Is it possible to merge a patch that enable send them togetter when it is possible?

pool_size of max_connections?

It seems to be somewhat confusing: new NEWS.md I see pool_size mentioned as deprecated but README.md does not mention max_connections at all and hackney.app.src uses it.

Add async events for upload

Currently, the async mode only generates events once the remote server responds. I'd like to be able to monitor the progress of a large upload using POST. This means generating events for the submission phase of the request as well as the response. At the very least, "chunk" events and an event for when the whole request has been transmitted (but before the response has been received). Is this a possibility?

Does "Reuse a connextion" preserve state

Hi,

I find the doc is unclear about whether or not hackney preserves state when reusing a connection e.g. read cookies and send them back on the next request.

I've read the code and i'm truly sure it does, and if it wouldn't the point of reusing a connection would be zero, but i think the readme should be more explicit about that. Because that's what I was looking for when I found hackney, and I bet there are many people looking for this functionality.

I'm sorry not providing a pull request, but I'm not fluent enough in english.

edit: btw, the readme file mentions hackney_http_proxy:connect_proxy/5 which doens't seem to exist anymore.

HTTPS requests varying runtime

Any ideas why HTTPS requests through proxy could show variable duration?
At the moment I don't have a minimal testcase to reproduce it but our Common Test suite show runtime from 8s to 2m50s with HTTPS tests taking the most of the time.
I our tests we do requests to a local mock through mocked CONNECT proxy both implemented using mochiweb and SSL tunnel is created using standrad OTP ssl.
I'll try to minimize the testcase but if there are any known issues with HTTPS I'ld like to know

Case clause error when using a proxy

Hi,

I'm trying to use hackney through a proxy and i've a found a weird behaviour : in hackney_http_connect.erl I can read ProxyPort = proplists:get_value(connect_port, Opts). So connect_port is the key for an integer in the proplist.

But the Opts variable is passed along in do_handshake/4 and there again we see the same proplists read : ProxyPort = proplists:get_value(connect_port, Options) but the case expression matches booleans only :

    HostHdr = case ProxyPort of
        true ->
            list_to_binary(Host);
        false ->
            iolist_to_binary([Host, ":", integer_to_list(Port)])
    end,

See hackney_http_connect.erl line 140.

Thanks for reading.

add unitests

unitests should be added to hackney.

  • all tests should use the tools provided by Erlang (eunit)
  • They should test the HTTP parser: by testing expected responses again queries: run cowboy on a dynamic port, send requests to it and check the responses. I am thinking to have a list of Args that are given tohackney:request/5 sent to known resources in cowboy. Then we check the returned response.
  • Test the stream supervision

Memory usage

I have a program that's pretty simple: it downloads a list of remote video files; for each one, it downloads the file locally, then uploads it with a POST to a different host.

Running in a single process, this loop steadily increases the OS process RAM usage. The lasy time I killed it off it was at around 2.5GB resident / 4.5GB virtual.

I don't know anything about memory profiling in Erlang, and I don't know for sure that this is related to hackney. But since downloading and uploading is the only thing this program does involving large amounts of data, I was wondering if it was possible that hackney might hang on to data under some circumstances.

Can't start hackney

Hello,

Can't start hackney using application:start(hackney).

1> application:start(hackney).
{error,{"no such file or directory","dispcount.app"}}

dispcount application is defined as included_applications in hackney.app.src

Wrong typespec for hackney:stream_body?

The README implies that hackney:stream_body returns:

{ok, binary()} | done | {error, term()}

The actual -spec says that it returns:

ok | stop | {error, term()}

From what I can tell, the typespec is incorrect. Happy to provide a PR if that's indeed the case ๐Ÿ˜„

hackney:body/1 returns {error, closed}

When tracker returns body with unknown (unspecified) size and disconnects, than hackney:body/1
calls stream_body_recv/1, that tries get more data, calling recv(Client). But whole body is already in the accumulator, socket is closed by the remote host and this call returns {error, closed}.

Startup inconsistency

application:start(hackney) doesn't seem to start the pool
hackney:start() does

Also, a missing clause found:

diff --git a/src/hackney_pool.erl b/src/hackney_pool.erl
index 7d38681..7eec494 100644
--- a/src/hackney_pool.erl
+++ b/src/hackney_pool.erl
@@ -151,7 +151,7 @@ do_start_pool(Name, Options) ->
     case supervisor:start_child(hackney_sup, Spec) of
         {ok, Pid} ->
             Pid;
-        already_started ->
+        {error, {already_started, _}} ->
             find_pool(Name, Options)
     end.

Hackney should not attempt to parse the body on HEAD requests

"The metainformation contained
in the HTTP headers in response to a HEAD request SHOULD be identical
to the information sent in response to a GET request."

"If the new field values
indicate that the cached entity differs from the current entity (as
would be indicated by a change in Content-Length, Content-MD5, ETag
or Last-Modified), then the cache MUST treat the cache entry as
stale."

Basically, a HEAD request will still return a Content-Length but not a body. In such cases, parsing the body should return an empty binary, currently it returns a timeout error.

I am using hackney master. I can provide a pull request to fix the issue, but I would like to know how would you prefer to fix it. One option is to pass the HTTP method directly as argument on start_response, another is to store the method in the client record as well.

Thanks!

[discuss] Return a request reference instead of an opaque record

Actually, when you're not creating an async request, hackney return:

 {ok, Status, Header, Client} = hackney:request(get, <<"http://someurl">>).

or when you stream a body:

  {ok, Client}  =  hackney:request(get, <<"http://someurl">>,  [], stream).

Each succesive call in hackney return a new Client opaque record,. For example when you fetch a body:

 {ok, Client1} = hackney:body()

It has the advantage that at this point no process is created to handle the request, a client record is basically wrapping a socket to code and decode data sent and receiced from the remote HTTP server. So there is no message passing.

But the recent async feature is now returning a stream reference instead of a client record:

 {ok, StreamRef} = hackney:request(get, <<"http://someurl">>, [], <<>>, [async]).

A new supervised stream process is created when the response start and you receive the messages containing this stream reference. Until the response start, the request is still sent like the synchronous requests without message passing.

What I would like to discuss is the possibility to only return a stream reference instead of an opaque client record. A new process will be created each time to handle a request. And the request body will be streamed by sending some messages to it.

The main advantages of this change would be:

  • keep the API coherent, always return one kind of response
  • you don't have to care about keeping this client object around
  • Ease the code. the functions will be rewritten to only handle message passing.

With the main disadvantage of having to use the message passing to send to your request PId which will use more memory and slow down a little the process.

Any feedback about this would be really appreciated. Let me know :)

response_stream?

On documentation:

async: receive the response asynchronously The function return {ok, response_stream, StreamRef}}. When {async, once} is used the response will be received only once. To receive the other messages use the function hackney:stream_next/1

But there's no response_stream response anywhere on code. AFAIK it's sending {ok, StreamRef} . Is it correct?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.