typhoeus / typhoeus Goto Github PK
View Code? Open in Web Editor NEWTyphoeus wraps libcurl in order to make fast and reliable requests.
Home Page: http://rubydoc.info/github/typhoeus/typhoeus
License: MIT License
Typhoeus wraps libcurl in order to make fast and reliable requests.
Home Page: http://rubydoc.info/github/typhoeus/typhoeus
License: MIT License
The first line of the extconf.rb hard codes Typhoeus to either PPC or 32-bit x86. SL is 64-bit and so the gem install works (because you can compile 32-bit stuff) but the gem fails to load with 64-bit native Ruby.
The docs indicate that Typhoeus::Request.new(...).response will work, and it does not. (Did it used to?) Naturally, the way it works now is response = T::R::run(...), but the docs should reflect that.
In 0.1.28, headers are duped/mangled. Here's what I'm seeing in a Hoptoad notification (header on first line, what it's being set to on the second):
HTTP_USER_AGENT
Typhoeus - http://github.com/pauldix/typhoeus/tree/master, Typhoeus - http://github.com/pauldix/typhoeus/tree/master
HTTP_X_API_TOKEN
cc1892665e521c6692248cffbd7edd17, cc1892665e521c6692248cffbd7edd17
This is with something like:
Typhoeus::Request.get("https://url",
:headers => {
"X-API-TOKEN" => "3c0b3b94e163759821a6b4752cc23c5c"
}
)
Seems something is adding them to an array and joining on ", " when the request is built. The same request works fine with 0.1.27 and with curl on the command line.
I was assuming response.code = 0 meant a timeout expired, but in some cases of my app response.body is non-empty but response.code is set to 0. How is this possible? How can I be sure to distinguish between a completion and a timeout?
After a few hours of successful requests being made by our application Typhoeus will repond with code 0, the response is instant to any subsequent requests yet use of curl, Net::HTTP and wget proves the destination reachable while the app fails this way. The behaviour is like a false timeout and persists until the application is restarted. Attempting to write a test to prove this but due to the sporadic nature it's not simple to recreate.
Using libcurl 7.19.7 and Ruby 1.9.1
It would be nice if when an object that responded to :each
was passed as the request_body
to an easy object, it would be iterated over and sent to the server.
This would allow for streaming the request (useful for uploading large files, etc).
Refer to this gist : http://gist.github.com/425250
the first request will set the header :
[...]
POST / HTTP/1.1
Host: www.google.com
Accept: /
TEST: TEST
Content-Length: 0
Content-Type: application/x-www-form-urlencoded
[...]
In the second request, queued in multi, headers are not set :
[...]
POST / HTTP/1.1
Host: www.google.com
Accept: /
Content-Length: 0
Content-Type: application/x-www-form-urlencoded
[...]
It seems that Typhoeus disregards the HTML encoding type and returns a string in the default encoding, which may be invalid. In my code, I work around it by force encoding downloaded strings, but it seems like reading the headers is the right solution.
We are experiencing segfaults on a linux server. These are averaging maybe 36 hours apart on a lightly loaded system, just guessing but it looks like once in about 1000 requests through typhoeus.
We can try to gather information for you, but we can't seem to force the issue (we'll try a little harder)
The reality is that we're under a time constraint here. If we can't get this addressed quickly we'll have to drop something else into the critical piece for the time being. Nevertheless we'll do what we can help in any way you can think of. There's a chunk of code that's using typhoeus that is not critical (it doesn't matter if it fails every now and again since we can restart it and no user will ever know the difference)
Here's the first part of the trace information:
/usr/local/lib/ruby/gems/1.9.1/gems/typhoeus-0.1.13/lib/typhoeus/multi.rb:20: [BUG] Segmentation fault
ruby 1.9.1p243 (2009-07-16 revision 24175) [x86_64-linux]
-- control frame ----------
c:0090 p:---- s:0441 b:0441 l:000440 d:000440 CFUNC :multi_perform
c:0089 p:0019 s:0438 b:0438 l:000437 d:000437 METHOD /usr/local/lib/ruby/gems/1.9.1/gems/typhoeus-0.1.13/lib/typhoeus/multi.rb:20
c:0088 p:0023 s:0435 b:0435 l:000434 d:000434 METHOD /usr/local/lib/ruby/gems/1.9.1/gems/typhoeus-0.1.13/lib/typhoeus/hydra.rb:65
c:0087 p:0066 s:0432 b:0432 l:000431 d:000431 METHOD /usr/local/lib/ruby/gems/1.9.1/gems/typhoeus-0.1.13/lib/typhoeus/request.rb:97
c:0086 p:0031 s:0426 b:0426 l:000425 d:000425 METHOD /usr/local/lib/ruby/gems/1.9.1/gems/typhoeus-0.1.13/lib/typhoeus/request.rb:106
Among other things, the docs don't say what exception is raised on a timeout.
pauldix-feedzirra-0.0.12/lib/feedzirra/feed.rb:232: warning: multiple values for a block parameter (2 for 1)
expects 2 values
You really need to use rb_thread_select instead of select in typhoeus_multi.c. Look at a recent version of curb, and their implementation in curb_multi.c.
I'm having problems with the run() method sometimes never returning when a large amout of requests are queued (more than 2000).
At first I caught a redirection loop coming from one of the urls, so I specified the :max_redirects option of my Typhoeus::Request objects, but the problem came back after a while.
Is there any other similar pitfalls I should look for that could cause this behavior? Note that the urls I use come from various sources I do not control.
I can gdb the thing when it happens, and the code is stuck in select(). I'll definitly get a backtrace tomorrow when the problem re-occurs. Anything else I should get to produce a thorough bug report?
Typheous.cache expects a get/set caching API as provided by MemCache. Support Typhoeus.cache = Rails.cache to make Rails caching very simple. Better yet, autodetect Rails.cache (aka RAILS_CACHE) and use it by default. Rails's cache stores use read/write instead of get/set.
It should be nice to have a Gzip/Deflate feature for consuming web services over HTTP that return a big amount of data.
Related to my question on SO: http://stackoverflow.com/questions/2833829/fast-ruby-http-library-for-large-xml-downloads
It would be convenient to be able to use a string for :params, as I'm using Typhoeus with dumps of http traffic. The current implementation makes me tear the string apart and potentially futz with the encoding, when really what I would just like to do is :params=>"foo=longparam&bar=thatyoucanjustuse". Thoughts?
Typhoeus basically rules and I'd love to use for everything in my app, but i really need to be able to know what the last url the request made after all the redirects it goes through (mainly url shorteners), so I've been using curb in parts. I tried using the typhoeus response location headers but that proved to be problematic. Having it built into the response would be awesome.
can you cross compile this with rake-compiler or similar so that we can have native windows versions?
Bit of an edge case this but it was found when posting JSON containing multi byte characters.
Line 173 of lib/typhoeus/easy.rb sets the post field size by data.length, instead a call to data.bytesize works for this case.
The Easy response includes the Status-Line section of the response in the Headers section, which will include an invalid header in the headers_hash
, which looks something like: {"HTTP/1.1 200 OK"=>nil}
. Here is a reproducible example: http://gist.github.com/586968
I believe this is due to the curl bindings here: http://github.com/pauldix/typhoeus/blob/master/ext/typhoeus/typhoeus_easy.c#L108
Curl includes the Status-Line in the header data, while Typhoeus treats that as only the header fields.
typhoeus uses rack/utils here:
http://github.com/pauldix/typhoeus/blob/master/lib/typhoeus.rb#L3
but rack is not a gem dependency, so using typhoeus in a non rack projects produces load errors.
Thanks!
--AQ
Hi,
I keep getting the following error from utlls.rb [line 5]:
incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)
This is the culprit:
/([^ a-zA-Z0-9_.-]+)/u
I'm not sure why it needs a fixed encoding regexp, when I remove the "u" flag it works without killing the whole process; however I have no idea how this change might affect everything else.
Any ideas?
I am on mac and I am using curl (7.21.2) installed via brew. When I install typhoeus, it's compiled using an older version (7.19.7 - maybe it's default for snow).
When I install curb, it uses the newer version correctly, so I grab the extconf from there, put on my typhoeus fork, compile and now I have it installed using 7.21.2
https://github.com/rafaelss/typhoeus/blob/master/ext/typhoeus/extconf.rb (using last line from original extconf.rb)
My think is: curb's extconf uses curl-config to determine what version and such should be used and would be good typhoeus use the same approach.
What you think?
I'm using a rails webservice that reads the the raw post data on the server side using request.raw_post
.
I am able to send the data using curl like this:
cat sample.xml | curl -X POST -d @- -u (username):(apikey) https://foo.bar/api/profiles/batch_create -H 'Content-type: text/xml'
And it works. I can't seem to find a way to get it to work using this library.
e = Typhoeus::Easy.net e.verbose = 1 e.headers = AuthorizationHeader.merge({:content_type => 'text/xml'}) e.url = Url e.method = :post e.request_body = File.read(file) e.post_data = File.read(file) e.perform
Doesn't work. The rails server gets nothing back from 'request.raw_post'
The verbose output looks like
POST /api/profiles/batch_create HTTP/1.1 Host: foo.bar Accept: */* Authorization: Basic blahblah content_type: text/xml Content-Length: 6471 Content-Type: application/x-www-form-urlencoded Expect: 100-continue < HTTP/1.1 400 Bad Request < Date: Tue, 23 Mar 2010 00:55:18 GMT < Server: Apache/2.2.12 (Ubuntu) < X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.2.11 < X-Runtime: 127 < Cache-Control: no-cache < Content-Length: 268 < Status: 400 < Connection: close < Content-Type: application/xml; charset=utf-8 <
It looks like curl is using th 100-continue header, but I don't want that. I want to send it all as a single request. Any way to do that?
We have a custom PURGE method for sending cache purge requests to Varnish. Typheous converts everything non-standard to a DELETE in Typheous::Easy#method=. Change the line to:
set_option(OPTION_VALUES[:CURLOPT_CUSTOMREQUEST], method.to_s.upcase)
and our purges start working again.
I get the following error if I run typhoeus in a multi threaded app
/usr/local/lib/ruby/gems/1.9.1/gems/typhoeus-0.1.23/lib/typhoeus/multi.rb:20:in multi_perform': error on thread select (RuntimeError) from /usr/local/lib/ruby/gems/1.9.1/gems/typhoeus-0.1.23/lib/typhoeus/mu lti.rb:20:in
perform'
from /usr/local/lib/ruby/gems/1.9.1/gems/typhoeus-0.1.23/lib/typhoeus/hy dra.rb:70:in run' from /usr/local/lib/ruby/gems/1.9.1/gems/typhoeus-0.1.23/lib/typhoeus/re quest.rb:100:in
run'
from /usr/local/lib/ruby/gems/1.9.1/gems/typhoeus-0.1.23/lib/typhoeus/re quest.rb:105:in `get'
Hey Paul,
Great library! Definitely the best http library around for Ruby.
I'm having a problem though. You specifically ignore the :params option if the method is :post. This breaks our code that used to work fine with Net::HTTP. I'm not sure if there is something in the HTTP spec that says query params are disallowed for POST, but it seems better to just add them to the url for all methods, as most HTTP servers I'm familiar with make them available for all methods.
What do you think?
Cheers,
Justin
The params hash cannot have symbol keys, as they get sorted, and #sort doesn't know what to do with symbols.
Since it's fairly idiomatic in Ruby to use :symbols, I think this should be supported. Will work on this asap.
I'm adding this here as a marker. I don't mind submitting a doc patch for this, but I have a imminent release coming up in the next couple weeks.
In trying to use Hydra#stub for testing, I found some undocumented behaviors:
(1) The stubs are matched to requests. It's not explicitly written in the README, but it will only stub requests that matches what you declare. (There is no, "stub all" feature that raises an exception if you do not declare a matching stub).
(2) Stubs do not clear after you use them. When stubbing the singleton, Typhoeus::Hydra.hydra you will need a call to #clear_stubs
(3) Stub matching is in Typhoeus::HydraMock#matches?(request) ... it isn't clear whether this is a "public" or "private" API. For my purpose, it was easier to monkeypatch that for a brute-force stubbing.
Hi,
it would be handy to have a changelog file the in the project, could you add one please ?
thanks
In the section Advanced Authentication, the example shown is:
e = Typhoeus::Request.get("http://example.com",
:username => 'username',
:password => 'password',
:method => :ntlm)
However, the ":method" parameter in this instance would refer to the HTTP method, not the authentication method; the correct parameter name for authentication method should be :auth_method
For some reason, the first https request on Heroku will succeed, but later requests fail with code = 0. Here's how I'm testing:
gist: http://gist.github.com/278601
This is especially rough for other gems built on top of typhoeus, like raws (a Ruby AWS client) which accesses AWS via a https RESTful interface.
As mentioned here: http://groups.google.com/group/typhoeus/browse_frm/thread/856471c70d745011#
Copypasta:
There would be a method to do it with a modification to the library.
Basically, the easy.c would have to have a callback for chunks of
files. It's a non-trivial modification to make requiring changes to
the C section of the library and some additions to the Ruby part.
Someone else had asked about this a while ago for a slightly different
reason: to stream the download to a streaming parser like Yajl for
JSON or Nokogiri::Reader for XML. It's something I'd like to support
at some point.
Is there a way to know the true response time instead of the incremented total time?
Setting max concurrency to 1 solves this but it defeats the point of asynchronous requests.
Any ideas?
Hi, was just wondering have you given any thought to using ruby fibers with this. Would that have any positive effect on performance?
It might just be my own confusion, but in order to get this running under Windows I had to jump through some hoops. Thought I'd jot down my steps here, maybe some things in the gem install or instructions can change.
Environment: Ruby 1.8.6 [i386-mingw32](w/Ruby MinGW Dev Kit installed)
Theoretically, I should be able to download the latest Windows libcurl (curl-7.19.7-devel-mingw32.zip), unzip it, copy the DLL bin/libcurl.dll into c:\ruby\bin, and install this gem no problem. However, this command does not work:
C:>gem install typhoeus -- --with-curl=c:\curl-7.19.7-devel-mingw32
The reason it doesn't work is because Typhoeus' extconf.rb has some logic I don't quite understand, which skips checking the --with-curl option if compiling under MinGW. It looks like it might be as-yet-unfinished cross-compilation logic?
To fix this problem, I cloned from github to c:\typhoeus and commented out all the "if MinGW" logic in extconf.rb. (That is, I forced it to use my --with-curl option.)
After editing extconf:
C:\typhoeus>gem build typhoeus.gemspec
C:\typhoeus>gem install typhoeus-0.1.18.gem -- --with-curl=c:\curl-7.19.7-devel-mingw32
Installed without a hitch and works fine.
So, this gem WILL compile and work just fine under Windows using MinGW, but as of version 0.1.18, you need to hack it a bit. Hopefully that can be streamlined in a future version.
Hi,
apparently, using blocks and AR is not a good idea with hydra.
I was trying to do something like :
@users.each do |user|
r = Typhoeus::Request.new("http://localhost", :method => :post, :params => {:user_id => user.id, :user_name, => user.name})
r.on_complete do |response|
user.push_to_webservice if response.success? # change state, with something like state machine
end
hydra.queue r
end
hydra.run
This will lead to a memory leak, and sometimes, params will get overridden and completely broken.
I'm not sure we can correct this, I mostly create this issue for logging purpose
It appears that Typhoeus::Easy isn't responding to callbacks, here is my usage
e = Typhoeus::Easy.new
e.auth = {
:username => 'myusername',
:password => 'mypassword',
:method => Typhoeus::Easy::AUTH_TYPES[:CURLAUTH_DIGEST]
}
e.headers = { :Accept => 'application/json' }
e.verbose = 0
e.url = "http://my.url"
e.method = :get
e.on_success {
puts 'oh yes!'
}
e.on_failure {
puts 'oh no!'
}
e.perform
I've been able to work around this for now by adding conditions based on the response code, but was hoping to use the callbacks if possible. Am I missing something?
Using Typhoeus to access https://graph.facebook.com takes about 9 seconds for the first connect, and instantaneous thereafter. Using Net::HTTP is instantaneous every time.
I was playing with the curl binary and Facebook has a very slow handshaking process for SSL. Is there some way to fix this within Typhoeus to make it skip handshaking?
response times when specifying a HEAD method on request is very slow, sometimes over a minute.
all other methods using the same connection the response times are normal.
fwiw, the head request completes successfully.
environment: Ubuntu 10.04, Ruby 1.8.7 p249
running:
Typhoeus::Request.get("http://www.pauldix.net").body
shows:
"\037\213\b\000\000\000\000\000\000\377\325}\331r\333H\266\340\263\025Q\377\220\305\212\366\322W\340*R\233%\267\274T[u\275\265\245n\267\343\306\rE\022H\222\260@$\214\004D\261\356\364D\377\306D\314\274N\304|\307\374I\177\311\234%\023HR\224%\313\222\253\306]-\211D.'\317~N\236L<\376\361\371\333g\307\037\337\275\020\223b\232\210w\177...
I'm running on fedora 13 and cannot load typhoeus gem. Output track returns:
irb(main):001:0> require 'rubygems'
=> true
irb(main):002:0> require 'typhoeus'
TypeError: can't convert Array into String
from /usr/lib/ruby/gems/1.8/gems/rack-1.2.1/lib/rack/utils.rb:138:in union' from /usr/lib/ruby/gems/1.8/gems/rack-1.2.1/lib/rack/utils.rb:138 from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in
gem_original_require'
from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in require' from /usr/lib/ruby/gems/1.8/gems/typhoeus-0.1.31/lib/typhoeus.rb:3 from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:36:in
gem_original_require'
from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:36:in `require'
from (irb):2
Does somebody help me?
If you have ruby 1.8.5, the RSTRING_PTR macro isn’t defined. You can workaround by putting these in ext/typhoeus/native.h (maybe someone should have #ifndef RSTRING_PTR around it):
#define RSTRING_PTR(s) (RSTRING(s)->ptr)
#define RSTRING_LEN(s) (RSTRING(s)->len)
FYI: Centos 5 comes with Ruby 1.8.5, which is why I'm using that version.
As I understand it, using Easy/Multi should generally not be done. The README doesn't mention anything about it, though (actually, it includes examples of their use!).
bblimke is looking into adding Typhoeus support to WebMock (probably through the same stubbing integration points VCR uses), but is concerned that it only works for Hydra. It'd be good to clarify the typhoeus maintainer's recommendation for Easy/Multi.
Support for multipart_form_post would be cool. I'd love to replace all my curb usage with Typhoeus.
Hi,
Could you explain how to specify options like CURLOPT_SSLCERT in a Easy request ?
I really don't how you can pass custom libcurl options.
Thanks !
What license are you using? Maybe you need to add a file? MIT? GPL?
If I understand correctly, Typhoeus request TIMEOUT parameter is for the entire request, starting from the initial connection up to the end of the request.
I see that Curl has two timeout values, CURLOPT_CONNECTTIMEOUT and CURLOPT_TIMEOUT, if someone has time to expose CONNECT_TIMEOUT it would be great, as my problem is that when setting the Typhoeus TIMEOUT param too low, large files get truncated, and when setting it too high, too much time is spent on unreachable/problematic hosts.
I might even dig in the code and provide a diff later this week, but being totally new to gems and ruby in general it might take a while to get it right.
i tried use typhoeus for downloading some web pages,but there will some page have some issue,didn't return,so the typhoeus didn't return,i just find on_complete,how to record some error url then we can re-download them,
to be clear,is there some way for on_error to do sth for log
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.