kenpratt / wikipedia-client Goto Github PK
View Code? Open in Web Editor NEWRuby client for the Wikipedia API
Home Page: http://github.com/kenpratt/wikipedia-client
License: MIT License
Ruby client for the Wikipedia API
Home Page: http://github.com/kenpratt/wikipedia-client
License: MIT License
When running:
require 'wikipedia'
Wikipedia.find('? (Lost)')
Results in this runtime error:
..../gems/wikipedia-client-1.15.0/lib/wikipedia/page.rb:12:in page': undefined method []' for nil:NilClass (NoMethodError)
Currently this is a read-only client. Is login and edit feature considered in the future?
Latest builds show a regular failure:
Wikipedia::Client.find page (Edsger_Dijkstra)
should collect the image urls (FAILED - 1)
The URL list probably just needs to be updated.
Ideally, HTTP requests made in tests should be stubbed, so the code doesn't need to keep getting updated when someone makes a change on Wikipedia. But that's a bigger task, and probably merits a different issue.
$ ruby -v
ruby 2.7.0p0 (2019-12-25 revision 647ee6f091) [x86_64-darwin19]
$ rails -v
Rails 6.0.3.2
$ gem list | grep wiki
wikipedia (0.2)
wikipedia-client (1.10.0)
Loading rails console
throws this error:
Traceback (most recent call last):
bin/rails: Bootsnap::LoadPathCache::FallbackScan
47: from bin/rails:4:in `<main>'
46: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/activesupport-6.0.3.2/lib/active_support/dependencies.rb:324:in `require'
45: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/activesupport-6.0.3.2/lib/active_support/dependencies.rb:291:in `load_dependency'
44: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/activesupport-6.0.3.2/lib/active_support/dependencies.rb:324:in `block in require'
43: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in `require'
42: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:21:in `require_with_bootsnap_lfi'
41: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/loaded_features_index.rb:92:in `register'
40: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:22:in `block in require_with_bootsnap_lfi'
39: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:22:in `require'
38: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/railties-6.0.3.2/lib/rails/commands.rb:18:in `<main>'
37: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/railties-6.0.3.2/lib/rails/command.rb:46:in `invoke'
36: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/railties-6.0.3.2/lib/rails/command/base.rb:69:in `perform'
35: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/thor-0.20.3/lib/thor.rb:387:in `dispatch'
34: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
33: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
32: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/railties-6.0.3.2/lib/rails/commands/console/console_command.rb:101:in `perform'
31: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/railties-6.0.3.2/lib/rails/command/actions.rb:14:in `require_application_and_environment!'
30: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/railties-6.0.3.2/lib/rails/command/actions.rb:22:in `require_application!'
29: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/activesupport-6.0.3.2/lib/active_support/dependencies.rb:324:in `require'
28: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/activesupport-6.0.3.2/lib/active_support/dependencies.rb:291:in `load_dependency'
27: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/activesupport-6.0.3.2/lib/active_support/dependencies.rb:324:in `block in require'
26: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in `require'
25: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:21:in `require_with_bootsnap_lfi'
24: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/loaded_features_index.rb:92:in `register'
23: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:22:in `block in require_with_bootsnap_lfi'
22: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:22:in `require'
21: from /Users/helix/learn/config/application.rb:7:in `<main>'
20: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/2.7.0/bundler.rb:174:in `require'
19: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/2.7.0/bundler/runtime.rb:58:in `require'
18: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/2.7.0/bundler/runtime.rb:58:in `each'
17: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/2.7.0/bundler/runtime.rb:69:in `block in require'
16: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/2.7.0/bundler/runtime.rb:69:in `each'
15: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/2.7.0/bundler/runtime.rb:74:in `block (2 levels) in require'
14: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in `require'
13: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:21:in `require_with_bootsnap_lfi'
12: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/loaded_features_index.rb:92:in `register'
11: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:22:in `block in require_with_bootsnap_lfi'
10: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:22:in `require'
9: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/wikipedia-0.2/lib/wikipedia.rb:9:in `<main>'
8: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/activesupport-6.0.3.2/lib/active_support/dependencies.rb:324:in `require'
7: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/activesupport-6.0.3.2/lib/active_support/dependencies.rb:291:in `load_dependency'
6: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/activesupport-6.0.3.2/lib/active_support/dependencies.rb:324:in `block in require'
5: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:26:in `require'
4: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:40:in `rescue in require'
3: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:21:in `require_with_bootsnap_lfi'
2: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/loaded_features_index.rb:89:in `register'
1: from /Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:22:in `block in require_with_bootsnap_lfi'
/Users/helix/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:22:in `require': cannot load such file -- htmlentities (LoadError)
Page.text was added in this commit on May 3, 2014: e3142c1
Yet the 1.4.0 gem pushed to rubygems on June 25, 2015 doesn't include it.
Is there a reason for this?
I'm trying to search using a looser set of keywords without knowing the exact title (I'm sure I'm far from alone and this would be the standard case).
I might even be very close, but I'm not returning the desired article (which exists) purely because I'm not matching the title exactly - this can literally be a case of wrong word order or use of brackets in the title.
Can the API return the most likely search result (e.g. how the web page shows the search results) too?
travis-ci.org will be shut down soon. I suggest moving to Github Actions, or if for some reason, there is a desire to stay with travis, the account must be migrated to travis-ci.com.
If the move to Github Actions is a possibility, I don't mind implementing it.
<main>': undefined method
find' for Wikipedia:Module (NoMethodError)
I'm trying to duplicate the example in the docs, but get a no method error for find
.
My code is simple.
re-pro:
Wikipedia.find( 'Getting Things Done' ).links
output:
["Automatic transmission", "Ben Hammersley", "Business", "Cult following", "David Allen (author)", "Digital object identifier", "Distraction", "Distributed cognition", "Extended mind", "Francis Heylighen"]
expected:
should return the complete links, instead of the first 10 links.
This was working great up until this morning. Suddenly, I'm getting a RuntimeError of redirection forbidden.
RuntimeError: redirection forbidden:
http://en.wikipedia.org/w/api.php?action=query&format=json&prop=revisions%7Clinks%7Cextlinks%7Cimages%7Ccategories%7Ccoordinates%7Ctemplates&rvprop=content&inprop=url&titles=Getting%20Things%20Done ->
https://en.wikipedia.org/w/api.php?action=query&format=json&prop=revisions%7Clinks%7Cextlinks%7Cimages%7Ccategories%7Ccoordinates%7Ctemplates&rvprop=content&inprop=url&titles=Getting%20Things%20Done
Tried open-uri-redirections (https://github.com/open-uri-redirections/open_uri_redirections) but I couldn't get it working. Anyone know why this suddenly appeared and what a quick solution could be?
Just added this gem to my gemfile and received the error [!] There was an error parsing 'Gemfile': Undefined local variable or method 'wikipedia' for Gemfile. Bundler cannot continue.
when trying to run bundle install
. I installed the gem in isolation and it seemed to work once, but now this error comes up whenever requiring the gem inside the console too.
Since Wikipedia changed the Page API, the wikipedia 1.0.0 gem on rubygems.org is broken. Would you mind pushing a new release? Thanks!
It seems the entire side Wikipedia Infobox table is ignored when accessing the page.sanitized_content.
How can I pull only a company logo or a person's profile photo from the Infobox?
The rest of the Infobox content would be nice to have as well.
It is good to have tests for all corrected issues, so it is possible to ensure that they will not occur in the future.
Currently PR #46 contains code but no tests.
I need to run multiple clients with different configurations. As I understand the current implementation the configuration is a "global" singleton class. Hence, each instance of Client
uses the same Instnace of Configuration
.
I tried to address that with #99
What do you think?
Using 1.5.0 of the gem, I get the following:
?> p=Wikipedia.find('Milk Duds')
OpenSSL::SSL::SSLError: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/net/http.rb:920:in `connect'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/net/http.rb:920:in `block in connect'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/timeout.rb:76:in `timeout'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/net/http.rb:920:in `connect'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/net/http.rb:863:in `do_start'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/net/http.rb:852:in `start'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/open-uri.rb:313:in `open_http'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/open-uri.rb:724:in `buffer_open'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/open-uri.rb:210:in `block in open_loop'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/open-uri.rb:208:in `catch'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/open-uri.rb:208:in `open_loop'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/open-uri.rb:149:in `open_uri'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/open-uri.rb:704:in `open'
from /Users/me/.rvm/rubies/ruby-2.1.3/lib/ruby/2.1.0/open-uri.rb:712:in `read'
from /Users/me/.rvm/gems/ruby-2.1.3/gems/wikipedia-client-1.5.0/lib/wikipedia/client.rb:67:in `request'
from /Users/me/.rvm/gems/ruby-2.1.3/gems/wikipedia-client-1.5.0/lib/wikipedia/client.rb:35:in `request_page'
This gem clearly relies on the htmlentities
and hpricot
gems, but it is not documented or included in the gemspec. This causes unexpected errors when following the example for the first time. By adding both of those gems to my gemfile I was able to use this gem, but it really should be included when you run bundle install.
My ruby-fu is weak, but this line suggests the gem is just reporting as Ruby/version.
This is not compliant with our API etiquette guidelines or expectations. Please provide an informative default user agent that contains a URL to this project, and (for additional cookies) offer users the ability to append their own agent for their project-specific querying.
Hi,
https://github.com/kenpratt/wikipedia-client/blob/master/Gemfile#L8
I'm installing this gem on a Rails 6 project with the rubocop-shopify gem.
This gem seems to be conflicting with the shopify-rubocop
gem, but can't understand exactly what may be causing this since you only have it defined on the Gemfile and not on the gemspec.
Anyone else bumped into this?
I'm getting a segmentation fault when trying to run bundle exec rails c
or anytime I have to load the rails env locally.
If I remove the rubocop-shopify
gem it just works.
Any ideas? Thanks!
[BUG] Segmentation fault at 0x0000000000000028
ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [arm64-darwin20]
gem 'rubocop', '= 0.48.1' # wikipedia-client gem dependency
rubocop (~> 1.22) # my rails 6 project dependency
for example
is it possible?
For example Wikipedia.find('https://en.wikipedia.org/wiki/List_of_wars_before_1000').links #=> ["1st millennium", "1st millennium BC", "464 BC Sparta earthquake", "Abbasid Caliphate", "Abbasid Revolution", "Abd Allah ibn al-Zubayr", "Abi-Eshuh", "Acarnania", "Achaea (ancient region)", "Achaean League"]
but there are probably hundreds of links on this page.
The following methods in page.rb
have no test coverage:
editurl
extlinks
image_descriptionurl
image_urls
image_descriptionurls
coordinates
raw_data
image_metadata
templates
The following methods in wikipedia.rb
have no test coverage:
find_image
find_image
Full coverage report, using SimpleCov:
wikipedia-client_coverage.zip
I can take this task up myself.
The gem seems to use deprecated and obsolete method calls on the URI
module.
Run this code in Ruby 2.7.3 or Ruby 3.0.1
require 'wikipedia'
page = Wikipedia.find 'Getting Things Done'
Ruby 2.7.3 - URI.escape is obsolete
(warning)
.../gems/2.7.0/gems/wikipedia-client-1.12.0/lib/wikipedia/client.rb:109: warning: URI.escape is obsolete
Ruby 3.0.1 - undefined method encode for URI:Module (NoMethodError)
(fatal)
.../gems/3.0.0/gems/wikipedia-client-1.12.0/lib/wikipedia/client.rb:109:in `encode': undefined method `encode' for URI:Module (NoMethodError)
I see this was mentioned in b9e4b4a but then reverted in 0adc083 - which I believe was the right call, since CGI.escape
is not a 1-to-1 replacement, but the problem still needs fixing.
How to access a random article from Wikipedia?
Thanks!
A method for returning the summary of a Wikipedia article is badly needed.
There might also be more file extensions that do not work, which, presumably, at least include any image file types that cannot be displayed in the browser.
For sure the method works with .jpg, .png, and .svg.
A potential "solution":
if page['thumbnail'] and page['thumbnail']['source'] =~ /\.tif\.jpg/i
page['thumbnail']['source'].sub(/lossy-page1-\d+px/, 'lossy-page1-800px')
elsif page['thumbnail']
page['thumbnail']['source'].sub(/\/thumb/, '').sub(/\/[^\/]*$/, '')
end
However this would always be the link to the image at 800px, which might not always work...
I get this annoying error (using Ruby 2.0.0 from RubyInstaller under Windows 7):
"SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (OpenSSL::SSL::SSLError)". There are lots of suggestions for how to fix this, but I could find none that worked for wikipedia-client. Any hints?
I installed and ran rubocop
on the codebase:
> 19 files inspected, 351 offenses detected
The full results are here.
Many of these are style decisions rather than offenses, and can be filtered out by using the right preferences in a .rubocop.yml
file. However, some are genuine issues, and can use some work.
Would be happy to take this up myself (if you see the merit in it), but I'll need occasional help and reviews from more experienced contributors whenever it comes a style decision.
Hi.
Sometimes it's producing "can't convert String into Integer"
wikipedia-client-1.3.0/lib/wikipedia/page.rb, line 10
Rather a feature request than an issue. If you call the Wikipedia API through
https://en.wikipedia.org/w/api.php?action=query&titles=London&prop=pageimages&format=json&pilimit=5&pithumbsize=800
you can get a thumbnail of the main image. In that case either width or height of the image has 800px.
The thumbnails can be very useful for images, which are too large, e.g. the main image of Munich. The following code only delivers the several MB sized image from Wikipedia:
wikipedia_city_page = Wikipedia.find("München")
@image_url = wikipedia_city_page.main_image_url
=> Retrieved image URL of Munich
It would be really great, if we could also get a thumbnail from that image through the wikipedia gem.
Thanks a lot!
Does it return all the href values of the a
tags in the page ? It would be nice if the doc was a little bit more clear.
For example the page on Visual J++ : https://en.wikipedia.org/wiki/Visual_J%2B%2B.
Not really sure what's going on here...
Our Travis build never fails! Even when the tests do!
It seems bundle exec rake
always exits with code 0, no matter what.
You can see this behaviour here, where a build passes, even though one test has obviously failed.
The command "bundle exec rake" exited with 0.
I also tried it on local.
wikipedia-client (development) > bundle exec rake spec
...
<lots of failure>
...
Finished in 0.00658 seconds
21 examples, 1 failure, 20 pending
Failed examples:
rspec ./spec/lib/client_spec.rb:65 # Wikipedia::Client.find page with one section (mocked) should have the correct sanitized intro
wikipedia-client (development) > echo $?
0 #=> Exit code is zero, even thought tests failed
Will investigate this further. It might be an rspec
version issue, as ours is quite old, and I'm seeing mentions of such errors here:
Since it tests against real wikipedia. Every time an image regarding "Edsger Dijkstra" is changed, the tests breaks. What should be the approach in this case?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.