Comments (7)
Sorry - this is a known issue that has to do with incompatibility with recent versions of lxml. I just merged this pull request that catches the error ( #5 ), so try reinstalling pdfquery from github trunk and see if that fixes it.
This is a temporary fix that means you can't use xpath_in_bbox as described in the docs. It would be good to have a solution that keeps functionality with the new lxml -- this pull request might do it, but I haven't had time to play with it ( #3 ).
from pdfquery.
Hi, thanks for answering. How can I uninstall this version I installed? :) sorry, I'm a python newbie.
NVM, I got it with pip uninstall. I will try getting the code from github trunk and will get you notified if I can run the example. Thanks.
from pdfquery.
Welcome! First, if you haven't yet, you want to get pip working, the python
package manager. See http://www.pip-installer.org . Then (if you have git
installed) you should be able to do something like:
pip uninstall pdfquery
pip install -e git+https://github.com/jcushman/pdfquery.git#egg=pdfquery
This uninstalls the package and installs from source. As far as I know, all
this is doing behind the scenes is adding and removing files to your
site-packages directory (
http://stackoverflow.com/questions/122327/how-do-i-find-the-location-of-my-python-site-packages-directory
),
so in theory you could also do that directly.
(As you dig into python you might want to take a look at virtualenv, which
lets you keep a separate set of packages for each project you work on
instead of having them all jammed into site-packages. No need to complexify
it too much at this point though.)
On Fri, Jun 7, 2013 at 2:10 PM, moon13 [email protected] wrote:
Hi, thanks for answering. How can I uninstall this version I installed? :)
sorry, I'm a python newbie.—
Reply to this email directly or view it on GitHubhttps://github.com//issues/6#issuecomment-19123523
.
from pdfquery.
Jcushman, I followed your instructions and I installed the pdfquery from the source code on github. Now I try to run the sample code and I get this:
Traceback (most recent call last):
File "testePdfQuery.py", line 3, in
pdf = pdfquery.PDFQuery("examples/sample.pdf")
NameError: name 'pdfquery' is not defined
I get this error even if I have the import in my file "import pdfquery".
print sys.path gives me this
['/home/ubuntu/Downloads', '/usr/local/lib/python2.7/dist-packages/pyquery-1.2.4-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/cssselect-0.8-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/roman-2.0.0-py2.7.egg', '/usr/local/lib/python2.7/dist-packages', '/home/ubuntu/src/pdfquery', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client', '/usr/lib/python2.7/dist-packages/ubuntuone-client', '/usr/lib/python2.7/dist-packages/ubuntuone-control-panel', '/usr/lib/python2.7/dist-packages/ubuntuone-storage-protocol']
from pdfquery.
Hmm, that one's tough to diagnose from here. pip I think would probably put
the files in /usr/lib/python2.7/dist-packages, so you would end up with
/usr/lib/python2.7/dist-packages/pdfquery/pdfquery.py and it would import
from there. If it installed as a .egg (really a zip file), you might try
unzipping it.
On Fri, Jun 7, 2013 at 2:43 PM, moon13 [email protected] wrote:
Jcushman, I followed your instructions and I installed the pdfquery from
the source code on github. Now I try to run the sample code and I get this:Traceback (most recent call last):
File "testePdfQuery.py", line 3, in
pdf = pdfquery.PDFQuery("examples/sample.pdf")
NameError: name 'pdfquery' is not definedI get this error even if I have the import in my file "import pdfquery".
print sys.path gives me this
['/home/ubuntu/Downloads',
'/usr/local/lib/python2.7/dist-packages/pyquery-1.2.4-py2.7.egg',
'/usr/local/lib/python2.7/dist-packages/cssselect-0.8-py2.7.egg',
'/usr/local/lib/python2.7/dist-packages/roman-2.0.0-py2.7.egg',
'/usr/local/lib/python2.7/dist-packages', '/home/ubuntu/src/pdfquery',
'/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu',
'/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old',
'/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages',
'/usr/lib/python2.7/dist-packages',
'/usr/lib/python2.7/dist-packages/PILcompat',
'/usr/lib/python2.7/dist-packages/gtk-2.0',
'/usr/lib/python2.7/dist-packages/ubuntu-sso-client',
'/usr/lib/python2.7/dist-packages/ubuntuone-client',
'/usr/lib/python2.7/dist-packages/ubuntuone-control-panel',
'/usr/lib/python2.7/dist-packages/ubuntuone-storage-protocol']—
Reply to this email directly or view it on GitHubhttps://github.com//issues/6#issuecomment-19125474
.
from pdfquery.
Er, /usr/ _local_ /lib/python2.7/dist-packages , I meant.
On Fri, Jun 7, 2013 at 3:21 PM, Jack Cushman [email protected] wrote:
Hmm, that one's tough to diagnose from here. pip I think would probably
put the files in /usr/lib/python2.7/dist-packages, so you would end up
with /usr/lib/python2.7/dist-packages/pdfquery/pdfquery.py and it would
import from there. If it installed as a .egg (really a zip file), you might
try unzipping it.On Fri, Jun 7, 2013 at 2:43 PM, moon13 [email protected] wrote:
Jcushman, I followed your instructions and I installed the pdfquery from
the source code on github. Now I try to run the sample code and I get this:Traceback (most recent call last):
File "testePdfQuery.py", line 3, in
pdf = pdfquery.PDFQuery("examples/sample.pdf")
NameError: name 'pdfquery' is not definedI get this error even if I have the import in my file "import pdfquery".
print sys.path gives me this
['/home/ubuntu/Downloads',
'/usr/local/lib/python2.7/dist-packages/pyquery-1.2.4-py2.7.egg',
'/usr/local/lib/python2.7/dist-packages/cssselect-0.8-py2.7.egg',
'/usr/local/lib/python2.7/dist-packages/roman-2.0.0-py2.7.egg',
'/usr/local/lib/python2.7/dist-packages', '/home/ubuntu/src/pdfquery',
'/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu',
'/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old',
'/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages',
'/usr/lib/python2.7/dist-packages',
'/usr/lib/python2.7/dist-packages/PILcompat',
'/usr/lib/python2.7/dist-packages/gtk-2.0',
'/usr/lib/python2.7/dist-packages/ubuntu-sso-client',
'/usr/lib/python2.7/dist-packages/ubuntuone-client',
'/usr/lib/python2.7/dist-packages/ubuntuone-control-panel',
'/usr/lib/python2.7/dist-packages/ubuntuone-storage-protocol']—
Reply to this email directly or view it on GitHubhttps://github.com//issues/6#issuecomment-19125474
.
from pdfquery.
Right, I checked how pip installed it. It is installed as "pdfquery.egg-link". I've tried to unzip it, but no success :(
root@ubuntu-DQ77PRO:/usr/local/lib/python2.7/dist-packages# unzip pdfquery.egg-link
Archive: pdfquery.egg-link
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of pdfquery.egg-link or
pdfquery.egg-link.zip, and cannot find pdfquery.egg-link.ZIP, period.
from pdfquery.
Related Issues (20)
- Can't get coordinates.
- Pseudo classes not working
- How does pdfquery determine the index?
- can load the pages I need HOT 1
- Can't concat str to bytes HOT 3
- ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters HOT 1
- PdfQuery | .extract problem
- loading file with filecache AttributeError: 'NoneType' object has no attribute 'writestr' HOT 1
- windows only: pdfquery is locking the opended pdf-file HOT 1
- Extract all words with their coordinates.
- cache collision HOT 1
- can't concat str to bytes EASY FIX -- please update! HOT 3
- recommend you use pdfminer rather than pdfquery HOT 1
- Not able to detect horizontal lines properly.
- Coordinates to locator
- Is this project still alive? HOT 3
- Python 2 dependency problem: pyquery
- Support for password protected pdf files
- AttributeError: module 'pdfquery' has no attribute 'PDFQuery'
- TypeError: 'PDFObjRef' object is not subscriptable
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pdfquery.