Comments (7)
I'm afraid that there's no such API. Building such a capability on top of the existing PDFHummus library parsing abilities is possible, though.
but then, if there's apache pdfbox textstripper why not use that?
from pdf-writer.
Actually my app is in C++. Calling Java from C++ works for me but overall it becomes slower compared to calling native C++ code.
from pdf-writer.
I meant the c++ lib. The nodejs module is just a shell around it.
from pdf-writer.
pdfbox is only Java, so won't work.
I understand that PDFHummus library is rich enough and we can write a function on top to write the text stripper. I just wanted to check if such a function already exists.
In case I succeed in writing it, I'll post it in one of the forums.
from pdf-writer.
extracting text with hummus http://pdfhummus.com/post/156548561656/extracting-text-from-pdf-files
from pdf-writer.
@galkahana does this handle compression?
extracting text with hummus http://pdfhummus.com/post/156548561656/extracting-text-from-pdf-files
from pdf-writer.
Now also in c++:
https://github.com/galkahana/pdf-text-extraction
from pdf-writer.
Related Issues (20)
- Streams objects writing problem HOT 2
- Add watermark to PDF HOT 7
- Missing lib.obj file HOT 3
- Android Build Workflow HOT 3
- CIDSet encoding does not conform with ISO 19005-2:2011, ISO 19005-3:2012 (PDF/A-2b or PDF/A-3b) HOT 21
- annotations are lost with PDFDocumentCopyingContext::AppendPDFPageFromPDF HOT 3
- How to draw Bezier curves using PDF-Witer library? HOT 2
- Parse a screenplay into scene objects? HOT 2
- color emojis HOT 16
- Links are removed when documents are merged HOT 8
- Color inversion problem occurs when exporting images HOT 1
- infinite loop HOT 2
- Crash when WriteUsedFontsDefinitions HOT 17
- Publish to github releases without PDFWriterTesting HOT 4
- U3D support, 10 years later HOT 10
- `Segmentation fault (core dumped)` just for adding `PDFWriter pdfWriter` in the `h` file HOT 11
- Fail to draw rectangle to pdf,if rectangle larger than(595x842) HOT 2
- german umlaute diacritic not rendered properly HOT 2
- Make a PDF with an image in memory HOT 4
- Double hex encoding in AbstractContentContext::TJHexLow HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pdf-writer.