Comments (12)
Has running out of resources actually been an issue for you, or is this more of an academic concern?
I use my Raspberry Pi 4 because my Scansnap has not WebDAV or FTP feature. The resources of the pi runs out very quickly.
@rocketraman Can you please add sem
as additional requirement in the readme ? The lack of this information cost me some time to debug the bottleneg - until I found this closed ticket here telling my the if sem is installed: solve problem
code insertion :)
from sane-scan-pdf.
Yup, that was the intended behavior to parallelize the processing. Has running out of resources actually been an issue for you, or is this more of an academic concern? I find it difficult to believe a scanner could scan pages fast enough to cause a problem.
from sane-scan-pdf.
Yeah. We have a Fujitsu that can scan upto 60PPM. I was doing some testing on a laptop with the scanner on duplex, producing 78~ pages, and it'd spawn an absurd amount of tesseract
processes to consume 2/3rds of the laptop's 16GB of RAM, kept CPU pegged at 100%, and all tesseract
processes working at a crawl.
from sane-scan-pdf.
Nice scanner :-) Ok, good thing to fix.
from sane-scan-pdf.
@jarrodsfarrell Probably the easiest way I've found to do this is to use sem
from the GNU parallel project, but it will introduce another (optional) dependency. Its widely available so I don't have a problem with adding this, but would that work for your situation?
from sane-scan-pdf.
Taking a look into the project's man page it seems perfectly fine to use and a non-issue to have another dependency.
from sane-scan-pdf.
@jarrodsfarrell Can you grab the changes in pull #5 and see if that solves your problem? If it works for you, I'll merge it.
from sane-scan-pdf.
from sane-scan-pdf.
Unfortunately we don't have the 60PPM like before so I'm using a 25PPM model instead.
Regardless, it seems like using sem
is a overall good change. I think it's even letting the OCRing step work a bit faster than running all the tesseract
processes all at once (less task-switching?) and pauses between scans are noticeably more brief (scan process doesn't have to fight as much for resources?). Additional bonus is that the movement of the console is a good indicator that work is still being done instead of staying still until the tesseract
processes begin quiting.
Anyways, should the last argument be erroring like this?
USER@HOST:~/Workspace/sane-scan-pdf$ ./scan -d -m color --crop --deskew --ocr out.pdf
Unknown argument: out.pdf
Nevermind. It'd help if I read the documentation.
from sane-scan-pdf.
Thanks for reporting and testing. I'll merge this.
from sane-scan-pdf.
@rocketraman Can you please add
sem
as additional requirement in the readme ? The lack of this information cost me some time to debug the bottleneg - until I found this closed ticket here telling my theif sem is installed: solve problem
code insertion :)
It's already listed under optional requirements, but perhaps this issue deserves a more extensive call out.
from sane-scan-pdf.
@MoD01 I added an explanatory line in features for future people in your situation...
from sane-scan-pdf.
Related Issues (20)
- Using AVStream.codec to pass codec parameters to muxers is deprecated... HOT 1
- My scanadf does not recognize --page-height HOT 8
- bc appears to be required in default configuration
- Batch scan into single files doesn't work HOT 12
- Improve OCR layer compatibility with MacOS Preview via hocr renderer HOT 16
- Adjust brightness and optimise white page recognition HOT 1
- usage with scanbd: invalid argument when script is executed directly HOT 7
- Simulated duplex scanning with page re-ordering HOT 7
- Integration with Paperless-ng HOT 1
- no decode delegate for this image format HOT 4
- Rotate HOT 5
- Settings SOURCE=ADF doesn't work on brother MFC-L2700DW HOT 2
- units: cannot open file '/root/.units': Permission denied HOT 2
- When calling sane-scan-pdf from scanbd, it is run with euid root, causing permission errors
- Scan quality Fujitsu Software vs Sane?
- scanimage instead of scanadf HOT 4
- Page not aligning correctly HOT 20
- Binary name conflict HOT 3
- How to select the ADF as a source for scanning HOT 2
- Scan on Brother DCP-L3550CDW from ADF fails with `unrecognized option '--page-height'` HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sane-scan-pdf.