Giter Club home page Giter Club logo

Comments (12)

MoD01 avatar MoD01 commented on June 27, 2024 1

Has running out of resources actually been an issue for you, or is this more of an academic concern?

I use my Raspberry Pi 4 because my Scansnap has not WebDAV or FTP feature. The resources of the pi runs out very quickly.

@rocketraman Can you please add sem as additional requirement in the readme ? The lack of this information cost me some time to debug the bottleneg - until I found this closed ticket here telling my the if sem is installed: solve problem code insertion :)

from sane-scan-pdf.

rocketraman avatar rocketraman commented on June 27, 2024

Yup, that was the intended behavior to parallelize the processing. Has running out of resources actually been an issue for you, or is this more of an academic concern? I find it difficult to believe a scanner could scan pages fast enough to cause a problem.

from sane-scan-pdf.

jarrodsfarrell avatar jarrodsfarrell commented on June 27, 2024

Yeah. We have a Fujitsu that can scan upto 60PPM. I was doing some testing on a laptop with the scanner on duplex, producing 78~ pages, and it'd spawn an absurd amount of tesseract processes to consume 2/3rds of the laptop's 16GB of RAM, kept CPU pegged at 100%, and all tesseract processes working at a crawl.

from sane-scan-pdf.

rocketraman avatar rocketraman commented on June 27, 2024

Nice scanner :-) Ok, good thing to fix.

from sane-scan-pdf.

rocketraman avatar rocketraman commented on June 27, 2024

@jarrodsfarrell Probably the easiest way I've found to do this is to use sem from the GNU parallel project, but it will introduce another (optional) dependency. Its widely available so I don't have a problem with adding this, but would that work for your situation?

from sane-scan-pdf.

jarrodsfarrell avatar jarrodsfarrell commented on June 27, 2024

Taking a look into the project's man page it seems perfectly fine to use and a non-issue to have another dependency.

from sane-scan-pdf.

rocketraman avatar rocketraman commented on June 27, 2024

@jarrodsfarrell Can you grab the changes in pull #5 and see if that solves your problem? If it works for you, I'll merge it.

from sane-scan-pdf.

jarrodsfarrell avatar jarrodsfarrell commented on June 27, 2024

from sane-scan-pdf.

jarrodsfarrell avatar jarrodsfarrell commented on June 27, 2024

Unfortunately we don't have the 60PPM like before so I'm using a 25PPM model instead.

Regardless, it seems like using sem is a overall good change. I think it's even letting the OCRing step work a bit faster than running all the tesseract processes all at once (less task-switching?) and pauses between scans are noticeably more brief (scan process doesn't have to fight as much for resources?). Additional bonus is that the movement of the console is a good indicator that work is still being done instead of staying still until the tesseract processes begin quiting.

Anyways, should the last argument be erroring like this?

USER@HOST:~/Workspace/sane-scan-pdf$ ./scan -d -m color --crop --deskew --ocr out.pdf
Unknown argument: out.pdf

Nevermind. It'd help if I read the documentation.

from sane-scan-pdf.

rocketraman avatar rocketraman commented on June 27, 2024

Thanks for reporting and testing. I'll merge this.

from sane-scan-pdf.

rocketraman avatar rocketraman commented on June 27, 2024

@rocketraman Can you please add sem as additional requirement in the readme ? The lack of this information cost me some time to debug the bottleneg - until I found this closed ticket here telling my the if sem is installed: solve problem code insertion :)

It's already listed under optional requirements, but perhaps this issue deserves a more extensive call out.

from sane-scan-pdf.

rocketraman avatar rocketraman commented on June 27, 2024

@MoD01 I added an explanatory line in features for future people in your situation...

from sane-scan-pdf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.