Giter Club home page Giter Club logo

Comments (6)

joseavegaa avatar joseavegaa commented on May 24, 2024

Ok, after some testing I found that there is a compatibility issue between the latest version of Tesseract in conda-forge and tesserocr.

The ones to the left are the latest versions, which end up with ImportError: DLL load failed while importing tesserocr: The specified module could not be found.:

libarchive 3.7.2-h6f8411a_0 --> 3.6.2-h6f8411a_1
tesseract 5.3.2-hb328096_1 --> 5.3.2-hae9691c_0

If you use conda install tesseract=5.3.2=hae9691c_0 to specifically install that build, the issue is gone.

This is a temporary fix, but I am not sure if the error arises from the build of Tesseract in conda-forge or if it a problem with tesserocr itself.

from tesserocr.

sirfz avatar sirfz commented on May 24, 2024

I guess you're installing tesserocr via conda-forge? Unfortunately the tesserocr build has been broken for a while (2 or 3 versions ago). The original maintainer isn't active on it and I'm no conda user myself, if anyone wants to take up maintenance responsibilities it would be great.

from tesserocr.

icanhasmath avatar icanhasmath commented on May 24, 2024

I also found that libarchive is required to run Tesseract on Windows. I think as long as you have a working libarchive on path (and it's dependencies, I had to add openssl as well), your tesseract.exe should work.

If you want to try our build of the Tesserocr stack you can pull it from here. I've tested it on Linux and Windows using the Post Install steps.

With big thanks to @sirfz for that documentation.

from tesserocr.

zdenop avatar zdenop commented on May 24, 2024

libarchive and curl (which needs openssl) are not needed - these are optional dependencies for tesseract.
libarchive could be used for compressed traneddata, but you find nobody use it. People prefer speed over saving space.
curl is used for opening online images by tesseract executable which is not wrapped by tesserocr.
Both features (Saving space & reading online images) could be replaced by native python functions, so adding them as a dependancy to tesserocr makes no sence.

from tesserocr.

icanhasmath avatar icanhasmath commented on May 24, 2024

There is an option in Tesseract to disable libarchive DISABLE_ARCHIVE - it is set to off by default. If libarchive is not present at build time it doesn't throw an error, but the Tesseract.exe expects the dependency to be available on start-up. I will try a rebuild without these dependencies.

from tesserocr.

icanhasmath avatar icanhasmath commented on May 24, 2024

Confirmed that these settings removed the need to ship extra dependencies.

For CMake:

DISABLE_ARCHIVE=ON
DISABLE_CURL=ON       

For Autotools:

--without-archive
--without-curl

from tesserocr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.