Giter Club home page Giter Club logo

Comments (33)

struCoder avatar struCoder commented on August 20, 2024 2

@tonydeng
you can download chi_sim or other languages from https://github.com/tesseract-ocr/tessdata
to your /usr/local/Cellar/tesseract/3.04.01_2/share/tessdata

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Hey there!

3.04 is currently not supported, please use one of the latest release, more info at the release page.

To get it running under Linux it's very easy, although I have a Mac, I'd never have tried tess4j on a macosx.

I will give it a try and get back to you.
Please post the information if you find out more.

Best regards,
OJ

Am 26.11.2015 um 06:12 schrieb fivesmallq [email protected]:

OS X EI Capitan 10.11.1
JDK8_60
test4j 2.0.1
tesseract 3.04.00

i installed tesseraect from brew.

brew reinstall tesseract --all-languages --with-training-tools
tessdata path is /usr/local/share/ and it has chi_sim.traineddata

but when i use tess4j to load chi_sim,

Failed loading language 'chi_sim'
Tesseract couldn't load any languages!

A fatal error has been detected by the Java Runtime Environment:

SIGSEGV (0xb) at pc=0x000000012a54e933, pid=3139, tid=5891

JRE version: Java(TM) SE Runtime Environment (8.0_60-b27) (build 1.8.0_60-b27)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.60-b23 mixed mode bsd-amd64 compressed oops)

Problematic frame:

C [libtesseract.dylib+0x12933] tesseract::Tesseract::recog_all_words(PAGE_RES_, ETEXT_DESC_, TBOX const_, char const_, int)+0xb9

the jvm crashed. here is log https://gist.github.com/fivesmallq/1f6d349c02e9bbab9b80

eng is ok.

also, i clone the tess4j project from github. and update junit test to set language chi_sim, put chi_sim.traineddata to src/main/resources, It appeared the same problem.

➜ tessdata git:(master) which tesseract
/usr/local/bin/tesseract
➜ tessdata git:(master) tesseract --list-langs
List of available languages (107):
afr
amh
ara
asm
aze
aze_cyrl
bel
ben
bod
bos
bul
cat
ceb
ces
chi_sim
chi_tra
chr
cym
i use tesseract with the command line is ok.

is it not currently does not support tesseract 3.04.00 ?

Thank you


Reply to this email directly or view it on GitHub.

from tess4j.

fivesmallq avatar fivesmallq commented on August 20, 2024

@4F2E4A2E OK, Thank you,I will try it use tesseract 3.03.00 on Ubuntu.

from tess4j.

vovtz avatar vovtz commented on August 20, 2024

No need to use Ubuntu, because Tess4J works under OS X as well (at least under 10.10 - Yosemite). You do need to use the current release together with Tesseract 3.03 though.

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Good to know! :)
Did you install tesseract via brew?

from tess4j.

vovtz avatar vovtz commented on August 20, 2024

No, I built it from source because (at the time?) version 3.03 was not available in the Homebrew repo.

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Nice thanks.

from tess4j.

DavyLin avatar DavyLin commented on August 20, 2024

@4F2E4A2E hello,i have the same error.Did you solve this error?

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

There was no error, only the datapath beeing not loaded correctly. Tell me more and i may be able to help you :)

from tess4j.

fivesmallq avatar fivesmallq commented on August 20, 2024

tesseract 3.04.00
leptonica-1.72
libjpeg 8d (libjpeg-turbo 1.3.0) : libpng 1.2.50 : libtiff 4.0.3 : zlib 1.2.8

tess4j-2.0.1 works well on Ubuntu 14.04 and JDK 1.8.0_77

from tess4j.

DavyLin avatar DavyLin commented on August 20, 2024

@4F2E4A2E
tesseract 3.0.2
leptonic-1.69
tess4j-1.5
jdk1.7.0_45
datapath=/usr/local/share/

but the error info:
Failed loading language 'chi_sim'
Tesseract couldn't load any languages!

A fatal error has been detected by the Java Runtime Environment:

SIGSEGV (0xb) at pc=0x000000011c657a8a, pid=1578, tid=5891

JRE version: Java(TM) SE Runtime Environment (7.0_45-b18) (build 1.7.0_45b18)
Java VM: Java HotSpot(TM) 64-Bit Server VM (24.45-b08 mixed mode bsd-md64 compressed oops)
Problematic frame:
C [libtesseract.dylib+0xf3a8a] _ZN9tesseract8Classify18CharNormClassifireEP5TBLOBRK6DENORMP20INT_TEMPLATES_STRUCTP13ADAPT_RESULTS+0x7a

Failed to write core dump. Core dumps have been disabled. To enable core umping, try "ulimit -c unlimited" before starting Java again

An error report file with more information is saved as:
/Volumes/HDD/work/ideaprojects/zhibird/zbOnline/hs_err_pid1578.log

If you would like to submit a bug report, please visit:
http://bugreport.sun.com/bugreport/crash.jsp
The crash happened outside the Java Virtual Machine in native code.
See problematic frame for where to report the bug.

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Tess4j 1.5? Really? Why don't you use the latest version?

Am 15.04.2016 um 16:47 schrieb davylin [email protected]:

@4F2E4A2E
tesseract 3.0.2
leptonic-1.69
tess4j-1.5
jdk1.7.0_45
datapath=/usr/local/share/

but the error info:
Failed loading language 'chi_sim'
Tesseract couldn't load any languages!

A fatal error has been detected by the Java Runtime Environment:

SIGSEGV (0xb) at pc=0x000000011c657a8a, pid=1578, tid=5891

JRE version: Java(TM) SE Runtime Environment (7.0_45-b18) (build 1.7.0_45b18)
Java VM: Java HotSpot(TM) 64-Bit Server VM (24.45-b08 mixed mode bsd-md64 compressed oops)
Problematic frame:
C [libtesseract.dylib+0xf3a8a] _ZN9tesseract8Classify18CharNormClassifireEP5TBLOBRK6DENORMP20INT_TEMPLATES_STRUCTP13ADAPT_RESULTS+0x7a

Failed to write core dump. Core dumps have been disabled. To enable core umping, try "ulimit -c unlimited" before starting Java again

An error report file with more information is saved as:
/Volumes/HDD/work/ideaprojects/zhibird/zbOnline/hs_err_pid1578.log

If you would like to submit a bug report, please visit:
http://bugreport.sun.com/bugreport/crash.jsp
The crash happened outside the Java Virtual Machine in native code.
See problematic frame for where to report the bug.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub

from tess4j.

DavyLin avatar DavyLin commented on August 20, 2024

@4F2E4A2E I found the information in the sourceforege (http://tess4j.sourceforge.net/changelog.html) i think about Tess4j 1.5 is supported for Tesseract 3.02. i try the latest version ,thank you!

from tess4j.

aceyin avatar aceyin commented on August 20, 2024

@4F2E4A2E i got the same issue under Mac OS
tesseract, tesseract-eng and tesseract-chi-sim are all installed by port, the version of tesseract is 3.04.00_3

the error messages are:

Failed loading language 'chi_sim'
Tesseract couldn't load any languages!

but if i switch the language back to "eng", it worked fine ...

the libraries i used are:

<dependency>
<groupId>net.sourceforge.tess4j</groupId>
    <artifactId>tess4j</artifactId>
        <version>3.2.1</version>
</dependency>
<dependency>
    <groupId>com.github.jai-imageio</groupId>
    <artifactId>jai-imageio-core</artifactId>
    <version>1.3.1</version>
</dependency>
<dependency>
    <groupId>net.java.dev.jna</groupId>
    <artifactId>jna</artifactId>
    <version>4.1.0</version>
</dependency>

could u pls help to have a look ?

from tess4j.

aceyin avatar aceyin commented on August 20, 2024

@fivesmallq @DavyLin have u solved that issue?

from tess4j.

fivesmallq avatar fivesmallq commented on August 20, 2024

@aceyin use tesseract 3.04.01dev with tess4j-3.2.0 maybe solve this issue.

➜ ~ tesseract -v
tesseract 3.04.01dev
leptonica-1.72
libjpeg 8d : libpng 1.6.20 : libtiff 4.0.6 : zlib 1.2.5

from tess4j.

aceyin avatar aceyin commented on August 20, 2024

@fivesmallq i'am using tesseract 3.04, and this issue occurred in 3.04 also ...
trying run the demo on ubuntu .

from tess4j.

fivesmallq avatar fivesmallq commented on August 20, 2024

@aceyin
image

maven:

        <dependency>
            <groupId>net.sourceforge.tess4j</groupId>
            <artifactId>tess4j</artifactId>
            <version>2.0.1</version>
        </dependency>

➜ ~ tesseract -v
tesseract 3.04.01dev
leptonica-1.72
libjpeg 8d : libpng 1.6.20 : libtiff 4.0.6 : zlib 1.2.5


test again. tess4j 3.2.1 is also works well.

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Hi guys! Did you for sure add and double checked that the lang-pack is in the correct by you define datafolder?

from tess4j.

aceyin avatar aceyin commented on August 20, 2024

@fivesmallq @4F2E4A2E
thanks guys, but it didnot work under my Mac...
here are the tesseract version and lang package
image

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Maybe you are using the wrong language pack files? https://github.com/tesseract-ocr/langdata/tree/master/chi_sim
If it is working with the english language pack, then it should work with the rest

from tess4j.

aceyin avatar aceyin commented on August 20, 2024

@4F2E4A2E @fivesmallq
i tried run tesseract from command line, and it worked.
test
so i think the problem should be something like path or classpath etc.

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Can you confirm if english langpack is working?

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

@aceyin ?

from tess4j.

tonydeng avatar tonydeng commented on August 20, 2024

@aceyin
I was the same question, would you like to ask you now to solve it?

from tess4j.

zcmgyu avatar zcmgyu commented on August 20, 2024

It also not work for me on macOS 10.12.3

Tess4j latest version.

<dependency>
    <groupId>net.sourceforge.tess4j</groupId>
    <artifactId>tess4j</artifactId>
    <version>3.2.1</version>
</dependency>

I did install tesseract-ocr frow brew. Latest version: 3.04.01_2

brew install tesseract --all-languages

Tesseract instance = new Tesseract();
instance.setDatapath("/usr/local/Cellar/tesseract/3.04.01_2/share/tessdata");
instance.setLanguage("jpn");

Error:

Failed loading language 'jpn'
Tesseract couldn't load any languages!

from tess4j.

nguyenq avatar nguyenq commented on August 20, 2024

Try with JNA 4.3.0 to see if it makes any difference.

And the path should be set the parent directory of tessdata directory:

instance.setDatapath("/usr/local/Cellar/tesseract/3.04.01_2/share/");

from tess4j.

zcmgyu avatar zcmgyu commented on August 20, 2024

Thanks Quan Nguyen for your response.
You mean I just need add this repository into Maven?

<dependency>
    <groupId>net.java.dev.jna</groupId>
    <artifactId>jna</artifactId>
    <version>4.3.0</version>
</dependency>

<dependency>
    <groupId>net.sourceforge.tess4j</groupId>
    <artifactId>tess4j</artifactId>
    <version>3.2.1</version>
</dependency>

And I also change path into

ITesseract instance = new Tesseract(); // JNA Interface Mapping
instance.setDatapath("/usr/local/Cellar/tesseract/3.04.01_2/share");
instance.setLanguage("jpn");

It also throws that error message.

hs_err_pid59284.log.zip

PS: instance.setLanguage("eng"); works fine for me

from tess4j.

nguyenq avatar nguyenq commented on August 20, 2024

Yes. Can you try temporarily removing eng.traineddata file and renaming jpn.traineddata to that and see if that would work?

It is really a shot in the dark since we do not have Mac to investigate the issue. The library has been tested on Windows and Linux. JNA is the only piece that is different between the platforms since it has native components specific to an OS, so we think tracing through JNA code in debugging could help.

from tess4j.

zcmgyu avatar zcmgyu commented on August 20, 2024

Thanks for your reply.
But, It still throws that error message, though I rename jpn package to eng.
I think problem derived from jpn package.
It's a pity if you can't check on macOS environment.
I suggest you use Hackintosh in your PC.
hs_err_pid86067.log.zip

from tess4j.

nguyenq avatar nguyenq commented on August 20, 2024

Someone commented in #34 that "All language data larger than about 20-25 MB cannot be loaded."

from tess4j.

zcmgyu avatar zcmgyu commented on August 20, 2024

Could you fix it in next version :D

from tess4j.

iseegr8tfuldeadppl avatar iseegr8tfuldeadppl commented on August 20, 2024

Make sure the environment variable TESSDATA_PREFIX is set to your tessdata directory!
(for ex. C:\msys64\mingw32\share\tessdata).

from tess4j.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.