Giter Club home page Giter Club logo

Comments (20)

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Tess4J uses jni to access the tesseract native lib, because of that i imagine to be very difficult to get it running via OpenCL, but nothing is impossible.

So why should we support this? It would not be a feature of our Project, but instead a Project itself.

By the way, i guess for the most use cases and "Java Thread Skills" you won't really need it and i don't think Enterprise Companys like NSA, etc. is using Tess4J for OCR, or are they? ^^

Vulkan!?

from tess4j.

sonik340 avatar sonik340 commented on August 20, 2024

The company I work for will use tess4j to read thousands of document a day.
They are interested in performance but powerful servers could cost an arm
:-) That is why I wander about way to fasten tess4j.
I have found some who built tesseract to use opencl so, knowing that tess4j
is "only" an interface I wonder if the work to get an opencl-tess4j was
possible.

Vulkan : I just heard about this. As far as I I understand it is a
framework that gives direct access to the graphic card. Do not mind about
that.
On Sep 1, 2015 11:36, "4F2E4A2E" [email protected] wrote:

Tess4J uses jni to access the tesseract native lib, because of that i
imagine to be very difficult to get it running via OpenCL, but nothing is
impossible.

So why should we support this? It would not be a feature of our Project,
but instead a Project itself.

By the way, i guess for the most use cases and "Java Thread Skills" you
won't really need it and i don't think Enterprise Companys like NSA, etc.
is using Tess4J for OCR, or are they? ^^

Vulkan!?


Reply to this email directly or view it on GitHub
#24 (comment).

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Hey @sonik340 are you still on to this?

from tess4j.

sonik340 avatar sonik340 commented on August 20, 2024

Hi,
This is not a priority for now. As far as I know, my company will not pay
for special servers.
But of course, any improvement would be welcomed. Indeed, I never made it
to make Tesseract work on multiple threads.

2016-03-01 23:31 GMT+01:00 4F2E4A2E [email protected]:

Hey @sonik340 https://github.com/sonik340 are you still on to this?


Reply to this email directly or view it on GitHub
#24 (comment).

SMORDOWSKI Romain
Adresse: 129 rue Jules Guesde, 59170 Croix
Courriel: [email protected]
Tél. portable: 06 80 47 87 36
Tél. fixe: 09 54 68 06 90

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Tesseract guys has added support for OpenCL, j have no clue about it, but if this topic (thread conversion) interests me a lot, if you are up to it we could try to get this done. What do say?

from tess4j.

sonik340 avatar sonik340 commented on August 20, 2024

Sadly, I do not have a lot of spare time. If you have precise task I can
help let me know. I am not Java expert but I would like too.
On 2 Mar 2016 17:00, "4F2E4A2E" [email protected] wrote:

Tesseract guys has added support for OpenCL
https://github.com/tesseract-ocr/tesseract/releases, j have no clue
about it, but if this topic (thread conversion) interests me a lot, if you
are up to it we could try to get this done. What do say?


Reply to this email directly or view it on GitHub
#24 (comment).

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Do you have any OpenCL Example i can build on? In Java!?

from tess4j.

sonik340 avatar sonik340 commented on August 20, 2024

Sadly no. But there is an AMD api that can help do the job. Or may be
Tesseract already use OpenCL and we just have to JNI on it.
On 3 Mar 2016 18:34, "4F2E4A2E" [email protected] wrote:

Do you have any OpenCL Example i can build on? In Java!?


Reply to this email directly or view it on GitHub
#24 (comment).

from tess4j.

sonik340 avatar sonik340 commented on August 20, 2024

Here is a very interesting link for java openCL :
http://developer.amd.com/tools-and-sdks/opencl-zone/aparapi/

I found out on Google that we would just have to enable openCL on tesseract
but I do not precisely found how to (seems to be a shell option)
On 3 Mar 2016 22:31, "Romain Smordowski" [email protected]
wrote:

Sadly no. But there is an AMD api that can help do the job. Or may be
Tesseract already use OpenCL and we just have to JNI on it.
On 3 Mar 2016 18:34, "4F2E4A2E" [email protected] wrote:

Do you have any OpenCL Example i can build on? In Java!?


Reply to this email directly or view it on GitHub
#24 (comment).

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Found some information on how to do it natively: http://www.sk-spell.sk.cx/tesseract-meets-the-opencl-first-test

from tess4j.

sonik340 avatar sonik340 commented on August 20, 2024

Fine !
I do not know how you plugged java to Tesseract. Can you give me
information on how you did it (technics used). I think it is JNI.
I so I will try to get to know more about it. A good example would help.
On 6 Mar 2016 20:28, "4F2E4A2E" [email protected] wrote:

Found some information on how to do it natively:
http://www.sk-spell.sk.cx/tesseract-meets-the-opencl-first-test


Reply to this email directly or view it on GitHub
#24 (comment).

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Well sure, but for that you just have to take a look to our source code. I still think that we can solve this using Threads in Java.

Is the goal merely parallel processing?

from tess4j.

sonik340 avatar sonik340 commented on August 20, 2024

I already tried parallel processing but I got errors from Tesseract, I
think because two instances used the same files.
On 6 Mar 2016 20:55, "4F2E4A2E" [email protected] wrote:

Well sure, but for that you just have to take a look to our source code. I
still think that we can solve this using Threads in Java.

Is the goal merely parallel processing?


Reply to this email directly or view it on GitHub
#24 (comment).

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

Well this depend on so many things, real threads in java are not that easy, but trying to convert the same image many times maybe not the best aproach.

http://stackoverflow.com/a/6983094/543426

I will try to solve this with java extracting txt from 100 images in parallel, is that ok for you?

from tess4j.

sonik340 avatar sonik340 commented on August 20, 2024

I tried to launch one tesseract instance per image (on different images).
The problems seemed to come from a needed tesseract reference file.
On 6 Mar 2016 21:37, "4F2E4A2E" [email protected] wrote:

Well this depend on so many things, real threads in java are not that
easy, but trying to convert the same image many times maybe not the best
aproach.

http://stackoverflow.com/a/6983094/543426

I will try to solve this with java extracting txt from 100 images in
parallel, is that ok for you?


Reply to this email directly or view it on GitHub
#24 (comment).

from tess4j.

nguyenq avatar nguyenq commented on August 20, 2024

There is a nice example of executing doOCR in parallel at https://sourceforge.net/p/tess4j/discussion/1202293/thread/4562eccb/

from tess4j.

sonik340 avatar sonik340 commented on August 20, 2024

So, we just need to upgrade Tesseract inside tess4j ?
On 7 Mar 2016 03:05, "Quan Nguyen" [email protected] wrote:

There is a nice example of executing doOCR in parallel at
https://sourceforge.net/p/tess4j/discussion/1202293/thread/4562eccb/


Reply to this email directly or view it on GitHub
#24 (comment).

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

@there you go, multi threaded conversion:

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running net.sourceforge.tess4j.TestExtractTextFromImageMultiThreaded
Starting conversion with '4' Threads.
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_deskew.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_03.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_02.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_deskew_04.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_unlv.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_deskew_03.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_unlv_04.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_unlv_03.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_deskew_02.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_unlv_02.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_04.png.txt
Finished all threads
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.65 sec
Running net.sourceforge.tess4j.TestExtractTextFromImageSingleThreaded
Starting conversion with '1' Threads.
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_unlv.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_deskew_04.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_deskew_03.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_unlv_04.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_unlv_03.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_unlv_02.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_deskew.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_04.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_deskew_02.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_03.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext_02.png.txt
Extracted: \tess4j-multi-threads\[...]batch-conversion\eurotext.png.txt
Finished all threads
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.56 sec

Almost half of the time needed, please consider reading Amdah's Law and Vogella's Tutorial on Java-Threads
tess4j-multi-threads.zip

from tess4j.

4F2E4A2E avatar 4F2E4A2E commented on August 20, 2024

@sonik340 any status update for us would be appreciated in order to close this issue.

from tess4j.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.