Giter Club home page Giter Club logo

Comments (5)

srisi avatar srisi commented on July 30, 2024 1

Related: It would be great if we could rectify and deskew images before ocring them. It seems that tesseract doesn't do this by default.
e.g. in the following image, only the highlighted area was ocred.
screenshot from 2019-02-11 10-05-52

Quick google search turned up:
An implementation in python:
https://www.pyimagesearch.com/2017/02/20/text-skew-correction-opencv-python/
It seems that imagemagick might be able to do the job:
https://stackoverflow.com/questions/12117644/deskew-and-filter-an-image-for-ocr

from computation_hist.

mscuthbert avatar mscuthbert commented on July 30, 2024

Ah! I was used to Acrobat OCR which does this -- yes, I think that this would a super task to do.

from computation_hist.

srisi avatar srisi commented on July 30, 2024

@samimak37 and @meesuekim: I think it would be useful for @ifeife123 's task of extracting documents from larger pdfs if the ocr function could extract page ranges, i.e. if you could implement the params start_page and end_page such that it would create an ocred pdf / extract text only from the selected page range.

from computation_hist.

srisi avatar srisi commented on July 30, 2024

@samimak37
I think this implements the method (find angle that maximizes number of lines that are white or mixed) that we discussed yesterday: https://avilpage.com/2016/11/detect-correct-skew-images-python.html

from computation_hist.

srisi avatar srisi commented on July 30, 2024

I've been messing around today with tesseract on the command line, primarily with some of my tobacco documents.
TLDR: the LSTM mode of tesseract 4 is impressive.

Base image: https://s3.amazonaws.com/comp-hist/docs/1_10_architecture/docs/1/pages/1/1_10_architecture_1_1.png
CLI documentation: https://github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage

Tesseract 3:

‘ ’ W“. 4'4”?“ ‘

Ir. c I. 1 P3133333 V
. 31333131", P3131331 P1331
_ 11333 21-235 ' ~ 3

1' 033: I1. 1313:3311: _ 1‘

g ' . ‘ 13 33331-33333 3113 on: 31333331333 3113 33:1333 m
"311131313 at 3 333113; 33 133131”, 33:33 81-, '31: 33:33:! to

3:331:13 133 III (mp 3113 3333 33333 13 3311313¢ 30.1313 33333 7‘"

,13 13 33 3333 by 133 3331133 8313333 333331-33 0:333 (180) 333
. 73111 33 3313¢ 133 704 33 3 333—33111 33313 1313 m 333
333133: 311133 13 03311-31 83331-3 31 ‘73 1333333333113 Av3333. 1

. III 333333133 1331 1 3: 2 3373 (3313:3111 1333 33313
'-p:313r I) 13 33116133 20 33 33313336 13 1333 to 3333-30313  133

33333 31 1331: 31311.1 3331.3 1133 you to 3313: 1313 :333331
3333¢ 133 3133: 319333 3133 33133 133 3:3 3331-33111 33331d3r133.

81333:“: {33:3 ,

I. I. Vanna -
1133131331 011-3313:

cc: 9131. C. I. 3133
\/Pro.f. P. 3. 302-33

Tesseract 4 with LSTM and language set to english
tesseract test.png stdout -l eng --oem 1

Meri 4, esr

Mr, C. M. ¥. Peterson _
Director, Physical Plast
Room 24-205
& Dear wr, Peterson: ; wl es

; In accordance with our  alonsitinn with varices EM
_efticials at a meeting on Thursday, March 21, MIT agreed to

© provide the 18M group with some space in Building 20, This space ict, =

is to be used by the Applied Science Research Group (ASR) we

~. will be using the 704 on a one-shift basis This group has

another office in Central Square at 678 Bsssanhusatis Avenue. ;

IBM requested that 1 or 2 bays (naturally they would
_ prefer 2) in Building 20 be assigned to them to accomodate the
needs of their staff, 1 would like you to enter this request
among the other space bids which you are Tana considering.

Sincerely yours ’

F. M, Verzuh

Assistant Director

ce: pfof . C., F, Floe
Prof. P, M. Morse

per

I'll experiment a bit more.

from computation_hist.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.