ocr4all / larex Goto Github PK

A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.

License: MIT License

Java 52.49% CSS 0.13% JavaScript 44.43% Shell 0.23% SCSS 2.65% Dockerfile 0.06%

larex's Introduction

OCR4all

As suggested by the name one of the main goals of OCR4all is to allow basically any given user to independently perform OCR on a wide variety of historical printings and obtain high quality results with reasonable time expenditure. Therefore, OCR4all is explicitly geared towards users with no technical background. If you are one of those users (or if you just want to use the tool and are not interested in the code), please go to the documentation website or the getting started project where you will find test data.

Please note that OCR4all current main focus is a semi-automatic workflow allowing users to perform OCR even on the earliest printed books, which is a very challenging task that often requires a significant amount of manual interaction, especially when almost perfect quality is desired. Nevertheless, we are working towards increasing robustness and the degree of automation of the tool. An important cornerstone for this is the recently agreed cooperation with the OCR-D project which focuses on the mass full-text recognition of historical materials.

This repository contains the code for the main interface and server of the OCR4all project, while the repositories OCR4all/docker_image and OCR4all/docker_base_image are about the creation of a preconfigurated docker image.

For installing the complete project with a docker image, please follow the instructions here.

Mailing List

OCR4all is under active development and consequently, frequent releases containing bug fixes and further functionality can be expected. In order to always be up to date, we highly recommend subscribing to our mailing list where we will always announce notable enhancements.

Built With

Docker - Platform and Software Deployment
Maven - Dependency Management
Spring - Java Framework
Materialize - Front-end Framework
jQuery - JavaScript Library

Included Projects

OCRopus - Collection of document analysis programs
Calamari - OCR Engine based on OCRopy and Kraken
LAREX - Layout analysis on early printed books

Formerly included / inspired by

Kraken - OCR engine for all the languages
nashi - Some bits of javascript to transcribe scanned pages using PageXML

Contact, Authors, and Helping Hands

Dr. Christian Reul (project lead) - mail: [email protected]
Florian Langhanki (user support and guides) - mail: [email protected]

Developers

Dr. Herbert Baier Saip (lead)
Maximilian Nöth (OCR4all, LAREX, and Calamari)
Dr. Christoph Wick (Calamari)
Andreas Büttner (Calamari and nashi)
Kevin Chadbourne (LAREX)
Yannik Herbst (distribution via VirtualBox)
Björn Eyselein (Artifactory and distribution via Docker)

Miscellaneous

Raphaëlle Jung (guides and artwork)
Dr. Uwe Springmann (ideas and feedback)
Prof. Dr. Frank Puppe (ideas and feedback)

Former Project Members

Dennis Christ (OCR4all)
Alexander Hartelt (OCR4all)
Nico Balbach (OCR4all and LAREX)
Christine Grundig (ideas and feedback)
Maximilan Wehner (user support and guides)
...

Funding

Citing OCR4all

If you are using OCR4all please cite:

Reul, C., Christ, D., Hartelt, A., Balbach, N., Wehner, M., Springmann, U., Wick, C., Grundig, Büttner, A., C., Puppe, F.: OCR4all — An open-source tool providing a (semi-) automatic OCR workflow for historical printings Applied Sciences 9(22) (2019)

@article{reul2019ocr4all,
  title={OCR4all—An open-source tool providing a (semi-) automatic OCR workflow for historical printings},
  author={Reul, Christian and Christ, Dennis and Hartelt, Alexander and Balbach, Nico and Wehner, Maximilian and Springmann, Uwe and Wick, Christoph and Grundig, Christine and B{\"u}ttner, Andreas and Puppe, Frank},
  journal={Applied Sciences},
  volume={9},
  number={22},
  pages={4853},
  year={2019},
  publisher={Multidisciplinary Digital Publishing Institute}
}

larex's People

Stargazers

Watchers

larex's Issues

Moving page while drawing a line/region/segment

While drawing a line, region or segment it should be possible to move the page, e. g. by using the arrow keys or something like Shift + left mouse.

Scale problem with different image sizes in one book

There seems to be a problem with images of different pixel sizes.
Cuts, region polygons and fixed segments created in the web view are sometimes processed with the wrong scale factor by the server.

Different RoIs for recto/verso

It should be possible to define two separate RoIs recto/verso (odd/even) pages.

firefox compatibility problems. scrolling via mouse

Scrolling via mouse on firefox will make the page vanish.
Seems to be a problem with the value provided by firefoxs scroll wheel.

Add reading order to the webview

The webapplication does currently not support creating or changing the reading order.
Files created by export to PageXML are lacking this order.

Select several segments

When the user clicks somewhere on the page where no segments are located the already selected segments shouldn't get deselected if Ctrl is still pressed.

Jump to page

It should be possible to quickly jump to a certain page. Since the numbering of the pages depends on the file names and might be inconsistent the best solution would probably be a extendable drop down menu showing the file names. This menu should also indicate which pages have already been segmented.

Segment select Issues

It should be possible to deselect a single segment while keeping the rest of the selected segments (Ctrl + left click).
Selected segments should get deselected when the user clicks at a spot where no segments are located.
It should be possible to add several segements to a selection by holding down Ctrl and selecting the additional segments by drawing a rectangle.

show filename as tooltip

It would be helpful if a tooltip could show the filename of an image in the left bar of the viewer.

Hover effect during the creation of polygons

The hover effect of polygons is active while creating a new polygon.
This is an unneeded distraction and should be prevented

Inconsitent behaviour of create/delete region

Create/Delete region sometimes gets ignored. Also happens when using Firefox.
Sometimes the image region can be deleted, which shouldn't be possible.

Open type dropdown with one click

When right clicking on an inactive segment it should become active and the type dropdown should open right away.
Right now it sometimes requires two clicks (left click to activate, right click for dropdown). This seems to depend on the operating system as the problem never (?) occurs when running windows.

Add flexible select functionality

Add a flexible select functionality

Select everything inside a box
Select everything between last selected and "Shift" selected

Region maxOccurence

The maxOccurence region parameter seems to be buggy (and kinda useless) and should be temporarily disabled/removed.
It's misspelled anyways :-).

Resize/expand and move polygons for regions and fixed segments

A user should be able to move and resize polygons for regions and fixed segments.
Such a operation currently requires the user to delete and recreate the polygon.

Current goals are to add options for:

moving polygons for regions and fixed segments
resizing/expanding rectangle polygons of regions
resizing/expanding polygons of fixed segments

A distinction between rectangle polygons and un-regular polygons has to be made.
It must be possible to scale/expand a rectangular polygon at his edges and corners while maintaining a rectangular shape.
Resizing the polygons of fixed segments could be a more complicated issue. Those polygons do not have to be rectangles and un-regular polygons should behave differently than rectangles.

Shortcuts for saving

There should be shortcuts for saving (Ctrl+S) and Exporting (Ctrl+E) a segmentation result.

Zoom cap and anti aliasing

I'm not sure if it is necessary to be able to zoom in to about 10,000,000%. A cap might be in order.
Furthermore, anti aliasing looks weird and should be turned off,

Mark processed pages in the page overview

Already finished (= result has been stored as PageXML) pages should be marked with a green border in the page overview.

Merge several selected segments

It should be possible to merge several selected segments:

Each segment should get connected to its closest neighbour.
After merging, the type dialogue should pop-up so a new type can be assigned right away. Alternatively, the initial type of the biggest single segment could be assigned.

Enable zooming during drawing

It should be possible to zoom while drawing a polygon/rectangle/cut line.

Saving final result

When saving the final result via "Export PageXML" (should be renamed to "Save Result" or something similar) the corresponding page should be marked by a symbol (just like when downloading the PageXML).
When returning to a page whose final result has been already saved, the final result should be displayed instead of the result that was produced by the automatic segmentation.
A keyboard shortcut would be nice.

is there any plan when to release souce code

thx very much

Non-permanent ignore regions

While the RoI should always be permanent, ignore rectangles or polygons by default should only be valid for the current page. Optimally, the user should be able to choose between permanent and non-permanent.

Visibility of the functions to move, scale and delete objects.

Moving, scaling and deleting an object can only be done by selecting the object and pressing the keys m, s or del respectively. The access to those functionalities is not obvious enough.

The following change to the context menu is proposed:

Creating a region object or fixed segment could still open the current context menu, while right clicking existing objects would open the enhanced context menu.

Page XML-Export

Add Page XML-Export per page.

error when adding images at runtime

If a new images is added to a book diretory while LAREX is running and the book is open in the LAREX editor, an exception is thrown when clicking on the new image (or the last image of the book):
java.lang.IndexOutOfBoundsException: Index: 7, Size: 7
at java.util.LinkedList.checkElementIndex(LinkedList.java:555)
at java.util.LinkedList.get(LinkedList.java:476)
at com.web.model.Book.getPage(Book.java:46)
at com.web.facade.LarexFacade.segmentPages(LarexFacade.java:106)
at com.web.controller.ViewerController.segment(ViewerController.java:86)

UI enhance the display the create options of Region polygons, Fixed Segments and Cuts

Categories to create a region polygon, fixed segment and cut are not clear enough.
The display seems to be un-intuitive and should be changed.

Assigning a type to several selected segments

It should be possible to assign a new type to several selected segments at once.

Adding fixed segments without re-running the segmentation

It should be possible to add fixed segments without re-running the segmentation.
When the result is saved/exported fixed segments which have been added after the last segmentation run should be treated as final segments. Obviously, this could lead to overlapping segments but to deal with this lies within the responsibility of the user.

Segments opacity occasionally stuck on "selected"

Moving or scaling fixed segments will change the opacity of the segments body to "selected".
Reopening the page or segmenting will reset it to normal.

~~Edit: Same happens occasionally when working with the reading order.~~

Aborting to move or scale a segment can cause it to be highlighted, even if the cursor is no longer above the segment. Hovering over the segment again will solve this.

Cancel drawing operation when losing focus

e. g.: Switch the page, click on a properties button, ...

Book pages aren't loaded alphabetically

Book pages are not loaded alphabetically and are displayed randomly.
A short sort before the processing will be added.

Hide segmentation

It should be possible to hide the current segmentation result (keyboard shortcut...) as it can be distracting when manually marking an image etc.
This setting should stay active until the user disables it (shortcut), runs the segmentation or switches to another page.

Slowdown - Selecting many segments with many points

The applications slows down while moving the document if many segments with many points are selected.

A different more performant design should be used.

Add save/load settings functionality

It should be possible to save and load parameter and region settings for an individual book so they can be used again after closing the tool.

Resizing: catchment area

If a region/segment is very small, it is cumbersome to select its border.
There should be a minimum value for the catchment area while resizing.

book directory with special characters ignored

If a directory in the book folder contains a 'special character' such as an ö it is ignored in the list of books.

Deleted segments still occur in PageXML result

It seems that already deleted segments still get saved in the PageXML result. Obviously, this shouldn't happen.

Zooming while creating a polygon

It is possible to zoom while creating a polygon.
The polygon stays in relation to the screen instead of the page image.
It would be better if it stayed in relation to the page.

Traffic should be reduced

Opening a book currently loads all of its pages in full resolution from the server.
This can lead to unwanted and unneeded web traffic and should be avoided

Current goals are:

only loading page images by demand
using down scaled images for the page preview

Add region of interest (RoI) modus

In order to exclude page periphery (if present) a RoI modus is required.
The user should be able to mark the interesting region of the page by drawing a rectangle. The remaining borders should then be marked as an ignore region.

Clicking the File->PageXML Button should download the xml-file only once

When using the PageXML Button of the file menu bar, the page is downloaded twice. This behavior is confirmed on chrome and firefox.
The Export as PageXML Button on the side panel works as expected

Add image segmentation modi

It should be possible to choose between different segmentation modi:

none (detected image contours are ignored)
contour only (the original image contour is used)
straight rect (a straight bounding rectangle is fitted around the detected contour)
rotated rect (a rotated ...)

Regions types in "create region"

Region types which are already in use shouldn't be displayed in the "create region" dialogue.
Region types should be displayed in alphabetical order.

Add and edit reading order

Add the functionality to add and edit a reading order for each page.

Current goals are to add options for:

auto generate a reading order top to bottom
edit the current reading order
hide reading order
creating a new reading order via mouse

Move/Scale multiple objects at once

Scaling and moving region polygons or fixed segments is at the moment restricted to the object last selected.
Scaling and moving multiple objects should be implemented and tested.

multiple regions have the same color

Colors for the regions page_number, image, paragraph, marginal and ignore are set manually, the other regions are getting a computer generated color assigned.
This leads to different regions sharing a similar color.

Current goals are:

~~unique colors for every region by default (at least as unique as possible)~~ (03b768e)
~~settings for the user to change a region's color~~ (9bd2288)

Add TIF support

The webapp currently only supports image types common to webapplications.
TIF is not supported.
This should be changed given the importance of the TIF format in digitization workflows.

Keep active thumbnail centered

When selecting a new page the corresponding thumbnail should be centered vertically (if possible) so the user doesn't have to scroll all the time while working through the scans.

Default region positions stop working after being resized or moved

When a position of a default region (e. g. the page-number position at the top of the page) gets resized or moved it stops working ->comprised segments aren't classified correctly anymore.
User created positions can be resized or move without any problems.