Giter Club home page Giter Club logo

ocr4all.github.io's Introduction

example workflow

As suggested by the name one of the main goals of OCR4all is to allow basically any given user to independently perform OCR on a wide variety of historical printings and obtain high quality results with reasonable time expenditure. Therefore, OCR4all is explicitly geared towards users with no technical background. If you are one of those users (or if you just want to use the tool and are not interested in the code), please go to the documentation website or the getting started project where you will find test data.

Please note that OCR4all current main focus is a semi-automatic workflow allowing users to perform OCR even on the earliest printed books, which is a very challenging task that often requires a significant amount of manual interaction, especially when almost perfect quality is desired. Nevertheless, we are working towards increasing robustness and the degree of automation of the tool. An important cornerstone for this is the recently agreed cooperation with the OCR-D project which focuses on the mass full-text recognition of historical materials.

This repository contains the code for the main interface and server of the OCR4all project, while the repositories OCR4all/docker_image and OCR4all/docker_base_image are about the creation of a preconfigurated docker image.

For installing the complete project with a docker image, please follow the instructions here.

Mailing List

OCR4all is under active development and consequently, frequent releases containing bug fixes and further functionality can be expected. In order to always be up to date, we highly recommend subscribing to our mailing list where we will always announce notable enhancements.

Built With

Included Projects

  • OCRopus - Collection of document analysis programs
  • Calamari - OCR Engine based on OCRopy and Kraken
  • LAREX - Layout analysis on early printed books

Formerly included / inspired by

  • Kraken - OCR engine for all the languages
  • nashi - Some bits of javascript to transcribe scanned pages using PageXML

Contact, Authors, and Helping Hands

Developers

  • Dr. Herbert Baier Saip (lead)
  • Maximilian Nöth (OCR4all, LAREX, and Calamari)
  • Dr. Christoph Wick (Calamari)
  • Andreas Büttner (Calamari and nashi)
  • Kevin Chadbourne (LAREX)
  • Yannik Herbst (distribution via VirtualBox)
  • Björn Eyselein (Artifactory and distribution via Docker)

Miscellaneous

  • Raphaëlle Jung (guides and artwork)
  • Dr. Uwe Springmann (ideas and feedback)
  • Prof. Dr. Frank Puppe (ideas and feedback)

Former Project Members

  • Dennis Christ (OCR4all)
  • Alexander Hartelt (OCR4all)
  • Nico Balbach (OCR4all and LAREX)
  • Christine Grundig (ideas and feedback)
  • Maximilan Wehner (user support and guides)
  • ...

Funding

Citing OCR4all

If you are using OCR4all please cite:

Reul, C., Christ, D., Hartelt, A., Balbach, N., Wehner, M., Springmann, U., Wick, C., Grundig, Büttner, A., C., Puppe, F.: OCR4all — An open-source tool providing a (semi-) automatic OCR workflow for historical printings Applied Sciences 9(22) (2019)

@article{reul2019ocr4all,
  title={OCR4all—An open-source tool providing a (semi-) automatic OCR workflow for historical printings},
  author={Reul, Christian and Christ, Dennis and Hartelt, Alexander and Balbach, Nico and Wehner, Maximilian and Springmann, Uwe and Wick, Christoph and Grundig, Christine and B{\"u}ttner, Andreas and Puppe, Frank},
  journal={Applied Sciences},
  volume={9},
  number={22},
  pages={4853},
  year={2019},
  publisher={Multidisciplinary Digital Publishing Institute}
}

ocr4all.github.io's People

Contributors

dependabot[bot] avatar isa348 avatar l-fl avatar maxnth avatar sinab0ck avatar

Stargazers

 avatar  avatar  avatar

ocr4all.github.io's Issues

Update links

Hi, I noticed a few misdirecting links:

Remove node_modules from repository

The node_modules directory should be removed from the repository as it bloats the repository size, might lead to unnecessary merge conflicts and can be recreated locally with yarn install / npm install

Fix layout for uncommon and mobile windows resolutions

There are some minor UI bugs when the page is opened with an uncommon resolution or when slowly minimizing the resolution.

  1. When opening the page in a small, uncommon resolution the logo text in the navbar overlays some of the navbar list items (see below)
    image
  2. When minimizing the window resolution the switch to the minimized navbar -> sidebar doesn't appear early enough and therefore some navbar list items remain inaccessible (see below)
    image

Update Dockerhub

Since we moved from ls6uniwue to uniwuezpd the links should be updated accordingly in the documentation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.