Giter Club home page Giter Club logo

core's Introduction

OCR-D/core

Python modules implementing OCR-D specs and related tools

image image image image Docker Automated build image image

Gitter chat

Introduction

This repository contains the python packages that form the base for tools within the OCR-D ecosphere.

All packages are also published to PyPI.

Installation

NOTE Unless you want to contribute to OCR-D/core, we recommend installation as part of ocrd_all which installs a complete stack of OCR-D-related software.

The easiest way to install is via pip:

pip install ocrd

# or just the functionality you need, e.g.

pip install ocrd_modelfactory

All python software released by OCR-D requires Python 3.5 or higher.

Command line tools

NOTE: All OCR-D CLI tools support a --help flag which shows usage and supported flags, options and arguments.

ocrd CLI

ocrd-dummy CLI

A minimal OCR-D processor that copies from -I/-input-file-grp to -O/-output-file-grp

Packages

ocrd_utils

Contains utilities and constants, e.g. for logging, path normalization, coordinate calculation etc.

See README for ocrd_utils for further information.

ocrd_models

Contains file format wrappers for PAGE-XML, METS, EXIF metadata etc.

See README for ocrd_models for further information.

ocrd_modelfactory

Code to instantiate models from existing data.

See README for ocrd_modelfactory for further information.

ocrd_validators

Schemas and routines for validating BagIt, ocrd-tool.json, workspaces, METS, page, CLI parameters etc.

See README for ocrd_validators for further information.

ocrd

Depends on all of the above, also contains decorators and classes for creating OCR-D processors and CLIs.

Also contains the command line tool ocrd.

See README for ocrd for further information.

Testing

Download assets (make assets)

Test with local files: make test

  • Test with local asset server:

    • Start asset-server: make asset-server
    • make test OCRD_BASEURL='http://localhost:5001/'
  • Test with remote assets:

    • make test OCRD_BASEURL='https://github.com/OCR-D/assets/raw/master/data/'

See Also

core's People

Contributors

b2m avatar bertsky avatar cneud avatar finkf avatar j23d avatar kba avatar m3ssman avatar mikegerber avatar stweil avatar witiko avatar wrznr avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.