Giter Club home page Giter Club logo

leptess's Introduction

Leptess

CircleCI Crates.io Docs

Productive and safe Rust bindings/wrappers for Tesseract and Leptonica.

Build dependencies

Make sure you have clang, Leptonica and Tesseract installed.

Ubuntu

sudo apt-get install libleptonica-dev libtesseract-dev clang

You will also need to install tesseract language data based on your OCR needs:

sudo apt-get install tesseract-ocr-eng

Mac

brew install tesseract leptonica

Windows

On Windows, this library uses Microsoft's vcpkg to provide tesseract.

Please install vcpkg and set up user wide integration or vcpkg crate won't be able to find the library.

To install tesseract:

REM from the vcpkg directory

REM 32 bit
.\vcpkg install tesseract:x86-windows

REM 64 bit
.\vcpkg install tesseract:x64-windows

To run the tests configure vcpkg-crate to find the tesseract library:

SET VCPKGRS_DYNAMIC=true
cargo test

Usage

let mut lt = leptess::LepTess::new(None, "eng").unwrap();
lt.set_image("path/to/page.bmp");
println!("{}", lt.get_utf8_text().unwrap());

For more examples, see docs and examples directory.

To run demos in examples directory, try:

cargo run --example low_level_ocr_full_page

Development

To run tests, you will need at Tesseract 4.x to match what we have in tests/tessdata/eng.traineddata. See CircleCI config to see how to replicate the setup.

leptess's People

Contributors

houqp avatar kangalio avatar ccouzens avatar timvisee avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.