Giter Club home page Giter Club logo

flask_image_to_text's Introduction

Flask - image to text server

The following project consists of an image to text server created using python. The project itself consists of a Flask server and a simple web interface for users to upload image files. Once uploaded, the images can be processed using Optical Character Recognition (OCR) techniques to extract the text content. The text is then appended to a related text file which can be downloaded as a zip file.

The main focus of the project was to create a simple file uploading app, however this became a bit more extensive than initially anticipared (and this is without a proper GUI).

Getting started

Prerequisites

  • Flask (2.2.5 or higher)
  • Python (3.10 or higher)
  • Docker (24.x or higher)
  • Docker compose (v2.21.x or higher)
  • Windows, macOS, or Linux operating system

Installation and Setup

The project can be setup on the device itself. Just clone the project:

$ git clone https://github.com/tjkrstr/file_uploader.git

and run the app.py file using python:

$ python3 /backend/app.py

A Docker file and a docker-compose.yml file has been created which makes it possible to run the project in a containerized setting. Either use the run script I have created or use docker compose:

$ bash run
$ docker compose up -d

Tesseract setup and install notes:

Tesseract is the tool used to convert pictures to text. It supports a large number of different languages (Tesseract available languages). Linux tesseract installation process:

$ sudo apt update
$ sudo apt-get install libleptonica-dev tesseract-ocr tesseract-ocr-dev libtesseract-dev python3-pil tesseract-ocr-eng tesseract-ocr-dan

Pip install tesseract:

$ pip install tesseract-ocr
$ pip install pytesseract

If you install it this way you do not need to run/use the pytesseract pytesseract.tesseract_cmd in your python script. You need this if the script is running on a windows machine: pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'

Improvements and future work

The code itself could be much more efficient. Furthermore there exists logic that is not utilized but may be in the future. Tessearct is also not perfect, for instance when transforming the test file "jpegtest.jpg", it only contains "a"s however when translated to text they are detected as "d"s.

Improvements:

  • The status codes are a bit janky and could have been better utilized in regards to other parts of the logic.
  • More extensive testing...
  • And probably many more...

Future work:

  • Add the possibility of processing pdf files.
  • Add logic that makes it possible to select the langauge model tesseract should utilize via the GUI.
  • A graphical user interface (GUI). This was however not the main goal for this project as it mainly serves as a backend foundation for uploading files.

flask_image_to_text's People

Contributors

tjkrstr avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.