Giter Club home page Giter Club logo

opencv-document-scanner's Introduction

Document Scanner

An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive color threshold to clean up the image.

On my test dataset of 280 images, the program correctly detected the corners of the document 92.8% of the time.

This project makes use of the transform and imutils modules from pyimagesearch (which can be accessed here). The UI code for the interactive mode is adapted from poly_editor.py from here.

  • You can manually click and drag the corners of the document to be perspective transformed: Example of interactive GUI

  • The scanner can also process an entire directory of images automatically and save the output in an output directory: Image Directory of images to be processed

Here are some examples of images before and after scan:

Usage

python scan.py (--images <IMG_DIR> | --image <IMG_PATH>) [-i]
  • The -i flag enables interactive mode, where you will be prompted to click and drag the corners of the document. For example, to scan a single image with interactive mode enabled:
python scan.py --image sample_images/desk.JPG -i
  • Alternatively, to scan all images in a directory without any input:
python scan.py --images sample_images

opencv-document-scanner's People

Contributors

andrewdcampbell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opencv-document-scanner's Issues

LSD Error: new_image_double: invalid image size.

I have successfully installed all the requirements , but I'm getting error "LSD Error: new_image_double: invalid image size." . I am unable to figure it out . It is not telling any specific line number or file .

in pylsd/lsd.py: ValueError: could not convert string to float: '29,373552' (locale DE) - when "," is decimal operator

In site-packages/pylsd/lsd.py I had to change the decimal separator from "," to "." for my locale "DE".

A solution which should work for all locales with decimal separator "," and "." uses string2float():

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import ctypes
import os
import sys
from tempfile import NamedTemporaryFile

import numpy as np


from .bindings.lsd_ctypes import lsdlib

def string2float(string):
  t = string
  dot_pos = t.rfind('.')
  comma_pos = t.rfind(',')
  if comma_pos > dot_pos:
    t = t.replace(".", "")
    t = t.replace(",", ".")
  else:
    t = t.replace(",", "")
  return(float(t))

def lsd(src, scale=0.8, sigma_scale=0.6, quant=2.0, ang_th=22.5, eps=0.0, density_th=0.7, n_bins=1024, max_grad=255.0):
    """Analyse image with Line Segment Detector.

    Args:
        src (Numpy object) : 2-d grayscale image array (HxW) to analyse.

    Keyword Args:
        scale (double) : Scale the image by Gaussian filter.
        sigma_scale (double) : Sigma for Gaussian filter is computed as sigma = sigma_scale/scale.
        quant (double) : Bound to the quantization error on the gradient norm.
        ang_th (double) : Gradient angle tolerance in degrees.
        eps (double) : Detection threshold, -log10(NFA).
        density_th (double) : Minimal density of region points in rectangle.
        n_bins (int) : Number of bins in pseudo-ordering of gradient modulus.
        max_grad (double) : Gradient modulus in the highest bin. The default value corresponds to the highest gradient modulus on images with gray levels in [0,255].

    Returns:
        A list of line candidates as 5-tuples of (x1, y1, x2, y2, width).
    """
    rows, cols = src.shape
    src = src.reshape(1, rows * cols).tolist()[0]

    lens = len(src)
    src = (ctypes.c_double * lens)(*src)

    with NamedTemporaryFile(prefix='pylsd-', suffix='.ntl.txt', delete=False) as fp:
        fname = fp.name
        fname_bytes = bytes(fp.name) if sys.version_info < (3, 0) else bytes(fp.name, 'utf8')

    lsdlib.lsdGet(src, ctypes.c_int(rows), ctypes.c_int(cols), fname_bytes,
                  ctypes.c_double(scale),
                  ctypes.c_double(sigma_scale),
                  ctypes.c_double(quant),
                  ctypes.c_double(ang_th),
                  ctypes.c_double(eps),
                  ctypes.c_double(density_th),
                  ctypes.c_int(n_bins),
                  ctypes.c_double(max_grad))

    with open(fname, 'r') as fp:
        output = fp.read()
        cnt = output.strip().split(' ')
        count = int(cnt[0])
        dim = int(cnt[1])
        lines = np.array([string2float(each) for each in cnt[2:]])
        lines = lines.reshape(count, dim)

    os.remove(fname)
    return lines

to avoid a string→float conversion error

Traceback (most recent call last):
  File "./scan.py", line 335, in <module>
    scanner.scan(im_dir + '/' + im)
  File "./scan.py", line 284, in scan
    screenCnt = self.get_contour(rescaled_image)
  File "./scan.py", line 198, in get_contour
    test_corners = self.get_corners(edged)
  File "./scan.py", line 98, in get_corners
    lines = lsd(img)
  File "/home/benutzer/.local/lib/python3.8/site-packages/pylsd/lsd.py", line 58, in lsd
    lines = np.array([float(each) for each in cnt[2:]])
  File "/home/benutzer/.local/lib/python3.8/site-packages/pylsd/lsd.py", line 58, in <listcomp>
    lines = np.array([float(each) for each in cnt[2:]])
ValueError: could not convert string to float: '29,373552'

I got CV error with python 3.7.2

cv2.error: OpenCV(4.0.1) /build/opencv/src/opencv-4.0.1/modules/imgproc/src/shapedescr.cpp:237: error: (-215:Assertion failed) count >= 0 && (depth == CV_32F || depth == CV_32S) in function 'arcLength'

libraries import error

hi, I have an issue while trying to install polygon_interacter and pyimagesearch libraries. how i should handle it?

ImportError: cannot import name 'lsd' from 'lsd'

python3 scan.py --image sample_images/dollar_bill.JPG
Traceback (most recent call last):
  File "scan.py", line 20, in <module>
    from pylsd.lsd import lsd
  File "xx/Library/Python/3.8/lib/python/site-packages/pylsd/__init__.py", line 8, in <module>
    from lsd import lsd
ImportError: cannot import name 'lsd' from 'lsd' (/Users/david/Library/Python/3.8/lib/python/site-packages/lsd/__init__.py)

I got this error

Traceback (most recent call last):
File "D:\anaconda3\lib\site-packages\matplotlib\backends\backend_qt5.py", line 519, in _draw_idle
self.draw()
File "D:\anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py", line 402, in draw
self.figure.draw(self.renderer)
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 50, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File "D:\anaconda3\lib\site-packages\matplotlib\figure.py", line 1649, in draw
renderer, self, artists, self.suppressComposite)
File "D:\anaconda3\lib\site-packages\matplotlib\image.py", line 138, in _draw_list_compositing_images
a.draw(renderer)
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 50, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File "D:\anaconda3\lib\site-packages\matplotlib\axes_base.py", line 2628, in draw
mimage._draw_list_compositing_images(renderer, self, artists)
File "D:\anaconda3\lib\site-packages\matplotlib\image.py", line 138, in _draw_list_compositing_images
a.draw(renderer)
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 50, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 1185, in draw
ticks_to_draw = self._update_ticks(renderer)
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 1023, in _update_ticks
tick_tups = list(self.iter_ticks()) # iter_ticks calls the locator
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 967, in iter_ticks
majorLocs = self.major.locator()
File "D:\anaconda3\lib\site-packages\matplotlib\ticker.py", line 1985, in call
return self.tick_values(vmin, vmax)
File "D:\anaconda3\lib\site-packages\matplotlib\ticker.py", line 1993, in tick_values
locs = self._raw_ticks(vmin, vmax)
File "D:\anaconda3\lib\site-packages\matplotlib\ticker.py", line 1932, in _raw_ticks
nbins = np.clip(self.axis.get_tick_space(),
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 2154, in get_tick_space
tick = self._get_tick(True)
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 1818, in _get_tick
return XTick(self.axes, 0, '', major=major, **tick_kw)
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 172, in init
self.apply_tickdir(tickdir)
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 429, in apply_tickdir
self.stale = True
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 236, in stale
self.stale_callback(self, val)
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 63, in _stale_axes_callback
self.axes.stale = val
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 236, in stale
self.stale_callback(self, val)
File "D:\anaconda3\lib\site-packages\matplotlib\figure.py", line 57, in _stale_figure_callback
self.figure.stale = val
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 236, in stale
self.stale_callback(self, val)
File "D:\anaconda3\lib\site-packages\matplotlib\pyplot.py", line 568, in _auto_draw_if_interactive
if val and matplotlib.is_interactive() and not fig.canvas.is_saving():
RecursionError: maximum recursion depth exceeded
QWidget::paintEngine: Should no longer be called
QPainter::begin: Paint device returned engine == 0, type: 1
QPainter::end: Painter not active, aborted
QWidget::paintEngine: Should no longer be called
QPainter::begin: Paint device returned engine == 0, type: 1

Wrong edge detection in case of no background available

Hello, First of all, I want to say thank you for this wonderful code repository. It really helps me a lot.

I have tried converting many different types of images and it is working quite well, but I have found one case where it detects the wrong contour. When I pass the image with little or no background at that time it detects the wrong contour.

here is the source image,
8

here is the final transformed image,
8

It detected the wrong contour.

Can you please guide me to resolve this issue.

Thanks in advance.

Implementation has been removed due to original code license issues

I am receiving below error while executing python scan.py --images imageDirectory

Traceback (most recent call last):
File "scan.py", line 335, in
scanner.scan(im_dir + '/' + im)
File "scan.py", line 284, in scan
screenCnt = self.get_contour(rescaled_image)
File "scan.py", line 198, in get_contour
test_corners = self.get_corners(edged)
File "scan.py", line 95, in get_corners
lsd = cv2.createLineSegmentDetector()
cv2.error: OpenCV(4.1.0) C:\projects\opencv-python\opencv\modules\imgproc\src\lsd.cpp:143: error: (-213:The function/feature is not implemented) Implementation has been removed due original code license issues in function 'cv::LineSegmentDetectorImpl::LineSegmentDetectorImpl'

Python version : 3.7.3
OpenCV version : 4.1.0

Is there any workaround for this issue?

Thanks.

Replace dilate to closing or opening

Hi!

I noticed that the edged image has edges bigger than the original document boundary.
It happens because of dilation. It would be better to replace it to closing or opening.
On my tests closing works a bit greater score.
cv2.morphologyEx(gray, cv2.MORPH_CLOSE, kernel)

dilated = cv2.dilate(gray, kernel)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.