andrewdcampbell / opencv-document-scanner Goto Github PK

An interactive document scanner built in Python using OpenCV featuring automatic corner detection, image sharpening, and color thresholding.

Python 100.00%

opencv python scanner

opencv-document-scanner's Introduction

Document Scanner

An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive color threshold to clean up the image.

On my test dataset of 280 images, the program correctly detected the corners of the document 92.8% of the time.

This project makes use of the transform and imutils modules from pyimagesearch (which can be accessed here). The UI code for the interactive mode is adapted from poly_editor.py from here.

You can manually click and drag the corners of the document to be perspective transformed:
The scanner can also process an entire directory of images automatically and save the output in an output directory:

Here are some examples of images before and after scan:

Usage

python scan.py (--images <IMG_DIR> | --image <IMG_PATH>) [-i]

The -i flag enables interactive mode, where you will be prompted to click and drag the corners of the document. For example, to scan a single image with interactive mode enabled:

python scan.py --image sample_images/desk.JPG -i

Alternatively, to scan all images in a directory without any input:

python scan.py --images sample_images

opencv-document-scanner's People

Contributors

Stargazers

Watchers

Forkers

alwc collinswei jimmyanthony augustrush8 wuyunxiangwyx xiaolaodi malkhan52 dylanninin sauravmondallive wanghuiyao hiadore leegerpeng ducnq135 ioir123ju ngeen dantepsychedelico abdo1819 ivanshafran kingmv benxaamin ibrahim-amer kingwpf ccszwg ajinkya933 juliandnl michael-yxchen bobqiu lipingyang-geoai graczu0x0x0x0x0 wonksing yjingyu xiyuan27 abkonate kucukagan helderfarias piou-project gr3q yynnxu praneybehl joaofauvel projectifyofficial ucalyptus2 wty1143 browningwan miguelps luisfalconeri thanh97 askintution fytrace vinrok hyperiongeo rajiv-quantela rajiv2806 kshamap joejorn beeteedubs s1kim anuj-rathore jomatotu yueyedeai hemin110 rmurray1 koryakovdmitry lbtanh guillaumeai balioune kapitsa2811 prathamesh60 favcode remtav khc033 elevin04 jenterl zadockmaloba 446991802 dongshik shtaiga hawkhai scando1993 jimreno saneysrikanth joaocps msciesiek xukuanhit varadkatkalambekar wikinaut zsun14 douglasotoni raffieeey spheppner circlestarzero vikassnwl khawaritzmi pushprajmaraje rimsc iraqforces alexrogalskiy tbsuperman gregorylearns ajunlonglive

opencv-document-scanner's Issues

Add a License

Without a license specified, no one can use this code. https://github.com/readme/guides/open-source-licensing

LSD Error: new_image_double: invalid image size.

I have successfully installed all the requirements , but I'm getting error "LSD Error: new_image_double: invalid image size." . I am unable to figure it out . It is not telling any specific line number or file .

in pylsd/lsd.py: ValueError: could not convert string to float: '29,373552' (locale DE) - when "," is decimal operator

In site-packages/pylsd/lsd.py I had to change the decimal separator from "," to "." for my locale "DE".

A solution which should work for all locales with decimal separator "," and "." uses string2float():

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import ctypes
import os
import sys
from tempfile import NamedTemporaryFile

import numpy as np


from .bindings.lsd_ctypes import lsdlib

def string2float(string):
  t = string
  dot_pos = t.rfind('.')
  comma_pos = t.rfind(',')
  if comma_pos > dot_pos:
    t = t.replace(".", "")
    t = t.replace(",", ".")
  else:
    t = t.replace(",", "")
  return(float(t))

def lsd(src, scale=0.8, sigma_scale=0.6, quant=2.0, ang_th=22.5, eps=0.0, density_th=0.7, n_bins=1024, max_grad=255.0):
    """Analyse image with Line Segment Detector.

    Args:
        src (Numpy object) : 2-d grayscale image array (HxW) to analyse.

    Keyword Args:
        scale (double) : Scale the image by Gaussian filter.
        sigma_scale (double) : Sigma for Gaussian filter is computed as sigma = sigma_scale/scale.
        quant (double) : Bound to the quantization error on the gradient norm.
        ang_th (double) : Gradient angle tolerance in degrees.
        eps (double) : Detection threshold, -log10(NFA).
        density_th (double) : Minimal density of region points in rectangle.
        n_bins (int) : Number of bins in pseudo-ordering of gradient modulus.
        max_grad (double) : Gradient modulus in the highest bin. The default value corresponds to the highest gradient modulus on images with gray levels in [0,255].

    Returns:
        A list of line candidates as 5-tuples of (x1, y1, x2, y2, width).
    """
    rows, cols = src.shape
    src = src.reshape(1, rows * cols).tolist()[0]

    lens = len(src)
    src = (ctypes.c_double * lens)(*src)

    with NamedTemporaryFile(prefix='pylsd-', suffix='.ntl.txt', delete=False) as fp:
        fname = fp.name
        fname_bytes = bytes(fp.name) if sys.version_info < (3, 0) else bytes(fp.name, 'utf8')

    lsdlib.lsdGet(src, ctypes.c_int(rows), ctypes.c_int(cols), fname_bytes,
                  ctypes.c_double(scale),
                  ctypes.c_double(sigma_scale),
                  ctypes.c_double(quant),
                  ctypes.c_double(ang_th),
                  ctypes.c_double(eps),
                  ctypes.c_double(density_th),
                  ctypes.c_int(n_bins),
                  ctypes.c_double(max_grad))

    with open(fname, 'r') as fp:
        output = fp.read()
        cnt = output.strip().split(' ')
        count = int(cnt[0])
        dim = int(cnt[1])
        lines = np.array([string2float(each) for each in cnt[2:]])
        lines = lines.reshape(count, dim)

    os.remove(fname)
    return lines

to avoid a string→float conversion error

Traceback (most recent call last):
  File "./scan.py", line 335, in <module>
    scanner.scan(im_dir + '/' + im)
  File "./scan.py", line 284, in scan
    screenCnt = self.get_contour(rescaled_image)
  File "./scan.py", line 198, in get_contour
    test_corners = self.get_corners(edged)
  File "./scan.py", line 98, in get_corners
    lines = lsd(img)
  File "/home/benutzer/.local/lib/python3.8/site-packages/pylsd/lsd.py", line 58, in lsd
    lines = np.array([float(each) for each in cnt[2:]])
  File "/home/benutzer/.local/lib/python3.8/site-packages/pylsd/lsd.py", line 58, in <listcomp>
    lines = np.array([float(each) for each in cnt[2:]])
ValueError: could not convert string to float: '29,373552'

Does it do anything?

Used as suggested, but no new file was created.

dist_point_to_segment is removed in new version of matplotlib.mlab

I am getting this error and when i checked from below this i found that it is removed.
from matplotlib.mlab import dist_point_to_segment
ImportError: cannot import name 'dist_point_to_segment' from 'matplotlib.mlab

https://matplotlib.org/3.1.0/api/api_changes.html

I got CV error with python 3.7.2

cv2.error: OpenCV(4.0.1) /build/opencv/src/opencv-4.0.1/modules/imgproc/src/shapedescr.cpp:237: error: (-215:Assertion failed) count >= 0 && (depth == CV_32F || depth == CV_32S) in function 'arcLength'

libraries import error

hi, I have an issue while trying to install polygon_interacter and pyimagesearch libraries. how i should handle it?

ImportError: cannot import name 'lsd' from 'lsd'

python3 scan.py --image sample_images/dollar_bill.JPG
Traceback (most recent call last):
  File "scan.py", line 20, in <module>
    from pylsd.lsd import lsd
  File "xx/Library/Python/3.8/lib/python/site-packages/pylsd/__init__.py", line 8, in <module>
    from lsd import lsd
ImportError: cannot import name 'lsd' from 'lsd' (/Users/david/Library/Python/3.8/lib/python/site-packages/lsd/__init__.py)

I got this error

Traceback (most recent call last):
File "D:\anaconda3\lib\site-packages\matplotlib\backends\backend_qt5.py", line 519, in _draw_idle
self.draw()
File "D:\anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py", line 402, in draw
self.figure.draw(self.renderer)
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 50, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File "D:\anaconda3\lib\site-packages\matplotlib\figure.py", line 1649, in draw
renderer, self, artists, self.suppressComposite)
File "D:\anaconda3\lib\site-packages\matplotlib\image.py", line 138, in _draw_list_compositing_images
a.draw(renderer)
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 50, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File "D:\anaconda3\lib\site-packages\matplotlib\axes_base.py", line 2628, in draw
mimage._draw_list_compositing_images(renderer, self, artists)
File "D:\anaconda3\lib\site-packages\matplotlib\image.py", line 138, in _draw_list_compositing_images
a.draw(renderer)
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 50, in draw_wrapper
return draw(artist, renderer, *args, **kwargs)
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 1185, in draw
ticks_to_draw = self._update_ticks(renderer)
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 1023, in _update_ticks
tick_tups = list(self.iter_ticks()) # iter_ticks calls the locator
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 967, in iter_ticks
majorLocs = self.major.locator()
File "D:\anaconda3\lib\site-packages\matplotlib\ticker.py", line 1985, in call
return self.tick_values(vmin, vmax)
File "D:\anaconda3\lib\site-packages\matplotlib\ticker.py", line 1993, in tick_values
locs = self._raw_ticks(vmin, vmax)
File "D:\anaconda3\lib\site-packages\matplotlib\ticker.py", line 1932, in _raw_ticks
nbins = np.clip(self.axis.get_tick_space(),
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 2154, in get_tick_space
tick = self._get_tick(True)
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 1818, in _get_tick
return XTick(self.axes, 0, '', major=major, **tick_kw)
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 172, in init
self.apply_tickdir(tickdir)
File "D:\anaconda3\lib\site-packages\matplotlib\axis.py", line 429, in apply_tickdir
self.stale = True
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 236, in stale
self.stale_callback(self, val)
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 63, in _stale_axes_callback
self.axes.stale = val
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 236, in stale
self.stale_callback(self, val)
File "D:\anaconda3\lib\site-packages\matplotlib\figure.py", line 57, in _stale_figure_callback
self.figure.stale = val
File "D:\anaconda3\lib\site-packages\matplotlib\artist.py", line 236, in stale
self.stale_callback(self, val)
File "D:\anaconda3\lib\site-packages\matplotlib\pyplot.py", line 568, in _auto_draw_if_interactive
if val and matplotlib.is_interactive() and not fig.canvas.is_saving():
RecursionError: maximum recursion depth exceeded
QWidget::paintEngine: Should no longer be called
QPainter::begin: Paint device returned engine == 0, type: 1
QPainter::end: Painter not active, aborted
QWidget::paintEngine: Should no longer be called
QPainter::begin: Paint device returned engine == 0, type: 1

Wrong edge detection in case of no background available

Hello, First of all, I want to say thank you for this wonderful code repository. It really helps me a lot.

I have tried converting many different types of images and it is working quite well, but I have found one case where it detects the wrong contour. When I pass the image with little or no background at that time it detects the wrong contour.

here is the source image,

here is the final transformed image,

It detected the wrong contour.

Can you please guide me to resolve this issue.

Thanks in advance.

Implementation has been removed due to original code license issues

I am receiving below error while executing python scan.py --images imageDirectory

Traceback (most recent call last):
File "scan.py", line 335, in
scanner.scan(im_dir + '/' + im)
File "scan.py", line 284, in scan
screenCnt = self.get_contour(rescaled_image)
File "scan.py", line 198, in get_contour
test_corners = self.get_corners(edged)
File "scan.py", line 95, in get_corners
lsd = cv2.createLineSegmentDetector()
cv2.error: OpenCV(4.1.0) C:\projects\opencv-python\opencv\modules\imgproc\src\lsd.cpp:143: error: (-213:The function/feature is not implemented) Implementation has been removed due original code license issues in function 'cv::LineSegmentDetectorImpl::LineSegmentDetectorImpl'

Python version : 3.7.3
OpenCV version : 4.1.0

Is there any workaround for this issue?

Thanks.

Document colors

Hi there!

Is there any possibility to preserve the document colors in the final result? @andrewdcampbell

Kind Regards,
João

Replace dilate to closing or opening

Hi!

I noticed that the edged image has edges bigger than the original document boundary.
It happens because of dilation. It would be better to replace it to closing or opening.
On my tests closing works a bit greater score.
cv2.morphologyEx(gray, cv2.MORPH_CLOSE, kernel)

OpenCV-Document-Scanner/scan.py

Line 194 in 412fbd2

dilated = cv2.dilate(gray, kernel)