Giter Club home page Giter Club logo

computervision_ue's Introduction

Unsupervised person localization in wilderness search and rescue

Table of contents

Introduction

This is part of the exercise class "UE Computer Vision, Oliver Bimber / Indrajit Kurmi, 2021W" at the JKU Austria.

In this lab project, we had to implement an unsupervised person localization algorithm.

Challenge

challenge.png

Data extraction

data_method.png

Methods and solution

0. Image pre-processing

img.png

img_1.png

1. Color channels approach

Method

img_2.png

img_3.png

img.png

img_1.png

  • OpenCV functions cv2.findContours() and cv2.boundingRect()
  • Pad by 2 pixels in each direction
  • Merge overlapping bounding boxes
  • Pad by 7 pixels in x and 4 pixels in y
  • Choose the biggest bounding box of blue and red image, respectively
  • Remove if x<24 or y<18
  • If no detections -> lower threshold for binary image and start from 1
  • Merge detections overlapping between the two images
  • Pad detections to a minimum size of 38x30 px

Advantages and disadvantages

  • Advantage:
    • can distinguish people from other objects by detecting movement
  • Disadvantages:
    • bias towards detecting people wearing blue or red โ€“ problems finding people with green clothing
    • cannot detect people that are not moving or moving too little

2. Autoencoder approach

The autoeconder is implemented in anomaly_detection_autoencoder_SAR_JKU.ipynb

Initial idea

Going through various research papers on anomaly detection, we decided to try out an Autoencoder approach for this task

  • Autoencoder -> encoder-decoder system to reconstruct the input as the output.

  • Train a convolutional autoencoder so that it will reconstruct an image from the normal data with a smaller reconstruction error, but reconstruct an image from the anomaly data with a larger reconstruction error

  • Our solution decides if an image is from the normal data or from the anomaly data based on a threshold of the reconstruction error.

  • the model is encouraged to learn to precisely reproduce the most frequently observed characteristics

  • when facing anomalies, the model should worsen its reconstruction performance.

  • after training, the autoencoder will accurately reconstruct normal data, while failing to do so with unfamiliar anomalous data

  • reconstruction error (the error between the original data and its low dimensional reconstruction) is used as an anomaly score to detect anomalies

  • we are aware that autoencoding models can, be very good at reconstructing anomalous examples and consequently not able to reliably perform anomaly detection

Model Architecture

img_4.png

  • Base is a Convolutional autoencoder for image denoising from official Keras docs
  • Adapted loss for Structural Similarity Index (SSIM)
  • Decided for that because
    • Relatively straight forward to tune
    • Simple architecture
    • Sufficient for our image detection problem

Implementation and findings

Over the course of the implementation it became apparent, that

  • properly pre-processed images improve the performance of the autoencoder a lot
  • a deep convolutional autoencoder is sufficient to reproduce the images properly
  • the autoencoder should be trained with color images as the color provides most of the information for the task
  • the biggest challenge is the length of training as
    • too short training shows too many reconstruction errors
    • too long training reconstructs anomalies
  • as well as the threshold for finding the most useful SSIM differences

img_5.png Reconstruction worked well

img_6.png

Visualization of activation layers over RBG channels, showing stronger activations for red and blue channel

img_8.png

Indication of finding the anomalies as desired.

img_9.png

Finding the proper threshold for SSIM differences

img.png


The project was implemented over the course of a semester at university.

In the end we implemented the whole pipeline to fit the corresponding grading criteria.

computervision_ue's People

Contributors

createdd avatar lisa-schneckenreiter avatar nikoszka avatar utawagner avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

ahmedsoror

computervision_ue's Issues

fix integration of images in re-usable and understandable code

lets brush the file up. I added some changes to make it usable on unix system. maybe we can add a param for that. I also fixed the issue with validation datasets as those were not merged. reason was that the subfolders started with "valid" instead of "validation" unlike the other folders test and train

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.