Giter Club home page Giter Club logo

semantic-recognized-realtime-camera-style-transfer's Introduction

Semantic Recognized Real-time Camera Style Transfer

Introduction

This repository is an extension of Image Recognition course final project, which intends to develop an application that can achieve semantic recognized real-time camera arbitrary multi-style transfer. Specifically, I intend to apply different styles to human and background dynamically by utilizing human segmentation technique.

Reference Repositories

Human Segmentation

The human segmentation implementation used in this application is borrowed from this repository: thuyngch/Human-Segmentation-PyTorch. Speciafically, I adopt ResNet18 backboned UNet to do the segmentation.

Real-time Arbitrary Style Transfer

Three real-time arbitrary style transfer implementations are borrowed from:

  1. naoto0804/pytorch-AdaIN
  2. tyui592/Avatar-Net_Pytorch
  3. GlebBrykin/SANET

Application Architecture

Application architecture

Usage

Download network weights and install timm for segmentation

Pre-trained weights of all style transfer networks are already included in the repository, while the weights of segmentation network is too large, so download it at here and place it under model_checkpoints directory in AdaIN_DynamicMask, AvatarNet_DynamicMask and SANet_DynamicMask.

Then install timm for segmentation network in any of three subdirectories, e.g.

cd AdaIN_DynamicMask
pip install -e models/pytorch-image-models

Start web camera application

For AdaIN_DynamicMask and AvatarNet_DynamicMask, use

python webcam.py --human_style "path to style image for human" --background_style "path to style image for background" --ratio "number between 0 and 1" (optional)

The ratio argument is used to adjust the strength of style, 0 means output with be the same as original image, 1 means the strongest style effect.

For SANet_DynamicMask, the borrow implementation currently doesn't support style strength adjustment, so the ratio arguemnt doesn't have any effect. I intend to add this adjustment feature in the future.

Results and evaluation

AdaIN

AdaIN result

AvatarNet

AvatarNet result

SANet

SANet result

Runtime Profile

Runtime profile

Runtime profile is tested on server with GTX 1080Ti card by reading one content image repeatedly (Because I can't use camera on server :( , while my laptop doesn't have powerful GPU). The content image size is 1280720, the style image size is 400400.

Gatys et al. stands for the first neural style transfer approach proposed by Gatys et al.

The implementation of AvatarNet seems have some efficiency problem, which indicates by its low GPU utilization.

Reference Paper

  1. L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2414–2423, 2016.

  2. X. Huang and S. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510, 2017.

  3. Lu Sheng, Ziyi Lin, Jing Shao and Xiaogang Wang, “Avatar-Net: Multi-scale Zero-shot Style Transfer by Feature Decoration”, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

  4. D. Y. Park and K. H. Lee, “Arbitrary style transfer with style-attentional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5880–5888, 2019.

semantic-recognized-realtime-camera-style-transfer's People

Contributors

albertpi-git avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.