Giter Club home page Giter Club logo

Comments (10)

senguptaumd avatar senguptaumd commented on August 22, 2024 29

Let us consider the case of a fixed camera where alignment is not necessary. There are two main networks: Segmentation (Deeplab) and Matting.

  • The released code uses Deeplab with Deep Residual Network for segmentation and can operate at 2.7fps.
  • Since segmentation accuracy is not that important, we replaced Deeplab (Deep Residual Network) with Deeplab (Mobilenet). This changes the runtime to 3.5fps while the quality is still the same.
  • Lastly, we also used a simple distillation strategy to refine our matting network. We changed all the conv layers to depthwise separable conv, removed all the conv bias before batchnorm. It is inspired by MobileNetv1. This combined with Deeplab (Mobilenet) results in 5.9fps, while the quality is still the same.

We tested this on 960x540 frame, batch size 1, RTX 2060 Super.

We are currently working on a complete real-time model that can handle 30fps or at best 60fps with 720p resolution. This is still in research phase and we will release the code and demo once we have something concrete.

from background-matting.

poincarelee avatar poincarelee commented on August 22, 2024 1

@mozpp Hi, I am implementing your code, but I found the following modules: ResnetConditionHR_mo, ResnetConditionHR, ResnetConditionHR_mo_4convert, UnetBased_mo not exist.
Did you write these modules by yourself? Or could you share the modules?

from background-matting.

fire17 avatar fire17 commented on August 22, 2024

Hey there! Your results look really stunning!
Would also love to get some inference stats :)
Please tell us your average fps and best fps, which gpu have you used, and the resolution of the images for those times, etc. Thx a lot and have a good one! Awesome work 💪

from background-matting.

kwea123 avatar kwea123 commented on August 22, 2024

What are the segmentation targets? If it's human only then you don't need that many classes as for PascalVOC. Anyway, if you aim at >30fps then I'd suggest stop with deeplab and try other networks, it's simply impossible.

Apart from the algo itself, there are also hardware acceleration you can make: e.g. use onnxruntime or tensorrt, which I found useful in my previous experiments with segmentation.

Also maybe port the segmentation part from tensorflow to pytorch (use only one framework), I don't know if data transfer between them will be a bottleneck?

from background-matting.

jiangjianping avatar jiangjianping commented on August 22, 2024

@fire17 ,

This typical usage is video conference service like ZOOM. The lowest support should be 640*360 with 20fps, this can be accepted with virtual/blur background.

from background-matting.

jiangjianping avatar jiangjianping commented on August 22, 2024

@senguptaumd ,

If the resolution is changed from 540P to 360P, Can the 5.9fps be increased under your testing?

from background-matting.

mozpp avatar mozpp commented on August 22, 2024

I have some idea to speed up, I will imply in my fork

from background-matting.

jiangjianping avatar jiangjianping commented on August 22, 2024

@mozpp,

good news! besides optimizations, C++ may have better performance?

from background-matting.

senguptaumd avatar senguptaumd commented on August 22, 2024

If you want 360p, it is better to retrain the networks with less residual blocks, which will significantly require less memory and will also boost the runtime. Also @kwea123 makes a valid point about Deeplab and I think you can train a smaller network like Mobilenetv2 to only perform person segmentation on COCO/PASCAL VOC. This will speed up the segmentation.

If you guys manage to come up with a faster implementation feel free to share the repo. I can link it to my github page.

from background-matting.

mozpp avatar mozpp commented on August 22, 2024

Now I imply a light-weight version in my fork.
https://github.com/mozpp/Background-Matting/blob/master/test_bg-matting_mo.py
It can reach 30fps for model inference(ignore deeplab).
anyone have good advice to improve the performance?

from background-matting.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.