Giter Club home page Giter Club logo

Hi 👋, I'm M. Mamunur Rashid

My specialization as a computer science researcher lies in computer graphics and vision. My research interests are focused on VR, 3D animation, image synthesis, and image/video editing.

mamunctg

  • 🔭 I’m currently working on Facial Expression Transfer

Publications

[1] High-fidelity facial expression transfer using part-based local–global conditional gans

Muhammad Mamunur Rashid, Shihao Wu, Yongwei Nie, Guiqing Li

The Visual Computer (CGI 2023), July 2023

Abstract

We propose a GAN-based facial expression transfer method. It can transfer the facial expression of a reference subject to the source subject while preserving the source identity attributes, such as shape, appearance, and illumination. Our method consists of two modules based on GAN: Parts Generation Networks (PGNs), and Parts Fusion Network (PFN). Instead of training the model on the entire image globally, our key idea is to train different PGNs for different local facial parts independently and then fuse the generated parts together using PFN. To encode the facial expression faithfully, we use a pre-trained parametric 3D head model (called photometric FLAME) to reconstruct realistic head models from both source and reference images. We also extract 3D facial feature points of the reference image to handle extreme poses and occlusions. Based on the extracted contextual information, we use PGNs to generate different parts of the head independently. Finally, PFN is used to fuse all the generated parts together to form the final image. Experiments show that the proposed model outperforms state-of-the-art approaches in faithfully transferring facial expressions, especially when the reference image has a different head pose to the source image. Ablation studies demonstrate the power of using PGNs.

[2] Nonspeech7k dataset: Classification and analysis of human non-speech sound

Muhammad Mamunur Rashid, Guiqing Li, Chengrui Du

IET Signal Processing, June 2023

Abstract

Human non-speech sounds occur during expressions in a real-life environment. Realising a person's incapability to prompt confident expressions by non-speech sounds may assist in identifying premature disorder in medical applications. A novel dataset named Nonspeech7k is introduced that contains a diverse set of human non-speech sounds, such as the sounds of breathing, coughing, crying, laughing, screaming, sneezing, and yawning. The authors then conduct a variety of classification experiments with end-to-end deep convolutional neural networks (CNN) to show the performance of the dataset. First, a set of typical deep classifiers are used to verify the reliability and validity of Nonspeech7k. Involved CNN models include 1D-2D deep CNN EnvNet, deep stack CNN M11, deep stack CNN M18, intense residual block CNN ResNet34, modified M11 named M12, and the authors’ baseline model. Among these, M12 achieves the highest accuracy of 79%. Second, to verify the heterogeneity of Nonspeech7k with respect to two typical datasets, FSD50K and VocalSound, the authors design a series of experiments to analyse the classification performance of deep neural network classifier M12 by using FSD50K, FSD50K + Nonspeech7k, VocalSound, VocalSound + Nonspeech7k as training data, respectively. Experimental results show that the classifier trained with existing datasets mixed with Nonspeech7k achieves the highest accuracy improvement of 15.7% compared to that without Nonspeech7k mixed. Nonspeech7k is 100% annotated, completely checked, and free of noise.

Connect with me:

https://www.linkedin.com/in/mamunur-rashid-b5978153/

Skills (Languages and Tools):

aws blender c cplusplus docker illustrator linux matlab opencv pandas photoshop python pytorch scikit_learn seaborn tensorflow

mamunctg

 mamunctg

mamunctg

Muhammad Mamunur Rashid's Projects

bae-net icon bae-net

The code for paper "BAE-NET: Branched Autoencoder for Shape Co-Segmentation".

data-augmentation-review icon data-augmentation-review

List of useful data augmentation resources. You will find here some not common techniques, libraries, links to github repos, papers and others.

ftcn icon ftcn

[Official] Exploring Temporal Coherence for More General Video Face Forgery Detection(ICCV 2021)

handson-ml2 icon handson-ml2

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.