xhuvom / faceid Goto Github PK

An implementation of YOLO v2 for direct facial recognition within detection layer.

License: Other

Makefile 0.25% Python 0.64% Shell 0.12% Cuda 8.45% C 90.18% C++ 0.31% Objective-C 0.05%

faceid's Introduction

#Darknet# Darknet is an open source neural network framework written in C and CUDA. It is fast, easy to install, and supports CPU and GPU computation.

For more information see the Darknet project website.

For questions or issues please use the Google Group.

----------------------YOLO Facial recognition on Darknet Framework-------------------------------------------------

############## Detecting and recognizing face is a three step process with automatic annotation ##################### Fork on github: https://github.com/xhuvom/darknetFaceID YOLO darknet implementation to detect, recognize and track multiple faces. Yes it can detect and recognize individual faces just by training on different classes. The algorithm automatically learn facial features itself and recognize individual faces. All you need is to train different face images as different classes. I have tested 3 different faces trained with ~2k individual images per class. After about 60k epochs, the algorithm works pretty well with acceptable accuracy. See a demo video ( https://www.youtube.com/watch?v=UsOi1BfunnU )

Annotating large number of images manually by hand is time-consuming and inefficient for practical prototyping. Thats why I have used the fork https://github.com/quanhua92/darknet/ to detect faces from webcam images and annotate any number of images automatically.

Basically its a simple three step process::

Capture
Train
Deploy

Part 1: Capture >>

[i] To detect face from live camera feed and annotate automatically, use the .cfg and .weight files from QuanHua (https://mega.nz/#F!GRV1XKbJ!v8BCsFO8iJVNppiGXY4qMw). [ii] Only add those lines on src/image.c file of this fork as described bellow:

(line #223) to save .jpg images and (line #227) to save annotations on separate folders for each class (also change class number on line #229

[iii] After modifications, run the detector from live webcam or video file which specifically shows only one particular persons face. [iv] Repeat the process for every persons you want to recognize and modify training data location and class number accordingly. About ~2k face images per person is enough to recognize individual faces but to improve accuracy, more data could be added.

Part 2: Train>>

After capturing each persons face images and annotations on separate training folders, some data preprocessing is required for training. Image conversion: Convert jpg images to JPEG for Darknet framework using command [ $ mogrify -format JPEG *jpg ] according to your image data directory.

Label conversion: Convert annotations to VOC data format with scripts/convert.py script provided on scripts folder. This operation generates training image list file on the same folder for different classes. Add all those training list files into one file and point the file on cfg/face.data

After preprocessing, modify class numbers accordingly, create data/face.names and cfg/face.data files with your desired labels and directories.

Configure src/yolo.c file and yolo_kernels, with "CLASS_NUM" parameter according to your class numbers. Comment the lines (#223 & #227) on "src/image.c" file as we are not gonna overwrite the dataset of the images captured.

Now prepare for training with a cfg file (modify #224 with filters and class numbers according to the equation > [filters = (class+coord+1)*num] for example you can modify the "yolo_face.cfg" file according to your parameters.

Now start training on GPU with a pretrained ImageNet mode (download from https://pjreddie.com/media/files/darknet19_448.conv.23 ) and run the command "./darknet detector train cfg/face.data cfg/face.cfg darknet19.448.conv.23" to initiate training. Checkpoint files will be saved on the "backup" directory specified on "face.data" file.

Part 3 >> Deploy

After about 120k training epochs, the training weight files now should successfully detect and recognize individual faces with acceptable accuracy by running the command "./darknet detector demo cfg/face.data cfg/face.cfg your_weight_file.weights"

The same process could be used to recognize facial expressions (demo https://www.youtube.com/watch?v=GMy0Zs8LX-o). The only thing I have added here is the automatic annotation of face images, which is quite cumbersome if done by hand.

faceid's People

Contributors

Stargazers

Watchers

Forkers

af-lazuardi molyswu mincore anhngml ubaidsayyed54 arasharchor berli lab930boss yuqingguoo foocp vinitbibhu timonsku wavelet303 reyadrahman mygotone neuralnoise zumbalamambo rutulpatel7077 attawit drat andrewsohn wellfrogliu fitrialif mandy-wei sushantjha8 image-amazing affian mzuhairqadir suzirui123 hulkmaker cybermanbd maroju100 qiangcai drbinzhao zjtheone deydebaditya cmk-repo masteroogway97 pavlikdee japita-se arvind-india ganwy2017 zafrulumar m2f0 wilburd aricwang88 baoanth admshumar borepig fantasticism abhisheksonnakula wenmq sparkyruth vahid-ai flavio58it automatonatm zhongxingpeng empireofkings lqchien otilrac daijie1223 zihad-13 binglihanshuang shyamranny see-plus-plus awstrainer007 wonderai sanju00m016 ijzepeda samar-080301 hadi2291 fabricejumel pocean2001 mohian igordbottero webstorage119 temimujidat ssit-ops

faceid's Issues

Where is the scripts/convert.py file?

Hi ! I'm writing to you because you said in the second paragraph of the field 'Part 2 : Train' that we must convert annotations to VOC data format with scripts/convert.py. But unfortunatly, this script is not in this folder. So where can I found it ?

is it possible to extract features from images?

How to use recognize facial expressions?

Incrementally adding new faces

Hi,

I've implemented a solution with yolo3 borrowing some things from your contribution.
I can detect various faces, but training is proving a bit slow for my requirements.
Would it be possible to incrementally add new faces? Meaning, just train the new face on top of the previous weights, so as not to train the model for all classes from scratch every time a new face appears.

Any input on this would be appreciated.

could you tell me how to run the detector?

i see "[iii] After modifications, run the detector ...", but i don't know how to run the detector, could you tell me how to run the detector?

undefined reference to `save_image_jpg'

I seem to get this error when I try to make....

Can't see why... Noone else had this?

obj/image.o: In function draw_detections': image.c:(.text+0x1d181): undefined reference to save_image_jpg'

thanks

terminate called after throwing an instance of 'cv::Exception'

Hi! I'm sorry to ask you such repeated question.
for capturing my face ./darknet yolo demo cfg/yolo-face.cfg yolo-face_final.weights "nvcamerasrc ! video/x-raw(memory:NVMM), width=(int)1280, height=(int)720,format=(string)I420, framerate=(fraction)30/1 ! nvvidconv flip-method=2 ! video/x-raw, format=(string)BGRx ! videoconvert ! video/x-raw, format=(string)BGR ! appsink"

but It has error...

=================================================================================

FPS:0.0
Objects:

face: 31%
OpenCV Error: Unspecified error (could not find a writer for the specified extension) in imwrite_, file /home/nvidia/opencv-3.2.0/modules/imgcodecs/src/loadsave.cpp, line 531
terminate called after throwing an instance of 'cv::Exception'
what(): /home/nvidia/opencv-3.2.0/modules/imgcodecs/src/loadsave.cpp:531: error: (-2) could not find a writer for the specified extension in function imwrite_

Aborted

=================================================================================
i'm using nvidia jetson tx1, and i reinstalled opencv 3.2.0.
we try to effort for 2 weeks.. plz help me

error: too few arguments to function ‘cudnnSetConvolution2dDescriptor’

Hi xhuvom, it was a greet job of you posting this, i have searching for this for a week.
However when i "make" i got this error:
error: too few arguments to function ‘cudnnSetConvolution2dDescriptor’
cudnnSetConvolution2dDescriptor(l->convDesc, l->pad, l->pad, l->stride, l->stride, 1, 1, CUDNN_CROSS_CORRELATION);
Please help me, thanks!

detecting every person with trained person id

i have trained with two classes. while detecting, it's drawing bounding box correctly for those two trained person in the video but all other person is detecting as either of the trained image.

How to train

I have followed your instructions up to the training part, however whenever I execute the train command I got the following results.

mark@mark-G11CD:/media/mark/Data_Application/darknetFaceID$ ./darknet detector train cfg/face.data cfg/face.cfg darknet19_448.conv.23 face layer filters size input output 0 conv 32 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 32 1 max 2 x 2 / 2 416 x 416 x 32 -> 208 x 208 x 32 2 conv 64 3 x 3 / 1 208 x 208 x 32 -> 208 x 208 x 64 3 max 2 x 2 / 2 208 x 208 x 64 -> 104 x 104 x 64 4 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128 5 conv 64 1 x 1 / 1 104 x 104 x 128 -> 104 x 104 x 64 6 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128 7 max 2 x 2 / 2 104 x 104 x 128 -> 52 x 52 x 128 8 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 9 conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 10 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 11 max 2 x 2 / 2 52 x 52 x 256 -> 26 x 26 x 256 12 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 13 conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 14 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 15 conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 16 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 17 max 2 x 2 / 2 26 x 26 x 512 -> 13 x 13 x 512 18 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 19 conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 20 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 21 conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 22 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 23 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 24 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 25 route 16 26 reorg / 2 26 x 26 x 512 -> 13 x 13 x2048 27 route 26 24 28 conv 1024 3 x 3 / 1 13 x 13 x3072 -> 13 x 13 x1024 29 conv 30 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 30 30 detection Loading weights from darknet19_448.conv.23...Done! Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005 Loaded: 0.030896 seconds Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.438415, Avg Recall: -nan, count: 0 1: 9.846994, 9.846994 avg, 0.000100 rate, 0.156046 seconds, 1 images Loaded: 0.000030 seconds Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.388885, Avg Recall: -nan, count: 0 2: 7.840993, 9.646395 avg, 0.000100 rate, 0.101668 seconds, 2 images Loaded: 0.000020 seconds Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.304338, Avg Recall: -nan, count: 0 3: 4.401259, 9.121881 avg, 0.000100 rate, 0.101112 seconds, 3 images Loaded: 0.000020 seconds Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.214405, Avg Recall: -nan, count: 0 4: 1.664008, 8.376094 avg, 0.000100 rate, 0.090299 seconds, 4 images Loaded: 0.000022 seconds Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.146090, Avg Recall: -nan, count: 0 5: 0.564317, 7.594916 avg, 0.000100 rate, 0.095788 seconds, 5 images Loaded: 0.000020 seconds Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.091963, Avg Recall: -nan, count: 0 6: 0.164433, 6.851868 avg, 0.000100 rate, 0.094488 seconds, 6 images Loaded: 0.000020 seconds Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.062839, Avg Recall: -nan, count: 0 7: 0.074724, 6.174154 avg, 0.000100 rate, 0.089121 seconds, 7 images Loaded: 0.000020 seconds Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.041137, Avg Recall: -nan, count: 0 8: 0.054264, 5.562165 avg, 0.000100 rate, 0.094170 seconds, 8 images Loaded: 0.000022 seconds

I only have 1 class which is me and below is my face.cfg file I used.

`[net]
batch=1
subdivisions=1
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
max_batches = 120000
policy=steps
steps=-1,100,80000,100000
scales=.1,10,.1,.1

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

#######

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[route]
layers=-9

[reorg]
stride=2

[route]
layers=-1,-3

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=30
activation=linear

[region]
anchors = 0.738768,0.874946, 2.42204,2.65704, 4.30971,7.04493, 10.246,4.59428, 12.6868,11.8741
bias_match=1
classes=1
coords=4
num=5
softmax=1
jitter=.2
rescore=1

object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1

absolute=1
thresh = .6
random=0`

Trying to process an mp4 file...

I'm at capture stage.. I get this error when trying to process from Video file.
I don't have this error when I use regular darknet so I wonder if it's to do with the saving jpg element?

FPS:0.0
Objects:

aeroplane: 56%
OpenCV Error: Unspecified error (could not find a writer for the specified extension) in imwrite_, file /build/opencv-2TNgni/opencv-3.1.0+dfsg1/modules/imgcodecs/src/loadsave.cpp, line 459
terminate called after throwing an instance of 'cv::Exception'
what(): /build/opencv-2TNgni/opencv-3.1.0+dfsg1/modules/imgcodecs/src/loadsave.cpp:459: error: (-2) could not find a writer for the specified extension in function imwrite_

[1] 2908 abort (core dumped) ./darknet detector demo cfg/coco.data fddb-face/yolo-face.cfg me.mp4

how to make this

obj/image.o: In function draw_detections': image.c:(.text+0x1ba20): undefined reference to save_image_jpg'
collect2: error: ld returned 1 exit status
Makefile:63: recipe for target 'darknet' failed
make: *** [darknet] Error 1

===============================================================

i'm using nvidia jetson tx1, and i installed opencv 3.4.1.

i downloaded darknetFaceID and only modified image.c file.

and i changed Makefile OPENCV=0.

now, we have some issues. plz help me... we try to effort for 5days......

thank you for helping........

How to decide for unknown people?

Hi,

I understand that we need dataset for known people, but how do decide that a face is unknown while the network has been trained for n-classes (n=number of known people.)?

xhuvom / faceid Goto Github PK

faceid's Introduction

Part 1: Capture >>

Part 2: Train>>

Part 3 >> Deploy

faceid's People

Contributors

Stargazers

Watchers

Forkers

faceid's Issues

Recommend Projects

Recommend Topics

Recommend Org