Giter Club home page Giter Club logo

face-crop-plus's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

face-crop-plus's Issues

Make a graceful warning when encountering non-image file in input folder

If the script encounters a non-image file it fails with the following warning:

Processing:  50% 1/2 [00:03<00:03,  3.72s/it]/usr/local/lib/python3.10/dist-packages/face_crop_plus/utils.py:264: UserWarning: Could not read the image input/.ipynb_checkpoints
  warnings.warn(f"Could not read the image {path}")
Processing:  50% 1/2 [00:03<00:03,  3.72s/it]
Traceback (most recent call last):
  File "/usr/local/bin/face-crop-plus", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/face_crop_plus/__main__.py", line 278, in main
    cropper.process_dir(input_dir, output_dir)
  File "/usr/local/lib/python3.10/dist-packages/face_crop_plus/cropper.py", line 909, in process_dir
    list(imap)
  File "/usr/local/lib/python3.10/dist-packages/tqdm/std.py", line 1182, in __iter__
    for obj in iterable:
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/lib/python3.10/dist-packages/face_crop_plus/cropper.py", line 817, in process_batch
    images, _, paddings = as_batch(images, self.resize_size)
  File "/usr/local/lib/python3.10/dist-packages/face_crop_plus/utils.py", line 342, in as_batch
    return np.stack(img_batch), np.stack(unscales), np.stack(paddings)
  File "<__array_function__ internals>", line 180, in stack
  File "/usr/local/lib/python3.10/dist-packages/numpy/core/shape_base.py", line 422, in stack
    raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack

First time I saw it i thought no image in the input folder was cropped.

EDIT: I know one should put images in the input folder but sometimes some hidden files are created as it happens when running script from a Google's Colab (as seen in example above).

Future Ideas

Some ideas for new features:

  • Allow variable face factor
  • Allow processing a single file with an option to show the processed image (only require matplotlib or ipykernel display if this feature is used)
  • Add on_file fn to allow filtering specific images
  • Allow passing a directory of landmarks and allow on_landmarks_file_fn to for custom loading
  • Make it more convenient to process files when full paths are known to them
  • Add create_splits utility that splits the images based on val_frac and test_frac or reads the file that specifies the splits or multiple files (one for each split) or just accepts 1-3 sets with file names
  • Also add a feature to pass a directory of directories or a list of directories, e.g., we may want to process images in train/val/test subfolders
  • Create Base Dir/Image Processor and make Cropper extend it with its cropping functionalities
  • Try to center the image based on the center of mass of the landmarks instead of the nose (as what it seems to be doing now?), for instance when people look sideways, their noses are typically in the middle but not the rest of the fice is biased toward either left or right side of the image

For example: make it more convenient to process datasets where all the files are inside a single folder: Face Synthetics

Some bugs and enhancement possibilities

To-Do List

Overview

Just a general to-do list and notes for the future update from v1.0.3.

Bugs

  • Warning thrown if there are no landmarks for a specific filename in the case the landmarks are provided in prior
  • Error is thrown when paddings is tried to be accessed when landmarks are provided, i.e., missing initialization of paddings = None
  • STD Landmarks change when Cropper is instantiated a second time in a row, i.e., STANDARD_LANDMARKS_5 is modified in _init_landmarks_target
  • Fix typos in documentation

Behavior

  • Change default padding to constant (to avoid heads coming out of heads)
  • Change enh_threshold to None (to avoid unnecessary wasting time on enhancement)

Enhancements

  • Custom description argument in process_dir
  • Utility to rename files to ASCII and shorten filenames (in-place and copy over)

Make face aligment optional

It would be nice to have a way to turn off the face aligment to avoid using padding. Especially good for training models where the position of the face/body is not important.

50% through batch I get this error.

image = cv2.cvtColor(cv2.imread(path), cv2.COLOR_BGR2RGB)
cv2.error: OpenCV(4.7.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src\color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cv::cvtColor'

cli argument: (windows)

face-crop-plus -i I:\AI\imgtmp_1024x768\sharpened\AI\1024_1024x768 --output-size 512 512 -d cuda:0 -b 5 --face-factor 0.75 -o I:\AI\imgtmp_1024x768\sharpened\AI\1024_1024x768\faces

4090 cuda 12.1

Wondering if it broke on a 0 byte file it tried to parse or a broken image?

Not an issue but maybe some enhancements?

I've run this in WSL2 and Windows proper and I'm not even sure if this is supposed to run on either so this is just a thought. Here i'm talking about Windows 10 proper:

The _faces dir won't write if the input dir has specific garbage characters that came in with the filename. e.g. "―". I have quoted directories in Windows to no avail. e.g.

F:\face-crop-plus>face-crop-plus -i "M:_____Digital\junk 0602" -s 1024 -ff 0.25 -d cuda:0 --det-threshold 0.99 --padding constant

This above will not generate any output or will crash, depending on the character in the filename.

Here's the output of processing a directory that above character in the filename- it will bomb after finding these:

Processing: 45%|███████████████████████████████▌ | 14/31 [00:28<00:35, 2.07s/it][ WARN:[email protected]] global loadsave.cpp:244 cv::findDecoder imread_('C:\Users\ML-scrapes\Downloads\Faces-web\Faces _ΓÇò_752DB5E.jpg'): can't open/read file: check file path/integrity

I know it's probably about escaping certain chars from linux to windows, almost certainly. Also any sort of garbage filenames in Windows will break it. I could easily run dos2unix on the filenames in the pipeline, just wanted to point out that behavior.

Also- may I recommend the default directory being a sub directory of the input dir instead of at the top? Am I being too picky yet? :)

Again- not complaining- just really find this an incredibly useful tool for my ML work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.