Giter Club home page Giter Club logo

bbaug's Introduction

Master Branch Dist CI Alt text PyPI version Downloads

BBAug

BBAug is a Python package for the implementation of Google’s Brain Team’s bounding box augmentation policies. The package is aimed for PyTorch users who wish to use these policies in the augmentation of bounding boxes during the training of a model. Currently all 4 versions of the policies are implemented. This package builds on top of the excellent image augmentations package imgaug.

References

Features

  • Implementation of all 4 policies
  • Custom policies
  • Custom augmentations
  • Bounding boxes are removed if they fall outside of the image*
  • Boudning boxes are clipped if they are partially outside the image*
  • Augmentations that imply direction e.g. rotation is randomly determined

*Doest not happen for bounding box specific augmentations

To Do

  • Implementation of version 2 of policies (implemented in v0.2)
  • Implementation of version 1 of policies (implemented in v0.2)
  • For bounding box augmentations apply the probability individually for each box not collectively (implemented in v0.4)

Installation

Installation is best done via pip:

pip install bbaug

Prerequisites

  • Python 3.6+
  • PyTorch
  • Torchvision

Description and Usage

For detailed description on usage please refer to the Python notebooks provided in the notebooks folder.

A augmentation is define by 3 attributes:

  • Name: Name of the augmentation
  • Probability: Probability of augmentation being applied
  • Magnitude: The degree of the augmentation (values are integers between 0 and 10)

A sub-policy is a collection of augmentations: e.g.

sub_policy = [('translation', 0.5, 1), ('rotation', 1.0, 9)]

In the above example we have two augmentations in a sub-policy. The translation augmentation has a probability of 0.5 and a magnitude of 1, whereas the rotation augmentation has a probability of 1.0 and a magnitude of 9. The magnitudes do not directly translate into the augmentation policy i.e. a magnitude of 9 does not mean a 9 degrees rotation. Instead, scaling is applied to the magnitude to determine the value passed to the augmentation method. The scaling varies depending on the augmentation used.

A policy is a set of sub-policies:

policies = [
    [('translation', 0.5, 1), ('rotation', 1.0, 9)],
    [('colour', 0.5, 1), ('cutout', 1.0, 9)],
    [('rotation', 0.5, 1), ('solarize', 1.0, 9)]
]

During training, a random policy is selected from the list of sub-policies and applied to the image and because each augmentation has it's own probability this adds a degree of stochasticity to training.

Augmentations

Each augmentation contains a string referring to the name of the augmentation. The augmentations module contains a dictionary mapping the name to a method reference of the augmentation.

from bbaug.augmentations import NAME_TO_AUGMENTATION
print(NAME_TO_AUGMENTATION) # Shows the dictionary of the augmentation name to the method reference

Some augmentations are applied only to the bounding boxes. Augmentations which have the suffix BBox are only applied to the bounding boxes in the image.

Listing All Policies Available

To obtain a list of all available polices run the list_policies method. This will return a list of strings containing the function names for the policy sets.

from bbaug.policies import list_policies
print(list_policies()) # List of policies available

Listing the policies in a policy set

from bbaug.policies import policies_v3
print(policies_v3()) # Will list all the polices in version 3

Visualising a Policy

To visulaise a policy on a single image a visualise_policy method is available in the visuals module.

from bbaug.visuals import visualise_policy
visualise_policy(
    'path/to/image',
    'save/dir/of/augmentations',
    bounding_boxes, # Bounding boxes is a list of list of bounding boxes in pixels (int): e.g. [[x_min, y_min, x_man, y_max], [x_min, y_min, x_man, y_max]]
    labels, # Class labels for the bounding boxes as an iterable of ints eg. [0, 5]
    policy, # the policy to visualise
    name_to_augmentation, # (optional, default: augmentations.NAME_TO_AUGMENTATION) The dictionary mapping the augmentation name to the augmentation method
)

Policy Container

To help integrate the policies into training a PolicyContainer class available in the policies module. The container accepts the following inputs:

  • policy_set (required): The policy set to use
  • name_to_augmentation (optional, default: augmentations.NAME_TO_AUGMENTATION): The dictionary mapping the augmentation name to the augmentation method
  • return_yolo (optional, default: False): Return the bounding boxes in YOLO format otherwise [x_min, y_min, x_man, y_max] in pixels is returned

Usage of the policy container:

from bbaug import policies

# select policy v3 set
aug_policy = policies.policies_v3()
 
# instantiate the policy container with the selected policy set
policy_container = policies.PolicyContainer(aug_policy)

# select a random policy from the policy set
random_policy = policy_container.select_random_policy() 

# Apply the augmentation. Returns the augmented image and bounding boxes.
# Image is a numpy array of the image
# Bounding boxes is a list of list of bounding boxes in pixels (int).
# e.g. [[x_min, y_min, x_man, y_max], [x_min, y_min, x_max, y_max]]
# Labels are the class labels for the bounding boxes as an iterable of ints e.g. [1,0]
img_aug, bbs_aug = policy_container.apply_augmentation(random_policy, image, bounding_boxes, labels)
# image_aug: numpy array of the augmented image
# bbs_aug: numpy array of augmneted bounding boxes in format: [[label, x_min, y_min, x_man, y_max],...]

Policy Implementation

The policies implemented in bbaug are shown below. Each column represents a different run for that given sub-policy as each augmentation in the sub-policy has it's own probability this results in variations between runs.

Version 0

These are the policies used in the paper.

image image image image image

Version 1

image image image image image image image image image image image image image image image image image image image image

Version 2

image image image image image image image image image image image image image image image

Version 3

image image image image image image image image image image image image image image image

bbaug's People

Contributors

harpalsahota avatar sadransh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

bbaug's Issues

Ensure Repeatability

Is there a way to set the seed so that I can recreate how the policies were selected? I would like to have some type of record of augmentations applied so I can verify my training steps in the future when I have more data.

TypeError: __init__() got multiple values for argument 'label'

Code reproduced:

class ExampleDataset:
    def __init__(self, root, policy_container=None, is_train=True):
        self.root = root
        self.data_type = "train" if is_train else "val"
        self.policy_container = policy_container
        # main
        # self.imgs = list(sorted(glob.glob(f'{root}/images/{self.data_type}/*.jpg')))
        # self.boxes = list(sorted(glob.glob(f'{root}/labels/{self.data_type}/*.txt')))
        # test
        self.imgs = list(sorted(glob.glob(f'{root}/test_data/*.jpg')))
        self.boxes = list(sorted(glob.glob(f'{root}/test_data/*txt')))
        self.out_dir = f'{root}/test_data_out/'
        if not os.path.exists(self.out_dir):
            os.mkdir(self.out_dir)

    def __len__(self):
        return len(self.imgs)
        
    # def __getitem__(self, idx):
    #     img = np.array(Image.open(self.imgs[idx]))
    #     boxes_path = self.boxes[idx]
    #     height, width, _ = img.shape
    #     # For convenience I’ve hard coded the label and co-ordinates as label, x_min, y_min, x_max, y_max
    #     # for each bounding box in the image. For your own model you will need to load
    #     # in the coordinates and do the appropriate transformations.
    #     boxes = []
    #     labels = []
    #     with open(boxes_path, 'r') as in_box:
    #         for line in in_box:
    #             if line:
    #                 line = line.split()
    #                 xywh = list(map(int, map(float, line[1:])))
    #                 xyxy = self.convert_xyxy(xywh, width=width, height=height)
    #                 boxes.append(xyxy)
    #                 labels.append(int(line[0]))
        
    #     if self.policy_container:

    #         # Select a random sub-policy from the policy list
    #         random_policy = self.policy_container.select_random_policy()
    #         print(random_policy)

    #         # Apply this augmentation to the image, returns the augmented image and bounding boxes
    #         # The boxes must be at a pixel level. e.g. x_min, y_min, x_max, y_max with pixel values
    #         img_aug, bbs_aug = self.policy_container.apply_augmentation(
    #             random_policy,
    #             img,
    #             boxes,
    #             labels,
    #         )
    #         labels = np.array(labels)
    #         boxes = np.hstack((np.vstack(labels), np.array(boxes))) # Add the labels to the boxes
    #         bbs_aug= np.array(bbs_aug)
            
    #         # Only return the augmented image and bounded boxes if there are
    #         # boxes present after the image augmentation
    #         if bbs_aug.size > 0:
    #             return img, boxes, img_aug, bbs_aug
    #         else:
    #             return img, boxes, [], np.array([])
    #     return img, boxes

    def run(self, num_random=10):
        for idx in range(len(self.imgs)):
            img = np.array(Image.open(self.imgs[idx]))
            boxes_path = self.boxes[idx]
            height, width, _ = img.shape
            # For convenience I’ve hard coded the label and co-ordinates as label, x_min, y_min, x_max, y_max
            # for each bounding box in the image. For your own model you will need to load
            # in the coordinates and do the appropriate transformations.
            boxes = []
            labels = []
            with open(boxes_path, 'r') as in_box:
                for line in in_box:
                    if line:
                        line = line.split()
                        xywh = list(map(int, map(float, line[1:])))
                        xyxy = self.convert_xyxy(xywh, width=width, height=height)
                        boxes.append(xyxy)
                        labels.append(int(line[0]))
            
            if self.policy_container:
                # run $num_random times
                print("Processing: " + self.imgs[idx])
                i = 0
                for i in range(num_random):

                    # Select a random sub-policy from the policy list
                    random_policy = self.policy_container.select_random_policy()

                    # Apply this augmentation to the image, returns the augmented image and bounding boxes
                    # The boxes must be at a pixel level. e.g. x_min, y_min, x_max, y_max with pixel values
                    img_aug, bbs_aug = self.policy_container.apply_augmentation(
                        random_policy,
                        img,
                        boxes,
                        labels
                    )
                    labels = np.array(labels)
                    boxes = np.hstack((np.vstack(labels), np.array(boxes))) # Add the labels to the boxes
                    bbs_aug= np.array(bbs_aug)
                    
                    # Only return the augmented image and bounded boxes if there are
                    # boxes present after the image augmentation
                    if bbs_aug.size > 0:
                        print("Step: " + str(i))
                        print(random_policy)
                        # img, boxes, img_aug, bbs_aug
                        # to write
                        cv2.imwrite(str(i)+"_bbaug_"+self.imgs[idx], img_aug)
                        with open(str(i)+"_bbaug_"+self.boxes[idx], "w") as fw:
                            fw.writelines(bbs_aug)
                        i += 1
    
    def convert_xyxy(self, xywh, width, height):
        x, w = xywh[0] * width, xywh[2] * width
        y, h = xywh[1] * height, xywh[3] * height
        x1 = x - w / 2
        x2 = x + w / 2
        y1 = y - h / 2
        y2 = y + h / 2

        return list(map(int, [x1, x2, y1, y2]))

Help: Generate annotation boxes for Augmented images

@harpalsahota thanks for this great repo.

I am working on an object detection project using the Detectron2 framework. I would like to try out augmentations suggested by Google's Brain. I was wondering if there is a way to generate annotation boxes for augmented images along with the following information.

Detecron2 expects labels in the below format

{'file_name': 'train_images/85fbb8dffe30d2b1.jpg', # image path 
 'height': 768,                                                          # resulting image height 
 'width': 1024,                                                         # resulting width
 'annotations': [{'bbox': [2.0, 0.0, 1022.0, 766.0],   # annotation boxes
   'bbox_mode': 0,
   'category_id': 24}]}                                              # class label

Could you help me with this?

Apply augmentations on entire batch

Looking at the documentation, it seems that I would apply the augmentation one image at a time. Is there a way to apply this to an entire batch?

integration with yolov5

I am new to computer vision and I am seeking to integrating the policies with yolov5. Any help will be appreciated

if it's possible to apply for image with single boundingbox

Hello,

thanks for cool project.

My images/ground truth images have single bounding box, instead of two bounding boxes. Is it possible to modify code, so that the augmentation works for images with single bounding box? When I applied your code, I got following error.


TypeError Traceback (most recent call last)
in
17 # e.g. [[x_min, y_min, x_max, y_max], [x_min, y_min, x_max, y_max]]
18 # Labels are the class labels for the bounding boxes as an iterable of ints e.g. [1,0]
---> 19 img_aug, bbs_aug = policy_container.apply_augmentation(random_policy, image, bounding_boxes, labels)
20 # image_aug: numpy array of the augmented image
21 # bbs_aug: numpy array of augmneted bounding boxes in format: [[label, x_min, y_min, x_man, y_max],...]

/opt/conda/lib/python3.6/site-packages/bbaug/policies/policies.py in apply_augmentation(self, policy, image, bounding_boxes, labels)
480 [
481 BoundingBox(*bb, label=label)
--> 482 for bb, label in zip(bounding_boxes, labels)
483 ],
484 image.shape

/opt/conda/lib/python3.6/site-packages/bbaug/policies/policies.py in (.0)
480 [
481 BoundingBox(*bb, label=label)
--> 482 for bb, label in zip(bounding_boxes, labels)
483 ],
484 image.shape

TypeError: init() missing 1 required positional argument: 'y2'

GPU augmentations

Cool project.

Does any of the policies do the augmentation process on the GPU? Maybe something like Nvidia Dali? (Obviously not exactly...)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.