MS COCO and YOLOX-Nano Experiment

Generative augmentation with Kopikat: object detection accuracy boost on 5K images or less

Overview

We created Kopikat, a tool for generative data augmentation. It helps significantly improve the neural network quality on datasets with under 5,000 images - a typical case in real-life applications. To test it, we ran an experiment on 5% of MSCOCO dataset and YOLOX-Nano. We got +1.06 mAP, or 14.6% of the metric, out of the box - and made no changes to the network architecture. We believe that Kopikat can help a vast amount of projects achieve better results with limited datasets.

Generative Data Augmentation

Data augmentation is applied during neural network training to make the data more diverse and, thus, the resulting model more robust. Standard approaches flip, rotate, and crop the input images, change their contrast and brightness, add random noise, and make other changes that preserve the original data annotation while making the data more random.

Data augmentations help fight overfitting and are used in every modern approach to computer vision.We enhanced the standard approaches to data augmentation using https://kopikat.co - our generative data augmentation product. Kopikat generates new copies of the original images while preserving the original data annotation. It significantly surpasses previous data augmentation methods in the quality and variety of the resulting models.With Kopikat, we aim to enable real-life products where collecting a big dataset is hard.

The sizes of real-life datasets are typically 5,000 images or less, significantly lower than those of research-grade datasets. We hope that Kopikat helps many teams achieve better results with limited data.

Testing how Kopikat helps improve metrics on small datasets

Our goal was to test how Kopikat helps smaller datasets - with less than 5K images - as this is the typical dataset size in industrial applications.

Our hypothesis was that Kopikat should help diversify smaller datasets and improve the accuracy of the models on them, while it might not be as useful for big datasets with 100K+ samples.To test how it helps in real-life use cases, we used YOLOX-Nano, a lighter and faster version of YOLOX. We ran the experiments with vanilla YOLOX-Nano and did not change its code.

We ran three experiments:

‍

5% of COCO, or 5,914 images;
25% of COCO, or 29,571 images. We used the split from https://github.com/giddyyupp/coco-minitrain to represent the same class distribution as in full COCO.
100% of COCO, or 118,286 images.

We created one copy of each original MSCOCO image in each of them.

Figure 2: Validation metrics obtained during the experiment with 25% subset of COCO dataset.

Purple curve is the baseline training (original images from COCO), blue curve - the training for the original images + images augmented with Kopikat. Vertical axis is Mean Average Precision (mAP) across different IoU thresholds ranging from 0.50 to 0.95. It is a commonly used metric for evaluating Object Detection models. We observed a consistent metric improvement in every epoch of the training: there was never a point when this dataset was worse than the original one.

Figure 3. The results of the experiments on 5%, 25%, and 100% COCO with Kopikat augmentations.

We observed an improvement in most of the experiments, but the improvement for smaller datasets was more significant than for 100% COCO. For a 5% split, we got a boost of 1.1 AP points, or 14.6%, right out of the box. For a 25% split, we got a boost of 1.6 AP points or 9.1%, and 0.42 AP, or 1.73%, for the full COCO.Our intuition is that smaller datasets lack the diversity needed to fully represent the use case and fight overfitting, while in big datasets, this issue is not as important. In real industrial applications, the typical dataset size is 5000 mages or smaller, so we believe Kopikat can be helpful for many projects.

How does Kopikat work?

Try For FREE

Kopikat Augmentation Examples

Here are some examples of how Kopikat augments the data. Here we use images from COCO dataset and show their original annotation on both source and generated images. For every image, the first row is the original image, the second row - its Kopikat augmentation.

Kopikat is helpful in:

1. Object Detection:
Improving the accuracy of models such as YOLOX-Nano ensures more effective real-time object recognition. This is super useful in many products like retail traffic analysis, security systems, autonomous vehicles, and many other domains.

2. Neural Network Training with Limited Data:
‍Kopikat diversifies limited datasets and allows to use them in industrial applications. Small datasets are very common in real-life projects - and we hope to help build products even in these cases.

3. Transfer Learning:
‍Models trained on augmented data can be used for transfer learning to other tasks or datasets. This can significantly reduce the time and resources needed to train neural networks for new tasks from scratch.

Conclusion

Kopikat introduces a completely new approach that we call Generative Data Augmentation. Our experiments show its effectiveness for computer vision models.

Kopikat allows the user to enlarge and diversify datasets and is specifically helpful for datasets with up to 5,000 images that are typical for real-life AI projects. Data augmentation opens new opportunities for researchers and developers, providing tools to enhance model training. We plan to develop our product further, making it more effective and versatile.

We believe that the importance of data will only grow in the upcoming years, and we are ready to support this growth by offering innovative solutions.

And best of all, you can try it yourself!

Try For FREE