MixUp and CutMix
Mixup: blend images together
CutMix: Cut off a part of pixel and fill with pixel from other images
Mixup 完全结合图像的信息,引入不自然的信息可能性比较大
CutMix 结合图像的部分信息,能够加快训练效率
In Pytorch
define number of classes
A typical image classification pipeline
Add Mixup and CutMix
After DataLoader
DataLoader has already batched the images and labels for us, and this is exactly what these transforms expect as input
The shape of tensor
Before CutMix/MixUp: images.shape = torch.Size([4, 3, 224, 224]), labels.shape = torch.Size([4])
After CutMix/MixUp: images.shape = torch.Size([4, 3, 224, 224]), labels.shape = torch.Size([4, 100])
Label transform from (batch_size)
into (batch_size, num_class)
The transformed labels can still be passed as-is to a loss function like torch.nn.functional.cross_entropy()
. 直接能按原样传入
for
cross_entropy()
, the target label shape can be (N, C), where C is the number of classes, and N is the batch size
As part of the collation function
Add cutmix_or_mixup directly after the DataLoader is the simplest way, but it does not take advantage of the DataLoader multi-processing. For that, we can pass those transforms as part of the collation function.
images.shape = torch.Size([4, 3, 224, 224]), labels.shape = torch.Size([4, 100])
With non-standard input format
typical format is (images, labels)
MixUp and CutMix will magically work by default with most common sample structures: tuples where the second parameter is a tensor label, or dict with a label[s]
key.
if samples have a different structure, use CutMix and MixUp by passing a callable to the labels_getter parameter.