As deep neural models require a great amount of training data, pattern recognition models are not an exception. Since CT scan images do not suffer generally from rotation or vertical flips, we took as transformations random resized crop and random horizontal flip. The first one randomly selects a multiplier (from 0.08 to 1) to apply to the original size to select the cropping window size and an aspect ratio (from 0.75 to 1.33) before being resized. Random horizontal flip randomly flips horizontally an image with a probability of 0.5.

Besides data augmentation, it is important to preprocess images to specific parameters as previously trained models are optimized for data normalized in a specific range. For this reason, images are resized to 256 ​× ​256 pixels to be later resized to 224 ​× ​224 in the random resized crop transformation. Finally, images are normalized from 0 to 1 by dividing their pixel channels’ values by 255 (RGB 8-bit images are represented by pixels values ranging from 0 to 255 on each of their 3 channels). To keep original quality of validation and test set, images were only resized and standardized, with no augmentation involved.

The selection of the data augmentation methods presented are based on the proven effectiveness and the simplicity in interpretation of such methods [11]. Also, as we do not expect images to be presented, for example, with a 90° rotation or with sharp variations in brightness or contrast, such transformation techniques could lead to an increase in irrelevant augmented data. Finally, the parameters from the transformations are the ones commonly used in pretrained models such as Inception v3.