Contenido principal

augmentedImageSource

(To be removed) Generate batches of augmented image data

augmentedImageSource will be removed in a future release. Create an augmented image datastore using the augmentedImageDatastore function instead. For more information, see Version History.

Description

auimds = augmentedImageSource(outputSize,imds) creates an augmented image datastore, auimds, for classification problems using images from image datastore imds, with output image size outputSize.

auimds = augmentedImageSource(outputSize,X,Y) creates an augmented image datastore for classification and regression problems. The array X contains the predictor variables and the array Y contains the categorical labels or numeric responses.

auimds = augmentedImageSource(outputSize,tbl) creates an augmented image datastore for classification and regression problems. The table, tbl, contains predictors and responses.

auimds = augmentedImageSource(outputSize,tbl,responseNames) creates an augmented image datastore for classification and regression problems. The table, tbl, contains predictors and responses. The responseNames argument specifies the response variable in tbl.

auimds = augmentedImageSource(___,Name,Value) creates an augmented image datastore, using name-value pairs to configure the image preprocessing done by the augmented image datastore. You can specify multiple name-value pairs.

example

Examples

collapse all

Preprocess images using random rotation so that the trained convolutional neural network has rotational invariance. This example uses the augmentedImageSource function to create an augmented image datastore object. For an example of the recommended workflow that uses the augmentedImageDatastore function to create an augmented image datastore object, see Train Network with Augmented Images.

Load the sample data, which consists of synthetic images of handwritten numbers.

[XTrain,YTrain] = digitTrain4DArrayData;

digitTrain4DArrayData loads the digit training set as 4-D array data. XTrain is a 28-by-28-by-1-by-5000 array, where:

  • 28 is the height and width of the images.

  • 1 is the number of channels

  • 5000 is the number of synthetic images of handwritten digits.

YTrain is a categorical vector containing the labels for each observation.

Create an image augmenter that rotates images during training. This image augmenter rotates each image by a random angle.

imageAugmenter = imageDataAugmenter('RandRotation',[-180 180])
imageAugmenter = 
  imageDataAugmenter with properties:

           FillValue: 0
     RandXReflection: 0
     RandYReflection: 0
        RandRotation: [-180 180]
           RandScale: [1 1]
          RandXScale: [1 1]
          RandYScale: [1 1]
          RandXShear: [0 0]
          RandYShear: [0 0]
    RandXTranslation: [0 0]
    RandYTranslation: [0 0]

Use the augmentedImageSource function to create an augmented image datastore. Specify the size of augmented images, the training data, and the image augmenter.

imageSize = [28 28 1];
auimds = augmentedImageSource(imageSize,XTrain,YTrain,'DataAugmentation',imageAugmenter)
auimds = 
  augmentedImageDatastore with properties:

         NumObservations: 5000
           MiniBatchSize: 128
        DataAugmentation: [1x1 imageDataAugmenter]
      ColorPreprocessing: 'none'
              OutputSize: [28 28]
          OutputSizeMode: 'resize'
    DispatchInBackground: 0

Specify the convolutional neural network architecture.

layers = [
    imageInputLayer([28 28 1])
    
    convolution2dLayer(3,16,'Padding',1)
    batchNormalizationLayer
    reluLayer
    
    maxPooling2dLayer(2,'Stride',2)
       
    convolution2dLayer(3,32,'Padding',1)
    batchNormalizationLayer
    reluLayer
    
    maxPooling2dLayer(2,'Stride',2)
       
    convolution2dLayer(3,64,'Padding',1)
    batchNormalizationLayer
    reluLayer
        
    fullyConnectedLayer(10)
    softmaxLayer
    classificationLayer];

Set the training options for stochastic gradient descent with momentum.

opts = trainingOptions('sgdm', ...
    'MaxEpochs',10, ...
    'Shuffle','every-epoch', ...
    'InitialLearnRate',1e-3);

Train the network.

net = trainNetwork(auimds,layers,opts);
Training on single CPU.
Initializing image normalization.
|========================================================================================|
|  Epoch  |  Iteration  |  Time Elapsed  |  Mini-batch  |  Mini-batch  |  Base Learning  |
|         |             |   (hh:mm:ss)   |   Accuracy   |     Loss     |      Rate       |
|========================================================================================|
|       1 |           1 |       00:00:01 |        7.81% |       2.4151 |          0.0010 |
|       2 |          50 |       00:00:23 |       52.34% |       1.4930 |          0.0010 |
|       3 |         100 |       00:00:44 |       74.22% |       1.0148 |          0.0010 |
|       4 |         150 |       00:01:05 |       78.13% |       0.8153 |          0.0010 |
|       6 |         200 |       00:01:26 |       76.56% |       0.6903 |          0.0010 |
|       7 |         250 |       00:01:45 |       87.50% |       0.4891 |          0.0010 |
|       8 |         300 |       00:02:06 |       87.50% |       0.4874 |          0.0010 |
|       9 |         350 |       00:02:30 |       87.50% |       0.4866 |          0.0010 |
|      10 |         390 |       00:02:46 |       89.06% |       0.4021 |          0.0010 |
|========================================================================================|

Input Arguments

collapse all

Size of output images, specified as a vector of two positive integers. The first element specifies the number of rows in the output images, and the second element specifies the number of columns. This value sets the OutputSize property of the returned augmented image datastore, auimds.

Image datastore, specified as an ImageDatastore object.

Images, specified as a 4-D numeric array. The first three dimensions are the height, width, and channels, and the last dimension indexes the individual images.

Data Types: single | double | uint8 | int8 | uint16 | int16 | uint32 | int32

Responses for classification or regression, specified as one of the following:

  • For a classification problem, Y is a categorical vector containing the image labels.

  • For a regression problem, Y can be an:

    • n-by-r numeric matrix. n is the number of observations and r is the number of responses.

    • h-by-w-by-c-by-n numeric array. h-by-w-by-c is the size of a single response and n is the number of observations.

Responses must not contain NaNs.

Data Types: categorical | double

Input data, specified as a table. tbl must contain the predictors in the first column as either absolute or relative image paths or images. The type and location of the responses depend on the problem:

  • For a classification problem, the response must be a categorical variable containing labels for the images. If the name of the response variable is not specified in the call to augmentedImageSource, the responses must be in the second column. If the responses are in a different column of tbl, then you must specify the response variable name using the responseNames argument.

  • For a regression problem, the responses must be numerical values in the column or columns after the first column. The responses can be either in multiple columns as scalars or in a single column as numeric vectors or cell arrays containing numeric 3-D arrays. When you do not specify the name of the response variable or variables, augmentedImageSource accepts the remaining columns of tbl as the response variables. You can specify the response variable names using the responseNames argument.

Responses must not contain NaN values. If there are NaNs in the predictor data, they are propagated through the training, however, in most cases the training fails to converge.

Data Types: table

Names of the response variables in the input table, specified as one of the following:

  • For classification or regression tasks with a single response, responseNames must be a character vector or string scalar containing the response variable in the input table.

    For regression tasks with multiple responses, responseNames must be string array or cell array of character vectors containing the response variables in the input table.

Data Types: char | cell | string

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: augmentedImageSource([28,28],myTable,'OutputSizeMode','centercrop') creates an augmented image datastore that sets the OutputSizeMode property to crop images from the center.

Preprocessing operations performed on color channels of input images, specified as the comma-separated pair consisting of 'ColorPreprocessing' and 'none', 'gray2rgb', or 'rgb2gray'. This argument sets the ColorPreprocessing property of the returned augmented image datastore, auimds. The ColorPreprocessing property ensures that all output images from the augmented image datastore have the number of color channels required by inputImageLayer.

Preprocessing applied to input images, specified as the comma-separated pair consisting of 'DataAugmentation' and an imageDataAugmenter object or 'none'. This argument sets the DataAugmentation property of the returned augmented image datastore, auimds. When DataAugmentation is 'none', no preprocessing is applied to input images.

Method used to resize output images, specified as the comma-separated pair consisting of 'OutputSizeMode' and one of the following. This argument sets the OutputSizeMode property of the returned augmented image datastore, auimds.

  • 'resize' — Scale the image to fit the output size. For more information, see imresize.

  • 'centercrop' — Take a crop from the center of the training image. The crop has the same size as the output size.

  • 'randcrop' — Take a random crop from the training image. The random crop has the same size as the output size.

Data Types: char | string

Perform augmentation in parallel, specified as the comma-separated pair consisting of 'BackgroundExecution' and false or true. This argument sets the DispatchInBackground property of the returned augmented image datastore, auimds. If 'BackgroundExecution' is true, and you have Parallel Computing Toolbox™ software installed, then the augmented image datastore auimds performs image augmentation in parallel.

Output Arguments

collapse all

Augmented image datastore, returned as an augmentedImageDatastore object.

Version History

Introduced in R2017b

expand all