Semantic Segmentation of Multispectral Images by Using Quantized U-Net on FPGA

This example uses:

Deep Learning HDL Toolbox Deep Learning HDL Toolbox
Deep Learning HDL Toolbox Support Package for Intel FPGA and SoC Devices Deep Learning HDL Toolbox Support Package for Intel FPGA and SoC Devices
Deep Learning Toolbox Deep Learning Toolbox
Deep Learning Toolbox Model Compression Library Deep Learning Toolbox Model Compression Library
MATLAB Coder Interface for Deep Learning MATLAB Coder Interface for Deep Learning

This example show how to use the Deep Learning HDL Toolbox™ to deploy a quantized U-Net to perform semantic segmentation on multispectral images. The example uses the pretrained U-Net network to demonstrate quantization and deployment of the quantized network. Quantization helps reduce the memory requirement of a deep neural network by quantizing weights, biases, and activations of network layers to 8-bit scaled integer data types. To retrieve the prediction results, use MATLAB®.

Deploy the quantized U-Net network by creating a dlhdl.Workflow object. Use the dlhdl.Workflow object to:

Generate a list of instructions, weights and biases by using the compile method.
Generate a programming file for the FPGA by using the deploy method.
Retrieve the network prediction results and performance by using the predict method.

The quantized network takes in a multispectral input image of size 256-by-256 that has six channels and outputs a segmentation map where each pixel corresponds to one of 18 classes. To train the network, see Semantic Segmentation of Multispectral Images Using Deep Learning (Image Processing Toolbox).

Prerequisites

Intel Arria 10 SoC Development Kit up to Revision C

Load Pretrained U-Net Network

Load the pretrained Directed Acyclic Graph (DAG) network U-Net using the downloadTrainedUnet helper function. This function is attached to the example as a supporting file.

imageDir = fullfile(tempdir,'trainedUnet');
trainedUNetURL = "https://www.mathworks.com/supportfiles/vision/data/multispectralUnet.mat";
downloadTrainedUnet(trainedUNetURL, imageDir);
load(fullfile(imageDir,'multispectralUnet.mat'));

To obtain information about the 58 layers in the DAG network, use the analyzeNetwork function.

analyzeNetwork(net)

Download Data

The pretrained network was trained on a high-resolution multispectral data set [1]. The image set was captured using a drone over Hamlin Beach State Park, NY. The data contains labeled training, validation, and test set that have 18 object class labels. The size of the data file is ~3.0 GB. For calibration and testing of the network, use parts of the training data set.

Download the MAT file version of the data set by using the downloadHamlinBeachMSIData helper function. This function is attached to the example as a supporting file.

imageDir = tempdir;
url = 'https://home.cis.rit.edu/~cnspci/other/data/rit18_data.mat';
downloadHamlinBeachMSIData(url, imageDir);

Create Calibration Data

The pretrained U-Net network accepts inputs of size 256-by-256-by-6. The training data in the downloaded MAT file has a size of 7-by-9393-by-5642. Use the extractMultispectralData helper function to extract patches of size 256-by-256-by-6 and store them in MAT files for calibration. The seventh channel in the training data is a binary mask and is not used by the pretrained network for inference.

For best quantization results, the calibration data must be representative of actual inputs that are predicted by the U-Net network. Expedite the calibration process by reducing the calibration data set to six images. Choose the six images so that they form a 2-by-3 grid to represent a large continuous image.

foldername = 'CalibData';
dataPath = fullfile(imageDir, 'rit18_data', 'rit18_data.mat');
im = extractMultispectralData(foldername, dataPath, 2, 3);

The first three channels of the multispectral training data contain RGB information. Display a histogram-equalized version of the extracted data.

im = histeq(im(:,:,[3 2 1]));
montage({im});

Create an imageDatastore object to use for calibration. The patches are loaded from the folder 'CalibData'.

imds = imageDatastore('CalibData', FileExtensions = '.mat', ReadFcn = @matReader);

Create dlquantizer Object

Create a quantized network object by using dlquantizer. Set the target execution environment to FPGA.

dlQuantObj = dlquantizer(net, ExecutionEnvironment = 'FPGA');

Calibrate Quantized Network

Use the calibrate function to exercise the network by using sample inputs and collect the range information. The calibrate function exercises the network. The function collects the dynamic ranges of the weights and biases in the convolution and fully connected layers of the network and in the dynamic ranges of the activations in all layers of the network. The calibrate function returns a table.

calibrate(dlQuantObj, imds)

ans=103×5 table
    'Encoder-Section-1-Conv-1_Weights'    'Encoder-Section-1-Conv-1'    "Weights"    -0.0785    0.0839
       'Encoder-Section-1-Conv-1_Bias'    'Encoder-Section-1-Conv-1'       "Bias"     0.7125    1.1249
    'Encoder-Section-1-Conv-2_Weights'    'Encoder-Section-1-Conv-2'    "Weights"    -0.2389    0.2489
       'Encoder-Section-1-Conv-2_Bias'    'Encoder-Section-1-Conv-2'       "Bias"     0.7060    1.3810
    'Encoder-Section-2-Conv-1_Weights'    'Encoder-Section-2-Conv-1'    "Weights"    -0.0483    0.0754
       'Encoder-Section-2-Conv-1_Bias'    'Encoder-Section-2-Conv-1'       "Bias"     0.9370    1.0490
    'Encoder-Section-2-Conv-2_Weights'    'Encoder-Section-2-Conv-2'    "Weights"    -0.1825    0.1911
       'Encoder-Section-2-Conv-2_Bias'    'Encoder-Section-2-Conv-2'       "Bias"     0.8574    1.0482
    'Encoder-Section-3-Conv-1_Weights'    'Encoder-Section-3-Conv-1'    "Weights"    -0.0123    0.0279
       'Encoder-Section-3-Conv-1_Bias'    'Encoder-Section-3-Conv-1'       "Bias"     0.9723    1.0495
    'Encoder-Section-3-Conv-2_Weights'    'Encoder-Section-3-Conv-2'    "Weights"    -0.1462    0.1317
       'Encoder-Section-3-Conv-2_Bias'    'Encoder-Section-3-Conv-2'       "Bias"     0.9604    1.0234
    'Encoder-Section-4-Conv-1_Weights'    'Encoder-Section-4-Conv-1'    "Weights"    -0.0066    0.0070
       'Encoder-Section-4-Conv-1_Bias'    'Encoder-Section-4-Conv-1'       "Bias"     0.9854    1.0057
      ⋮

Create Target Object

Set the synthesis tool path to point to an installed Intel® Quartus® Prime Standard Edition 20.1 executable file. You must have already installed Altera® Quartus II.

% hdlsetuptoolpath(ToolName = 'Altera Quartus II', ToolPath = 'C:\intel\20.1\quartus\bin\quartus.exe');

Create a target object that has a custom name for your target device and an interface to connect your target device to the host computer. Interface options are JTAG (default) and Ethernet.

hTarget = dlhdl.Target('Intel', Interface = 'JTAG');

Create Workflow Object

Create an object of the dlhdl.Workflow class. Specify the network and bitstream name. Specify the quantized network object dlQuantObj as the network. Make sure that the bitstream name matches the data type and the FPGA board that you are targeting. In this example, the target FPGA board is the Intel Arria 10 SoC board. The bitstream uses an int8 data type.

hW = dlhdl.Workflow(Network = dlQuantObj, Bitstream = 'arria10soc_int8', Target = hTarget);

Compile Workflow Object

To compile the U-Net network, run the compile function of the dlhdl.Workflow object.

dn = compile(hW)

### Compiling network for Deep Learning FPGA prototyping ...
### Targeting FPGA bitstream arria10soc_int8.
### The network includes the following layers:
     1   'ImageInputLayer'                        Image Input                  256×256×6 images with 'zerocenter' normalization                                   (SW Layer)
     2   'Encoder-Section-1-Conv-1'               Convolution                  64 3×3×6 convolutions with stride [1  1] and padding [1  1  1  1]                  (HW Layer)
     3   'Encoder-Section-1-ReLU-1'               ReLU                         ReLU                                                                               (HW Layer)
     4   'Encoder-Section-1-Conv-2'               Convolution                  64 3×3×64 convolutions with stride [1  1] and padding [1  1  1  1]                 (HW Layer)
     5   'Encoder-Section-1-ReLU-2'               ReLU                         ReLU                                                                               (HW Layer)
     6   'Encoder-Section-1-MaxPool'              Max Pooling                  2×2 max pooling with stride [2  2] and padding [0  0  0  0]                        (HW Layer)
     7   'Encoder-Section-2-Conv-1'               Convolution                  128 3×3×64 convolutions with stride [1  1] and padding [1  1  1  1]                (HW Layer)
     8   'Encoder-Section-2-ReLU-1'               ReLU                         ReLU                                                                               (HW Layer)
     9   'Encoder-Section-2-Conv-2'               Convolution                  128 3×3×128 convolutions with stride [1  1] and padding [1  1  1  1]               (HW Layer)
    10   'Encoder-Section-2-ReLU-2'               ReLU                         ReLU                                                                               (HW Layer)
    11   'Encoder-Section-2-MaxPool'              Max Pooling                  2×2 max pooling with stride [2  2] and padding [0  0  0  0]                        (HW Layer)
    12   'Encoder-Section-3-Conv-1'               Convolution                  256 3×3×128 convolutions with stride [1  1] and padding [1  1  1  1]               (HW Layer)
    13   'Encoder-Section-3-ReLU-1'               ReLU                         ReLU                                                                               (HW Layer)
    14   'Encoder-Section-3-Conv-2'               Convolution                  256 3×3×256 convolutions with stride [1  1] and padding [1  1  1  1]               (HW Layer)
    15   'Encoder-Section-3-ReLU-2'               ReLU                         ReLU                                                                               (HW Layer)
    16   'Encoder-Section-3-MaxPool'              Max Pooling                  2×2 max pooling with stride [2  2] and padding [0  0  0  0]                        (HW Layer)
    17   'Encoder-Section-4-Conv-1'               Convolution                  512 3×3×256 convolutions with stride [1  1] and padding [1  1  1  1]               (HW Layer)
    18   'Encoder-Section-4-ReLU-1'               ReLU                         ReLU                                                                               (HW Layer)
    19   'Encoder-Section-4-Conv-2'               Convolution                  512 3×3×512 convolutions with stride [1  1] and padding [1  1  1  1]               (HW Layer)
    20   'Encoder-Section-4-ReLU-2'               ReLU                         ReLU                                                                               (HW Layer)
    21   'Encoder-Section-4-DropOut'              Dropout                      50% dropout                                                                        (HW Layer)
    22   'Encoder-Section-4-MaxPool'              Max Pooling                  2×2 max pooling with stride [2  2] and padding [0  0  0  0]                        (HW Layer)
    23   'Mid-Conv-1'                             Convolution                  1024 3×3×512 convolutions with stride [1  1] and padding [1  1  1  1]              (HW Layer)
    24   'Mid-ReLU-1'                             ReLU                         ReLU                                                                               (HW Layer)
    25   'Mid-Conv-2'                             Convolution                  1024 3×3×1024 convolutions with stride [1  1] and padding [1  1  1  1]             (HW Layer)
    26   'Mid-ReLU-2'                             ReLU                         ReLU                                                                               (HW Layer)
    27   'Mid-DropOut'                            Dropout                      50% dropout                                                                        (HW Layer)
    28   'Decoder-Section-1-UpConv'               Transposed Convolution       512 2×2×1024 transposed convolutions with stride [2  2] and cropping [0  0  0  0]  (HW Layer)
    29   'Decoder-Section-1-UpReLU'               ReLU                         ReLU                                                                               (HW Layer)
    30   'Decoder-Section-1-DepthConcatenation'   Depth concatenation          Depth concatenation of 2 inputs                                                    (HW Layer)
    31   'Decoder-Section-1-Conv-1'               Convolution                  512 3×3×1024 convolutions with stride [1  1] and padding [1  1  1  1]              (HW Layer)
    32   'Decoder-Section-1-ReLU-1'               ReLU                         ReLU                                                                               (HW Layer)
    33   'Decoder-Section-1-Conv-2'               Convolution                  512 3×3×512 convolutions with stride [1  1] and padding [1  1  1  1]               (HW Layer)
    34   'Decoder-Section-1-ReLU-2'               ReLU                         ReLU                                                                               (HW Layer)
    35   'Decoder-Section-2-UpConv'               Transposed Convolution       256 2×2×512 transposed convolutions with stride [2  2] and cropping [0  0  0  0]   (HW Layer)
    36   'Decoder-Section-2-UpReLU'               ReLU                         ReLU                                                                               (HW Layer)
    37   'Decoder-Section-2-DepthConcatenation'   Depth concatenation          Depth concatenation of 2 inputs                                                    (HW Layer)
    38   'Decoder-Section-2-Conv-1'               Convolution                  256 3×3×512 convolutions with stride [1  1] and padding [1  1  1  1]               (HW Layer)
    39   'Decoder-Section-2-ReLU-1'               ReLU                         ReLU                                                                               (HW Layer)
    40   'Decoder-Section-2-Conv-2'               Convolution                  256 3×3×256 convolutions with stride [1  1] and padding [1  1  1  1]               (HW Layer)
    41   'Decoder-Section-2-ReLU-2'               ReLU                         ReLU                                                                               (HW Layer)
    42   'Decoder-Section-3-UpConv'               Transposed Convolution       128 2×2×256 transposed convolutions with stride [2  2] and cropping [0  0  0  0]   (HW Layer)
    43   'Decoder-Section-3-UpReLU'               ReLU                         ReLU                                                                               (HW Layer)
    44   'Decoder-Section-3-DepthConcatenation'   Depth concatenation          Depth concatenation of 2 inputs                                                    (HW Layer)
    45   'Decoder-Section-3-Conv-1'               Convolution                  128 3×3×256 convolutions with stride [1  1] and padding [1  1  1  1]               (HW Layer)
    46   'Decoder-Section-3-ReLU-1'               ReLU                         ReLU                                                                               (HW Layer)
    47   'Decoder-Section-3-Conv-2'               Convolution                  128 3×3×128 convolutions with stride [1  1] and padding [1  1  1  1]               (HW Layer)
    48   'Decoder-Section-3-ReLU-2'               ReLU                         ReLU                                                                               (HW Layer)
    49   'Decoder-Section-4-UpConv'               Transposed Convolution       64 2×2×128 transposed convolutions with stride [2  2] and cropping [0  0  0  0]    (HW Layer)
    50   'Decoder-Section-4-UpReLU'               ReLU                         ReLU                                                                               (HW Layer)
    51   'Decoder-Section-4-DepthConcatenation'   Depth concatenation          Depth concatenation of 2 inputs                                                    (HW Layer)
    52   'Decoder-Section-4-Conv-1'               Convolution                  64 3×3×128 convolutions with stride [1  1] and padding [1  1  1  1]                (HW Layer)
    53   'Decoder-Section-4-ReLU-1'               ReLU                         ReLU                                                                               (HW Layer)
    54   'Decoder-Section-4-Conv-2'               Convolution                  64 3×3×64 convolutions with stride [1  1] and padding [1  1  1  1]                 (HW Layer)
    55   'Decoder-Section-4-ReLU-2'               ReLU                         ReLU                                                                               (HW Layer)
    56   'Final-ConvolutionLayer'                 Convolution                  18 1×1×64 convolutions with stride [1  1] and padding [0  0  0  0]                 (HW Layer)
    57   'Softmax-Layer'                          Softmax                      softmax                                                                            (HW Layer)
    58   'Segmentation-Layer'                     Pixel Classification Layer   Cross-entropy loss with 'Road Markings', 'Tree', and 16 other classes              (SW Layer)
                                                                                                                                                                
### Notice: The layer 'Decoder-Section-1-UpConv' of type 'nnet.cnn.layer.TransposedConvolution2DLayer' is split into an image input layer 'Decoder-Section-1-UpConv_insertZeros' and an addition layer 'Decoder-Section-1-UpConv' for normalization on hardware.
### Notice: The layer 'Decoder-Section-2-UpConv' of type 'nnet.cnn.layer.TransposedConvolution2DLayer' is split into an image input layer 'Decoder-Section-2-UpConv_insertZeros' and an addition layer 'Decoder-Section-2-UpConv' for normalization on hardware.
### Notice: The layer 'Decoder-Section-3-UpConv' of type 'nnet.cnn.layer.TransposedConvolution2DLayer' is split into an image input layer 'Decoder-Section-3-UpConv_insertZeros' and an addition layer 'Decoder-Section-3-UpConv' for normalization on hardware.
### Notice: The layer 'Decoder-Section-4-UpConv' of type 'nnet.cnn.layer.TransposedConvolution2DLayer' is split into an image input layer 'Decoder-Section-4-UpConv_insertZeros' and an addition layer 'Decoder-Section-4-UpConv' for normalization on hardware.
### Notice: The layer 'ImageInputLayer' with type 'nnet.cnn.layer.ImageInputLayer' is implemented in software.
### Notice: The layer 'Softmax-Layer' with type 'nnet.cnn.layer.SoftmaxLayer' is implemented in software.
### Notice: The layer 'Segmentation-Layer' with type 'nnet.cnn.layer.PixelClassificationLayer' is implemented in software.
### Compiling layer group: Encoder-Section-1-Conv-1>>Encoder-Section-1-ReLU-2 ...
### Compiling layer group: Encoder-Section-1-Conv-1>>Encoder-Section-1-ReLU-2 ... complete.
### Compiling layer group: Encoder-Section-1-MaxPool>>Encoder-Section-2-ReLU-2 ...
### Compiling layer group: Encoder-Section-1-MaxPool>>Encoder-Section-2-ReLU-2 ... complete.
### Compiling layer group: Encoder-Section-2-MaxPool>>Encoder-Section-3-ReLU-2 ...
### Compiling layer group: Encoder-Section-2-MaxPool>>Encoder-Section-3-ReLU-2 ... complete.
### Compiling layer group: Encoder-Section-3-MaxPool>>Encoder-Section-4-ReLU-2 ...
### Compiling layer group: Encoder-Section-3-MaxPool>>Encoder-Section-4-ReLU-2 ... complete.
### Compiling layer group: Encoder-Section-4-MaxPool>>Mid-ReLU-2 ...
### Compiling layer group: Encoder-Section-4-MaxPool>>Mid-ReLU-2 ... complete.
### Compiling layer group: Decoder-Section-1-UpConv_insertZeros ...
### Compiling layer group: Decoder-Section-1-UpConv_insertZeros ... complete.
### Compiling layer group: Decoder-Section-1-UpConv>>Decoder-Section-1-UpReLU ...
### Compiling layer group: Decoder-Section-1-UpConv>>Decoder-Section-1-UpReLU ... complete.
### Compiling layer group: Decoder-Section-1-Conv-1>>Decoder-Section-1-ReLU-2 ...
### Compiling layer group: Decoder-Section-1-Conv-1>>Decoder-Section-1-ReLU-2 ... complete.
### Compiling layer group: Decoder-Section-2-UpConv_insertZeros ...
### Compiling layer group: Decoder-Section-2-UpConv_insertZeros ... complete.
### Compiling layer group: Decoder-Section-2-UpConv>>Decoder-Section-2-UpReLU ...
### Compiling layer group: Decoder-Section-2-UpConv>>Decoder-Section-2-UpReLU ... complete.
### Compiling layer group: Decoder-Section-2-Conv-1>>Decoder-Section-2-ReLU-2 ...
### Compiling layer group: Decoder-Section-2-Conv-1>>Decoder-Section-2-ReLU-2 ... complete.
### Compiling layer group: Decoder-Section-3-UpConv_insertZeros ...
### Compiling layer group: Decoder-Section-3-UpConv_insertZeros ... complete.
### Compiling layer group: Decoder-Section-3-UpConv>>Decoder-Section-3-UpReLU ...
### Compiling layer group: Decoder-Section-3-UpConv>>Decoder-Section-3-UpReLU ... complete.
### Compiling layer group: Decoder-Section-3-Conv-1>>Decoder-Section-3-ReLU-2 ...
### Compiling layer group: Decoder-Section-3-Conv-1>>Decoder-Section-3-ReLU-2 ... complete.
### Compiling layer group: Decoder-Section-4-UpConv_insertZeros ...
### Compiling layer group: Decoder-Section-4-UpConv_insertZeros ... complete.
### Compiling layer group: Decoder-Section-4-UpConv>>Decoder-Section-4-UpReLU ...
### Compiling layer group: Decoder-Section-4-UpConv>>Decoder-Section-4-UpReLU ... complete.
### Compiling layer group: Decoder-Section-4-Conv-1>>Final-ConvolutionLayer ...
### Compiling layer group: Decoder-Section-4-Conv-1>>Final-ConvolutionLayer ... complete.

### Allocating external memory buffers:

          offset_name          offset_address     allocated_space 
    _______________________    ______________    _________________

    "InputDataOffset"           "0x00000000"     "16.0 MB"        
    "OutputResultOffset"        "0x01000000"     "48.0 MB"        
    "SchedulerDataOffset"       "0x04000000"     "24.0 MB"        
    "SystemBufferOffset"        "0x05800000"     "28.0 MB"        
    "InstructionDataOffset"     "0x07400000"     "36.0 MB"        
    "ConvWeightDataOffset"      "0x09800000"     "540.0 MB"       
    "EndOffset"                 "0x2b400000"     "Total: 692.0 MB"

### Network compilation complete.

dn = struct with fields:
             weights: [1×1 struct]
        instructions: [1×1 struct]
           registers: [1×1 struct]
    syncInstructions: [1×1 struct]
        constantData: {}

Program Bitstream into FPGA and Download Network Weights

To deploy the network on the Intel Arria10 SoC hardware, run the deploy function of the dlhdl.Workflow object. This function uses the output of the compile function to program the FPGA board by using the programming file. The function also loads the network weights and biases into the device. The deploy function starts programming the FPGA device, displays progress messages, and the time it takes to deploy the network.

deploy(hW)

### Programming FPGA Bitstream using JTAG...
### Programming the FPGA bitstream has been completed successfully.
### Loading weights to Conv Processor.
### Conv Weights loaded. Current time is 14-Dec-2021 23:40:29

Load Example Images

Extract patches for inference on FPGA by using the extractMultispectralData helper function and store them in MAT files. Create 20 patches of size 256-by-256-by-6 so that they form a 4-by-5 grid to represent a large input image.

foldername = 'TestData';
dataPath = fullfile(imageDir, 'rit18_data', 'rit18_data.mat');
extractMultispectralData(foldername, dataPath, 4, 5);

Load the extracted data into testData by using the helperConcatenateMultispectralData helper function. It concatenates inputs along the fourth dimension for multiframe prediction by using the dlhdl.Workflow object. The function is attached to the example as a supporting file.

testData = helperConcatenateMultispectralData(foldername);

Run Prediction

Execute the predict function of the dlhdl.Workflow object and display the prediction results for testData. Because the input is concatenated along the fourth dimension, the predictions occur simultaneously.

[prediction, speed] = predict(hW, testData(:,:,1:6,:), 'Profile', 'on');

### Finished writing input activations.
### Running in multi-frame mode with 20 inputs.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                  175391449                  1.16928                      20         3507877237              0.9
    Encoder-Section-1-Conv-1   1216888                  0.00811 
    Encoder-Section-1-Conv-2   2898182                  0.01932 
    Encoder-Section-1-MaxPool   5225243                  0.03483 
    Encoder-Section-2-Conv-1    689902                  0.00460 
    Encoder-Section-2-Conv-2   2604963                  0.01737 
    Encoder-Section-2-MaxPool   4862763                  0.03242 
    Encoder-Section-3-Conv-1    416523                  0.00278 
    Encoder-Section-3-Conv-2   2406534                  0.01604 
    Encoder-Section-3-MaxPool   6432961                  0.04289 
    Encoder-Section-4-Conv-1    345878                  0.00231 
    Encoder-Section-4-Conv-2   4062950                  0.02709 
    Encoder-Section-4-MaxPool   7270617                  0.04847 
    Mid-Conv-1             1298161                  0.00865 
    Mid-Conv-2            14902377                  0.09935 
    Decoder-Section-1-UpConv_insertZeros  14894578                  0.09930 
    Decoder-Section-1-UpConv   6431694                  0.04288 
    Decoder-Section-1-Conv-1   1842230                  0.01228 
    Decoder-Section-1-Conv-2   9572771                  0.06382 
    Decoder-Section-2-UpConv_insertZeros  10785828                  0.07191 
    Decoder-Section-2-UpConv   4863034                  0.03242 
    Decoder-Section-2-Conv-1   3103690                  0.02069 
    Decoder-Section-2-Conv-2  10455339                  0.06970 
    Decoder-Section-3-UpConv_insertZeros  10361041                  0.06907 
    Decoder-Section-3-UpConv   5225305                  0.03484 
    Decoder-Section-3-Conv-1   4555619                  0.03037 
    Decoder-Section-3-Conv-2  11171105                  0.07447 
    Decoder-Section-4-UpConv_insertZeros  11466232                  0.07644 
    Decoder-Section-4-UpConv   5907915                  0.03939 
    Decoder-Section-4-Conv-1   2673353                  0.01782 
    Decoder-Section-4-Conv-2   1539401                  0.01026 
    Final-ConvolutionLayer   5908123                  0.03939 
 * The clock frequency of the DL processor is: 150MHz

The output of hW.predict is of shape 256-by-256-by-18-by-20, where the outputs are concatenated along the fourth dimension. The 20 test images were created from a 1024-by-1280-by-6 section of the training data. The inputs and outputs are rearranged by using helperArrangeInput and helperArrangeOutput functions to display the prediction results. The functions are attached to the example as supporting files.

testImage = helperArrangeInput(testData, 4, 5);
segmentedImage = helperArrangeOutput(prediction, 4, 5);

Display the Prediction Results

Overlay the segmented image on the histogram-equalized RGB test image and display the prediction results.

classNames = [ ...
    "RoadMarkings", "Tree", "Building", "Vehicle", "Person", ...
    "LifeguardChair", "PicnicTable", "BlackWoodPanel", ...
    "WhiteWoodPanel", "OrangeLandingPad", "Buoy", "Rocks", ...
    "LowLevelVegetation", "Grass_Lawn", "Sand_Beach", ...
    "Water_Lake", "Water_Pond", "Asphalt"];

cmap = jet(numel(classNames));
N = numel(classNames);
ticks = 1/(N*2):1/N:1;

B = labeloverlay(histeq(testImage(:,:,[3 2 1])), medfilt2(segmentedImage), Transparency = 0.4, Colormap = cmap);

figure
imshow(B);
title('Labeled Test Image')
colorbar('TickLabels', cellstr(classNames), 'Ticks', ticks, 'TickLength', 0, 'TickLabelInterpreter', 'none');
colormap(cmap)

References

[1] Kemker, R., C. Salvaggio, and C. Kanan. "High-Resolution Multispectral Dataset for Semantic Segmentation." CoRR, abs/1703.01918. 2017.

[2] Kemker, Ronald, Carl Salvaggio, and Christopher Kanan. "Algorithms for Semantic Segmentation of Multispectral Remote Sensing Imagery Using Deep Learning." ISPRS Journal of Photogrammetry and Remote Sensing, Deep Learning RS Data, 145 (November 1, 2018): 60-77. https://doi.org/10.1016/j.isprsjprs.2018.04.014.