Main Content

Analyze Training Data for Semantic Segmentation

To train a semantic segmentation network you need a collection of images and its corresponding collection of pixel labeled images. A pixel labeled image is an image where every pixel value represents the categorical label of that pixel.

The following code loads a small set of images and their corresponding pixel labeled images:

dataDir = fullfile(toolboxdir('vision'),'visiondata');
imDir = fullfile(dataDir,'building');
pxDir = fullfile(dataDir,'buildingPixelLabels');

Load the image data using an imageDatastore. An image datastore can efficiently represent a large collection of images because images are only read into memory when needed.

imds = imageDatastore(imDir);

Read and display the first image.

I = readimage(imds,1);
figure
imshow(I)

Figure contains an axes object. The axes object contains an object of type image.

Load the pixel label images using a pixelLabelDatastore to define the mapping between label IDs and categorical names. In the dataset used here, the labels are "sky", "grass", "building", and "sidewalk". The label IDs for these classes are 1, 2, 3, 4, respectively.

Define the class names.

classNames = ["sky" "grass" "building" "sidewalk"];

Define the label ID for each class name.

pixelLabelID = [1 2 3 4];

Create a pixelLabelDatastore.

pxds = pixelLabelDatastore(pxDir,classNames,pixelLabelID);

Read the first pixel label image.

C = readimage(pxds,1);

The output C is a categorical matrix where C(i,j) is the categorical label of pixel I(i,j).

C(5,5)
ans = categorical
     sky 

Overlay the pixel labels on the image to see how different parts of the image are labeled.

B = labeloverlay(I,C);
figure
imshow(B)

Figure contains an axes object. The axes object contains an object of type image.

The categorical output format simplifies tasks that require doing things by class names. For instance, you can create a binary mask of just the building:

buildingMask = C == 'building';

figure
imshowpair(I, buildingMask,'montage')

Figure contains an axes object. The axes object contains an object of type image.