Contenido principal

Instance Segmentation

Label ground truth and perform instance segmentation using pretrained AI models like SOLOv2, Mask R-CNN, and SAM, or train custom networks with transfer learning

Instance Segmentation tools in Computer Vision Toolbox™ enable you to detect, classify, and segment individual objects within an image, even when multiple objects are overlapping. You can start by creating labeled ground truth using the Image Labeler and Video Labeler apps, which support interactive and AI-assisted annotation of object instances with polygons or rectangle ROIs. For more information, see Label Objects Using Polygons for Instance Segmentation.

The toolbox provides pretrained instance segmentation networks such as SOLOv2 and Mask R-CNN. You can use these models directly for inference or adapt them to specific applications through transfer learning. For more information, see Get Started with Instance Segmentation Using Deep Learning and Get Started with SOLOv2 for Instance Segmentation. For class agnostic instance segmentation, the toolbox supports the Segment Anything Model (SAM) through the imsegsam function and the segmentAnythingModel object.

To prepare training data, the toolbox offers utilities for managing and organizing data sets along with data augmentation and preprocessing. For more information, see Postprocess Exported Labels for Instance Segmentation Training.

After you generate predictions using pretrained or custom models, you can evaluate instance segmentation performance and generate detailed insights into segmentation accuracy, object-level precision, and performance across different object sizes. These metrics help assess the quality of both mask predictions and bounding box localization. For more information, see evaluateInstanceSegmentation.

The toolbox also supports 3-D object pose estimation using instance segmentation through the Pose Mask R-CNN framework, enabling fine-grained analysis of object orientation and structure. For more information, see Perform 6-DoF Pose Estimation for Bin Picking Using Deep Learning.

Instance segmentation using SOLOv2: Left — A segmented and labeled road scenario using a sample modified RGB image from the CamVid data set, Right — A segmented image of PVC pipe connectors

Apps

Image LabelerLabel images for computer vision applications
Video LabelerLabel video for computer vision applications

Functions

expand all

SOLOv2

solov2Segment objects using SOLOv2 instance segmentation network (Since R2023b)
segmentObjectsSegment objects using SOLOv2 instance segmentation (Since R2023b)

Mask R-CNN

maskrcnnDetect objects using Mask R-CNN instance segmentation (Since R2021b)
segmentObjectsSegment objects using Mask R-CNN instance segmentation (Since R2021b)

Segment Anything Model (SAM)

imsegsamPerform automatic full image segmentation using Segment Anything Model 2 (SAM 2) (Since R2024b)
segmentAnythingModelPretrained Segment Anything Model 2 (SAM 2) for image segmentation (Since R2024a)

Load Training Data

boxLabelDatastoreDatastore for bounding box label data
groundTruthGround truth label data
imageDatastoreDatastore for image data
combineCombine data from multiple datastores

Train Instance Segmentation Networks

trainSOLOV2Train SOLOv2 network to perform instance segmentation (Since R2023b)
trainMaskRCNNTrain Mask R-CNN network to perform instance segmentation (Since R2022a)

Augment and Preprocess Training Data

poly2maskConvert region of interest (ROI) polygon to region mask
bwboundariesTrace object boundaries in binary image
balanceBoxLabelsBalance bounding box labels for object detection
bboxcropCrop bounding boxes
bboxeraseRemove bounding boxes
bboxresizeResize bounding boxes
bboxwarpApply geometric transformation to bounding boxes
bbox2pointsConvert rectangle to corner points list
imwarpApply geometric transformation to image
imcropCrop image
imresizeResize image
randomAffine2dCreate randomized 2-D affine transformation
centerCropWindow2dCreate rectangular center cropping window
randomWindow2dRandomly select rectangular region in image
evaluateInstanceSegmentationEvaluate instance segmentation data set against ground truth (Since R2022b)
instanceSegmentationMetricsInstance segmentation quality metrics (Since R2022b)
metricsByAreaEvaluate instance segmentation across object mask size ranges (Since R2023b)
insertObjectMask Insert masks in image or video stream
insertObjectAnnotationAnnotate truecolor or grayscale image or video
insertShapeInsert shapes in image or video
insertTextInsert text in image or video
showShapeDisplay shapes on image, video, or point cloud
posemaskrcnnPredict object pose using Pose Mask R-CNN pose estimation (Since R2024a)
predictPoseEstimate object pose using Pose Mask R-CNN deep learning network (Since R2024a)
trainPoseMaskRCNNTrain Pose Mask R-CNN network to perform pose estimation (Since R2024a)

Topics

Get Started

Create Ground Truth for Instance Segmentation

Prepare Training Data for Instance Segmentation

Featured Examples