In recent years, pictures machine learning techniques centered on deep learning have attracted attention. For example, self-driving cars have gradually become possible, but in the entire deep learning process, algorithms are required to recognize and learn images provided as raw data. In this process, semantic segmentation technology is applied. So, how does semantic segmentation annotate pictures? Let’s introduce it below.

Example of 2D semantic segmentation Top input image Bottom prediction

What is Semantic Segmentation?

Semantic segmentation is a machine learning technique that learns how to recognize the extent of objects in an image, enabling machine learning algorithms to locate the precise boundaries of objects. Semantic segmentation refers to dividing complex and irregular images into regions based on the attributes of objects, and marking the corresponding attributes to help train image recognition models. It is often used in areas such as autonomous driving, human-computer interaction, and virtual reality.

How does semantic segmentation annotate pictures?

Semantic segmentation is the process of assigning a class label (also known as a semantic class) to each pixel in an image. Therefore, semantic segmentation annotation can be compared to pixel-level image classification, where each pixel of an image is assigned a specific class label.

Semantic segmentation means dividing an image into segments. Sounds easy, right? However, for the process to run smoothly and for machine learning to be as effective as possible, semantic image segmentation is actually a multi-step process that requires various methods, models, and ML and DL techniques.

Illustration of challenges in semantic segmentation a Input Image b Ground Truth

Semantic Segmentation Annotation Image Process:

1. Draw semantic segmentation on the image;

2. Select the “Mask to polygon” option from the menu of the annotation tool;

3. Convert segmentation to polygon;

4. Edit the polygon, then use the Polygon to Mask tool to return to split mode.

The process of voice segmentation and labeling pictures will vary depending on the labeling tool. For example, some tools simply create polygons and select masks as output options, while others may require that in order to retrieve segmentation masks, frames should be annotated with mask tools rather than simple polygons.

Table of Contents