Image Processing Digital Twin

Digital Image Processing Methods Used in Robotic Vision Applications

Raw video

Visualization of the raw camera feed from a basketball competition robot

DeltaX in Estonia holds regular robotics competitions. In this competition, students participate by building robots that fulfill technical requirements as well as a list of competition requirements. The robots must be able autonomosly locate small colored rubber balls and shoot them into the opposing robots goal hoop. The robots must not exit the playing field and may not shoot balls into their own goal hoop. In order to acomplish this, a large part of the robots computational complexity comes from processing the video feed to determine the robots position, the goals position and the balls position.

HSV Video

Most common colorspace for color based computer vision applications

Image processing techniques are very often applied on different color spaces than the standard RGB space, the most common alternative being the HSV colorspace. While in RGB each pixel has 3 values denoting the intensity of Red, Green and Blue, HSV uses Hue, Saturation and Value. Hue is represented in a 180(360) degree arc that denotes the saturated color of the pixel. Saturation ranges from 0 to 255 and represents how much of the color is present in the pixel as opposed to how gray it is. The last parameter Value represents the pixels luminance and how bright it is on a scale of 0 to 255.

HSV Threshold Video

Channel thresholding using the HSV color space

The HSV color space can be used to very easily and efficiently cut out and mask specific color, saturation and luminance ranges. For the competition, students often employ HSV thresholding to mask the balls, hoop and visible playing area so they can be later detected with algorithms like blob detection.

The Hue and Threshold sliders in this scene denote a specific Hue value and the range around it which are selected. Saturation and Luminocity both denote the minimum values that are selected, all other values are discarded for the mask generation.

Morphological Operations

Operations used on a binary mask to join islands

Manually thresholding color ranges may allow the user to get a rough estimation of the presence of an object in the camera feed but with any motion or blurring the masked image may break up, this can cause issues in later detection methods. A very simple and common technique of fixing this problem of isolated islands and dotted images is to use Morphological operations like Erosion and Dilation. Dilation as an operation increases the size of any and all white pixels in the scene which often results in neighboring islands to join together in the final image. Erosion has the opposite effect, it is primarily used to remove unwanted small islated islands in the mask by way of reducing the size of all islands on the mask. For the competions robots, a combination of the 2 is often used and even iteratively to get a smoother and more robust detection. The process of Erosion followed by a Dilation is called an Opening operation and a Dilation followed by an Erosion is called a Closing operation.

The Erosion and Dilation sliders set the size of the Erosion and Dilation operation, effectively adjusting their impact on the thresholded mask.

Convolution

Noise removal on the input feed that can smoothen the thresholding

Sometimes when the areas that have to be detected are large and the color may be common in the camera feed as small dots, it may be more efficient to use Convolution and Blurring methods on the input image. This reduces the noise caused by small variations in pixel values and produces a more robust output for simpler detections.

For both samples, the Blur slider sets the size of the box filter that is applied using Convolution. A box filter is an N x N matrix where all of the values are 1 / (N*N).

Convolution

Contour and edge detection

Convolution can also be used to detect the high variation and high frequency areas in the image. This is often achieved with Edge filters that isolate and magnify and textural variations in the scene. This can be very useful in detecting specific shapes like the fiducial markers that are on either side of the goal hoop. After percise detection, a distance to the goal can be calculated, allowing the robot to adjust its throwing strength to land a score.

For both samples, the Edge slider sets the size of the edge filter and increases the size of color transitions in the image. The top image has the oritional image applied over the raw edge detected image and the bottom version has the edge detected values thresholded, creating an edge mask. The threshold size can be adjusted using the slider.