– PointWOLF

Ever since 3D technology was introduced to the world, it has created a lot of buzz. Everyone was mesmerized by how far technology had come, and how what was once thought to be only possible in the fictional world become reality. The general public was exposed to the concept of 3D back in the early 1900s when 3D cinema was introduced. Many will remember gasping in awe when characters seemed to move right in front of their eyes. In the world today, 3D technology has evolved further, being incorporated into a variety of different fields such as videogames, sound, printers, vehicles, and more. In this particular case, 3D vision will be the topic of interest.

According to Vital Vision, 3D vision refers to “recording three-dimensional information from target objects.” The target object that is transferred onto the 3D space differs from a typical 2D image as one is able to see “accurate coordinates that show the exact location of every pixel.” All of these coordinate points make up what is referred to as point cloud. Such a process of virtually representing a real-life object in the 3D space is being handled by artificial intelligence (AI) to portray the object as realistically as possible. However, 3D vision is facing limitations due to the scarcity of data used for training neural networks. To combat this, Korea University’s (KU) Machine Learning and Vision Lab and the University of Pittsburgh developed an augmentation method called PointWOLF. PointWOLF was introduced in a research paper titled “Point Cloud Augmentation with Weighted Local Transformations” during the International Conference on Computer Vision 2021.

Authors Kim Sihyeon and Lee Sanghyeok (Provided by Professor Hyunwoo J. Kim)
Authors Kim Sihyeon and Lee Sanghyeok (Provided by Professor Hyunwoo J. Kim)

The Basics of 3D Vision

The fundamentals of 3D vision all come down to point cloud. Although the concept of point cloud may be hard to grasp at first, it becomes easier when imagining an object made up of small dots instead of lines. Point clouds are created through a laser scan process called photogrammetry, which involves recording the amount of time it takes for a particular object to reflect back pulses of light. Depending on the length of time, the AI determines the location of the object and plots points on the 3D space to create a whole structure.

The Framework of PointWOLF

Currently, the limited data for training deep neural networks is posing a problem. Data is critical for training networks as it is what prevents overfitting. According to Towards Data Science, overfitting occurs when the model is not able to generalize well on new, unseen data although it has achieved a good fit on the training data. This is commonly referred to as having poor generalizability, which means that the AI model is only able to excel with the data it was trained with, not with data it has not seen before. To reduce the occurrence of overfitting, it is crucial to gather more training data with which the AI is able to learn different application techniques.

One way to do so is through data augmentation (DA), which as mentioned by Machine Learning Mastery, refers to “artificially creating new training data from existing training data.” With DA, more information is obtained from the original training data set by artificially transforming and synthesizing data. Although DA is frequently used to address data scarcity issues, it has not been explored in-depth for point clouds. Conventional DAs for images include “global similarity transformations such as rotation, scaling, and translation with pint-wise jittering.” However, such techniques are not flexible enough to generate diverse patterns.

In order to put forward a technique that is able to retain the original point cloud yet is able to create new data, KU created PointWOLF, a simple point cloud augmentation with weighted local transformations. Unlike conventional DAs, PointWOLF generates realistic and smooth deformations for point clouds by combining multiple transformations. Put into simpler terms, PointWOLF is successful in creating realistic 3D models by deforming local structures. These deformations are the combination of multiple transformations with varying weights.

To create a 3D model, PointWOLF first selects several anchor points - shown as the red dots in the image below - and performs local transformations at those particular points. As shown in the image, T1, T2, and T3 show deformations of the original sample based on each anchor point. With all three deformations combined, the resulting augmented sample looks quite realistic with smooth joints. This is because PointWOLF applies different weights depending on the distance of each anchor point from a point in input. The type of local transformations such as “changing aspect ratios, translation, and rotation” on each anchor point is chosen at random.

Point WOLF Framework Illustration (Provided by Provided by KU's research paper "Point Cloud Augmentation with Weighted Local Transformations")
Point WOLF Framework Illustration (Provided by Provided by KU's research paper "Point Cloud Augmentation with Weighted Local Transformations")

As seen by the framework above, a lot of steps go into creating a non-rigid deformation. It is important to note that the entire process of putting together multiple transformations into one whole model enables the AI to learn how to extract new data from the original dataset. In other words, PointWOLF puts forward an enhanced way to train deep neural networks that is able to tackle the problem of data scarcity for point cloud. However, such effectiveness of PointWOLF still comes with limitations. According to Professor Hyunwoo J. Kim (Department of Computer Science and Engineering), “PointWOLF is currently studied at an object-level, but for it to be applied at a scene-level, more research is needed on how to expand PointWOLF’s ability to recognize objects and naturally augment data.”

Professor Hyunwoo J. Kim (Provided by Professor Hyunwoo J. Kim)
Professor Hyunwoo J. Kim (Provided by Professor Hyunwoo J. Kim)

The wonders of technology are surprising the world day by day, and such methods like PointWOLF act as indicators of how much the world has evolved. The novel point cloud augmentation method called PointWOLF aims to surpass conventional DA techniques, overcoming the issue of data scarcity for point cloud. With many industries incorporating point cloud into their businesses today, such a finding is undoubtedly going to benefit the world as a whole.

저작권자 © The Granite Tower 무단전재 및 재배포 금지