Harada-Osa-Kurose-Mukuta Lab.

3D reconstruction

What is 3D reconstruction?

Three-dimensional reconstruction refers to the accurate reproduction on a computer of the three-dimensional shape and appearance of an observed object or environment using an RGB camera or depth camera.

By utilizing 3D reconstruction technology, it is possible to create 3D maps of the environment and detect surrounding people and vehicles, which are essential for the accurate and safe operation of automated vehicles. It is also indispensable for recognition functions that enable robots to grasp packages and carry objects up and down stairs in everyday environments. It can also be used as a tool to construct virtual worlds that are very close to the real world, such as the metaverse.

Deforming radiance fields with cages [Xu+, ECCV 2022]

Methods of 3D reconstruction can be roughly divided into active and passive methods.

In active methods, a depth image is obtained by irradiating an object with a laser and measuring the time it takes for the laser to return, or by irradiating a random pattern and measuring the displacement of the pattern according to the distance. The 3D structure of the object or environment is reconstructed by combining multiple depth images obtained from different viewpoints.

Passive methods use standard image sensors in smartphones and other devices to capture two-dimensional images, and then use image processing and recognition to estimate the three-dimensional structure of the object or environment. Passive methods can be applied to a wide variety of situations because they can use common cameras.

Various methods have been proposed for image processing and recognition for 3D reconstruction, such as utilizing the displacement of the target object observed from multiple cameras with different viewpoints or the motion of the target object on the image plane. Recently, a method called Neural Radiance Fields (NeRF) has attracted attention: NeRF is a neural network capable of reconstructing a three-dimensional scene from multiple two-dimensional images from different viewpoints. It can accurately generate an image of a target object from a novel viewpoint that does not appear in the observed situation.

Uniqueness and achievements of this laboratory

Our laboratory is engaged in studying spatio-temporal reconstruction (four-dimensional reconstruction) that considers the passage of time and the reconstruction of three-dimensional space. When the shape of an object changes over time, it is called a non-rigid object. Examples of non-rigid objects include people and animals whose shapes change with the movement of their limbs, and trees and curtains whose shapes change freely with the wind. On the other hand, rigid objects such as buildings and tables hardly change shape over time. Four-dimensional reconstruction of non-rigid objects is known to be a difficult task even in the field of computer vision.

In our laboratory, we have developed a novel NeRF method that automatically estimates not only the surface shape and appearance but also the skeletal and joint structures of objects such as people and animals from multiple observed images, and can freely move the estimated objects as intended by the user. We are also developing a new NeRF method to freely deform a stuffed animal-like object without a skeletal structure.

NeRF for articulated object [Noguchi+, CVPR 2022]

In addition, for spatio-temporal reconstruction of non-rigid objects, we are developing a cloud matching method that accurately finds correspondences between depth images of objects observed from different viewpoints, and a point cloud registration method that accurately estimates transformations between depth images of non-rigid objects with different shapes.

Non-rigid object registration [Li+, NeurIPS  2022]

Non-rigid object registration [Li+, NeurIPS 2022]

Future Directions

Different methodologies exist depending on whether the object is rigid or non-rigid. We plan to research methods that enable unified 4D reconstruction regardless of the object's characteristics. In addition, we aim to develop a method for accurate four-dimensional reconstruction even with limited observation information by appropriately estimating unobservable areas. Furthermore, we would like to create a method that can be applied to a wide range of areas, such as towns and suburbs, and that can be applied to future prediction in the real world by accumulating the data over a long period.

References

  1. Yang Li, Tatsuya Harada. Non-rigid Point Cloud Registration with Neural Deformation Pyramid. NeurIPS2022.
  2. Yuki Kawana, Yusuke Mukuta, Tatsuya Harada. Unsupervised Pose-aware Part Decomposition for Man-made Articulated Objects. ECCV2022. (oral)
  3. Atsuhiro Noguchi, Xiao Sun, Stephen Lin, Tatsuya Harada. Unsupervised Learning of Efficient Geometry-Aware Neural Articulated Representations. ECCV2022.
  4. Tianhan Xu, Tatsuya Harada. Deforming Radiance Fields with Cages. ECCV2022.
  5. Yang Li, Tatsuya Harada. Lepard: Learning partial point cloud matching in rigid and deformable scenes. CVPR2022. (oral)
  6. Atsuhiro Noguchi, Umar Iqbal, Jonathan Tremblay, Tatsuya Harada, Orazio Gallo. Watch It Move: Unsupervised Discovery of 3D Joints for Re-Posing of Articulated Objects. CVPR2022.
  7. Atsuhiro Noguchi, Xiao Sun, Stephen Lin, Tatsuya Harada. Neural Articulated Radiance Field. ICCV2021.
  8. Yuki Kawana, Yusuke Mukuta, Tatsuya Harada. Neural Star Domain as Primitive Representation. NeurIPS2020.
  9. Yang Li, Aljaz Bozic, Tianwei Zhang, Yanli Ji, Tatsuya Harada, Matthias Niessner. Learning to Optimize Non-Rigid Tracking. CVPR2020. (oral)
  10. Yang Li, Tianwei Zhang, Yoshihiko Nakamura, Tatsuya Harada. SplitFusion: Simultaneous Tracking and Mapping for Non-Rigid Scenes. IROS2020.