Efforts to develop a pipeline for automated creation of urban 3D models

#Digital Twin

Blogs

1.Introduction

In recent years, there has been growing expectation to utilize digital twins in solving social issues. At SoftBank's Research Institute of Advanced Technology, we are engaged in research and development from various perspectives to make digital twins practical.

One of the elements that constitute a digital twin is the reproduction of the real world in digital space. The scale of spaces reproduced in the digital realm varies. For instance, in the manufacturing sector, there are cases where entire factories are recreated as 3D models in digital space. This allows for simulations of production line layouts and workflow patterns, aiming to reduce costs [1].

In larger-scale examples, there are cases where entire cities are modeled in 3D. Google Earth [2] is a very famous example of a global-scale 3D model. There are also other initiatives to create 3D model data on a city scale, such as the PLATEAU project led by the Ministry of Land, Infrastructure, Transport, and Tourism, Japan. The PLATEAU project uses a data format called CityGML, which includes not only 3D shape information, but also attribute information. By the end of fiscal year 2024, it is planned to be implemented in approximately 250 cities in Japan[3].

Even when we talk about 3D models of cities, there are different types. One type is the TIN (Triangulated Irregular Network) model, which reproduces the city using irregular triangular polygons (upper part of Figure 1). Another type, like the one used in PLATEAU, represents each building as structured data (lower part of Figure 1). There are various differences between these data types, but one easy-to-understand distinction is this: The TIN model represents the city as a continuous 3D model without separating buildings, ground, trees, and other geographical features. In contrast, the latter type separates and structures each geographical feature individually. By structuring each geographical feature, it becomes possible to attach arbitrary semantic information to the 3D model of each building.

(upper part) TIN model and (lower part) structured 3D model [4] | Efforts to develop a pipeline for automated creation of urban 3D models

Figure 1 (upper part) TIN model and (lower part) structured 3D model[4]

Utilizing urban 3D models has the potential to contribute to solving various social issues. For example, a lot of municipalities that have developed their cities’ 3D models in PLATEAU are visualizing the impact areas of water or landslide disasters in three dimensions, aiming to improve disaster prevention planning[5]. Furthermore, as a case of utilization in private-sector, architectural designers and construction companies effectively simulate aspects such as the views of newly built buildings and the sunlight conditions of the surrounding environment by utilizing 3D models of cities when considering new development plans. In terms of urban planning, structured and semantically rich 3D models tend to broaden the scope of potential applications. This is because they allow for visualizing the before-and-after scenarios by simply replacing specific 3D models when rebuilding buildings, or assessing the disaster risk by extracting only the older and deteriorated buildings. We believe it will be necessary to widely develop structured 3D city model data as a future urban digital twin platform that can utilize data from a big data perspective while linking various information held by cities.

2. Current State of Urban 3D Model Development

Let’s take a closer look at the current development status of structured urban 3D models.

As introduced in the previous section, 3D models have been created in accordance with the LOD (Level of Detail) divisions defined by the CityGML format. The LOD classification of buildings can mainly be divided into LOD 0 to LOD 4 (Figure 2 A ~ E). LOD 0 is a model expressing the horizontal shape of a building (two dimension), whereas LOD 1 is a box representation model giving height to LOD 0. However, it's important to note that the height of LOD 1 buildings, which adopts the median of measured height values, may not necessarily represent the exact height. LOD 2 is a model that even recreates the shape of the roof and allows for the possible addition of exterior textures. Then, LOD 3 even represents elements of a building's external structure such as windows and doors, and LOD 4 can model the internal structure such as floors and pillars.

Models at LOD 2 and above require increasingly complex shape reproduction, which has traditionally been created and developed primarily through manual processes. As a result, the cost of developing 3D models at LOD 2 and above has inevitably been high. In the PLATEAU project, over 200 cities have had 3D models developed, but the areas covered by LOD 2 and higher models are often limited to specific parts of the main districts. In contrast, many other areas are represented by LOD 1 models. This is likely due to the current early stage of urban 3D model utilization, where the cost of developing LOD 2 and above models is seen as high relative to the cost-effectiveness, making it challenging to justify expanding coverage. However, as defined earlier, LOD 1 models, which are simple box representations without exterior textures, lack the immersive quality, visibility, and accuracy of LOD 2 and higher models. Consequently, the range of use cases for LOD 1 models is more limited.

To enhance the visibility and usability of urban 3D models, it is desirable to have LOD 2 and higher models developed over broader areas. However, due to cost-related challenges, this has not yet been realized. One potential solution to this issue is the development of methods to automatically create LOD 2 and higher urban 3D models with minimal human intervention and at a lower cost. We have explored such methods as a potential approach to overcoming these challenges.

3D model representation by LOD definition [6] | Efforts to develop a pipeline for automated creation of urban 3D models

Figure 2 3D model representation by LOD definition[6]

3. Development of an Automated Pipeline for Urban 3D Model Creation and Use Case

We particularly focused on the LOD 2 model, and explored a pipeline capable of automating the models that were traditionally made manually. Until now, there have been urban 3D products reconstructed using satellite imagery, but the positional accuracy of the models is on the order of meters due to distance of the satellite from the ground surface. Our research and development efforts aim to create urban 3D models with higher precision and resolution, achieving accuracy on the order of several to tens of centimeters. This is accomplished by utilizing aerial survey photographs, which are closer to the ground surface than satellites and are also used in the field of public surveying. In this section, we will introduce the automated creation process we have considered thus far, broadly divided into three steps.

STEP (1) Reconstruction of 3D Point Clouds from Aerial Survey Photographs

Using multiple aerial photographs taken under specific conditions, it is possible to reconstruct extensive 3D point clouds of the ground surface. Aerial photographs, as the name suggests, are images captured from the air using aircraft such as Cessna planes or helicopters, covering large areas of the ground. The technique of obtaining information about ground objects and terrain from these images is known as aerial photogrammetry.

In aerial photogrammetry, photographs are taken along a flight path with an overlap rate of approximately 80%. These overlapping images are then processed using a technique called Structure from Motion (SfM) to reconstruct the 3D shapes of objects on the ground, as well as to estimate the camera positions and orientations at the time of capture. Following this, a method known as Multi-View Stereo (MVS) is used to generate dense point clouds [7]. Through this process, large urban areas can be recreated as 3D point clouds (Figure 3 A).

It’s also worth noting that SfM/MVS can be applied using images captured by drones, making it an effective approach for reconstructing smaller, more localized areas.


STEP (2) 3D point clouds separation and structuring


To create a structured 3D model for each building, the dense point cloud reconstructed by SfM/MVS needs to be separated into individual buildings. For instance, by using the perimeter information of buildings maintained as public data, it's possible to segment out the point clouds corresponding to the buildings (Figure 3 B).

Next, each segmented building point cloud is reconstructed into a polygonal object that includes the roof shapes consistent with LOD 2. Most buildings are composed of a combination of geometric planes and curved surfaces that define their shape. By extracting planar information from the building point cloud and calculating the optimal 3D structure enclosed by these planes, it is possible to automatically reconstruct 3D models of buildings that represent their roof shapes in accordance with LOD 2 standards (Figure 3 C, D). Alternatively, if segmentation and structuring are bypassed, and a dense point cloud is directly converted into a polygonal mesh, a TIN model (as introduced in Section 1, upper part of Figure 1) can be created.


STEP (3) Applying Building Exterior Textures to the 3D Model Using Aerial Survey Photographs


Finally, aerial photos are used again to apply textures to the surfaces of the reconstructed 3D building models (Figure 3 E). The reconstructed 3D models and the information about the camera's position, orientation, and parameters share common three-dimensional coordinate information. Therefore, by three-dimensional coordinate transformation, it's possible to extract parts of the aerial photos where the corresponding buildings are positioned and automatically apply textures to the surfaces of the 3D models.

Schematic diagram of the automated 3D model creation pipeline |Efforts to develop a pipeline for automated creation of urban 3D models

Figure 3 Schematic diagram of the automated 3D model creation pipeline

As part of our vision for utilizing urban 3D models, we are exploring the potential for enhancing radio wave simulations. By using LOD 2 urban 3D models, which offer higher geometric fidelity than LOD 1 models, it is expected that we can perform ray-tracing simulations that more accurately replicate real-world conditions (Figure 4).

Application example in a ray-tracing radio simulation |Efforts to develop a pipeline for automated creation of urban 3D models

Figure 4 Application example in a ray-tracing radio simulation

4. Conclusion and Future Outlook

In this article, we have introduced a research and development case study focused on methods for automatically creating urban 3D models. Specifically, our exploration of the creation of LOD 2 models has shown that it is feasible to implement an automated pipeline that significantly reduces manual processes. Given the nature of urban 3D models as digital maps, regular updates are essential. However, relying on manual labor and incurring high costs for each update is not ideal.

By advancing the automatic creation technology introduced in this article, we aim to facilitate the development and updating of urban 3D models, thereby contributing to the construction of future digital twin platforms and the expansion of use cases.

Writer:Yuta Murayama

References

Research Areas