Towards More Resilient Mobile Communication

#Core Network

Blogs

1.The Importance of Availability in Communication Systems

Today, it has become commonplace for people to own smartphones and use various applications. In the future, it is expected that services essential to social life, such as electronic payments and public services, will become more widespread as applications than ever before. However, to use these convenient services, it is a fundamental prerequisite that communication is available. In other words, availability is of utmost importance. If availability is low, the value of these convenient services will be diminished.

On the other hand, telecommunication operators cannot escape from failures caused by equipment deterioration or natural disasters. No matter how many measures are taken, it is difficult to completely eliminate failures. If maximizing the availability of communication services to users is considered the ultimate mission of an operator, then fault tolerance to enhance availability becomes an important metric in future mobile systems.

2. Challenges of Fault Tolerance in Mobile Systems

In the currently prevalent 5G systems, it is considered that there are two main challenges in terms of fault tolerance:

(i) The core network (CN), which is a control plane of a mobile system, is prone to congestion
(ii) Vulnerable to disruptions between the Radio Access Network (RAN) and the CN

Core Network (CN) is prone to congestion

As shown in Figure 1, national-level operators accommodate millions of User Equipments (UEs) utilizing RAN, CN, and Transport Network (TN) that connect them.

Figure 1.Typical Deployment of a 5G System

While there are millions of UEs and tens of thousands of RANs, there are only a few CNs. CNs can be considered high-load systems on a regular basis. Furthermore, CNs execute procedures by invoking functional blocks called Network Functions (NF) to perform requested procedures without introducing inconsistencies in data regarding UE locations and authentication information. The invocation of these functions is considered a transaction. In other words, a typical CN continuously performs a costly and complex message exchange, similar to a bucket relay, for millions of UEs.

When a problem arises during a procedure, naturally, message exchanges are generated to counteract it. The large-scale and inherently high-cost message exchange becomes overwhelmed with messages during abnormal situations. This is the issue of congestion within the CN. When congestion occurs in the CN, it leads to widespread outage in the mobile system, resulting in reduced availability[1].

Vulnerable to disruptions between the Radio Access Network (RAN) and the CN

In Figure 2, UE#a communicates using the User Plane Function (UPF) installed at a nearby Multi-access Edge Computing (MEC) site, while UE#b communicates via a UPF installed in a central data center. The data communication path for UE#a is independent of the central data center.

Figure 2. Disruption between RAN and CN due to TN failure

Now, let's assume a disruption occurs at the X mark in the figure, leading to the disconnection of RAN#A and RAN#B from the CN. In this case, UE#b obviously falls into a state of communication failure as it cannot perform data communication. However, UE#a also falls into a state of communication failure. This is because RAN#A and RAN#B have become disconnected from the control of CN, causing the wireless connection to stop. Even if the disruption point occurs in a location unrelated to the data communication path, it still results in a communication failure. Even if the wireless connection is working properly, without being able to communicate with the CN, it is not possible to establish data communication lines, thus preventing communication within the disrupted area. This is the problem of vulnerability to disruptions between RAN and CN.

3. Studies for Realizing a Fault-Tolerant Mobile System

An example of a fault-tolerant communication system is the Internet. The Internet is composed of many Autonomous Systems (ASs), and even if failures occur within or between some ASs, it is less likely to cause malfunctions as a whole due to its autonomous and decentralized nature. By incorporating the autonomy and decentralization observed in the Internet, it is considered possible to realize a mobile system that is less likely to experience communication disruptions to devices even in the event of partial failures.

To make this drastic architectural change, we should summarize the minimum requirements that must be met by the mobile system. A mobile system is a communication system that provides uninterrupted voice and data communication services based on the customer's subscription information to a device which keeps moving. In a nutshell, from a technical point of view, a mobile system can be described as a "communication system in which RAN and CN share information guaranteed by AAA, depending on the mobility of UE and its sessions, resulting in the emergence of communication paths applying User Plane policies." It is important to determine how to fulfill these three requirements in order to improve fault tolerance.

We have summarized the three major requirements of a mobile system as follows:

1. Authentication, Authorization, and Accounting (AAA)
2. UE/Session Mobility
3. User Plane(U-plane)

4. Enhancing Fault Tolerance with C-plane RAN Core Convergence

The problems explained so far arise from the fact that CNs exist only in a few central locations. Therefore, we have proposed the concept of an "Autonomous Decentralized Mobile System (ADMobile)" where the control plane (C-plane) functionality of CN is integrated into RAN, enabling RAN to autonomously and cooperatively accomplish mobile communication, much like the Internet (Figure 3). Generally, the concept of integrating CN functionality into RAN is called RAN Core Convergence. Regarding the U-plane, RAN Core Convergence is being considered for 6G[2]. In ADMobile, the focus is particularly on C-plane RAN Core Convergence. In ADMobile, RANs with integrated CN C-plane functionality are called AD-RAN.

Figure 3. Overview of Autonomous Decentralized Mobile System

In ADMobile, procedures other than accessing the central subscriber database (Subscriber DB) are completed within a single AD-RAN. This enables distributing the load of accommodating millions of UEs in tens of thousands of AD-RANs. Moreover, we have successfully realized the CN C-plane functionality as a collection of small functions instead of a colossal NF, achieving a CN C-plane without bucket relays occurring between NFs. From this, it can be said that the congestion issues in CNs are less likely to occur in ADMobile.

Figure 3 illustrates the case of a disruption occurrence at the X mark in ADMobile. In ADMobile, since the RAN provides CN control plane functionalities, the phenomenon of partitioning between the RAN and CN does not exist. Therefore, the radio outages that were occurring due to the segmentation between the RAN and CN also do not occur. Since the CN C-plane functionality is located in the same position as the RAN, communication is possible within the disrupted area, and there is a possibility of connecting to the Internet from nearby MEC sites as well. After the TN failure, the data communication path for UE#b, which originally went through the central route, autonomously converges to go through the MEC site. Also, even after the TN recovers from a failure, congestion at the CN theoretically does not occur because the CN functionalities are distributed. In this way, ADMobile overcomes the vulnerability of congestion at the CN and disruptions between RAN and CN.

5. Future Outlook

No matter how advanced communication features become, they cannot be effective if there is no connection. ADMobile is a mobile system that focuses on the fault tolerance brought about by the autonomy and decentralization of the Internet. Going forward, we plan to advance the proof-of-concept for ADMobile, and conduct experiments simulating TN failures to evaluate the effects of congestion suppression and partition tolerance.

ADMobile, which was covered in this article, was presented at the Technical Committee on Network Systems of the Institute of Electronics, Information and Communication Engineers (IEICE), which was held in Okinawa on March 1, 2024.

Presentation Material (Japanese)

References
[1] SoftBank Research Institute of Advanced Technology, “Toward the Creation of a Robust and Scalable Mobile Network”,
https://www.softbank.jp/corp/technology/research/story-event/003/, Feb. 2023.
[2] Nokia Bell Lab., “Communications in the 6G era”
https://www.bell-labs.com/institute/white-papers/communications-6g-era-white-paper/, Sep. 2020.

Writer :Hiroki Watanabe,Katsuhiro Horiba

Research Areas