In Japan, traffic regulations stipulate that when a vehicle approaches an unsignalized crosswalk with a vehicle stopped in front of it, the driver needs to come to a complete stop before proceeding. This presents a challenge to human operators monitoring autonomous vehicles remotely–who have to monitor multiple vehicles simultaneously–as they may not be able to intervene quickly and effectively in certain instances.
SoftBank Corp. (TOKYO: 9434) is working to solve this issue and support human operators by utilizing multimodal AI, an AI system that collects and integrates multiple types of data—such as text, audio, images, and sensor information—to enable more comprehensive processing and analysis.
On November 5, 2024, SoftBank announced it developed a multimodal AI that understands traffic (a “traffic understanding multimodal AI”) to support operators remotely monitoring autonomous driving. The traffic understanding multimodal AI, which aims to reduce operational costs and enhance vehicle safety, is designed to operate on low-latency edge AI servers with the goal of achieving fully unmanned operations. By running the multimodal AI in real time on SoftBank's MEC and other low-latency edge AI servers, the system enables a real-time understanding of autonomous vehicle situations to provide reliable remote support for autonomous driving.
In October 2024, SoftBank launched a field trial to test this solution, which uses the traffic understanding multimodal AI, at Keio University Shonan Fujisawa Campus (SFC), which is located in Fujisawa City, Kanagawa Prefecture, south of Tokyo. Through the trial, SoftBank researchers are aiming to verify whether the traffic understanding multimodal AI can provide effective remote support for autonomous driving for smooth operations, even when vehicles encounter unforeseen situations.
Multimodal AI has advanced understanding of Japan’s traffic conditions
How does the traffic understanding multimodal AI help? To begin with, it processes forward-facing footage from autonomous vehicles, such as dashcam video, and gives prompts about current traffic conditions to assess complex driving situations and potential risks, thereby generating recommended actions for safe driving.
Furthermore, the foundational AI model has been trained on a broad range of Japanese traffic information repositories, including traffic manuals and regulations, along with general driving scenarios and risk-laden situations that are difficult to foresee, as well as corresponding countermeasures. This training enables the traffic understanding multimodal AI to acquire a wide range of knowledge for the safe operation of autonomous vehicles.
This remote support solution for autonomous driving is also being utilized in trials conducted by MONET Technologies, a joint venture with SoftBank, Toyota Motor Corporation and other Japan-based automotive manufacturers.
By continuously learning from unpredictable driving risks and recommended actions encountered in real driving environments, the traffic understanding multimodal AI's accuracy will be enhanced continuously.
For more information, see this press release.
Related Article
(Posted on November 20, 2024)
by SoftBank News Editors