Press Releases 2026
SoftBank to Launch "AI Data Center GPU Cloud"
Powered by "Infrinia AI Cloud OS"
as Part of its Neocloud Business in October 2026
Enabling efficient and flexible AI workloads
across AI model development, inference, and data processing
May 25, 2026
SoftBank Corp.
SoftBank Corp. ("SoftBank") today announced that, under its new growth strategy "Activate AI for Society," it will launch "AI Data Center GPU Cloud," a cloud service powered by "Infrinia AI Cloud OS," a software stack*1 for AI data centers, as part of its neocloud*2 business in October 2026. Through this initiative, SoftBank will provide integrated AI computing infrastructure and software that can be securely used within Japan. Ahead of the launch, SoftBank will begin offering a beta version today and start using the service internally across its group companies.
"AI Data Center GPU Cloud" is a cloud service that combines SoftBank's AI computing infrastructure with "Infrinia AI Cloud OS," an AI data center software stack that provides Kubernetes as a Service (KaaS)*3 for multi-tenant environments and Inference as a Service (Inf-aaS) for Large Language Model inference via APIs.
By leveraging advanced GPU-accelerated AI computing infrastructure, including NVIDIA GB200 NVL72 deployed in SoftBank's Japan-based data centers, customers can efficiently and flexibly execute a wide range of AI workloads—from model training and inference to data processing—while ensuring secure data management and operations within Japan.
In addition, the service provides centralized and automated management of GPU resources, Kubernetes-based operations, and AI workload execution, enabling optimized processing for each workload. This reduces the effort required to set up development environments and manage compute resources, lowering operational burdens and costs while providing a stable platform that can flexibly adapt to evolving requirements.
Going forward, based on its "Telco AI Cloud" initiative*4 to build next-generation social infrastructure for the AI era by leveraging its telecommunications foundation, SoftBank aims to optimize AI processing from training to inference by integrating "AI Data Center GPU Cloud" with AI-RAN, while building a sovereign, distributed AI infrastructure that delivers low latency and high reliability.
Key Features of "AI Data Center GPU Cloud"
1. Support for a wide range of workloads from training to inference
The service provides a GPU environment that supports a broad range of AI workloads, from compute-intensive training, such as LLM development, to latency-sensitive inference. Built on advanced accelerated computing platforms, including NVIDIA GB200 NVL72, the service combines high-performance NVIDIA Blackwell GPUs, interconnected via NVIDIA NVLink, with high-performance storage, enabling efficient LLM training and complex inference processing even in multi-tenant environments.
2. Flexible operations with Kubernetes as a Service (KaaS)
By leveraging Kubernetes, the service enables centralized and automated management of large-scale container environments, significantly reducing operational complexity associated with configuration changes and scaling of development environments. Container technology also accelerates application startup and streamlines deployment and scaling, enabling faster execution of the entire process from AI model development to implementation and operation. In addition, Kubernetes-based load balancing ensures stable service delivery, while automatic recovery mechanisms in the event of failures provide high availability and service continuity.
3. Model inference environment powered by Inference as a Service (Inf-aaS)
By automating the deployment and operation of model inference infrastructure on Kubernetes, the service supports the rapid development of inference APIs. This reduces the burden of infrastructure management and enables users to quickly and reliably deploy inference environments by simply selecting their own or preferred AI models.
Junichi Miyakawa, President & CEO of SoftBank Corp., commented:"As AI becomes more deeply integrated into society, the source of competitiveness is expanding beyond AI itself to include the computing power and operational software that support it. Under our new growth strategy, 'Activate AI for Society,' SoftBank will provide integrated computing infrastructure and software that can be securely used within Japan as a neocloud provider. 'Infrinia AI Cloud OS' and 'AI Data Center GPU Cloud' will serve as core services in this initiative, strongly supporting customers' AI development and real-world deployment."
Charlie Boyle, vice president of DGX systems at NVIDIA, commented:"The transformation of telecommunications into an AI-native architecture requires a new foundation of AI infrastructure capable of handling the most complex sovereign AI workloads. SoftBank's deployment of the NVIDIA GB200 NVL72 and 'Infrinia AI Cloud OS' gives Japanese enterprises a high-performance, secure, and scalable platform to accelerate their industries."
About "Infrinia AI Cloud OS"
"Infrinia AI Cloud OS" is a software stack for AI data centers developed by the Infrinia team, which is responsible for building next-generation AI infrastructure architectures and systems. It enables the deployment of Kubernetes as a Service (KaaS) for multi-tenant environments and Inference as a Service (Inf-aaS) for Large Language Model inference via APIs, as part of GPU cloud services.
Compared with custom-built or in-house developed solutions, "Infrinia AI Cloud OS" is expected to reduce total cost of ownership (TCO) and operational burdens. As a result, it enables the rapid delivery of GPU cloud services that support efficient and flexible execution of AI workloads from model training to inference.
For more information on "Infrinia AI Cloud OS," please refer to the press release dated January 21, 2026, SoftBank Corp. Announces "Infrinia AI Cloud OS," a Software Stack for AI Data Centers.
- [Notes]
-
- *1A software stack is a set of software components and functions used together to build and operate systems and applications.
- *2A neocloud is a group of cloud platforms optimized for large-scale AI workloads, providing high-performance GPU-centric infrastructure and AI-native services.
- *3Kubernetes is an open-source system for automating the deployment and scaling of applications and for managing containerized applications.
- *4For more information on the "Telco AI Cloud" initiative, please refer to the press release dated March 2, 2026, SoftBank Corp. Announces Telco AI Cloud Vision to Build Social Infrastructure for the AI Era, Leveraging Its Telecommunications Foundation.
- *1
- SoftBank, the SoftBank name and logo are registered trademarks or trademarks of SoftBank Group Corp. in Japan and other countries.
- Other company, product and service names in this press release are registered trademarks or trademarks of the respective companies.