Artificial intelligence (AI) has the potential to surpass all the transformative innovations that have occurred over the past century, and it will benefit society beyond our imagination in areas such as healthcare, productivity, education, and more. To run these complex AI workloads, the amount of computing required in data centers around the world needs to scale at an exponential scale. However, this insatiable demand for computing also reveals a serious challenge: data centers require huge amounts of electricity to power the groundbreaking technology of AI.
Today’s data centers already consume a lot of electricity – 460 terawatt hours (TWh) of electricity are needed to support the world each year, a figure equal to the entire electricity consumption of Germany. The rise of AI is expected to triple that number by 2030, meaning it will exceed the total electricity consumption of India, the world’s most populous country.
The AI models of the future will continue to get bigger and smarter, driving the demand for more computing power while also increasing the demand for electricity, thus becoming part of a virtuous cycle. Finding ways to reduce the power needs of these large data centers will be critical to achieving social breakthroughs and delivering on the promise of AI.
IC694MDL655 In other words, AI cannot be achieved without electricity, and companies need to rethink how they approach every aspect of energy efficiency.
Reimagining the future of AI – a future powered by the Arm platform
Arm’s original products were designed for battery-powered devices and helped revolutionize mobile phones. As a result, the energy efficiency DNA deeply embedded in Arm could make the industry rethink how chips should be built to meet the growing demands of AI.
In a typical server rack, compute chips alone can consume more than 50% of the power budget. The engineering team is looking for ways to bring that number down, and every watt of reduction counts.
Because of this, the world’s largest AI head cloud service providers are turning to Arm technology to reduce power consumption. Compared to others in the industry, Arm’s new Arm Neoverse CPU is the highest performing and most power efficient processor for cloud data centers. Neoverse gives head cloud service providers the flexibility to customize chips to optimize their demanding workloads while delivering leading performance and energy efficiency. Every watt saved can be used to implement more calculations. That’s why Amazon Cloud Services (AWS), Microsoft, Google, and Oracle are now using Neoverse technology to handle their general purpose computing and CPU-based AI reasoning and training. The Neoverse platform is becoming the de facto standard in the cloud data center space.
From recent industry releases:
AWS Graviton based on the Arm architecture: Amazon Sagemaker delivers 25 percent better AI inference performance, 30 percent better Web applications, 40 percent better databases, and 60 percent better efficiency than other products in the industry.
Google Cloud Axion based on Arm architecture: 50% and 60% better performance and power efficiency compared to traditional architectures, respectively, to support CPU-based AI reasoning and training, YouTube, Google Earth and other services.
Microsoft Azure Cobalt based on Arm architecture: 40% higher performance and support for services such as Microsoft Teams, coupling with Maia accelerators to drive Azure’s end-to-end AI architecture.
IC694MDL655 Oracle Cloud uses Ampere Altra Max based on Arm architecture: The server delivers 2.5 times higher performance and 2.8 times lower power consumption per rack compared to traditional equivalents, and is used for generative AI inference models such as summarization, tokenization of data trained by large language models, and bulk inference use cases.
Clearly, Neoverse greatly improves the performance and energy efficiency of general purpose computing in the cloud. In addition, the partners also found that Neoverse can bring the same benefits in terms of accelerated computing. Large-scale AI training requires unique accelerated computing architectures, such as the NVIDIA Grace Blackwell Platform (GB200), which combines NVIDIA’s Blackwell GPU architecture with an ARM-based Grace CPU. This ARM-based computing architecture enables system-level design optimization, delivers 25 times lower power consumption and 30 times higher performance per GPU compared to NVIDIA H100 Gpus for large language models. These optimizations can lead to disruptive performance and energy savings, all thanks to the unprecedented flexibility of chip customization that Neoverse brings.
As ARM-based deployments continue to expand, these companies will be able to save up to 15 percent of their total data center energy consumption. These huge savings can be used to drive additional AI operations in the same power range without increasing the energy burden. In other words, these savings are equivalent to running an additional 2 billion ChatGPT queries, driving a quarter of daily web search traffic, lighting 20 percent of U.S. homes, or powering a country the size of Costa Rica. This has had an amazing impact on improving energy consumption and environmental sustainability.
Arm cpus are fundamentally transforming AI and benefiting the planet. The Arm architecture is the cornerstone of future AI computing.