Wallaroo.AI and Ampere Computing Collaborate to Bring Energy-Efficient, Low-Cost Machine Learning Inferencing to the Cloud

July 18, 2023

Joint Solution Helps Support Artificial Intelligence Growth with Breakthrough Performance,

Reduced Infrastructure Requirements, Path to Sustainability Goals

NEW YORK – July 18, 2023 – Wallaroo.AI, the leader in scaling production machine learning (ML) from the cloud to the edge, today announced a strategic collaboration with Ampere Computing to create optimized hardware/software solutions that provide reduced energy consumption, greater efficiency, and lower cost per inference for cloud artificial intelligence (AI).

Ampere processors are inherently more energy efficient than traditional AI accelerators. Now, with an optimized low-code/no-code ML software solution and customized hardware, putting AI into production in the cloud has never been easier or more cost-effective (even at cost-per-inference measure) or used less energy.

“This Wallaroo.AI/Ampere solution allows enterprises to deploy easily, improve performance, increase energy efficiency, and balance their ML workloads across available compute resources much more effectively,” said Vid Jain, chief executive officer of Wallaroo.AI, “all of which is critical to meeting the huge demand for AI computing resources today also while addressing the sustainability impact of the explosion in AI.“

“Through this collaboration, Ampere and Wallaroo.AI are combining Cloud Native hardware and optimized software to make ML production within the cloud much easier and more energy-efficient,” said Jeff Wittich, Chief Product Officer at Ampere. “That means more enterprises will be able to turn AI initiatives into business value more quickly.”

Breakthrough Cloud AI Performance

One of the key advantages of the collaboration is the integration of Ampere’s built-in AI acceleration technology and Wallaroo.AI’s highly-efficient Inference Server, part of the Wallaroo Enterprise Edition platform for production ML.

Benchmarks have shown as much as a 6x improvement over containerized x86 solutions on certain models like the open source ResNet-50 model. Tests were run using an optimized version of the Wallaroo Enterprise Edition on Dpsv5-series Azure virtual machines using Arm64 Azure virtual machines using AmpereⓇ Altra 64-bit processors, however, the optimized solution will also be available for other cloud platforms.

Benefits of Energy-Efficient AI

Reduced Hardware Needs/Costs – With a $15.7 trillion (U.S.) potential contribution to the global economy by 2030 (PwC), demand for AI has never been higher. However, the graphics processing units (GPUs) used to train AI models are in high demand. The quantities required for AI – and especially for large ML models like ChatGPT and other large language models (LLMs) – mean they are often not a cost-effective solution for AI/ML. For many enterprises, it is a better alternative to run software like the highly optimized Wallaroo.AI inference server, which can cost-efficiently run many AI/ML workloads with similar performance using currently available, advanced CPUs.

Supporting Sustainability/ESG Goals – The MIT Technology Review states that one AI training model uses more energy in a year than 100 U.S. homes. This means facility costs (power, cooling, etc.) of running GPUs can severely impact cloud providers as well as the power grid. Many clients of cloud providers also have environmental, social, governance (ESG) or sustainability initiatives that would be negatively impacted by large-scale adoption of AI with GPUs. Using optimized inference solutions on CPUs like the Ampere Altra Family of processors, allows organizations to realize greater efficiency for inference workloads advancing both their need for AI/ML performance while simultaneously addressing their ESG goals for greater sustainability.

About Wallaroo.AI

Wallaroo.AI empowers enterprise AI teams to operationalize Machine Learning (ML) to drive positive outcomes with a software platform that enables deployment, observability, optimization and scalability of ML in the cloud, in decentralized networks, and at the edge. The unified Wallaroo.AI platform enables enterprise AI teams to easily deploy ML in seconds with no fuss or engineering overhead. They can then observe and optimize in real-time from a self-service operations center. Enterprises can run at scale with 80% less infrastructure and turbocharge their Databricks and Cloud ML production workflows using familiar dev tools. Wallaroo.AI is backed by Microsoft’s venture fund, M12, and leading VCs including Boldstart, Contour Ventures, Enicac, Greycroft, and Ridgeline. Learn more at www.wallaroo.ai.

About Ampere

Ampere is a modern semiconductor company designing the future of cloud computing with the world’s first Cloud Native Processors. Built for the sustainable Cloud with the highest performance and best performance per watt, Ampere processors accelerate the delivery of all cloud computing applications. Ampere Cloud Native Processors provide industry-leading cloud performance, power efficiency and scalability. For more information visit Ampere Computing.


Wallaroo Public Relations

Matt Stubbs


Wallaroo.AI logo
Ampere Computing logo energy-efficient ML in the cloud