Servers & Systems: The Right Compute
1752781 Members
6141 Online
108789 Solutions
New Article
BillMannel

Arm processors take their place as credible alternative to x86 processors for HPC applications

Learn how HPE’s new Arm-based high performance computing server, the HPE Apollo 80, brings new technologies to HPC for the first time, including Scalable Vector Extensions and direct-attached High Bandwidth Memory.

HPE Apollo 80-blog.png

Arm processors continue to gain momentum across the high performance computing (HPC) industry. So it’s without a doubt that we’re excited to announce the (coming) availability of HPE’s newest Apollo system—the Arm-based HPE Apollo 80.

Previously announced as a Cray CS500, the HPE Apollo 80 system is a Fujitsu A64FX-based platform and will be available with HPE’s leading cluster management solution (HPCM), leading compiler (Cray Programming Environment), and HPE Pointnext support. Shipping in early August, the HPE Apollo 80 further strengthens the integration of Cray and HPE technologies and product lines.

Bringing a new HPC system to market doesn’t happen in a silo and it doesn’t happen overnight. Understanding why this system matters means understanding how the industry got here.

Modern technical computing challenges drive an insatiable appetite for HPC power. For the last two decades, that appetite has been fed by x86 processor-based scale-out processors. 

Overall, x86 technology has worked well, helping propel discovery across every industry and field of research for years. These platforms have delivered maximum performance for parallelizable applications using generic industry-standard building blocks. They’ve also consistently delivered the best price/performance with a broad range of supported operating systems, applications, and cluster management tools. 

But with progress comes change. Science and research questions are only growing in complexity and they’re putting new demands on HPC solutions.

While x86 processors have served well, none were specifically designed for HPC. In fact, the microarchitectures aren’t even specifically designed for servers. Additionally, while current x86 processors have impressive floating-point capabilities it has come at significant cost in terms of power efficiencies.

Into these circumstances came Arm server processors. Designed for great flexibility and low power consumption, Arm technology has been on the ascent as a credible alternative to x86 solutions. Just consider the numbers. As of February 2020, Arm partners have shipped more than 160 billion Arm-based chips and an average of more than 22 billion over the past three years.

Bolstering the overall view of Arm processors as a viable solution for HPC, government agency-led research worldwide has increasingly focused on Arm platforms.

Licensed by Arm Holding, Arm processors are available from a wide variety of vendors in a range of sizes and applications from microcontrollers to phones to tablets to servers. Server-class processors are available from Fujitsu, Marvell (Cavium), and Ampere and have previously been offered by Calxeda, TI, and others.

Cray, now a Hewlett Packard Enterprise company, was instrumental in deploying early Arm HPC systems. The Isambard 1 supercomputer designed by Cray in partnership with the GW4 Alliance and the Met Office, the United Kingdom’s national weather service, was the first Arm-based supercomputer in the world to go into production use. Isambard 1 contains 168 dual processor nodes based on the Marvell ThunderX2 processor delivering 10,752 Armv8 cores. 

HPE recently developed the HPE Apollo 70 based on the Marvell (Cavium) ThunderX2 processor specifically for the HPC market. The HPE Apollo 70 was the building block for Sandia Lab’s breakthrough Arm supercomputing cluster “Astra.” Composed of 2,592 compute nodes with a theoretical peak performance of more than 2.3 petaflops, Astra debuted at number 204 on the November 2018 Top500 supercomputing list—the only Arm cluster on that list.

The Apollo 70 also served as the building block for the Catalyst UK program, an industry and academia collaboration aimed at accelerating the UK’s adoption of supercomputing applications. The program established 64-node Arm clusters at three leading universities—one of the largest Arm-based supercomputer deployments in the world. The Catalyst UK work continues and is leading to more and more HPC codes ported and tuned for Arm architecture.

Our Arm story continues today with an alliance forged by Cray and Fujitsu in 2019. Building on these companies’ strong legacies in vector processing and supercomputing, HPE can now offer HPC servers based on the new A64FX Arm processor from Fujitsu. This processor is the same one that Fujitsu used in its recently announced “Fugaku” supercomputer. Ranked #1 on the June 2020 Top500 supercomputing list, Fugaku delivers over 400 petaflops of peak performance—2.8 times faster than the runner-up.

Specifically designed for HPC and supporting new Arm Scalable Vector Extensions, the A64FX processor represents a next generation of Arm processor. It offers greatly improved floating point performances and high bandwidth memory (HBM) for improved memory performance.

Built on 7nm FinFET technology with >8.5 billion transistors, the A64FX was purpose built for HPC servers and its architects focused on delivering both performance and performance per watt.

Key to the floating-point performance is support for the new Armv8-A SVE architecture which both Cray and Fujitsu worked with Arm to develop. The A64FX has 48 cores each with two 512 bit-wide SIMD SVE units enabling the processor to deliver over 3.1 double precision teraflops. Another key innovation is the 32 GB of direct-attached high bandwidth memory (HBM).

The 32 GB of HBM memory can deliver over 1 terabytes/sec of memory bandwidth for improved performance for memory intensive HPC applications.

Power efficiency is critical for HPC as many installations have thousands of nodes. Designed to meet aggressive power efficiency goals, the A64FX was used in the winning November 2019 Green500 solution as the most power efficient supercomputer in the world delivering >16 gigaflops per watt.

Now meet the HPE Apollo 80.

The HPE Apollo 80 is a compact, cluster-ready solution with eight single processor servers in a standard 19” 2U chassis. Support from HPE Performance Cluster Manager gives customers all the tools they need to manage their Apollo 80 clusters with ease all day, every day. The software provides system setup, hardware monitoring and management, health management, image management, and software updates, plus power management for systems of any scale as well as integration with leading third-party system management tools. System administrators can spend less time managing their clusters and optimize their use so they can maximize return on their investment and shorten time to discovery. 

The Cray Programming Environment is also available for the HPE Apollo 80 solution and is a fully integrated software suite with compilers, tools, and libraries designed to maximize programmer productivity, application scalability and performance—a must for organizations which develop their own code. It has supported Arm for years and Cray’s vector processing background has played a significant role in both ARM SVE architecture (with Arm) and producing an efficient compiler for the A64FX SVE implementation.

We’re already getting enthusiastic reports from customers about the Apollo 80 technology’s potential. Over in the UK, the University of Bristol will use the A64FX-based system next year. “We’ve been on a journey toward Arm-based supercomputing and the new HPE and Fujitsu Arm system will bring us closer to that reality,” said Simon McIntosh-Smith, professor of HPC at the university. “We think the A64FX system will be the exciting core of the new Isambard 2 system which is going live in the next few months.”

HPE Apollo 80 customers include: Los Alamos National Laboratory, Oak Ridge National Laboratory, RIKEN Center for Computational Science, Stony Brook University, University of Bristol, Leibniz Supercomputing Centre (LRZ), and Center for Development of Advanced Computing (CDAC).

The HPE Apollo 80 system gives us a look at where we’ve been and what the future of HPC holds. Vector processing support and directly connected high bandwidth memory are expected to be key components of most next-generation HPC solutions.

Overall, the HPE Apollo 80 offers customers the opportunity to build HPC clusters with the most advanced Arm processor on the planet and with support from the world’s leading HPC vendor.

Learn more about HPE Apollo 80.


Bill Mannel
VP & GM, HPC

twitter.com/Bill_Mannel
linkedin.com/in/billmannel/
hpe.com/servers

About the Author

BillMannel

As VP & GM for HPC, I lead worldwide business execution and commercial HPC focus for one of the fastest growing market segments in Hewlett Packard Enterprise’s Hybrid IT (HIT) group that includes the recent Cray acquisition and the HPE Apollo portfolio.