NVIDIA’s A100 set a new record in the MLPerf benchmark last month and now it’s accessible through Amazon’s cloud.
Amazon Web Services (AWS) first launched a GPU instance 10 years ago with the NVIDIA M2050. It’s rather poetic that, a decade on, NVIDIA is now providing AWS with the hardware to power the next generation of groundbreaking innovations.
The A100 outperformed CPUs in this year’s MLPerf by up to 237x in data centre inference. A single NVIDIA DGX A100 system – with eight A100 GPUs – provides the same performance as nearly 1,000 dual-socket CPU servers on some AI applications.
“We’re at a tipping point as every industry seeks better ways to apply AI to offer new services and grow their business,” said Ian Buck, Vice President of Accelerated Computing at NVIDIA, following the benchmark results.
Businesses can access the A100 in AWS’ P4d instance. NVIDIA claims the instances reduce the time to train machine learning models by up to 3x with FP16 and up to 6x with TF32 compared to the default FP32 precision.
Each P4d instance features eight NVIDIA A100 GPUs. If even more performance is required, customers are able to access over 4,000 GPUs at a time using AWS’s Elastic Fabric Adaptor (EFA).
Dave Brown, Vice President of EC2 at AWS, said:
“The pace at which our customers have used AWS services to build, train, and deploy machine learning applications has been extraordinary. At the same time, we have heard from those customers that they want an even lower-cost way to train their massive machine learning models.
Now, with EC2 UltraClusters of P4d instances powered by NVIDIA’s latest A100 GPUs and petabit-scale networking, we’re making supercomputing-class performance available to virtually everyone, while reducing the time to train machine learning models by 3x, and lowering the cost to train by up to 60% compared to previous generation instances.”
Some of AWS’ customers across various industries have already begun exploring how the P4d instance can help their business.
Karley Yoder, VP & GM of Artificial Intelligence at GE Healthcare, commented:
“Our medical imaging devices generate massive amounts of data that need to be processed by our data scientists. With previous GPU clusters, it would take days to train complex AI models, such as Progressive GANs, for simulations and view the results.
Using the new P4d instances reduced processing time from days to hours. We saw two- to three-times greater speed on training models with various image sizes while achieving better performance with increased batch size and higher productivity with a faster model development cycle.”
For an example from a different industry, the research arm of Toyota is exploring how P4d can improve their existing work in developing self-driving vehicles and groundbreaking new robotics.
“The previous generation P3 instances helped us reduce our time to train machine learning models from days to hours,” explained Mike Garrison, Technical Lead of Infrastructure Engineering at Toyota Research Institute.
“We are looking forward to utilizing P4d instances, as the additional GPU memory and more efficient float formats will allow our machine learning team to train with more complex models at an even faster speed.”
P4d instances are currently available in the US East (N. Virginia) and US West (Oregon) regions. AWS says further availability is planned soon.
You can find out more about P4d instances and how to get started here.
Interested in hearing industry leaders discuss subjects like this? Attend the co-located 5G Expo, IoT Tech Expo, Blockchain Expo, AI & Big Data Expo, and Cyber Security & Cloud Expo World Series with upcoming events in Silicon Valley, London, and Amsterdam.