Intel’s new Xeon processors are made to deliver AI at the edge

SAN FRANCISCO — Intel Corp. announced the general availability of its second generation of Intel Xeon Scalable processors Tuesday, a workhorse of the data centre, and shared details on the many different flavours of the custom processors designed to address many specific challenges, including in artificial intelligence.

Intel competitor Nvidia’s GPUs have become popular in recent years for the parallel processing power they offer, aiding the training of data-intensive machine learning models powering artificial intelligence. With its new Xeon processors, Intel is targeting inference-based machine learning, when a model takes an input and uses its past training to return an answer. While training is typically done in a centralized location with a powerful computing array, inference tends to be pushed out to the edge so it can deliver results to users.

Intel wants to maintain its market leadership position in delivering deep learning inference, says Lisa Spelman, GM of Xeon processors at Intel.

Think of it like when you ask Alexa a question, says Lisa Spelman, vice-president of data centre group and general manager of Xeon processors at Intel. When it hears your question and decides to tell you a weather update, that’s inference-based machine learning. The natural language processing capability that was created back in Amazon’s lab was the training component of that AI.

“If you ask Alexa for the weather and she says ‘I’m training my model,’ that’s not a very good user experience,” she says. “It’s inference that’s delivering those results. And that’s the user experience – you want something that’s in context, something you understand. It’s in your location and it’s good data about where you are.”

A diagram from Nvidia explains the difference between deep learning training and inference.

Inference is garnering more interest in the tech industry for applications like image recognition, with major players like Amazon Web Services launching initiatives to promote it. Intel partnered with Amazon last fall on Deep Racer League, a league that challenges developers to train small autonomous cars to complete a lap of a race course.

To make its second-generation Xeon processors even more capable of inference training, Intel is integrating its Deep Learning Boost (Intel DL Boost) technology. Intel says it’s worked with its partners in the industry to optimize their machine learning frameworks to perform well on its chips. Popular frameworks like TensorFlow, PyTorch, Caffe, MXNet, and Paddle Paddle are supported in this way.

New Xeon processors feature Intel Deep Learning Boost

During his keynote at Intel’s Data-centric event, Navin Shenoy, executive vice president and general manager of the data centre group, said DL Boost is available on the Xeon Platinum 8200 processor and that it could yield a 14x performance improvement on deep learning workloads. Shenoy and other Intel executives performed a series of benchmarking exercises on stage, each reinforcing one basic fact – the new generation of Xeon processor is faster than the old generation in many ways, and in some ways it’s the fastest processor available on the market.

Intel - DL Boost demo at Data Centric event

“We are seeing explosive demand around computing. An almost insatiable appetite,” Shenoy said on stage. “This is the biggest generation to generation improvement we’ve delivered in the volume part of the Xeon stack in the last five years.”

Some portion of that hunger is for inference AI. According to Spelman, Xeon processors are already powering 80 to 90 per cent of inference in the market today. It wants not only to maintain that market segment. Not only is it likely to keep pace with the demand for training AI, but it could grow larger as more devices like Amazon’s Echo deliver Alexa or any other number of AI services.

“Inference at the edge, we predict to grow much faster,” she says. “The edge will rely on training from the data centre but will also open up now inference possibilities. So it’s likely that inference as a workload – and we’re talking about silicon-sizing – will be larger than training. So that’s why it’s important to us.”

Cascade Lake in the wild

Stu Schmeets has been using Intel's new Xeon processors for deep learning inference on cardiac MRI scans since last October.
Stu Schmeets has been using Intel’s new Xeon processors for deep learning inference on cardiac MRI scans since last October.

Intel’s second-generation Xeon processors, also known under the “Cascade Lake”, became generally available today. But some of Intel’s clients had access to the new CPUs earlier and have been putting them to use in the wild. Stu Schmeets, senior director, MRI R&D Collaborations, North America, Siemens, says his team has been using Cascade Lake since October.

The Siemens Healthineers team are developing AI that can look at cardiac MRI scans and identify significant images that a doctor should review. Cardiac MRIs are the main tool that physicians use to identify heart disease, responsible for 18 million deaths a year in the U.S. alone. But examining MRI images for problems manually takes a long time. If doctors could use AI to help sort the images and identify potential problems, care could be delivered faster.

“The faster we can get to the doctor and make a decision, the better for the patient,” Schmeets says. “The goal there is really to do a real-time or near-real time analysis.”

In a demo on stage at the event, Schmeets showed that Intel’s new Xeon processor performed at 5.5x the previous generation on Siemens workloads. He says that his lab team reports back similar results from their experience with it. Siemens AI model is still in the research phase, but eventually, the firm wants to push it out in an integrated way with MRI machines, using inference machine learning to deliver image recognition results right where the images are being taken.

“If you look to the future, MRI scanners are becoming better, strong, and more efficient,” he says. “What we will want to do is parse even more data. Higher spatial resolution. More frames per second from a temporal resolution standpoint. That means the data input is even greater and we need to have the technology to keep pace with it.”

In addition to Xeon processors, Siemens is also making use of Intel’s distribution of OpenVINO, a developer toolkit based on convolutional neural networks. It helps distribute workloads across Intel’s hardware.

Intel paid for Brian Jackson’s travel and accommodations to attend its Data-centric event. This article was not reviewed by Intel before publication.

Would you recommend this article?


Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.

Jim Love, Chief Content Officer, IT World Canada

Featured Download

Brian Jackson
Brian Jackson
Editorial director of IT World Canada. Covering technology as it applies to business users. Multiple COPA award winner and now judge. Paddles a canoe as much as possible.

Related Tech News

Featured Tech Jobs


CDN in your inbox

CDN delivers a critical analysis of the competitive landscape detailing both the challenges and opportunities facing solution providers. CDN's email newsletter details the most important news and commentary from the channel.