Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

By Tiffany Trader

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 “Sunny Cove” cores per processor, built-in acceleration and new instructions, the Ice Lake-SP platform offers a significant performance boost for AI, HPC, networking and cloud workloads, according to Intel.

In addition to increasing the core count from 28 to 40 over previous-gen Cascade Lake, Ice Lake provides eight channels of DDR4-3200 memory per socket and up to 64 lanes of PCIe Gen4 per socket, compared to six channels of DDR4-2933 and up to 48 lanes of PCI Gen3 per socket for the previous generation.

With these enhancements, along with AVX-512 for compute acceleration and DL Boost for AI acceleration, Ice Lake delivers an average 46 percent performance improvement for datacenter workloads and 53 higher average HPC performance, generation-over-generation, according to Intel. In early internal benchmarking*, Intel also showed Ice Lake outperforming the recently launched AMD third-generation Epyc processor, codenamed Milan, on key HPC, AI and cloud applications.

Intel’s Trish Damkroger

In an interview with HPCwire, Intel’s Vice President and General Manager of HPC Trish Damkroger highlighted the work that went into the Sunny Cove core as well as the HPC platform enhancements. “Having eight memory channels is key for memory bound workloads, and with the 40 cores along with AVX-512, the CPU shows great performance for a lot of workloads that are more compute bound,” she said. Damkroger further emphasized Intel’s Speed Select Technology (SST), which enables granular control over processor frequency, core count, and power. Although Speed Select was introduced on Cascade Lake, it previously only facilitated the configuring of frequency, but with Ice Lake, there is the added flexibility to dynamically adjust core count and power.

Using Intel’s Optane Persistent Memory (PMem) 200 series combined with traditional DRAM, the new Ice Lake processors support up to 6 terabytes of system memory per socket (versus 4.5 terabytes supported by Cascade Lake and Cascade Lake-Refresh). Optane PMem 200 is part of Intel’s datacenter portfolio targeting the new third-generation Xeon platform, along with Optane P5800X SSD, SSD D5-P5316 NAND, Intel Ethernet 800 series network adapters (offering up to 200GbE per PCIe 4.0 slot), and the company’s Agilex FPGAs.

In addition to moving to PCIe Gen4, which provides a 2X bandwidth increase compared with Gen3, the socket to socket interconnect rates for Ice Lake have increased nearly 7.7 percent for improved bandwidth between processors.

Gen over gen, Ice Lake delivers a 20 percent IPC improvement (28-core, ISO frequency, ISO compiler) and improved per-core performance on a range of workloads, shown on the slide below (comparing the 8380 to the 8280).

The combination of AVX-512 instructions (first implemented on the now-discontinued Intel Knights Landing Phi in 2016 and on Skylake in 2017) and the 8-channels of DDR4-3200 memory are proving especially valuable for boosting HPC workloads. With AVX-512 enabled, the 40-core, top-bin 8380 Platinum Xeon achieves 62 percent better performance on Linpack, over AVX2.

Compared with the previous-gen Cascade Lake, the Ice Lake 8380 Xeon achieves 38 percent higher performance on Linpack, 41 percent higher performance on HPCG, and 47 percent faster performance on Stream Triad, in Intel testing.

These improvements on industry-standard benchmarking apps are reflected on application codes used in earth system modeling, financial services, manufacturing, as well as life and material science. The slide below shows improvements on a total of 12 HPC applications, including 58 percent higher performance on the weather forecasting code WRF, 70 percent improved performance on Monte Carlo, 51 percent speed-up on OpenFoam, and 57 percent improvement on NAMD.

Damkroger said that the 57 percent improvement on NAMD — a molecular dynamics code used in life sciences — is just the start. Intel worked with the NAMD team at the University of Illinois Urbana-Champaign to further optimize performance, achieving a 2.43X gen-over-gen performance boost (143 percent). “It’s all because of AVX-512 optimizations,” said Damkroger.

Source: Intel

At the University of Illinois’ OneAPI Center of Excellence, researchers are working to expand NAMD to support GPU architectures through the use of OneAPI’s open standards. “We’re preparing NAMD to run more optimally on the upcoming Aurora supercomputer at Argonne National Laboratory,” said Dave Hardy, senior research programmer, University of Illinois Urbana-Champaign.

Another illustrative use case comes from financial services, a field which is beset by space and power constraints (in New York City, for example) and which uses complex in-house software. Pointing to the 70 percent speedup for Monte Carlo simulations, gen over gen, and claiming a 50 percent improvement versus the competition (ie AMD’s 7nm Milan CPU), Damkroger said the gains are attributable to Ice Lake’s L1 and L2 cache sizes, the eight faster memory channels, and also the AVX-512 instructions. She indicated that more optimized results will be on the way. “Honestly, we just got our Milan parts in about a week ago,” said Damkroger. “We have all the information for Rome, but we obviously are doing these comparisons to the latest competition.”

Intel has had the Milan parts in-house long enough to conduct some early competitive benchmarking*. The slide below (which Intel shared during a media pre-briefing last week) shows HPC, cloud and AI performance comparisons for the top Ice Lake part (40-core) versus the top AMD Epyc Milan part (64-core) in two-socket configurations. According to Intel’s testing, Ice Lake outperformed Milan by 18 percent on Linpack, by 27 percent on NAMD, and by 50 percent on Monte Carlo (as already stated).

Coming just three weeks after AMD’s Epyc Milan launch, the Intel Ice Lake launch pits third-gen Xeon against third-gen Epyc. “Intel’s positioning vis-a-vis AMD is certainly better than before,” said Dan Olds, chief research officer with Intersect360 Research. “I think it might be a toss up for customers; it’s going to depend on their workload and it’s going to depend on price-performance as usual, but whereas before today’s launch, it was kind of a lay-in to pick AMD, it’s not anymore. Looking at Intel’s benchmarking [gen over gen], WRF is almost 60 percent higher, Monte Carlo is 70 percent higher, Linpack is up 38 percent and HPCG is 41 percent higher — now that’s significant, HPCG is the torture test.”

“We’ll have to see what happens in the real world head to head with AMD,” Olds said, “but this puts Intel back in the game in a solid place. A 50 percent move from generation to generation is a big deal. It’s not quite Moore’s law, but it’s pretty solid.”

While AMD has gained significant ground since it reentered the datacenter arena with Epyc in 2017, Intel holds about 90 percent server market share. “Intel x86 is easily the dominant processor type in the global HPC market,” Steve Conway, senior advisor at Hyperion Research told HPCwire. The research firm’s studies show that Intel x86 will likely remain dominant through 2024, the end of their forecast period.

“Based on announced benchmarks, Ice Lake looks like an impressive technical advance,” said Conway. “We’ll know more as results on challenging real-world applications become available. Intel also has a pricing challenge against AMD, so it will be interesting to learn how prices for comparable SKUs compare. The most important Ice Lake benefit is that it’s designed to serve both established and emerging HPC markets effectively, especially AI, cloud, enterprise and edge computing. That’s the key to future success.”

Vik Malyala, senior vice president leading field application engineering at Supermicro, told HPCwire their customers were eager for the PCIe Gen 4 and the higher-core density provided in Ice Lake.  “For our customers, many workloads have been optimized for Intel architecture for the longest time. That is the reason many of our customers were willing to wait as opposed to jumping to alternate offerings,” he said.

“AMD does have a process advantage, so we should not underestimate that,” Malyala said. “But at the same time, what I’m excited about is both of them are offering good performance. And customers can actually choose a platform not because something is not available, but both are available, so they can actually try it out and see which one that fits best within their budget and fits their application requirements.”

Built-in acceleration and security

For the artificial intelligence space, Intel says Ice lake delivers up to 56 percent more AI inference performance for image classification than the previous generation, and offers up to a 66 percent boost for image recognition. For language processing, Ice Lake delivers up 74 percent higher performance on batch inference gen-over-gen. And on ResNet50-v1.5, the new CPU delivers 4.3 times better performance using int8 via Intel’s DL Boost feature compared with using FP32.

Nash Palaniswamy

“The convergence of AI and HPC is becoming a reality, and customers are thrilled that the 3rd Gen Intel Xeon Scalable processor enables a dynamic reconfigurable datacenter that supports diverse applications,” shared Intel’s Nash Palaniswamy, in an email exchange with HPCwire. “Our latest 3rd Gen Xeon Scalable processor is a powerhouse for AI workloads and delivers up to 25x performance on image recognition using our 40 core CPU compared to our competitor’s 64 core part,” said Palaniswamy, vice president and general manager of AI, HPC, datacenter accelerators solutions and sales at Intel.

The third-generation Xeon Scalable processors also add new security features, including Intel Software Guard Extensions (SGX) and Intel Total Memory Encryption (TME) for built-in security, and Intel Crypto Acceleration for streamlined processing of cryptographic algorithms. Over 200 ISVs and partners have deployed Intel SGX, according to Intel. 

The SKU stack

The Ice Lake family includes 56 SKUs, grouped across 10 segments (SKU chart graphic): 13 are optimized for highest per-core scalable performance (8 to 40 cores, 140-270 watts), 10 for scalable performance (8 to 32 cores, 105-205 watts), 15 target four- and eight- socket (18 to 28 cores, 150-250 watts), and there are three single-socket optimized parts (24 to 36 cores, 185-225 watts). There are also SKUs optimized for cloud, networking, media and other workloads. All but four SKUs support Intel Optane Pmem 200 series technology.

Reserved for liquid cooling environments, the 38-core 8368Q Platinum Xeon dials up the frequency of the standard 38-core 8368, increasing the base clock from 2.4 GHz to 2.6 GHz, all-core turbo from 3.2 GHz to to 3.3 GHz, and single-core turbo from 3.4 GHz to 3.7 GHz.

At the top of the SKU mountain is the 8380 with 40 cores, a frequency of 2.3 GHz (base), 3.0 GHz (turbo) and 3.4 GHz (single-core turbo), offering 60 MB cache in a 270 watt TDP. Compared to the Cascade Lake 8280, the 8380 provides 12 additional cores and runs 65 watts hotter. The suggested customer price for the new 8380 is $8099, which is actually about 19 percent less than the list price on the 8280 ($10009).

Intel has not said publicly if it plans to release a multi-chip module (MCM) version of Ice Lake, as a follow-on to the 56-core Cascade Lake-AP part. Intel could conceivably deliver an 80-core ICL-AP but given the 270 watt power envelope of the 8380, that may not be feasible from a thermal standpoint.

Supermicro’s Malyala approves of Intel’s approach to segment-specific SKUs and the near-complete support for Optane PMem across the stack. “It was a pretty big headache for a lot of people with Cascade lake and Cascade Lake Refresh in terms of them trying to figure out how to bring all these features and which ones to enable. It’s a lot cleaner now with Ice Lake,” he said. “There’s a virtualization SKU, a networking SKU, single-socket, long lifecycle. Presenting it this way helps customers to pick and choose because now the product portfolio has exploded, right? So how do people know which one to pick? That is addressed to some extent with the segment-specific SKUs, which also helps us, Supermicro, to validate these in our products.”

In a pre-briefing held last week, Intel said the Ice Lake ramp, which commenced in the final quarter of last year, is going well. The company has shipped more than 200,000 units in the first quarter of 2021 and reports broad industry adoption across all market segments with more than 250 design wins within 50 unique OEM and ODM partners, noting over 20 publicly announced HPC adopters.

Prominent HPC customers who have received shipments so far include LRZ and Max Planck (in Germany), Cineca (in Italy), the Korea Meteorological Administration (KMA), as well as the National Institute of Advanced Industrial Science and Technology (AIST), the University of Tokyo and Osaka University (in Japan).

The third-generation Xeon products are available now through a number of OEMs, ODMs, cloud providers and channel partners. Launch partners Cisco, Dell, Gigabyte, HPE, Lenovo, Supermicro and Tyan (among others) are introducing new or refreshed servers based on the new Intel CPUs, and Oracle has announced compute instances backed by the new Xeons in limited preview with general availability on April 28, 2021. More announcements will be made in the days and weeks to come.

* Benchmarking details at https://edc.intel.com/content/www/us/en/products/performance/benchmarks/intel-xeon-scalable-processors/

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Harvard/Google Use AI to Help Produce Astonishing 3D Map of Brain Tissue

May 10, 2024

Although LLMs are getting all the notice lately, AI techniques of many varieties are being infused throughout science. For example, Harvard researchers, Google, and colleagues published a 3D map in Science this week that Read more…

ISC Preview: Focus Will Be on Top500 and HPC Diversity 

May 9, 2024

Last year's Supercomputing 2023 in November had record attendance, but the direction of high-performance computing was a hot topic on the floor. Expect more of that at the upcoming ISC High Performance 2024, which is hap Read more…

Processor Security: Taking the Wong Path

May 9, 2024

More research at UC San Diego revealed yet another side-channel attack on x86_64 processors. The research identified a new vulnerability that allows precise control of conditional branch prediction in modern processors.� Read more…

The Ultimate 2024 Winter Class Round-Up

May 8, 2024

To make navigating easier, we have compiled a collection of all the 2024 Winter Classic News in this single page round-up. Meet The Teams   Introducing Team Lobo This is the other team from University of New Mex Read more…

How the Chip Industry is Helping a Battery Company

May 8, 2024

Chip companies, once seen as engineering pure plays, are now at the center of geopolitical intrigue. Chip manufacturing firms, especially TSMC and Intel, have become the backbone of devices with an on/off switch. Thes Read more…

Illinois Considers $20 Billion Quantum Manhattan Project Says Report

May 7, 2024

There are multiple reports that Illinois governor Jay Robert Pritzker is considering a $20 billion Quantum Manhattan-like project for the Chicago area. According to the reports, photonics quantum computer developer PsiQu Read more…

ISC Preview: Focus Will Be on Top500 and HPC Diversity 

May 9, 2024

Last year's Supercomputing 2023 in November had record attendance, but the direction of high-performance computing was a hot topic on the floor. Expect more of Read more…

Illinois Considers $20 Billion Quantum Manhattan Project Says Report

May 7, 2024

There are multiple reports that Illinois governor Jay Robert Pritzker is considering a $20 billion Quantum Manhattan-like project for the Chicago area. Accordin Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

How Nvidia Could Use $700M Run.ai Acquisition for AI Consumption

May 6, 2024

Nvidia is touching $2 trillion in market cap purely on the brute force of its GPU sales, and there's room for the company to grow with software. The company hop Read more…

Hyperion To Provide a Peek at Storage, File System Usage with Global Site Survey

May 3, 2024

Curious how the market for distributed file systems, interconnects, and high-end storage is playing out in 2024? Then you might be interested in the market anal Read more…

Qubit Watch: Intel Process, IBM’s Heron, APS March Meeting, PsiQuantum Platform, QED-C on Logistics, FS Comparison

May 1, 2024

Intel has long argued that leveraging its semiconductor manufacturing prowess and use of quantum dot qubits will help Intel emerge as a leader in the race to de Read more…

Stanford HAI AI Index Report: Science and Medicine

April 29, 2024

While AI tools are incredibly useful in a variety of industries, they truly shine when applied to solving problems in scientific and medical discovery. Research Read more…

IBM Delivers Qiskit 1.0 and Best Practices for Transitioning to It

April 29, 2024

After spending much of its December Quantum Summit discussing forthcoming quantum software development kit Qiskit 1.0 — the first full version — IBM quietly Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Leading Solution Providers

Contributors

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

Intel Plans Falcon Shores 2 GPU Supercomputing Chip for 2026  

August 8, 2023

Intel is planning to onboard a new version of the Falcon Shores chip in 2026, which is code-named Falcon Shores 2. The new product was announced by CEO Pat Gel Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire