Comments on: Stacking Up Intel Gaudi Against Nvidia GPUs For AI https://www.nextplatform.com/2024/06/13/stacking-up-intel-gaudi-against-nvidia-gpus-for-ai/ In-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds. Tue, 25 Jun 2024 18:07:40 +0000 hourly 1 https://wordpress.org/?v=6.7.1 By: EP https://www.nextplatform.com/2024/06/13/stacking-up-intel-gaudi-against-nvidia-gpus-for-ai/#comment-225711 Mon, 17 Jun 2024 18:55:27 +0000 https://www.nextplatform.com/?p=144285#comment-225711 Great business move by Intel – start a price war with the the undisputed market leader,that have tons of free cash while you are finansilly constrainted.
what can go wrong ?
And of all this before considering the Nvidia SW moat.
The same product in a different wrapping will not take Intel far.

]]>
By: Timothy Prickett Morgan https://www.nextplatform.com/2024/06/13/stacking-up-intel-gaudi-against-nvidia-gpus-for-ai/#comment-225685 Mon, 17 Jun 2024 02:37:06 +0000 https://www.nextplatform.com/?p=144285#comment-225685 In reply to JayN.

Yup. Took the cost of the network adapters back out. Thanks. Some days. . . .

]]>
By: JayN https://www.nextplatform.com/2024/06/13/stacking-up-intel-gaudi-against-nvidia-gpus-for-ai/#comment-225634 Sat, 15 Jun 2024 20:13:18 +0000 https://www.nextplatform.com/?p=144285#comment-225634 “Now, if you build a system and add in those expensive CPUs, main memory for them, network interface cards …”

The Gaudi3 system does not require installation of additional network cards or GPU linking, according to SMCI

https://youtu.be/BQVfGdkiTtc?t=120

]]>
By: Etan https://www.nextplatform.com/2024/06/13/stacking-up-intel-gaudi-against-nvidia-gpus-for-ai/#comment-225620 Sat, 15 Jun 2024 13:54:01 +0000 https://www.nextplatform.com/?p=144285#comment-225620 I think Intel has been taking of using the gaudi3 with existing data centers and data stored locally for inferencing. Thus, their math is for inferencing with existing data centers and adding the gaudi3. Thanks.

]]>
By: Amitp https://www.nextplatform.com/2024/06/13/stacking-up-intel-gaudi-against-nvidia-gpus-for-ai/#comment-225599 Sat, 15 Jun 2024 05:32:05 +0000 https://www.nextplatform.com/?p=144285#comment-225599 The peak FP16/BF16 performance of Gaudi3 is 2x higher than what you have. Per Gaudi, it is 1835tops, so that is 14680tops for 8 Gaudis.

https://www.intel.com/content/www/us/en/content-details/817486/intel-gaudi-3-ai-accelerator-white-paper.html

]]>
By: JayN https://www.nextplatform.com/2024/06/13/stacking-up-intel-gaudi-against-nvidia-gpus-for-ai/#comment-225584 Fri, 14 Jun 2024 20:41:38 +0000 https://www.nextplatform.com/?p=144285#comment-225584 In reply to Timothy Prickett Morgan.

The LINPACK HPL benchmarks are used for the top500 ranking, so the FP64 matrix design decision apparently also held back Aurora on that overall ranking, but Aurora did take top spot on the LINPACK AI workloads.

https://www.intel.com/content/www/us/en/newsroom/news/intel-powered-aurora-supercomputer-breaks-exascale-barrier.html#gs.auil5e

“Aurora supercomputer secured the top spot in the high-performance LINPACK-mixed precision (HPL-MxP) benchmark – which best highlights the importance of AI workloads in HPC.”

]]>
By: Timothy Prickett Morgan https://www.nextplatform.com/2024/06/13/stacking-up-intel-gaudi-against-nvidia-gpus-for-ai/#comment-225583 Fri, 14 Jun 2024 19:25:59 +0000 https://www.nextplatform.com/?p=144285#comment-225583 In reply to JayN.

Jay, here is a direct quote from Rick during the HPE Argonne Aurora briefing ahead of ISC, when Rick was asked about how come Aurora takes so much more energy on LINPACK than other machines:

“So let me answer that in probably the simplest way. So underlying hardware for Frontier and Aurora are different. So Frontier, the AMD 250X, has a has a matrix engine that executes 64 bit math which means that for matrix calculations that it can do basically twice the performance of the vector unit when it is doing 64 bits. Aurora does not have that matrix unit, and that was a deliberate design decision because most scientific calculations can’t take advantage of 64 bit dense matrix calculations. LINPACK can, but other scientific codes can’t. So the actual real applications, not the benchmark, do quite well on Aurora, we have a number of them that are multiples of the equivalent performance on Frontier. But the cost of doing that is you’re running the benchmark in vector mode. So that’s the underlying reason. But it was a deliberate design decision to not use silicon for a matrix unit for double precision, we put that extra silicon into accelerating lower precision on the PVC. So that’s — in BF16, for example, we have a lot more performance. So that’s, that’s that’s a technical reason. If you buy me a beer, I’ll tell you more about it.”

This was, as you might imagine, news to me. But that is what Rick said.

]]>
By: JayN https://www.nextplatform.com/2024/06/13/stacking-up-intel-gaudi-against-nvidia-gpus-for-ai/#comment-225577 Fri, 14 Jun 2024 15:13:46 +0000 https://www.nextplatform.com/?p=144285#comment-225577 In reply to Timothy Prickett Morgan.

Rick Stevens was integrally involved in the specification of the PVC GPU explicitly for a mixture of HPC and AI processing. If he had wanted 64 bit matrix processing, it would have been included.

The Gaudi3 AI processors also do not include 64 bit matrix operations.

https://cdrdv2-public.intel.com/817486/gaudi-3-ai-accelerator-white-paper.pdf

]]>
By: HuMo https://www.nextplatform.com/2024/06/13/stacking-up-intel-gaudi-against-nvidia-gpus-for-ai/#comment-225566 Fri, 14 Jun 2024 05:26:50 +0000 https://www.nextplatform.com/?p=144285#comment-225566 Must be that Habana Labs factor … on the beach, sipping rhum-cocktails, with large cigars … best way to up one’s inference perf/$ in my experience! 8^p

]]>
By: Timothy Prickett Morgan https://www.nextplatform.com/2024/06/13/stacking-up-intel-gaudi-against-nvidia-gpus-for-ai/#comment-225561 Fri, 14 Jun 2024 02:07:37 +0000 https://www.nextplatform.com/?p=144285#comment-225561 In reply to JayN.

Key bit of data missing. According to Rick Stevens, who runs Argonne, it doesn’t have 64-bit matrix math, just 64-bit on the vectors. Which is what I meant.

Ignorance is a strong word. Tired is a less strong one, which I am a lot these days for reasons I do not care to explain.

]]>