Comments on: How Long Before AI Servers Take Over The Market?

By: tufttugger

tufttugger — Thu, 02 Nov 2023 20:19:53 +0000

In reply to EC. Software talent and companies are all over the place. There's tons of competition there for Nvidia, from CSPs to Hypers doing their own, down to Hugging Face and all the other small AI companies. All they needed was an open framework that could be used on more than Nvidia GPUs. And as long as the hardware provider can optimize their hardware to those frameworks (OpenAI, ROCm, etc.), then the software lockin Nvidia had with CUDA, is gone. Then anyone can provide software solutions on any available hardware platforms. So I think margins will crash for Nvidia. How much of their DC margin is in hardware vs. software services? They don't break that out yet, both are inflated. Both will deflate as other hardware becomes available and as software developers compete using it. AMD is optimizing ROCm now for beyond CDNA, but to RDNA and eventually will incorporate FPGA, DPU, and CPU optimizations. That's where AMD will have an edge, they'll be able to sell their HSA hardware in many markets. They just need to keep hammering the optimization to the open frameworks. It could end up being a bit like the gaming GPU market, with Nvidia in the lead. But though AMD has a smaller share, they still make very good margins. Competitors need to be able to keep up on the optimizations (like drivers for gaming) to compete. That's my take.

By: Timothy Prickett Morgan

Timothy Prickett Morgan — Sun, 29 Oct 2023 20:25:52 +0000

In reply to Slim Albert. I have joked that every fast food joint in every town in the modern world should be using an overclocked supercomputer as a grill. Every home could use a smaller node as a heating unit? Instead of trying to make heat, what if we made it on purpose as a hot water heater?

By: Slim Albert

Slim Albert — Sun, 29 Oct 2023 17:32:37 +0000

In reply to HuMo.

I like your ideas (HuMo and Slim Jim) … but doubt they’ll work (kinda like giving every citizen 2 books to increase literacy). The EU’ll probably be more successful if it focuses on enhancing public infrastructure, with micro-, mini-, and normal- public datacenters, accessible to individual town residents, and housed in its many city halls (like computational libraries, or public transport). Those would provide 2 TF/s of oomph to each resident (for the Zettascale goal), using EPYC Zen or SR CPUs, each paired with three or four MI210 accelerators (22 TF/s FP64 @ 300W), as suggested here: https://www.nextplatform.com/2023/08/15/crafting-a-dgx-alike-ai-server-out-of-amd-gpus-and-pci-switches/ . Using your EU surface area numbers, that’s 4 mW/m^2 and thus eminently solar-powerable.

A town of 300 folks would get a micro-datacenter (a node) with 10 CPUs and 30 GPUs (about €150k, 15kW), which scales great on both Hashcat and ResNet50, and, like a GigaIO SuperNode, would allow them to simulate 1 second of Concorde flight every 33 hours! They could also house their personal webpages on the system, and do whatever genAI job they desire. It is a vision for public AI datacenters that would surely spark decentralized innovation and computational literacy, most broadly, throughout the EU (in my mind)!

By: EC

EC — Sun, 29 Oct 2023 16:49:10 +0000

It looks to be a shorter time-frame than 15 years. P100, arguably the first merchant accelerator specifically geared for machine learning (with HBMs and NVLink), shipped mid 2016. More specifically, not sure we can call any server work load “AI” before 2013 as the Cambrian explosion like event of Alexnet domination of Imagenet that drives this resurgence in ML was late 2012.

With respect to declining ASPs, can 70% corporate gross margins be maintained by Nvidia indefinitely? Likely not. But I wonder if the server landscape doesn’t end up looking like the PC Graphics landscape where one supplier earns healthy margins and the others either limp along or have exited.

A lot of the competitive debate is often reduced to hardware, chip vs chip feeds and speeds if you will. I don’t believe it’s that simple. In this case Nvidia has created a platform (with millions of developers) with incumbency and leadership and the flexibility to fight competitive solutions with mature products. Meanwhile they continue to innovate, dropping products in at the top of the stack which preserves high GM transactions. (MI300 appears to be around 15 months behind H100, and MI300 performance remains to be quantified.)

Unless competitors come up with a same-generation accelerator in a close time frame with near performance, I don’t see anything close to halving of ASPs. A decline from these nose bleed levels? sure. Intel basically maintained 60%GMs from 2010-2020 showing how dominant one supplier can be. Nvidia is still just getting ramped.

By: Slim Jim

Slim Jim — Sat, 28 Oct 2023 15:51:57 +0000

In reply to HuMo. If the EU had half the balls of Indiana, they'd give each citizen TWO Radeon RX7800XTs, and heed the impending rise of ZettaCthulhu! (eh-eh-eh!). All for less than half of the Netherland's GDP, or Nvidia's market cap (incidentally, it would be 3x more expensive to do this FP64 ZettaFlopping using RTX4090s instead!). The EU should elect Lisa Su as president (a physicist, just like Angela Merkel) and get that Zettascale show on the road! With 4 million square km of EU surface area, the needed 263 GW can easily be provided by photovoltaics (need just 66 milliwatts per square meter)! (ih-ih-ih!)

By: HuMo

HuMo — Wed, 25 Oct 2023 20:31:56 +0000

In reply to Slim Albert. For my tax money, I'd say every EU citizen should get a free AMD RX7800XT (1.2 TF/s FP64 @ 263W) so we get this 450 EF/s goal done good (especially key with today's news the EU will run out of drugs this winter!)! 8^p

By: Slim Albert

Slim Albert — Wed, 25 Oct 2023 16:53:10 +0000

In reply to Slim Jim.

Point taken! AI servers would likely be needed for TurnItIn/”TurnItOut”, the Yin/Yang of learning/cheating through academia. TurnItIn is already server-based and would be a conversion op, but “TurnItOut” may be a new and quickly growing killer-app, requiring brand new genAI servers, and for which many individual users would be willing to pay as it’d provide them with more free time in the evenings and weekends to enjoy life, play video games, post thoughtful material on social nets, experiment broadly, get better grades, jobs in positions of responsibility, higher wages, and improved self-esteem (ahem!).

On a side-note, irrespective of whether the AI is run in servers or edge, if it accounts for, say, 1 TF/s per person then that corresponds to aggregate oomphs of 340 EF/s in the US, and 450 EF/s in the EU, both of which are larger than recently stated 2025 goals of 300 EF/s and 80 EF/s for China and India, respectively. Using TOPs (instead of TF/s) and noting that Qualcomm’s upcoming X Elite laptop NPU does 45 TOPs (INT4), suggests that those apparently astronomical 2025 goals may not be actually that outlandish … (?)

By: Slim Jim

Slim Jim — Wed, 25 Oct 2023 01:48:47 +0000

In reply to Slim Albert. The AI part of your killer-app will need to run in the datacenter, otherwise those servers won't need GPUs (they'll just be conventional servers). It seems to me that AI apps that run at the edge, or in the browser, won't need AI servers at all.

By: HuMo

HuMo — Tue, 24 Oct 2023 21:13:00 +0000

Good to hear about IBM’s 2025 This is Spinal Tap Power11 chip — hopefully a good 10% louder than Power10! :^b

By: Slim Albert

Slim Albert — Tue, 24 Oct 2023 21:05:39 +0000

Contemporary AI seems most suited to behavioral analysis and targeting (French: ciblage comportemental) with associated activities of recommendation systems, sentiment extraction, marketing analytics, targeted ad-generation, mass surveillance, and stock-market prediction. Some percentage of datacenter machinery had been dedicated to such workloads before the recent “craze” in AI (eg. at Alphabet, Amazon, Meta, Berkshire Hathaway, USA’s NSA/DHS/CIA, France’s DGSI/DGSE, etc …). I wonder how much of the estimated 40% share of AI servers corresponds to conversion to AI of these “classical” ops (mainly useful to large entities, specific corporations or governments), versus how much might be for “new” applications of AI, by expansion into new fields of activity that may be of interest to individuals and small groups of people (families, friends, neighborhoods, news outlets, …). It seems to me that if the bulk of AI servers are conversions from classical ops then their share will probably level off due to limited market size (same relatively-fixed size as before AI). Market growth (beyond replacement/upgrade) will require some new “killer-app” (KA) I think.

We may need an AI KA that 2 billion people are each willing to shell-out $500 for, essentially, to make this a proper trillion-dollar market. It would be coded in something like JavAIscript, WebAIssembly, or pAIthon … ?