Comments on: How Did DeepSeek Train Its AI Model On A Lot Less – And Crippled – Hardware?

By: Timothy Prickett Morgan

Timothy Prickett Morgan — Fri, 31 Jan 2025 20:38:11 +0000

In reply to Erik Klipping. Or like comparing 3D graphics to actually driving a car. . . .

By: Gravitycreatedlife

Gravitycreatedlife — Fri, 31 Jan 2025 15:47:41 +0000

Back door to a back door, straight to the CCP.

By: Erik Klipping

Erik Klipping — Thu, 30 Jan 2025 21:51:28 +0000

I asked it a simple question about a legal issue in my country, and it succeeded in generating a human-like answer but failed to answer the question correctly. I understand that the potential lies in efficiency, but this is like comparing real-time 3D graphics on a C64 with 3D on modern hardware: they both look like 3D graphics, but one of them can only be used in a highly specialized use case.

By: Scott ho

Scott ho — Tue, 28 Jan 2025 23:20:45 +0000

Commentators testing bias have shown that deep seek can generate information relating to subjects censored by China. (They tell the LLM to use substitute characters, by-passing internal censors.) This indicates that the LLM was trained outside the Chinese firewall. This opens the possibility of training outside China on higher spec hardware.

By: John W

John W — Tue, 28 Jan 2025 18:22:56 +0000

Rather like having a family with your sister, training one model on the output of another is how the insanity starts.

By: Mehdi Zoghlami

Mehdi Zoghlami — Tue, 28 Jan 2025 17:17:19 +0000

As I said before, more sanctions on China will only lead to more innovation from Chinese engineers. And Trump is going to Make China Great Again.

By: Paul Berry

Paul Berry — Tue, 28 Jan 2025 16:36:01 +0000

Many in and out of Nvidia are claiming that this is actually a validation of the technology; that refinement of the methods of doing AI will make it plausible for more than 3-4 big companies to offer technology based on AI. I don’t know to what degree it’s true, but I really feel it’s necessary. I’m honestly not that impressed by what the AI industry has offered to date, and doubt it will be all that useful to a lot of industries. I think we need a lot more improvement before AI can be widely useful, and I’d rather have dozens of places trying to improve the state of the art, rather than just a handful.

By: Fernanda

Fernanda — Tue, 28 Jan 2025 15:19:56 +0000

Another banger Tim. Nice deepdive on deepseek… ha

By: Timothy Prickett Morgan

Timothy Prickett Morgan — Tue, 28 Jan 2025 14:35:52 +0000

In reply to Sunil Verma. Interesting scenario, Sunil. Thanks for that.

By: Sunil Verma

Sunil Verma — Tue, 28 Jan 2025 14:18:51 +0000

What impresses me about DeepSeek-V3 is that it only has 671B parameters and it only activates 37B parameters for each token. Instead of trying to have an equal load across all the experts in a Mixture-of-Experts model, as DeepSeek-V3 does, experts could be specialized to a particular domain of knowledge so that the parameters being activated for one query would not change rapidly. This would allow a chip like Sapphire Rapids Xeon Max to hold the 37B parameters being activated in HBM and the rest of the 671B parameters would be in DIMMs. This would be an ideal inference server for a small/medium size business. Queries would stay behind the company’s firewall. Unlike data center GPUs, this hardware could be used for general-purpose computing when it is not needed for AI. The HBM bandwidth of Sapphire Rapids Xeon Max is only 1.23 TBytes/sec so that needs to be fixed but the overall architecture with both HBM and DIMMs is very cost-effective. The reason it is cost-effective is that there are 18x more total parameters than activated parameters in DeepSeek-V3 so only a small fraction of the parameters need to be in costly HBM. Imagine a Xeon Diamond Rapids with 4.8 TBytes/sec of HBM3E bandwidth. That could generate about 4800B / 37B = 130 tokens/sec using DeepSeek-V3.

NVIDIA’s market cap fell by $589B on Monday. This loss in market cap is about 7x more than Intel’s current market cap ($87.5B). At NVIDIA’s new lower market cap ($2.9T), NVIDIA still has a 33x higher market cap than Intel.

Timothy Prickett Morgan wrote a good article about Xeon Max here:

nextplatform.com/2022/11/15/sapphire-rapids-xeon-sps-plus-hbm-offer-big-performance-boost