Home > News > DeepSeek's $1.6B Development: Debunking the Affordability Myth

DeepSeek's $1.6B Development: Debunking the Affordability Myth

DeepSeek's new chatbot boasts a surprising capability: answering virtually any question. This AI, a product of the Chinese startup, has rapidly become a major market player, even causing significant drops in NVIDIA's stock price.Image: ensigame.com DeepSeek's success stems from its innovative archi

By Zoey: Mar 13,2025

Image: ensigame.com

DeepSeek's success stems from its innovative architecture and training methods. Key technologies include:

Multi-token Prediction (MTP): Instead of predicting words individually, MTP forecasts multiple words simultaneously, boosting accuracy and efficiency.
Mixture of Experts (MoE): This architecture uses multiple neural networks, accelerating training and improving performance. DeepSeek V3 utilizes 256 networks, activating eight for each token.
Multi-head Latent Attention (MLA): MLA focuses on crucial sentence parts, repeatedly extracting key details to minimize information loss and capture nuanced meaning.

DeepSeek initially claimed a remarkably low training cost of $6 million for its powerful DeepSeek V3 model using only 2048 GPUs.

Image: ensigame.com

However, SemiAnalysis revealed DeepSeek's use of approximately 50,000 Nvidia Hopper GPUs—including 10,000 H800, 10,000 H100, and additional H20 units—across multiple data centers. This represents a total server investment of roughly $1.6 billion and operational expenses nearing $944 million.

DeepSeek, a subsidiary of High-Flyer hedge fund, owns its data centers, providing control over optimization and faster innovation. Its self-funded status enhances flexibility. Furthermore, DeepSeek attracts top talent, with some researchers earning over $1.3 million annually, primarily from Chinese universities.

Image: ensigame.com

DeepSeek's initial $6 million training cost claim is misleading; it only covers pre-training GPU usage, excluding research, refinement, data processing, and infrastructure. The company's total AI development investment exceeds $500 million. Its lean structure, however, allows for efficient innovation compared to larger, more bureaucratic companies.

Image: ensigame.com

DeepSeek's success highlights the potential of well-funded independent AI companies to compete with industry giants. While its "revolutionary budget" claims are exaggerated, its success is undeniable, resulting from substantial investment, technological breakthroughs, and a strong team. The cost difference is stark: DeepSeek's R1 model cost $5 million to train, compared to ChatGPT4's $100 million. However, it's still cheaper than its competitors.