Home > News > DeepSeek's $1.6B Development: Debunking the Affordability Myth

DeepSeek's $1.6B Development: Debunking the Affordability Myth

DeepSeek's new chatbot boasts a surprising capability: answering virtually any question. This AI, a product of the Chinese startup, has rapidly become a major market player, even causing significant drops in NVIDIA's stock price.Image: ensigame.com DeepSeek's success stems from its innovative archi
By Zoey
Mar 13,2025

DeepSeek's new chatbot boasts a surprising capability: answering virtually any question. This AI, a product of the Chinese startup, has rapidly become a major market player, even causing significant drops in NVIDIA's stock price.

DeepSeek TestImage: ensigame.com

DeepSeek's success stems from its innovative architecture and training methods. Key technologies include:

  • Multi-token Prediction (MTP): Instead of predicting words individually, MTP forecasts multiple words simultaneously, boosting accuracy and efficiency.
  • Mixture of Experts (MoE): This architecture uses multiple neural networks, accelerating training and improving performance. DeepSeek V3 utilizes 256 networks, activating eight for each token.
  • Multi-head Latent Attention (MLA): MLA focuses on crucial sentence parts, repeatedly extracting key details to minimize information loss and capture nuanced meaning.

DeepSeek initially claimed a remarkably low training cost of $6 million for its powerful DeepSeek V3 model using only 2048 GPUs.

DeepSeek V3Image: ensigame.com

However, SemiAnalysis revealed DeepSeek's use of approximately 50,000 Nvidia Hopper GPUs—including 10,000 H800, 10,000 H100, and additional H20 units—across multiple data centers. This represents a total server investment of roughly $1.6 billion and operational expenses nearing $944 million.

DeepSeek, a subsidiary of High-Flyer hedge fund, owns its data centers, providing control over optimization and faster innovation. Its self-funded status enhances flexibility. Furthermore, DeepSeek attracts top talent, with some researchers earning over $1.3 million annually, primarily from Chinese universities.

DeepSeekImage: ensigame.com

DeepSeek's initial $6 million training cost claim is misleading; it only covers pre-training GPU usage, excluding research, refinement, data processing, and infrastructure. The company's total AI development investment exceeds $500 million. Its lean structure, however, allows for efficient innovation compared to larger, more bureaucratic companies.

DeepSeekImage: ensigame.com

DeepSeek's success highlights the potential of well-funded independent AI companies to compete with industry giants. While its "revolutionary budget" claims are exaggerated, its success is undeniable, resulting from substantial investment, technological breakthroughs, and a strong team. The cost difference is stark: DeepSeek's R1 model cost $5 million to train, compared to ChatGPT4's $100 million. However, it's still cheaper than its competitors.

Top News

Copyright 15QX.COM © 2024 — All rights reserved