top of page

DeepSeek: A Deep Dive into the Black Horse of the AI Arms Race

Writer's picture: Mateusz Wilczyński Mateusz Wilczyński

We are a month into 2025, and it has been an interesting start to the year with many global developments underway. The year began with continued global conflicts and rising tensions amidst the possibility of an international trade war. Moreover, on 20th January, the Chinese startup DeepSeek released its new large language model (LLM), the DeepSeek R1, which subverted the Western competition and placed itself as a fresh and new competitor in the AI arms race.


But what is DeepSeek? What makes it any different from the other AI companies? Why did it crash the S&P500, Nasdaq and NVIDIA?


DeepSeek -  Company History


DeepSeek is a Chinese AI startup, a subsidiary of the High-Flier hedge fund and was founded in May of 2023, following the announced start of High-Flier's artificial general intelligence lab dedicated to research in developing AI tools separate from the fund's primary financial business in April 2023.


With High-Flier as the financial investor and backer, the lab (which became the distinct company DeepSeek) developed and released its first model 'DeepSeek Coder' in November 2023 and later its DeepSeek-LMM series of models later that year. In China, the company only became mainstream with the release of its DeepSeek-V2 model, which became the catalyst for the Chinese model price war as other tech giants the likes of ByteDance, Tencent and Alibaba began to cut the price of their models to compete with the performance and low cost of the V2. DeepSeek’s breakthrough into the overseas market came with the release of its first free chatbot app the DeepSeek-R1 model. This is the model you have heard of and has been the wakeup call for Wall Street and Silicon Valley.


DeepSeek R1 – A More Technical Analysis


But what is this DeepSeek-R1 model? The R1 is a Chinese ChatGPT' clone', which stands up to compete with the western models in terms of performance for a fraction of the cost, more on that later. Following the release it reached No.1 on the App Store dethroning ChatGPT, a feat no other LLM has managed. It also featured a 'thinking' property, where the model walked the user through its thought process prior to the response, making follow up questions easter and elaborating more on the content of the response; the feature was quickly introduced by ChatGPT and the other American companies in response.


Though those are just the cherries on top, what makes the model exemplary is that its 100% free to use, while just as good as the $200 per month ChatGPT subscription. Moreover, the model was extremely cheap to train costing DeepSeek only around 6 Million USD while the training costs of ChatGPT as its most direct competitor are estimated between 1.7 to 2.5 Billion USD; the R1 is also far cheaper to run then the competing GPT model as shown in the graphic below, at roughly 2% of the price.


"Comparison of Models: Quality, Performance & Price Analysis" – Artificial Analysis
"Comparison of Models: Quality, Performance & Price Analysis" – Artificial Analysis

(1 'Token' is equivalent to one 'use' of the AI, e.g. asking it a question)


What sets DeepSeek's AI apart from the majority of the competition and possibly the biggest challenge for other companies in the arms race going forward is the open-source nature of the R1. So, what does open-source mean? Software can either be open or closed source; the majority of programs are closed source which mean that you cannot see what is going on behind the scenes. Smartphone software like iOS, Office tools, Windows and Apple are all examples of closed source software where only the developers and engineers at those companies have visibility and accessing powers to edit and change the programmes. Open-source software on the other hand has its internal files available to the public who have the access to modify it. A leaked internal Google document from 2023 claimed that open source AI will outcompete both Google and OpenAI in the long term. DeepSeek’s R1 may be a significant step in that coming to fruition.


How does R1 compare to the competition?


The benchmark performance of the R1 model has it competing with the very best offered by American companies, although for all the praise to DeepSeek's model, it does not outperform the western models. Though this by itself is an amazing achievement as the consensus was that China as well as other competing countries were months if not years behind in development. When compared to the competition on the Artificial Analysis Quality Index, the R1 model scores 89 points, ahead of all competition aside from the ChatGPT-o1 which leads the pack with 90, once again at only 2% the running cost of the o1.


“DeepSeek-R1: Incentivizing Reasoning Capability in LLM's via Reinforcement Learning" - DeepSeek
“DeepSeek-R1: Incentivizing Reasoning Capability in LLM's via Reinforcement Learning" - DeepSeek
"Comparison of Models: Quality, Performance & Price Analysis" – Artificial Analysis
"Comparison of Models: Quality, Performance & Price Analysis" – Artificial Analysis

Across benchmarks, the R1 model is shown to score competitively with the western models outcompeting everyone else at Coding (HumanEval), Quantitative Reasoning (MATH-500) and sticking close behind ChatGPT in Reasoning and Knowledge (MMLU), Scientific Reasoning & Knowledge (GPQA Diamond) and the Artificial Analysis Multilingual Index. DeepSeek's R1 is not so much a 'clone' or a cheap replica as it is a competing model which is squeezing the benchmarks and riling up competition, a wakeup call for the US market.


“Comparison of Models: Quality, Performance & Price Analysis" – Artificial Analysis
“Comparison of Models: Quality, Performance & Price Analysis" – Artificial Analysis
"DeepSeek-R1 Upsets AI Market With Low Prices" – STATISTA
"DeepSeek-R1 Upsets AI Market With Low Prices" – STATISTA

The Market Response


On 27th January, Nvidia closed down 17%, knocking $589 billion of its market cap, marking the most significant drop in value in market history. For reference, the loss is equivalent to MacDonalds Corp., The Walt Disney Company, Uber Tech. Inc. and Target Corp. all going to $0 overnight. At the same time the S&P 500 and Nasdaq dropped 1.5% and 3% respectively, highlighting an overall loss of more than one trillion USD in the US markets.


To demonstrate the gravity of the situation, following the launch of R1, Meta was reportedly scrambling 'war rooms' of engineers to figure out how DeepSeek's AI is beating everyone else at a fraction of the price.


"Tech stocks slump as China's DeepSeek stokes fears over AI spending" – Financial Times
"Tech stocks slump as China's DeepSeek stokes fears over AI spending" – Financial Times

This reaction stems from the nature of the AI arms race in the US, where Google, Apple, Amazon, Meta, Microsoft and Open AI have dominated the sector globally. Meanwhile, a competing model from another country was entirely out of the question in everybody's mind until the release of the R1. The billions of dollars poured into developing and training AI models has been with the sole intent of coming out as the victor in the race and reap the rewards of the new technology. As one may notice, Nvidia does not develop any AI software but is closely tied to the race; Nvidia is the shovel merchant amidst the gold rush. Substantial processing power is required to train an AI model, and the leading company in the GPU industry is Nvidia with their H100 Tensor Core GPU, priced at $40,000. In the race to compete, US companies have spent hundreds of billions of dollars on these chips, justifying Nvidia's rise to the top 3 most valuable companies in the world in 2024 amidst the AI boom.


For added context, in 2024, Microsoft, Meta, Amazon and Google purchased 450k, 350k, 196k and 169k H100 GPUs ,respectively, which amounts to 18, 12, 8 and 6.8 billion USD each. This is a total of almost $45 billion from these four companies alone in a single year.


The belief is that more chips translate to a more powerful and competitive AI model - as such, export restrictions have been placed on some countries, as shown in the graphic below:

"Biden to Further Limit Nvidia AI Chip Exports in Final Push" - Bloomberg
"Biden to Further Limit Nvidia AI Chip Exports in Final Push" - Bloomberg

The DeepSeek model was trained using Nvidia’s H800 chips, which do not fall under the export restrictions. Ironically enough, the H800 chip is the same hardware as the H100 chip but has been tweaked to perform worse and bypass the restrictions. As a result, the success of the DeepSeek R1 model challenges the requirement for H100 Chips for the American companies and brings into question the investment of 100s of billions of dollars into the chips in recent years. On a sidenote, there are reports of Chinese companies getting their hands on the H100 Chips and therefore bypassing the restrictions through smuggling. Additionally, Singapore which is connected to China by land, is suspected of providing China with the chips, accounting for $2.7 Billion of revenue for Nvidia in 2023 and making up 15% of its revenue. Consequently, the validity of the model being developed using the H800 chips may be brought into question, although the specific open-source details and further analysis have shown that the model was most probably trained using the H800 model GPU.


Ultimately, the drop in Nvidia's market cap is a result of DeepSeek's success at a fraction of the training cost, being completely free to use, open source, trained using cheaper chips and significantly cheaper to run. This raises expectations for the competition of Western companies, who already operate at a loss when running LLM, and an expectation of lower requirements for the amount of GPU's for training would be a direct blow to Nvidia's revenue. As for the money already invested into AI, the annual revenue would have to reach 600bn USD to justify the investment and make a return in the long term, and as most companies are operating at a loss this poses a difficult reality check for the future of AI.


"AI's $600B Question" - SEQUOIA
"AI's $600B Question" - SEQUOIA

Criticism Surrounding DeepSeek's AI


Some consumers may be opposed to the use of the R1 model, it has been criticised for training using ChatGPT's output and furthermore was built from a larger model that was developed and trained earlier, whose associate costs are omitted from the training cost; albeit most companies build on their previous models in the same way. Moreover, the online version of DeepSeek has been criticised for censoring some data and information, when questioned about Tiananmen Square the model is unable to answer any questions as it is not allowed to discuss the topic. Although it should be noted this is only the case for the online version which is running on Chinese servers who have to follow the Chinese regulations, but when the model is installed locally (which is possible as it is open source) the restrictions can be completely removed, and the AI runs uncensored.


Finally, some users are afraid their personal data is leaked to the Chinese Communist Party when using the R1. It may be the case that data will be collected on the users, but this would be in no way different to the way data is collected when other models are used, proving to be a shallow criticism of the AI. The concern of privacy can again be addressed just as the one for censorship through running the AI software locally thanks to the open-source design.

Ultimately, though the criticisms may hold merit, it will not matter to the average consumer when faced with using an AI for free or paying $200 a month for the same service. DeepSeek has upended the business model of OpenAI and others, proving it to be the cause of competition in the industry.

 

An Outlook for the Future


Competition has historically been good for the consumer. Companies scrounge to lower prices and stay relevant, and the rate of innovation increases as quality is sought after amongst competitors. In the future there may be augmented competition between the companies not only on quality but also on price. Another sign of this is DeepSeek’s release of JanusAI and the Janus-Pro-7B on 31st January, whereby the new models are competing and winning with the likes of Midjourney and DALL-E 3 in the image generation space. Once again, this exhibits eastern AI uprooting the fundamental nature of the business model, forcing competing US companies to innovate further.


Early into his new term, Donald Trump called DeepSeek a "Wakeup call for US Tech", as Western companies operate AI at a loss. Microsoft's CEO was quoted discussing Jevons Paradox, which states as things get cheaper people tend use them more; though this may be true, it does not mean the service providers stay the same.


If open-source AI is to prevail in the future, the business model of current AI companies will have to adapt. If not, OpenAI and the American competition will have to justify their price with quality or consumers will turn away to the cheaper and more accessible alternatives, despite any moral qualms. Competition will be good for us as consumers, but it will signal a change in the AI race and market landscape as we know it.

0 comments

Recent Posts

See All

Comments


bottom of page