A Chinese startup named DeepSeek, based in Hangzhou, caused a significant stir in global equity markets last month. Their cost-effective AI reasoning model, R1, outperformed many Western competitors, triggering a sell-off exceeding $1 trillion. Now, the company is accelerating the release of its successor, the R2 model.
Originally planned for an early May launch, DeepSeek now aims to release R2 as soon as possible, according to sources. The company hopes the new model will enhance coding capabilities and expand language reasoning beyond English. This accelerated timeline has not been previously reported.
DeepSeek has not provided official comments on this development.
The impact of R1 is still being assessed by competitors. It was built using less powerful Nvidia chips but rivaled models developed by U.S. tech giants at enormous costs.
Vijayasimha Alilughatta, COO of Zensar, believes R2 could be a pivotal moment for the AI industry. DeepSeek’s success in creating cost-effective models could spur global efforts, challenging the dominance of a few major players.
The U.S. government, which prioritizes AI leadership, is likely to be concerned. The release may also further motivate Chinese authorities and companies, many of which have begun integrating DeepSeek models.
DeepSeek, founded by Liang Wenfeng, remains relatively unknown. Liang, who became a billionaire through his quantitative hedge fund High-Flyer, maintains a low profile.
Interviews with former employees and industry professionals, along with reviews of media and research, reveal a company operating more like a research lab than a traditional enterprise. It fosters a collaborative, less hierarchical environment.
Liang, born in 1985, holds engineering degrees from Zhejiang University. He is known for a flat management style, encouraging collaboration with young employees.
DeepSeek and High-Flyer offer competitive compensation. High-Flyer, a successful quant fund, funds DeepSeek’s research and computing power investments.
DeepSeek’s success is attributed to High-Flyer’s substantial investment in research and computing power. The fund invested heavily in supercomputing AI clusters, including the Fire-Flyer II, which uses around 10,000 Nvidia A100 chips.
Chinese regulators initially questioned High-Flyer’s extensive chip acquisition but did not intervene. This proved crucial when the U.S. banned A100 chip exports in 2022.
Beijing now supports DeepSeek but has instructed it to limit media interactions.
DeepSeek’s access to a large A100 cluster has attracted top research talent. The company’s success is also due to its focus on cost-effective AI architecture, using techniques like Mixture-of-Experts (MoE) and multihead latent attention (MLA).
DeepSeek’s pricing is significantly lower than competitors like OpenAI. This has prompted rivals to adjust their strategies, with price cuts and the development of less resource-intensive models.
DeepSeek has garnered strong support from Chinese authorities. Premier Li Qiang met with Liang, and numerous Chinese entities have integrated DeepSeek’s models.
However, some Western governments have raised privacy concerns, removing DeepSeek from app stores.
The company’s reliance on advanced AI chips, which face export restrictions, remains a challenge.