Top White House advisers have expressed alarm over concerns that China's DeepSeek, a rising star in the artificial intelligence (AI) sector, may have leveraged a controversial technique known as "distillation" to accelerate its advancements by piggybacking on U.S. AI models. This method, which involves one AI system learning from another, has sparked fears that DeepSeek's rapid progress could undermine U.S. leadership in the AI race.
DeepSeek recently made waves in the tech industry with the release of a new AI model that rivals the capabilities of U.S. giants like OpenAI but at a fraction of the cost. Notably, the China-based company made its code freely available, raising suspicions among technologists that DeepSeek may have used U.S. models to achieve its breakthroughs. The distillation technique involves using a more established AI model to evaluate and improve a newer model, effectively transferring knowledge without the need for massive investments in time and computing power.
While distillation is a common practice in the AI field, it violates the terms of service of several prominent U.S. models, including those developed by OpenAI. A spokesperson for OpenAI told Reuters that the company is aware of groups in China attempting to replicate U.S. AI models through distillation and is investigating whether DeepSeek improperly used its technology.
Naveen Rao, vice president of AI at Databricks, acknowledged that learning from rivals is "par for the course" in the AI industry, comparing it to automakers studying each other's engines. However, he emphasized the importance of adhering to terms of service. "We all try to be good citizens, but we're all competing at the same time," Rao said.
The allegations have drawn sharp criticism from U.S. officials. Howard Lutnick, President Donald Trump's nominee for Secretary of Commerce, accused DeepSeek of misappropriating U.S. AI technology during a Senate confirmation hearing and vowed to impose stricter export controls on AI technology. "I do not believe that DeepSeek was done all above board. That's nonsense," Lutnick said. "I'm going to be rigorous in our pursuit of restrictions."
David Sacks, the White House's AI and crypto czar, also raised concerns about DeepSeek's practices in a recent Fox News interview, highlighting the broader unease in Washington about China's use of U.S. technology to advance its own tech sector. This mirrors previous concerns in the semiconductor industry, where the U.S. has imposed restrictions on exports to China.
Despite the growing concerns, technologists warn that blocking distillation may be more difficult than it appears. DeepSeek's innovation lies in its ability to use a relatively small number of data samples—fewer than one million—from a larger model to significantly enhance a smaller model. With popular products like ChatGPT boasting hundreds of millions of users, detecting such small-scale data transfers is challenging.
Moreover, some AI models, such as Meta's Llama and French startup Mistral's offerings, are freely available for download and use in private data centers, making it difficult to monitor compliance with terms of service. Umesh Padval, managing director at Thomvest Ventures, noted, "It's impossible to stop model distillation when you have open-source models like Mistral and Llama. They are available to everybody."
Meta's Llama model requires users to disclose if they are using it for distillation, and DeepSeek acknowledged using Llama for some of its models in a recent paper. However, it remains unclear whether DeepSeek violated Meta's terms of service earlier in its development process.
Some companies are taking proactive measures to prevent unauthorized use of their AI models. Groq, an AI computing company, has blocked all Chinese IP addresses from accessing its cloud services to deter Chinese firms from allegedly piggybacking on its models. However, CEO Jonathan Ross admitted that such measures are insufficient. "People can find ways to get around it," he said. "It's going to be a cat-and-mouse game."
As the U.S. government considers stricter regulations, the AI industry remains divided on how to address the issue. One source familiar with the thinking at a major AI lab suggested that stringent "know-your-customer" requirements, similar to those in the financial sector, could help. However, no concrete measures have been implemented yet.
DeepSeek has not responded to requests for comment on the allegations. Meanwhile, OpenAI has pledged to work with the U.S. government to protect its intellectual property, though it has not detailed specific countermeasures.
As the AI race intensifies, the debate over distillation and intellectual property rights underscores the challenges of maintaining technological leadership in an increasingly competitive global landscape.