
Figure: The most popular AI models and AI patent applications in different regions
Recently, an artificial intelligence (AI) student team from Stanford University in the United States was suspected of plagiarizing the MiniCPM AI model of China’s Mianbi Intelligence, which has attracted industry attention and heated discussions among netizens. The Stanford team has apologized to the Chinese team online. Experts said that a large number of well-known AI large-model companies have emerged in China. China has a huge base of Internet users and provides rich scene data resources, which is an important advantage for training large models. China has rapidly grown into an important promoter of artificial intelligence technological innovation. \ Ta Kung Pao reporter Guo Hanlin Carlyle intern reporter Su Yurun
A large AI model refers to a machine learning model with ultra-large-scale parameters (usually more than one billion) and super computing resources, which can process massive amounts of data and complete various complex tasks, such as natural language processing and image recognition. On May 29, an AI team composed of three students from Stanford University released a large model called Llama3-V, claiming that it has better performance than OpenAI’s GPT-4V, Google’s Gemini Ultra, and Anthropic’s Claude Opus, and that it only costs $500 to train an optimal model. Soon, a user revealed that the model structure and configuration file of Llama3-V developed by the Stanford team are exactly the same as those of a large Chinese model, MiniCPM-Llama3-V 2.5, with only some simple modifications.
Exposing Silicon Valley’s “shameful culture”
MiniCPM was jointly launched by Chinese startup Mianbi Intelligence and Tsinghua University’s Natural Language Processing Laboratory in mid-May. Tsinghua and Mianbi Intelligence teams later confirmed that the Stanford Big Model project, like MiniCPM, could identify ancient characters from the Warring States Period in the “Tsinghua Bamboo Slips” (a batch of bamboo slips from the middle and late Warring States period collected by Tsinghua University), “not only were they exactly the same, but they were also exactly the same in their errors.” Since this ancient character data was not made public, the fact of plagiarism was eventually confirmed.
Siddharth Sharma and Aksh Garg, two authors of the Stanford Llama3-V team, apologized on social media earlier this month and removed all Llama3-V models. Aljadery Mustafa, who is from the University of Southern California and is responsible for writing the code, has deleted his social media account.
Before the above team apologized, Christopher David Manning, director of the Stanford Artificial Intelligence Laboratory, issued a statement condemning the plagiarism and said that MiniCPM “is a very good open source work” and “Fake it before you make it is the shameful culture of Silicon Valley.” Lucas Beyer, a researcher at Google DeepMind, commented on the matter, saying that China’s open source big model has a good model like MiniCPM, but has not received international attention commensurate with its technical strength.
Building an open technology community environment
Founded in August 2022, Mianbi Intelligence’s core technical team originated from the Tsinghua Natural Language Processing Laboratory and is one of the earliest teams in China to conduct large-scale model research. After the plagiarism incident, Mianbi Intelligence’s co-founder and CEO Li Dahai posted on WeChat Moments that he hoped the team’s efforts and excellent work would attract more attention and recognition, but not in the form of being imitated or even plagiarized. He further emphasized that an open, collaborative and trusting technology community environment should be built.
Liu Zhiyuan, chief scientist of Mianbi Intelligence and tenured associate professor at Tsinghua University, also wrote on Zhihu that this incident made him sigh at the “changes in scientific research experience over the past decade”: “From a horizontal perspective, we are obviously still significantly behind the world’s top work such as Sora and GPT-4o; at the same time, from a vertical perspective, we have grown rapidly from a nobody for more than a decade to a key promoter of AI technological innovation. Facing the upcoming era of general artificial intelligence (AGI), we should be more confident and actively involved in it.”
Experts: China has advantages in data and applications
Although China started late in the research and development of AI big models, it is developing fast. Professor Shen Yang of the School of Artificial Intelligence at Tsinghua University said in an interview with Ta Kung Pao that OpenAI’s launch of ChatGPT in 2022 made the public feel the gap between China and abroad in the field of AI. In recent years, there was a saying that “when foreign countries open source, China will develop its own products.” However, domestic practitioners “know the shame and then the courage” and began to catch up. A large number of well-known AI big model companies have emerged in China, and the gap between the two sides in this field is narrowing.
“The reason why the plagiarism incident has attracted widespread attention is that ‘reverse plagiarism’ was relatively rare before. In the past, domestic AI teams often developed based on foreign open source large models.” Shen Yang said that China’s domestic large models have shown more and more remarkable features. At present, China and the United States can at least achieve “you have me, I have you” in terms of large model technology. “This incident may be because the Stanford student team hopes to get financing as soon as possible, so they plagiarized the large model trained with Chinese data and used it as a ‘shell’.”
Shen Yang believes that China has many advantages in the field of large-scale model research and development, especially China has huge data resources and application scenarios. For example, in the manufacturing industry, from traditional industries to the “new three things”, China has formed a large number of advantageous industrial clusters; in the service industry, China’s short videos, e-commerce, online literature, mobile games, etc. have all become the world’s first. This provides strong support for the training of China’s large models, allowing China to develop more rapidly in the field of AI large models, and has the possibility of achieving “overtaking on the curve” in the near future.
source: china