Chinese regulators have proposed restrictive rules around AI models like ChatGPT being built in the country, requiring user identification and security reviews, and prohibiting “any content that subverts state power, advocates the overthrow of the socialist system, incites splitting the country or undermines national unity.”
The rules come hot on the heels of Chinese tech companies rolling out their versions of general purpose large language models, versatile AI systems that can converse in natural language and carry out a surprising number of tasks. While the reception of SenseTime, Baidu, and Alibaba’s models over the last month suggests they’re somewhat behind the likes of GPT-4, it’s clear the industry there is equally dedicated to developing these capabilities.
Unfortunately, shortly after the debut of Alibaba’s Tongyi Qianwen model, one of the country’s tech regulators, the Cyberspace Administration of China, proposed restrictions that may smother relevant innovations — and the Chinese AI industry’s ambitions along with them.
The draft rules are not available in English (I took the above quote from the Financial Times translation syndicated at Ars Technica) but can be viewed at the regulator’s website here. The first part of article 4 prohibits generative AI that subverts government power and authority or questions national unity, along with various other categories of prohibitions like ethnic discrimination, terrorism, and so on.
This kind of catch-all morality clause is commonplace in China, but it happens to be the kind of restriction that generative AI is uniquely incapable of complying with. Even the most carefully trained and tuned LLM seems to be capable of being tricked into saying all manner of objectionable things. Whether Chinese censors decide this is in compliance with the law or not is more or less entirely up to them, something that makes the prospect of dedicating serious resources to such a project somewhat fraught.
Of course, much of Chinese industry exists under a similarly suspended dagger, and although China’s regulators are capricious, they are not foolish enough to throw away the fruits of the government’s years of propping up R&D in the country. It’s probable that, not unlike other content-limiting laws there, this will act more as a fig leaf and ironclad excuse for the government to exert influence — not a blanket prohibition.
If anything, it is the other requirements that may slow AI development there to a crawl.
The CAC draft rules require, among other things, that providers assume liability and responsibility for the training data of models, including difficult to measure metrics like authenticity, objectivity, and diversity; users of the services must be verified as real people; personal information and reputation must be respected or regulators may find the provider liable; generated content must be labeled as such; and many other restrictions.
Certainly some of these requirements could be considered prudent or even critical to a responsible AI industry, but the truth is many of them would be incredibly difficult, perhaps impossible to implement by today’s companies and R&D efforts. OpenAI has achieved its success partly because it is working in an almost complete regulatory vacuum. If the law required that the company, say, obtain permission from the rights holders of the text and media it used to train its models, it would probably still be waiting to build GPT-2.
If an AI startup or even an established company can’t confidently operate in China for fear of violating these rules at a massive scale, they may decide that their resources are better spent elsewhere. As fast-moving as this industry is, such a setback may be very difficult to regain.
The Financial Times quoted Alibaba’s chief executive, Daniel Zhang, as saying that “10 to 20 years from now, when we look back, we will realize we were all on the same starting line.” That is almost certainly true, and from a similar perspective we may well see how regulation throttled innovation — or perhaps how it prevented a stagnating monopoly, or protected people from the well-organized mass theft of their data.
The draft rules are open for comment (by parties in China, obviously) for the next month, after which they may or may not be revised, and are slated to enter effect later this year.
Source @TechCrunch