Navigating the Legal Requirements of Generative Artificial Intelligence in China

Kevin Duan and Kemeng Cai of Han Kun Law Offices in Beijing discuss the regulation of generative AI in China.

Published on 15 August 2023
Kevin Duan, Expert Focus Contributor, Han Kun
Kevin Duan

Ranked in Chambers Greater China Region Guide

View profile

China is actively regulating generative artificial intelligence (AI) services. Regulators aim to contain its potential harmful impacts such as misinformation, algorithmic discrimination, personal data and copyright infringement, etc, while also implementing policies to create a favourable atmosphere for AI-driven innovation, scientific and technological advancements, and R&D application in various sectors.

Governing Framework for Generative AI

Generative AI services are subject to dedicated AI and algorithm regulations, as well as general applicable regulations on content safety, data privacy and security, consumer protection, etc.

In response to the popularity of generative AI services like ChatGPT and Midjourney, and their Chinese counterparts, the Cyberspace Administration of China (CAC) issued the Interim Measures for Generative Artificial Intelligence Services on 13 July 2023 (“Interim Measures”). Generative AI may also be governed by the regulations on “deep fake” services and algorithm-based information services. The latter mainly include algorithm-based services for content generation and synthesis, personalised pushing, sorting and selection, search and filtering, dispatch and decision-making, etc.

China is promulgating its first dedicated and comprehensive law on artificial intelligence, which has been listed in the 2023 legislative agenda of the State Council.

Apart from these special AI regulations, generative AI may also be subject to the laws and regulations on data privacy and security, copyright, consumer protection, content filtering, etc.

Definition and Applicable Scope

The Interim Measures apply to the utilisation of generative AI technology to provide services for generating text, images, audio, video and other content to the public within the territory of the People’s Republic of China, including the provision of generative AI through application programming interfaces. Generative AI technology refers to models and related techniques that have the capability to generate content such as text, images, audio and video. Although the definition seems quite broad and may capture a wide range of AI technologies, most commentators believe the Interim Measures mainly target GPT-like services based on deep learning.

"Generative AI technology refers to models and related techniques that have the capability to generate content such as text, images, audio and video."

The Interim Measures clearly exclude from their scope of application the research, development and application of generative AI technologies that do not provide services to the domestic public. This greatly reduces the compliance burden at the model development stage and alleviates the compliance concerns of many enterprises when accessing generative AI services for internal application purposes, such as improving work efficiency.

Inclusive and Prudent, Categorised and Classified Regulatory Approach

The Interim Measures call for an “inclusive and prudent, categorised and classified” regulation of generative AI services. The classified approach may be inspired by the European Union's draft Artificial Intelligence Act.

The Interim Measures also emphasise co-ordination among regulators. Each ministry or department can formulate more detailed rules on generative AI based on the Interim Measures and set more stringent specifications for certain industries, application scenarios or high-risk services.

Requirements for Data Processing Activities

The Interim Measures set forth a series of requirements on the collection, annotation and optimisation of training data, which mainly include the following.

  • Service providers must use data and base models from legal sources. The development of generative AI mainly relies on scraping online publicly available data and, according to court cases, the lawfulness of data scraping mainly depends on:
    • the nature of data, whether it involves personal data, trade secrets, copyrighted works or other specially protected data;
    • the methods of scraping, whether they break through or circumvent the protection measures of the target websites, and whether they violate the terms of use or robot protocols of the target websites; and
    • the impacts on the target websites, whether the generative AI is used in services competing with the target websites, whether the scraped data is used for competing purposes, and whether the scraping overloads the traffic of the target website.
  • Service providers shall not infringe others’ copyright if the training data involves copyrighted works. Model training does not fall into the 12 situations of fair use clearly listed in the Copyright Law, while it is controversial whether model training is captured by the catch-all provisions on “fair use”.
  • If the training data involves personal data, the service providers must obtain data subjects’ consent or have other legal bases stipulated by law. However, it is also unclear whether using publicly available personal data for model training can pass the “compatibility test” for secondary use or must re-obtain data subjects’ consent.
  • Service providers should take effective measures to improve the authenticity, accuracy, objectivity and diversity of training data.
  • Providers shall:
    • establish clear, specific and actionable annotation rules and conduct quality evaluations of data annotation;
    • sample and verify the accuracy of the annotation content; and
    • provide necessary training to annotation personnel, enhance their compliance awareness, and supervise and guide them to carry out annotation work in a standardised manner.

Content Moderation

Content safety is a critical concern for regulators. The Interim Measures set forth the following content moderation requirements:

  • not to generate content endangering the party-state regime, inciting separatism, undermining national unity and social stability, promoting terrorism and extremism, propagating ethnic hatred and discrimination, violence, pornography, as well as false and harmful information;
  • based on the characteristics of the service types, take effective measures to enhance the transparency of generative AI services and improve the accuracy and reliability of generated content; and
  • establish a mechanism for reporting complaints, deal with illegal information in a timely manner, and rectify by measures such as optimising model training.

Algorithm Filing and Security Assessment

Generative AI services with “public opinion attributes or social mobilisation capabilities” are required to complete algorithm filing and security assessment with CAC. Under PRC regulations, the term may capture online forums, blogs or microblogs, social media, instant messaging services or even all types of consumer-facing information services. In practice, almost all leading generative AI service providers are required by CAC to conduct algorithm filing and security assessments.

Algorithm filing applies not only to generative AI services, but also to other algorithm information services such as algorithm-based personalised pushing, sorting and selection, search and filtering, dispatch and decision-making. In algorithm filing, the service provider needs to explain the following issues to CAC:

  • basic information of the algorithm, including the algorithm flow chart, algorithm training data, algorithm model, intervention strategies, result identification, etc;
  • basic information of the algorithm services, including its service functions, user interfaces, targeted users and algorithm usage;
  • risk assessment; and
  • risk prevention measures.

"In practice, almost all leading generative AI service providers are required by CAC to conduct algorithm filing and security assessments."

Security assessment is more comprehensive and shall cover content filtering and moderation, data privacy and data security, account management, internal data and content security governance systems.

Other Compliance Requirements

Generative AI service providers also have the following obligations.

  • Personal data protection: service providers shall not collect unnecessary personal data and “unlawfully” retain user inputs or usage records that can identify users’ identity or provide user inputs to others, and shall establish mechanisms to respond to data subject right requests.
  • Labelling generated content: service providers shall clearly label images or videos generated by the generative AI services which may cause confusion.
  • Transparency: service providers shall take effective measures to improve algorithmic transparency. In practice, it is advisable for providers to disclose the basic principles, objectives, intentions and main operational mechanisms of the generative AI algorithm.

Exterritorial Effect and Foreign Investment Restriction

The Interim Measures stipulate that if overseas service providers provide generative AI services to domestic users but do not comply with regulations, CAC will notify the relevant agencies to take technical measures to cut off access to the overseas services, which provide de facto exterritorial effect. The regulations do not set forth any restrictions on foreign investment in generative AI services. However, these restrictions will apply if generative AI is used to provide services with foreign investment restrictions (eg, internet information services, audiovisual program services or internet cultural operations).

Impact and Outlook

Overall, the Interim Measures take a prudent and inclusive approach and moderately relax the compliance requirements for generative AI from research and development and model training to application, fine tuning and other stages. They show an encouraging new technology development and application policy orientation. However, looking at the specific rules, the Interim Measures’ requirements for training data compliance, generation content security and accuracy, transparency, etc, require enterprises to propose creative solutions to combine technology and legal compliance to alleviate the security concerns of regulatory agencies and win more institutional space for industrial development.

Han Kun Law Offices

Han Kun logo, Chambers Expert Focus contributor
14 ranked departments and 24 ranked lawyers

Learn more about the firm's ranking in Chambers Greater China Region

View firm profile

Chambers In Focus Newsletter

Sign up for our newsletter and never miss out on thought leadership content from legal experts and the key stories driving the legal profession forward.
Sign up here