The Era of Generative AI: What is the Future for Intellectual Property?

Paulina Mrozik and Natalia Dulkowska of PwC Legal Poland discuss the impact of generative AI on the area of intellectual property.

Published on 15 August 2023
Paulina Mrozik, PwC Poland, Expert Focus contributor
Paulina Komorowska-Mrozik
Contact author
Natalia Dulkowska, PwC Poland, Expert Focus contributor
Natalia Dulkowska
Contact author

In recent months, hundreds of millions of users have tested generative AI systems that are based on machine learning and require human input (interaction – prompt) to generate new text, graphics or audiovisual output.

The increasing use of generative AI systems, like any new technology, unlocks a wide spectrum of possibilities but also brings with it an outbreak of legal questions. We can already assume that one of the main technological challenges in 2023 is the “collision” of the emerging generative AI systems with intellectual property law. This challenge affects businesses that develop and offer AI-based solutions as well as the creators and those who plan to use such systems and its results in their operations.

Training Data: Use of Copyright-Protected Works for Machine Learning

The models on which generative AI systems are based are usually created by analysing huge amounts of data, which is very often scraped from the web and automatically processed. Where the set of training data of a given model includes intellectual property rights such as copyrighted work, the use of such protected data to train AI requires the authorisation of the right holder, as long as such use does not fall under statutory exceptions to exclusive rights.

In European law, the text and data mining (TDM) exception may be applicable under the provisions of the Copyright in Digital Single Market (CDSM) Directive. This exception is designed, among other aims, to enable machine learning and the development of artificial intelligence through the reproduction of databases or copyrighted works, for scientific research and commercial purposes.

"As a general rule, the use of protected data to train AI requires the authorisation of the right holder."

However, there are two important restrictions to the exception for commercial purposes. First, the extracted data may only be stored for the period necessary for the purpose of exploration. Second, TDM can be prohibited by entitled entities, potentially blocking AI development – so commercial entities can apply the TDM exception only if the right owner has not exercised its opt-out right. This could result in similar effects as the previous opt-in rule – the catalogue of data that can serve as learning material would be substantially limited, which could lead to the development of algorithms trained with biased data, or the import of algorithms trained under more lenient jurisdictions on non-verified data.

Furthermore, not only has the CDSM Directive not yet been implemented in many member states, but standards have not yet been developed on how to reserve (prohibit) rights to TDM in a machine-readable manner. Such a lack of legal certainty currently makes it difficult to set a clear framework for the legitimate use of AI solutions in Europe.

Outputs: the Concept of “Work” Under EU Law

The question of whether computer-generated results can be copyrighted dates back to the 1960s; the issue is not new, but the specific nature of AI technology further complicates it. When seeking answers to these non-obvious questions, it is worth looking at the pillars of copyright law.

First: the term “work” and the concept of “authorship”. None of the European directives harmonise this concept. The basic guidelines are provided by the Berne Convention, which adopts the general requirement that works should be created within a “literary, scientific or artistic field” (recital 16), and by EU case law, particularly the requirement of “the author's own intellectual creation” (Infopaq) and “the expression of the author's free and creative choices” (Hewlett-Packard Belgium SPRL v Reprobel SCRL). Thus, copyright protection can only be granted to a work whose creation process was under the control of a human being. When a machine plays a role in the development of such a work, it is difficult to unambiguously assign authorship to a human being and acknowledge the existence of the work. It is therefore crucial to verify whether the given extent of human contribution is sufficient to grant copyright protection to the given result of the AI system.

Further premises of copyright protection include the requirements of originality and the author's own expression (Levola). There is an emphasis on the creative process itself, not just the result (Infopaq) (eg, a creative, original prompt entered into an AI system could be of significance), but also the creators making “free and creative choices” should express their personality in the outcome (Painer). With regard to computer-generated outputs, a particularly significant question is to what extent the original features of a work should be a human contribution, and to what extent it is permissible to be “assisted” by external elements such as an AI solution. As we have indicated, this question is not entirely new, but providing an answer regarding the specifics of AI solutions obviously sheds new light on it.

AI Act Proposal: Generative AI-related Amendments

The widespread popularity of generative AI systems in Europe is affecting the legislative process of the proposed Artificial Intelligence Act. The EU lawmakers are planning to require generative AI solution providers to “document and make publicly available a sufficiently detailed summary of the use of training data protected under copyright law”. While this provision could make it possible for copyright holders to exercise their opt-out rights, it raises a number of practical concerns regarding the way of summarising the use of copyrighted training data and the methodology to adopt in order to identify and assess what data constitutes copyrighted works (given the absence of a registration requirement for copyrighted works in EU member states and the lack of full harmonisation in the area of copyright law).

Although the disclosure obligation aims to enable copyright holders to act against unlawful use, the unclear wording of the provision may lead to an increased risk of unjustified claims and a lack of transparency for copyright holders. Moreover, the proposed provision is likely to affect the interests of providers of generative AI systems by imposing obligations that are difficult to fulfil and by exposing their own know-how to disclosure.

PwC Legal

PwC Legal firm logo
4 ranked departments and 11 ranked lawyers
Find out more about the firm's ranking in Chambers Global
View firm profile

Chambers In Focus Newsletter

Sign up for our newsletter and never miss out on thought leadership content from legal experts and the key stories driving the legal profession forward.
Sign up here