Even though the extension of far-reaching obligations for Foundation Models and General Purpose AI as now envisaged by the two Parliamentary Committees (Civil Liberties, Justice and Home Affairs and Internal Market and Consumer Protection) are not yet set in stone, a clear trend is emerging how to regulate two technical concepts that play an important role in AI technology.
In particular, far-reaching obligations are envisaged for Foundation Models such as GPT-3, i.e. large language models with which the technical foundations for the coherent processing of texts are generated. Therefore, it is primarily the basic technology that is to be regulated according to the Parliament's proposal. In other words, the rails on which applications such as ChatGPT are based. Or as the compromise amendments put it in the suggested Recital 60e:
"AI systems with specific intended purpose or general purpose AI systems can be an implementation of a foundation model, which means that each foundation model can be reused in countless downstream AI or general purpose AI systems. These models hold growing importance to many downstream applications and systems."
Consequently, the Parliament's draft now legally defines Foundation Models and General Purpose AI systems in Article 3 as follows:
(1c) ‘foundation model’ means an AI model that is trained on broad data at scale, is designed for generality of output, and can be adapted to a wide range of distinctive tasks;
(1d) ‘general purpose AI system’ means an AI system that can be used in and adapted to a wide range of applications for which it was not intentionally and specifically designed;
Further, for the development and marketing of Foundation Models such as GPT3 or the Google technology PaLM2, which was only presented at the beginning of May 2023, extremely far-reaching and complex obligations would apply in future and providers would be required to ensure compliance even before a foundation model is made available on the market or put into service. If the AI Act would alreday be in in force in its current version, models such as GPT3 or PaLM2 would probably have to be removed from the market immediately. Incidentally, this would not only apply to the Foundation Model itself, because the proposed obligations apply regardless of whether they are provided as a standalone model or embedded in an AI system or product.
In more detail, the new suggested Article 28b of the draft regulation requires compliance with the following:
1. Foundation Model providers must be able to demonstrate through appropriate design, testing, and analysis that the identification and mitigation of foreseeable risks of the model to key social assets and fundamental rights (including amongst others safety, environment, rule of law) have been considered not only prior to, but also during the development by using appropriate methods. In that regard, independent experts can be consulted. If risks nevertheless remain, these must be documented. In the case of large-scale language models, this could apply, for example, to the potential misuse for the creation of fake news.
2. Furthermore, only data sets that have been subject to data governance measures appropriate for Foundation Models may be processed and incorporated. This relates in particular to measures examining the suitability of data sources and possible biases, as well as appropriate measures to mitigate these bias issues.
3. Foundation Models would also need to be developed to ensure adequate performance, predictability and interpretability. In addition, (cyber)security must be ensured and compliance with all parameters must be documented and ensured through comprehensive testing during design and development and with the involvement of independent experts.
4. The European Parliament is also envisioning extensive commitments with regard to sustainable programming in the course of the development of Foundation Models:
Applicable standards for reducing energy consumption, resource and waste consumption must be used, alongside with the requirements for energy efficiency, as soon as the EU Commission has published the relevant standards for this purpose. These standards must also enable the measurement and recording of the application's energy and resource consumption and other environmental impacts the use of a certain system may have.
5. In addition, comprehensive technical documentation and intelligible instructions for use must be drawn up to enable downstream providers to meet their own transparency obligations.
6. Finally, there is the obligation to establish a quality management system to ensure and document compliance with all specifications.
Another important regulatory step is the planned obligation to register Foundation Models in the new EU database. The information to be provided for registration will be comprehensively specified in a separate annex to the AI Regulation (Annex VIII).
Further commitments and definition of Generative AIProviders of Foundation Models used in "generative AI" will have to comply with further obligations. The draft defines such AI as "foundation models used in AI systems specifically intended to generate, with varying levels of autonomy, content such as complex text, images, audio, or video".
Providers of such technologies must not only comply with the general transparency obligations provided for under the AI Act and train, design and develop the Foundation Model in such a way as to ensure adequate safeguards against the generation of content that violates EU law.
In addition, a sufficiently detailed summary of the use of training protected under copyright law has to be published. This section obviously addresses the concerns of rightsholders that the use of copyright protected work to train generative AI models might be a copyright infringement, while it can also fall under existing text and data mining exceptions under EU copyright law.
For the developers of foundation models and general purpose AI, the compromise amendments of the European Parliament results in extremely extensive documentation obligations overall, although both the relevant legal definitions and obligations are pretty vague in the proposed text. What is an "appropriate" data governance model that regulates the use of biased training data in order to counter the problem of bias? What does a quality management system mean that meets the requirements of the regulation? How exactly has to be demonstrated in a compliant manner that an examination and reduction of possible risks to European fundamental rights has taken place? These questions and many further legal uncertainties are raising the question of the enforceability of such extensive obligations. The requirements contain so much leeway and scope for interpretation that it seems challenging to determine whether the obligations have been violated as a prerequisite for imposing potential fines against providers. Consequently, the suggested amendments to regulate Foundation Models and General Purpose AI Systems could lead to considerable legal uncertainty. Once the European Parliament will have adopted this proposal, it will be up to the Trilogue with the European Commission and the Council to further discuss and negotiate if this approach might cause more damage than benefit for the European tech industry and the power of AI innovation in the European market.
Sign up to our email digest
Click to subscribe or manage your email preferences.