GPAI

GPAI Provider or Deployer? A comprehensive breakdown of your roles and obligations under the EU AI Act

Define your role under the EU AI Act, learn where provider duties start after fine-tuning, and publish a safe Public Training Data Summary without leaking IP. Finish with a 6-question role test and leave with a practical checklist.

Important Note: This is a practical guide, not a legal document. It is intended to frame engineering-driven conversations. Before implementing any practices from this guide, please read our Educational Content Disclaimer and always consult with your legal counsel.

The New Reality of GPAI Regulation under the EU AI Act

With the EU AI Act entering into force, vague statements turn into operational tools — and the luxury of “we’ll figure it out later” disappears.
Clarifications on General-Purpose AI (GPAI) have set a new routine: where a public minimum of transparency is required, how not to harm intellectual property, and where the line is after which fine-tuning makes you a provider with your own obligations.

💡 Even if you’re a small team with no EU office, if users in the EU can access your product, that’s de facto placement on the EU market.
The “cascade” of provider obligations flows downstream: to fulfill your duties as a Deployer, you need information about the base model and clear usage modes. Otherwise, during supervision your product can be scrutinized—regardless of the base model’s status.

There is also opportunity here: these new approaches set behavioral norms across the ecosystem. You can and should use the obligations of large-model developers as leverage in your supply chain and contracts.

We look at the topic through two roles embedded in the Act

Provider

Whoever places a model on the market or substantially modifies it.

Deployer

Whoever uses third-party AI models in their systems (via API, weights, or platforms).

This perspective helps you not just comply, but win:
— lay a foundation for public transparency without exposing unnecessary IP;
— decide what to publish and what to keep in your internal evidence pack;
— draw contractual boundaries of duties and liability;
— turn requirements into a negotiation tool with providers and regulators.

📘 In this article we will

  • Unpack “grey areas” and common mistakes even seasoned teams make;
  • Define GPAI precisely and who is in scope;
  • Show the “domino effect” — how provider obligations propagate across the chain;
  • Run the 2-Minute Self-Diagnosis — find out whether you’re a Provider or a Deployer.
  • Turn obligations into advantage — how to use the new rules in contracts and processes.

4 Costly GPAI Compliance Mistakes Under the EU AI Act (and How to Avoid Them)

In our practice, we see two extremes when it comes to GPAI regulation: some teams think the EU AI Act does not apply to them, while others over-engineer compliance processes, incurring unnecessary costs and leaking IP. Both stem from misinterpreting key “grey areas” of the legislation.

Mistake #1: “Our model is too niche; this isn’t GPAI.”

Myth: “We built an AI model for a narrow task—a legal assistant or an HR bot. That’s not general-purpose AI, so GPAI requirements don’t apply to us.”
Reality: Under the EU AI Act, “general purpose” is interpreted by technical capabilities, not your marketing niche. If your “legal assistant” can answer questions, summarize documents, and generate text—that’s a heterogeneous task set. By the regulator’s definition, it’s GPAI, even if you sell only to lawyers.

Mistake #2: “We’re not developers; we just fine-tune open source.”

Myth: “We took Llama or Mistral and fine-tuned on our data. The original developer bears responsibility.”
Reality: The AI Office introduces “substantial modification.” If your fine-tuning materially changes the model’s behavior or risk profile (indicative threshold: fine-tuning compute > one-third of pretraining compute for the base model), you legally become the Provider of a modified version with corresponding obligations within the scope of your changes. Publishing weights under an open-source license does not exempt you from these baseline duties.

Mistake #3: “We’re a Deployer. Liability sits with the API vendor.”

Myth: “We use OpenAI or Google APIs. These giants surely comply. We don’t need to do anything.”
Reality: The Act is built on the principle of chain-of-responsibility. As a Deployer of a GPAI model, you must exercise due diligence. This means requesting and retaining the vendor’s complete documentation package (Public Training Data Summary, Model Card, etc., as required by Annex XII of the AI Act). If the vendor cannot provide it, using their model becomes a significant compliance risk for your product. In an investigation, the regulator will come to you first.

Mistake #4: “To comply, we must publish everything about training.”

Myth: “Transparency requirements will kill our IP by forcing us to disclose all data sources and parameters.”
Reality: That’s an extreme. The regulator aims for balance under the principle: “Sufficient to understand; insufficient to replicate.” You must publish information at an aggregated, categorical level. All specifics, full lists, and logs remain in your internal Evidence Pack, presented only upon regulator request, often under NDA.

Correctly identifying your role is the first step to avoiding these mistakes. Not sure where your boundary of responsibility lies? Take our express test to determine your role (Provider or Deployer) in 2 minutes and get a starter obligations checklist.

How to Classify Your AI Model: Official GPAI Criteria under the EU AI Act

While the EU AI Act Articles 53–55 provide the legal foundation, your day-to-day GPAI compliance workflow is defined by three critical documents from the AI Office. Understanding them is key to correctly classifying your model and determining your obligations.

Public Training Data Summary Template

This mandatory template dictates the format and level of detail for your public disclosures on training data. It defines the very boundaries of your transparency requirements.

GPAI Code of Practice

A voluntary but regulator-endorsed standard. Adhering to it is considered an “adequate path” to demonstrate compliance with copyright, transparency, and AI-safety rules, significantly reducing regulatory uncertainty.

AI Office Guidelines on Provider Obligations

This official interpretation guide clarifies who counts as an AI Provider, what constitutes a substantial modification, and how placing on the market is determined under the EU AI Act.

Together, these documents translate abstract legal requirements into a practical operational workflow. Next, let’s look at who falls under these rules and by what criteria.

Determining Your Role under the EU AI Act: Provider vs. Deployer

⚖️ Two regulatory roles define your obligations under the EU AI Act.

GPAI Provider

Definition: An organization that develops a GPAI model and places it on the EU market (via API, publishing model weights, or integration into SaaS) or substantially modifies a third-party model.


Key Consideration: The rules are extraterritorial. Providing access to users in the EU equals placement on the market — regardless of your company’s location.

GPAI Deployer

Definition: An organization that integrates a third-party GPAI model into its own product or internal systems (e.g., building RAG systems on foundation models like OpenAI’s or using models via cloud platforms like AWS Bedrock).


Key Consideration: A Deployer is not absolved of responsibility merely because the model is “someone else’s.” In an investigation, the regulator will come to you as the supplier of the final service on the EU market, making vendor due diligence a critical obligation.

The 4 Official Criteria for Classifying a GPAI Model

For an AI model to fall under the GPAI-specific requirements of the EU AI Act, it must meet all four of the following criteria simultaneously:

General-Purpose Capability

The model must be able to perform a broad range of heterogeneous tasks (e.g., summarization, text generation, classification). This is a test of its technical capabilities, not its marketed use case.

Generative Ability

The model must generate new content (such as text, code, or images), not merely analyze or classify existing data.

Availability to Third Parties (Placing on the Market)

The model must be made available for use outside your own company. Purely internal use for your own business purposes is not considered placement on the market.

Training Scale (Training Compute Threshold)

The computational cost (training compute) used for its training must exceed 10²³ FLOPs.

Systemic Risk Model: At a threshold of ≥10²⁵ FLOPs, the model is classified as a systemic-risk model and is subject to stricter obligations under the AI Act.

Two Critical Nuances Often Overlooked

👉 When does Fine-Tuning Make You a Provider?

If you substantially adapt a third-party GPAI model, you become the Provider of that modified version. The AI Office provides an indicative threshold for “substantiality”: if your fine-tuning compute is more than one-third (1/3) of the base model’s original pretraining compute.

👉 Does an Open-Source License Exempt You from GPAI Obligations?

The short answer is no. Publishing model weights under an open-source license may grant some procedural reliefs, but the baseline transparency and copyright requirements remain:

  • Preparing a Public Training Data Summary;
  • Ensuring copyright compliance (including respecting TDM opt-out signals);
  • Establishing a rights-holder complaint procedure.

For systemic-risk models (≥10²⁵ FLOPs), there are no exceptions regardless of the license.

An In-Depth Overview of AI Office Requirements

If you place a GPAI model on the EU market, the AI Office requires you to provide a public summary of its training data — officially known as the Public Training Data Summary. This document is the cornerstone of the AI Act’s transparency requirements. Its purpose is to allow downstream developers (Deployers), regulators, and the public to understand what data the model was trained on, without forcing you to reveal trade secrets.

Understanding these documentation requirements is mission-critical for two key roles in the AI value chain:

📎 GPAI Providers

Must prepare accurate, compliant, and IP-protecting documentation that fulfills Annex XII obligations and maintains the principle of proportional transparency.

🧩 Deployers (Product Teams)

Must request this document from vendors and conduct a thorough assessment to ensure their own downstream compliance before making the system publicly accessible.

Sufficient to understand; insufficient to replicate.

Below is a neutral overview of the main content requirements for a Public Training Data Summary as formulated by the AI Office.

Important Notice

Please note: This section describes what the regulator requires. A practical “how-to” implementation guide, complete with safe wording examples, is available as a downloadable template at the end of this block.

(Disclaimer: This content is for educational purposes only and does not constitute legal advice.)

What the Regulator Expects: A Section-by-Section Breakdown

Section 1: General Information

  • The model provider’s identity and contact information; if the provider is outside the EU, the authorized representative.
  • The model’s name and version; the linkage to the base model in case of modification (with a link to its summary, if possible).
  • The knowledge cut-off date (month/year); whether continuous training is used.
  • The modalities and main languages of the corpus; an indicative composition by modality (at a categorical level).
Keep in Evidence Pack: Exact hyperparameters, compute/eval logs, model architecture, and detailed data distributions/shares.

Section 2: Data Sources

Publish only categories and aggregates; detailed source lists remain internal.

  • 2.1 Publicly available datasets: Briefly describe significant public corpora and modalities, with the clear caveat that “publicly available ≠ copyright-free.”
  • 2.2 Licensed/purchased data: Disclose classes of content and the legal basis (e.g., license, partnership), without revealing partner names or commercial terms.
  • 2.3 Web data (crawling): Specify the time period, types of sites (e.g., encyclopedias, tech docs), and your TDM opt-out implementation (robots.txt, X-Robots-Tag, meta). Top domain shares must be reported in aggregate form (no URLs).
  • 2.4 User data: State whether it is used and on what legal basis (e.g., opt-in, terms, consent), from which products, general privacy/PII-removal measures, and the withdrawal mechanism.
  • 2.5 Synthetic data: State whether it is used, the purpose (e.g., augmentation, rare cases, safety), the general class of generators, and a brief description of safeguards.
  • 2.6 Other channels: A short description of data acquisition methods (e.g., data partnerships, donations) and their legal basis.
Keep in Evidence Pack: Complete lists of datasets/domains/URLs, data versions and shares, contract texts, and full filtering pipelines and metrics.

Section 3: Processing and Rights Compliance

  • How machine-readable TDM opt-out signals and other protocols are honored (respect for rights reservations).
  • How the rights-holder complaint procedure works, including contact information and reasonable response timelines (SLA).
  • High-level filtering principles (which categories are excluded) and the combined approach (automated + sample manual checks), without publishing specific thresholds or blocklists.
Keep in Evidence Pack: Specific moderation thresholds/models, red-team reports, anti-adversarial measures, and detailed logs.

Key Principles for Compliant Disclosure

  • Public Disclosure: Focus on categories, principles, and aggregates — no specific URLs, thresholds, or “recipes.”
  • Copyright Compliance: Describe your approach neutrally (e.g., “lawful access,” “honoring TDM signals”) without revealing internal operational details.
  • Versioning: Maintain a public URL for the summary near the model card, including the version number and a brief changelog. Sync updates with your release cycles.
  • The Evidence Pack Boundary: All granular details and logs remain internal. They are provided to regulators only upon formal request, often under an NDA.

From AI Act Requirements to a Working Compliance Process

Now you know what the AI Office requires. The key question is how to deliver it without leaking IP, wasting weeks of engineering time, or risking fines under the EU AI Act.

We propose three practical levels to turn compliance from a burden into your competitive advantage.

Quick Self-Diagnosis

Your Role & Obligations

Not sure about your role under the AI Act? Start with our 2-minute “Provider vs Deployer” test and get a personalized checklist of compliance actions relevant to you.

Take the Express Test

Practical Documentation Toolkit

The Template

Need a ready-made document scaffold? Download our guided Public Training Data Summary Template with safe wording examples that protect your IP and save hours of work.

Download Template

Full Process Implementation

The Mini-Course

Want to go beyond the template? Join our mini-course and learn how to build your end-to-end GPAI compliance process — from Evidence Pack setup to continuous audit integration.

Join the Mini-Course

The New Industry Benchmark: How the GPAI Code of Practice Sets the Norm for All AI Systems

Beyond the hard legal requirements of the EU AI Act, the regulator has offered the market a compass — the Code of Practice for GPAI. While formally voluntary, in substance it’s the “gold standard” for mature and responsible AI development.

For GPAI Providers: It’s a regulator-recognized “safe path” to demonstrate compliance, simplifying dialogue with authorities and lowering regulatory risk.
For All Other AI Builders: You can use these principles as an internal quality checklist for any AI system, even if it’s not a GPAI. This boosts user trust and future-proofs your business for upcoming regulation.

Think of the Code as three layers of AI product maturity.

1.Enhanced Transparency

What it is: Clear AI model documentation that explains what the model can and cannot do. This includes the Public Data Summary, a Model Card detailing its limitations, and clear release notes.

Benefit for everyone: This isn’t just about AI Act compliance — this is the foundation of user trust. When clients better understand your product, it reduces misuse and increases loyalty.

2.Respect for Data (Copyright & Data Governance)

What it is: A clear data sourcing policy, honoring site-level “do not text-mine” signals (TDM opt-out), and a fast rights-holder complaint procedure.

Benefit for everyone: This is digital hygiene. Embedding these data governance practices into any AI system reduces legal and reputational risk, making your product more resilient long-term.

3.Safety and Reliability (AI Safety by Design)

What it is: Established processes to discover and mitigate risks, including stress tests, red teaming, and an incident-response plan.

Benefit for everyone: This is professional engineering. Building Safety by Design into the AI product lifecycle — from an HR bot to a recommender system — directly improves reliability, prevents outages, and protects your brand’s reputation.

Who Is Adopting the Code and Why It Matters to You

The GPAI Codes of Practice have been publicly supported by major providers. While it’s not a legal waiver, it is a regulator-recognized “adequate path” to show compliance. For your business, it’s also a powerful maturity signal to enterprise clients — showing that your AI governance processes, roles, and artifacts are in place and accessible.

GPAI (EU AI Act) Short Timeline

1 August 2024
AI Act enters into force; phased application begins.
10–24 July 2025
GPAI package published — Code of Practice, Guidelines, and the Public Training Data Summary template.
2 August 2025
GPAI obligations apply to models placed on the market after this date; earlier models enjoy a transition period.
From 2 August 2026
AI Office oversight and interventions on GPAI; full enforcement actions for non-compliance begin.
From 2 August 2027
Requirements apply to systems released on the market before 2 August 2025.

Note: Dates and wording rely on public materials from the European Commission / AI Office. Verify current status at the time of reading.

2-minute GPAI Role Check

Find out whether you are a GPAI Provider or Deployer under the EU AI Act and get a tailored starter checklist. The test focuses on key factors only and does not collect personal data.

  • Quick self-classification of your role and responsibilities;
  • Clarity on what to publish and what to keep in your Evidence Pack;
  • Instant, role-specific checklist to start today.
Start the test

Answer 6 short questions and click “Show result”. If something is ambiguous, you will get a recommendation to clarify inputs.

Q1 — Where and what is available? (single choice)

Placing the model on the market vs product availability.

Q2 — General-purpose & generative?

Can the model perform multiple different types of tasks (e.g., text generation, Q&A, translation, code) and generate new content?

  

Q3 — Training scale (GPAI threshold 1023 FLOPs)?

Popular foundation models typically exceed this threshold.

     

Q4 — Origin / Modification

“Substantial” = large fine-tune (≈ ≥⅓ of pretraining compute) or a clear change of intended purpose/risk profile.



Q5 — Integration (if the model is not provided as a model)

Q6 — Control over weights/training?

Do you control architecture, weights, or the training process (more than API parameters)?

  

By clicking “Show result” you agree that an aggregated analytics event may be sent (no personal data).

Similar Posts