gpai-vendor-due-dilligence
Practical playbook for GPAI deployers
Updated: 24 Oct 2025 ~20 min read

LLM Vendor Due Diligence under the EU AI Act: A Hands-On Guide for GPAI Deployers

This isn’t another legal analysis. It’s a practical playbook for deployers—those accountable for a third-party model’s outcomes within their specific context. We treat compliance as an engineering problem: how to effectively request the right documents from your vendor, how to assess risks for your use-case, and which technical safeguards and contract clauses to implement.

Important Note: This is a practical guide, not a legal document. It is intended to frame engineering-driven conversations. Before implementing any practices from this guide, please read our Educational Content Disclaimer and always consult with your legal counsel.

Collect Assess Protect

Guide Map: From Data Collection to Implementation

If you’re on this page, you’re a GPAI Deployer (or about to become one)—you integrate third-party models and are accountable for the outcomes.

To help you quickly find what you need, this guide follows our Collect → Assess → Protect framework. Every item below is a clickable link to its section.

Why This Matters Now: 3 Real-World Facts Impacting Your Product Today

Liability

Responsibility is Shared, Not Shifted

The provider is responsible for the foundational model’s compliance with Art. 53 requirements, but you—the deployer—are accountable for how it’s applied under Art. 25. Integrating a third-party model without your own due diligence doesn’t shift the risk; it concentrates it on you.

Procurement

“We’re Not in the EU” Is Obsolete

Enterprise clients and platforms are already making compliance a procurement gate. Without the right documentation, your product won’t pass vendor security review, blocking major deals.

Stability

Models Evolve Unpredictably

Updates can be frequent and “silent.” Without version pinning and a rollback strategy, a minor vendor tweak can trigger a critical incident on your side.

Step 1: Artifact Collection. Your Due-Diligence Checklist

Your first step is to collect the essential public documentation from your vendor. This is about establishing an engineering baseline for your risk assessment.

  1. 1. Model / Capability Card

    The model’s “passport,” describing its intended purpose, limitations, known failure modes, baseline performance metrics (with dates and datasets), and version number.

    Where to look: docs / research / safety / model card / developer docs

  2. 2. Public Training Data Summary

    An aggregated overview of the training data’s categories, provenance, and time periods (without disclosing IP), including explicit policies on customer data usage for training.

    Where to look: transparency / FAQ / policy / model card / trust center

  3. 3. Changelog / Versioning Policy

    The history of model releases, notifications about breaking changes, support windows (EOL), and official notification channels.

    Where to look: API changelog / release notes / status / blog / developer updates

  4. 4. Copyright & TDM Policies

    The provider’s approach to copyright in two key areas: preventive (for future data) and reactive (for current outputs).

    Where to look: legal / copyright / DMCA / responsible AI / transparency/ contact

  5. 5. Downstream Use & Safety Policies

    The “rules of the road” for you as the integrator, detailing prohibited domains, moderation requirements, Human-in-the-Loop (HITL) conditions, and incident escalation procedures.

    Where to look: acceptable use / safety / usage guidelines / developer terms

“OK / Partial / GAP” — a rapid documentation audit

HUMAN-FRIENDLY BRIDGE

Use the OK / Partial / GAP matrix for a quick audit. The goal is not to “grade the vendor” but to understand where your compensating controls are required and what you must record in your Evidence Pack.

OK / Partial / GAP audit matrix for vendor documentation under the EU AI Act
Artifact OK Partial (compensation) GAP (critical) Evidence to capture
1 Model / Capability Card

Intended purpose; limitations & known failure modes; metrics with dates and datasets; model version and last update date provided.

Nothing to add; ideal case.

Document exists, but metrics/risks are broad; few examples.

Compensate: Narrow the use case in model_use_note.md; add HITL for sensitive scenarios.

Marketing-style overview with no limitations/risks; no version/dates.

Compensate: Avoid sensitive domains; increase monitoring and rollback; record the GAP and send a follow-up request to the provider.

URL, access date, screenshot; vendor-stated version.

2 Public Training Data Summary

Aggregated data categories (web/code/books, etc.); overall time period (“up to 2023”); explicit statement of no customer data used for training.

Nothing to add; ideal case.

Categories not disclosed (“trained on internet data”), but a clear policy states customer data is not used.

Compensate: Publish a public summary of categories based on provider statements, without proportions.

No info on training data and ambiguity in ToS about client/personal data.

Compensate: Exclude sensitive domains; request clarification; enhance logging and alerts.

Link to policy/transparency page, access date, short quoted excerpt.

3 Changelog / Versioning Policy

Public changelog with dates; breaking changes/EOL policy; notification channels (RSS/API/email).

Nothing to add; ideal case.

Changelog is irregular; notifications arrive after the fact.

Compensate: Enable staged rollout and version pinning; raise monitoring thresholds during updates.

Rolling updates without notification; no guarantees of version support.

Compensate: Maintain a feature-flag/rollback plan; limit functionality; record risk for procurement.

Changelog URL, date of last entry, internal reference in code>change_log.md.

4a TDM Reservation Compliance

Clear policy stating respect for machine-readable TDM signals (e.g., “we respect robots.txt”).

Nothing to add; ideal case.

Vague policy on “responsible data sourcing” with no specific mention of TDM signals.

Compensate: Add transparency in your public_summary.md about copyright risk awareness.

No policy, or provider is known to ignore TDM signals.

Compensate: Exclude high-sensitivity domains; document the GAP; consider alternative providers for copyright-sensitive use cases.

Link to policy; screenshot of key sentences.

4b Copyright Complaints Channel

Dedicated form/email for copyright/DMCA claims, with stated response times or process.

Nothing to add; ideal case.

General “contact us” or support email is the only channel.

Compensate: Create YOUR own public complaints channel; document triage in methods.md.

No defined mechanism for copyright complaints.

Compensate: Mandatory own complaints process; log all incidents; restrict use in high-IP-risk domains.

Link to form/address; screenshot of the process description.

5 Downstream Use & Safety Policies

Prohibited applications listed; practical guidance for moderation, HITL, and escalation.

Nothing to add; ideal case.

Prohibitions and guidance are declarative (“do no harm”), with no thresholds/examples.

Compensate: Define your own thresholds (filters, HITL conditions) in methods.md and model_use_note.md.

Policy absent or reduced to “comply with laws.”

Compensate: Develop downstream rules and UX disclaimers; restrict domains; educate users.

Link to the policy; short list of key prohibitions/requirements.

Important note: Even market leaders often land in Partial on some items — and that’s normal. Record sources (links/screenshots/dates) in the Evidence Pack and deliberately introduce compensating measures (HITL, version pinning, staged rollout, monitoring, contractual notifications).

Phase 1: Independent Collection

Objective: To gather all readily available public information. Focus on collecting the low-hanging fruit before drafting any requests.

Check the Standard Sections

Corporate Websites
  • Model Card
    docsresearchsafety
  • Data Transparency
    transparencypolicyFAQ
  • Releases/Versions
    API changelogrelease notesstatus
  • Usage/Safety
    acceptable usesafetyusage guidelines
  • Complaints/TDM
    legalcopyrightTDM opt-out
Open Source Hubs (Llama, Mistral, etc.)
  • GitHub
    README.mdSECURITY.mdLICENSEDiscussions
  • HuggingFace
    Model CardFiles & versionsCommunity

Record Findings Immediately (Not “Later”)

  • Save the permalink and access date (e.g., 2025-10-26).
  • Take a screenshot of the key section.
  • Place everything in /evidence/internal/vendor_documents/.
  • Mark the status in your risk_table.xlsx: OK / Partial / GAP.
Copied

Phase 2: A Formal Request (When Public Information Isn’t Enough)

The likelihood that a large provider will craft a bespoke, comprehensive reply for a small team is low. A parallel goal here is to create an audit trail: the email you send and the response you receive (or don’t) become key artifacts proving due diligence and justifying your engineering decisions and compensating controls.

Strategy and Tone

Position your request as part of standard procurement and safe deployment. Ask for public links; do not demand custom documents. Log every message and attachment in your /evidence/internal/vendor_correspondence/ folder.

  • Neutral, professional tone; avoid “you are required by law.”
  • Make it easy to reply: links are fine; no bespoke PDFs needed.
  • State you’re building a downstream transparency & risk package.

Email Structure (5 Logical Blocks)

  1. Context: Who you are, which model/version, and a brief use-case.
  2. Objective: You’re assembling a standard package for downstream transparency and risk assessment.
  3. Request: Five artifacts — Model/Capability Card; Public Training Data Summary; Changelog/Versioning; Complaints & TDM opt-out; Downstream Use & Safety Policies.
  4. Lower the Barrier: Public links are sufficient; no custom documents required.
  5. Call to Action: Desired response timeframe and your contact details.
Pro-Tip: If you’re already a customer, include your account/project IDs and your internal procurement deadline. This greatly speeds up routing and processing.

Follow-Up and Escalation

Common Mistakes to Avoid

  • Don’t cite the law as an ultimatum. Don’t write “you are obliged to provide this.” Frame your request around standard procurement and due-diligence processes.
  • Don’t ask for proprietary secrets. Don’t request specific datasets, their proportions, or model weights.
  • Don’t send your internal “shopping list.” Keep the email to a short, standard list of documents. Company-specific requirements belong to your internal processes, not to the vendor’s inbox.
Copied

Handling the Outcomes: An Action Plan

After you reach out to the provider, classify the outcome and act immediately. The goal is to preserve an audit trail, keep your risk_table.xlsx current, and implement compensating controls where needed.

OK / Partial

If you receive public links

  • Mark the status as Partial/OK in risk_table.xlsx (record sources & access dates).
  • Save a PDF or screenshot of key sections for your Evidence Pack.
Partial

If the response is vague

  • Thank them, then narrow your request to the 1–2 most critical missing items.
  • Document the outcome as Partial; accept the closest equivalents (e.g., FAQ or blog posts).
GAP

If you get no response (silence)

Mark it as a GAP in risk_table.xlsx (source: “no reply by [DATE]”), then implement compensating controls:

  • Narrow approved use-cases and/or add mandatory HITL in model_use_note.md.
  • Enable staged rollout & version pinning in methods.md.
  • Raise monitoring thresholds and alerting sensitivity.
  • Document stricter notification & rollback clauses in client agreements.
  • Log everything in /evidence/internal and add an entry to change_log.md.

What Not to Do

  • Don’t cite the law as an ultimatum. Use neutral phrasing, e.g., “a standard documentation set to support our procurement and safe deployment processes.”
  • Don’t ask giants to change their standard ToS. Focus on analyzing their documentation and strengthening your internal processes.
  • Don’t publish their documents or screenshots on your website. Use anonymized mockups for public examples and quote only publicly accessible pages.

Step 2: Documentation Assessment. A 7-Point Engineering Audit

Objective: To quickly classify risks and log them in your risk_table.xlsx with an OK / Partial / GAP status. No provider is perfect; the goal is to identify gaps and implement compensating controls on your side.

OK sufficient to start Partial needs compensation GAP critical / must mitigate
1

Intended Purpose vs. Your Use-Case

Look for: A clear description of the model’s intended purpose and its boundaries.

  • OK Statements like “assists with…; not intended for…” with clear examples.
  • Partial Purpose described too broadly (e.g., “a general-purpose assistant”).
  • GAP Stated purpose directly conflicts with your use-case.
Action
  • Define excluded domains explicitly in model_use_note.md.
  • Specify when HITL is mandatory in model_use_note.md.
2

Limitations & Prohibitions

Look for: A documented list of known failure modes and explicitly prohibited domains.

  • OK Specific limitations, conditions, and confidence levels.
  • Partial Vague phrases like “use responsibly”; prohibitions buried in ToS.
  • GAP Limitations section is missing.
Action
  • Map vendor limitations → features of your product.
  • Document UX fallbacks & filtering logic in methods.md.
3

Downstream Safety & Use Guidance

Look for: Practical guidance for integrators (moderation, HITL, abuse escalation).

  • OK Clear steps, example thresholds, or UX constraint examples.
  • Partial Declarative statements without concrete, actionable guidance.
  • GAP Guidance is missing or contradicts the Model Card.
Action
  • Define moderation logic and thresholds in methods.md.
  • List HITL conditions (when a human must review) in methods.md.
  • Publish an incident/complaints channel and escalation process.
4

Evaluation & Metrics

Look for: Benchmarks used, dates of evaluation, and identified weak spots.

  • OK Metrics with dates, dataset names, and commentary on degradation.
  • Partial Benchmarks mentioned without specific scores or dates.
  • GAP Only marketing claims (“state-of-the-art”); outdated/irrelevant tests.
Action
  • Establish your baseline performance and run regular spot-checks.
  • Log results in eval_snapshots/ for traceability.
5

Changelog & Versioning

Look for: Release history, breaking-change policy, EOL windows, notification channels.

  • OK Predictable cycles, proactive notifications, advice on version pinning.
  • Partial Changelog irregular; notifications arrive late.
  • GAP Silent rolling updates; unclear version support.
Action
  • Enable version pinning for the model/API.
  • Roll out updates via staged deployments.
  • Document a fast rollback plan in methods.md.
6

Public Training Data Summary

Look for: Data categories, provenance, time periods; explicit “no customer data in training”.

  • OK Categorical overview; notes on deduplication or TDM.
  • Partial Vague “trained on internet data” but a clear customer-data policy exists.
  • GAP No info and ToS ambiguous about customer/personal data.
Action
  • Reflect vendor statements in public_summary.md.
  • For a GAP, strengthen HITL and monitoring processes.
7

Integration Constraints & Usage Limits

Look for: Rate limits, latency expectations, content-filter behavior, pricing risks.

  • OK Explicit limits with retry/timeout guidance; stable windows.
  • Partial Limits exist but are undocumented; learned only by use.
  • GAP Hidden hard filters; unpredictable performance degradation.
Action
  • Implement client-side timeouts/retries.
  • Set alerts for spikes in 429/5xx errors.
  • Prepare a feature-degradation plan for instability.

5 Critical Red Flags Create backlog tasks immediately

  1. No versioning policy

    Poses a risk of breaking your product without warning.

  2. Vague limitations & prohibitions

    High probability of unexpected failures in production.

  3. An empty training data summary

    Makes it difficult to pass enterprise procurement and security questionnaires.

  4. No clear complaints / TDM channel

    Leaves you vulnerable to claims from rights holders.

  5. Rollback is impossible + no stable release windows

    Means you do not control the quality of your own product.

Step 3: Protection. Building Your Evidence Pack Lite

Understanding risks is half the battle; the other half is demonstrating a managed process. Evidence Pack Lite is a simple folder structure in your repository that proves due diligence without exposing your IP.

Folder Structure (The Engineering Minimum)

1/evidence/
2  public/
3    public_summary.md           # Public minimum (no IP/thresholds)
4  internal/
5    model_use_note.md           # Our use-case and boundaries
6    methods.md                  # Integration, tests, and controls
7    change_log.md               # Versions, dates, and impact
8    risk_table.xlsx             # OK / Partial / GAP + sources/priority
9    vendor_documents/           # Downloaded cards/policies (links/screenshots)
10   vendor_correspondence/      # Emails/responses
11   eval_snapshots/             # Test results with dates
12 manifest.sha256              # Package integrity check

          

Key Artifacts: What’s Inside

public_summary.md
Public Owner: PM Review: per release

Mini-template • Public-facing (no IP, no thresholds)

“We use a third-party LLM, [MODEL_NAME] [VERSION], for our [FEATURE_NAME] feature. The provider discloses aggregated categories of training data (e.g., web text, code); our customer data is not used for training. Model outputs may contain errors — we apply monitoring and, where necessary, human review (HITL). Model updates are implemented in stages with rollback capabilities.”
Public summary only • Keep sources & thresholds out of this file.
model_use_note.md
Internal Owner: PM / Domain Lead Review: when features change

1-pager • Purpose ↔ use-case mapping

  • Map the model’s intended purpose to your use-case.
  • List excluded domains.
  • Define mandatory HITL conditions.
  • Set incident escalation triggers.
methods.md
Internal Owner: On-call / ML Eng Review: per release

Runbook • Operational safeguards

  • Moderation / filtering logic.
  • Sanity & spot-check procedures.
  • Alerting channels & on-call.
  • Staged rollout & rollback scheme; roles/owners.
change_log.md
Internal Owner: Release Manager Review: per change

Operational logbook • What changed & impact

Format DATE → VERSION/CONFIG → WHAT CHANGED → HOW VERIFIED → IMPACT → DECISION
Decisionaccepted / rolled back
risk_table.xlsx
Internal Owner: PM / Tech Lead Review: per release

Risk dashboard • 7-point audit

  • Status: OK / Partial / GAP.
  • Source (URL/email) and access date.
  • Risk priority & compensating control implemented.

Don’t Forget: Direct Transparency Obligations (Art. 72)

This guide focuses on vendor due diligence, but deployers have an additional DIRECT obligation: if your system generates or manipulates content (text, images, audio), you must disclose to users that the content is AI-generated.

Quick implementation: UI labels (“AI-generated response”), Terms of Service updates, or API headers like X-AI-Generated: true. Document your approach in model_use_note.md.

Contracts: Managing Client Expectations & Assessing Vendor Terms

Technical processes are your foundation, but the final line of defense is in your contracts. Use the items below to frame engineering-driven conversations with legal counsel: set clear expectations with clients and realistically evaluate your vendor’s Terms of Service.

6 Key Units for Your Client Agreements

Use Notice & Limitations

Clarify third-party model, excluded domains, responsibility.

State that the feature is powered by a third-party LLM, is not intended for listed excluded domains, and that the user remains responsible for decisions based on the output.

Updates & Breakages (Notifications)

Pre-notice of model/quality changes.

Commit to notifying clients of material model or quality changes at least [N] days in advance and briefly describing the impact on functionality.

Versioning & Rollback

Pinning and safe fallback.

Reserve the right to pin a model version or perform a rollback upon metric degradation, with client notification.

Logs & Incidents (Minimum)

Forensics and reporting SLA.

Clarify that you maintain minimally necessary technical logs for investigations and will notify clients of serious incidents within ≤ [N] hours.

IP, Outputs & Retraining

Ownership & training use.

State that rights to the outputs belong to the client and that client data will not be used to train third-party models without a separate, explicit opt-in.

Termination & Disaster Recovery (DR)

Export & fallback mode.

Outline data export in an agreed format within [N] days and describe the availability of a fallback mode or phased feature degradation.

Clause Tracker (Lite): Example Approaches

A quick reference to align engineering, product, and legal on the practical shape of standard clauses. Use the table to set a baseline (“Basic”) or a softer posture (“Flexible”) depending on client profile and risk.

Unit Basic Flexible
Use Notice A specific list of prohibited domains; user is responsible for final decisions. A general disclaimer not for mission-critical decisions without a specific list.
Notifications Notification [N] days in advance + a brief description of the impact. Commercially reasonable efforts to notify, without a fixed timeframe.
IP / Retraining Outputs belong to the client; training on client data is opt-in only. Not used by default, but allows for an opt-in to improve the service.

Where These Clauses Typically Live

  • MSA/SOW + SLA: Use Notice, Updates, Versioning, Termination/DR.
  • SLA / Security Addendum: Logs/Incidents.
  • DPA (Data Processing Addendum): Prohibition of retraining without consent.

Important Disclaimer: This section is for educational purposes only. The units and wording provided are topics for discussion with your legal counsel, not ready-to-use contract clauses.

Engineering: The “3 Documents + 2 Processes” Framework

Compliance becomes real only when it’s translated into code, configuration, and repeatable processes. Below is the minimum engineering toolkit that shows you’re in control of your AI system.

3 Key Documents (Your Internal Artifacts)

model_use_note.md 1-Pager
  • The link between the model’s intended purpose ↔ your use-case.
  • A list of excluded domains.
  • Conditions where HITL is mandatory.
  • Incident escalation triggers, process owners, and contacts.
methods.md Runbook
  • Your safeguards & procedures: moderation/filtering logic.
  • What metrics are tracked and where; operational thresholds & alerts.
  • Your staged rollout and version pinning scheme.
  • Who decides on a rollback and how it’s documented (decision log).
change_log.md Operational Logbook

A standardized entry format: DATE → VERSION/CONFIG → WHAT CHANGED → HOW VERIFIED → IMPACT → DECISION (accepted/rolled back) → LINKS TO EVIDENCE

2 Critical Processes (Your Response Systems)

1

Monitoring (Detecting Degradation)

What to measure (examples)
  • Proxy Quality: user ratings/flags, rate of failures/“not sure”.
  • Technical: P95 latency, error rate, % of blocked responses.
  • Business: support complaints, conversion rate of a key feature.
How to define thresholds (the process)
  1. Establish a baseline by measuring normal performance over 2–4 weeks.
  2. Assess your risk appetite (what’s worse: false alarm or missed incident?).
  3. Start with ranges: quality drop >10–20% / 7d; P95 / error-rate +30–50% / 24h; complaints >0.3–1.0% sessions / 24h
  4. Document chosen thresholds & reasoning in methods.md and review monthly.
Response

When a threshold is triggered, an incident is created and assigned within ≤ 4 hours.

Guideline, not a norm. Don’t transfer internal engineering thresholds into client SLAs without legal consultation.

2

Rollback (Returning to Stability)

Example timeframes based on typical architecture patterns:
≤ 1 hour
API-managed LLMs with version pinning.
4-8 hours
Self-hosted models with feature flags.
≤ 24 hours
Complex pipelines (e.g., RAG with re-indexing, multi-region).

IMPORTANT: These are EXAMPLES, not standards.

Your actual rollback targets must be based on:

  • Your system’s criticality (a core user-facing feature vs. an internal tool)
  • Your operational capacity (a 24/7 on-call team vs. business hours support)
  • Your risk tolerance (a regulated fintech app vs. a startup experiment)

A 6-hour rollback window for a non-critical feature may be perfectly acceptable. Document YOUR reasoning in methods.md—the process matters more than the number.

Trigger conditions (examples)
  • A high-severity incident.
  • Repeated regression of key metrics for 2 consecutive days.
  • A vendor-announced breaking change affecting your scenario.
Steps
  1. Feature-flag off / pin previous version.
  2. Gradually drain traffic.
  3. Perform a sanity check.
  4. Record in change_log.md and send notifications per agreements.

Action: document in methods.md your target SLA based on your architecture, rollback triggers, and roles/responsibilities. Perform a dry run in a staging environment once a quarter.

Special Case: Compliance for OSS & RAG Systems

When you build on Open-Source Software (OSS) or RAG instead of a commercial API, the role of “vendor” partially shifts to you. Your Evidence Pack must cover the entire chain: data → model/weights → processes.

You are now responsible for the licensing chain, data provenance, operational safeguards, and rollback strategy. The cards below outline the engineering minimums to document and implement.

1

Data Governance (the most critical)

  • Corpus Licenses: Maintain a license ledger for each source, verifying permissions for commercial use, derivatives, and attribution. Do not mix incompatible licenses.
  • Provenance: Keep a data manifest with source URL, download date, file hashes, and index version for reproducibility.
  • TDM Opt-out vs. Technical Signals: Comply with legal TDM opt-out. Document your policy for respecting signals (e.g., robots.txt, noai) in methods.md.
  • Data Hygiene / PII: Describe and enforce rules for removing or masking PII and other sensitive fields before indexing.
data_manifest.json license_ledger.csv methods.md
2

Model & Licensing

  • Model/Weights License: Record the license and terms of the OSS model (e.g., Apache, MIT, RAIL, or vendor terms like Llama).
  • Your own model_use_note.md: Describe the system’s intended purpose, excluded domains, and liability boundaries.
MODEL_LICENSE model_use_note.md
3

Monitoring & Rollback (RAG specifics)

  • What to Monitor: Retrieval metrics (rate of “no answer”), index freshness, and citation quality.
  • Index Versioning: Pin the index & embeddings; maintain an index change log.
  • Rollback Plan: Prepare rollback not only for the model, but also for the index (keep the last stable version addressable).
index_change_log.md methods.md
4

Public Transparency

  • Public Summary: Describe categories of data sources (e.g., “public technical docs,” “our own blog posts”) without disclosing IP.
  • Attribution: If source licenses require it, maintain an attribution ledger and publish acknowledgments.
public_summary.md attribution_ledger.csv
5

Privacy & Security

  • Risk Assessment: Conduct a DPIA if personal data is processed.
  • Access Control: Document who can modify the index or model weights and keep a tamper-evident log of changes.
DPIA.pdf weights_access_log.csv index_change_log.md

Conclusion for OSS / RAG

Your Evidence Pack must include a license ledger, data manifest, index_change_log, your own model_use_note, and a methods.md with TDM rules and retrieval monitoring. This demonstrates end-to-end control over your system.

license_ledger.csv data_manifest.json index_change_log.md model_use_note.md methods.md

8 Key Practices (Your Internal Standard)

For each practice, we define: Owner → Cadence → Artifact.

  1. 1

    Supply Chain Transparency

    Owner
    PM / Tech Lead
    Cadence
    At release / Quarterly
    Artifact
    /evidence/public/public_summary.md + vendor document links
  2. 2

    Incident & Complaints Channel

    Owner
    Support / On-call Lead
    Cadence
    Continuous (reactive)
    Artifact
    Triage & escalation in /evidence/internal/methods.md
  3. 3

    Purpose Mapping

    Owner
    PM / Domain Lead
    Cadence
    Whenever features change
    Artifact
    /evidence/internal/model_use_note.md
  4. 4

    Change Management

    Owner
    Release Manager / Tech Lead
    Cadence
    Every model or config change
    Artifact
    /evidence/internal/change_log.md
  5. 5

    Evaluation & Monitoring

    Owner
    On-call / ML Engineer
    Cadence
    Scheduled (tests) / On-alert (incidents)
    Artifact
    Metrics & thresholds in /evidence/internal/methods.md
  6. 6

    Public Transparency (No IP Leakage)

    Owner
    PM
    Cadence
    At release / Quarterly
    Artifact
    public_summary.md with data categories & process descriptions only
  7. 7

    Content & Safety Guardrails

    Owner
    PM / Security Lead
    Cadence
    At release / When policies change
    Artifact
    Downstream rules, moderation logic & UX fallbacks in methods.md
  8. 8

    Audit Trail of Correspondence & Sources

    Owner
    Tech Lead
    Cadence
    As events occur
    Artifact
    /vendor_documents/ + /vendor_correspondence/

Your Implementation Checklist: From Zero to Audit-Ready

Tick each item as you complete it. The docked progress bar updates automatically and your progress is saved in the browser (localStorage).

Step 1: Collect (The Foundation)

Step 2: Assess (The Risk Analysis)

Step 3: Protect (Implementation & Processes)

Documentation

Engineering Processes

Legal & Team Alignment

FAQ: Common Questions for GPAI Deployers

To conclude, here are answers to the most frequent questions teams face when deploying General-Purpose AI (GPAI) systems. These are practical, engineering-first explanations designed to help you pass procurement, reduce risk, and stay audit-ready.

What are a deployer’s obligations under the EU AI Act when using a GPAI API from OpenAI, Meta, or Mistral? #

Your primary obligation is to exercise due diligence. Under the EU AI Act, liability is shared: the provider is responsible for the foundational model, and you—the deployer—are accountable for its application. In practice, you must:

  • Conduct vendor due diligence by collecting and assessing documentation (e.g., Model Card, Training Data Summary).
  • Map the model’s intended purpose to your specific use-case and document your risk assessment.
  • Maintain an internal Evidence Pack that records processes and decisions.
  • Implement technical compensating controls (monitoring, rollbacks, version pinning, HITL where required).
How do we pass an enterprise procurement or security questionnaire? Is there a vendor due-diligence checklist? #

Yes. The 7-point audit in Step 2 of this guide is your vendor due-diligence checklist. To pass procurement, prove that you run a managed process. Your Evidence Pack should include:

  • A completed risk_table.xlsx with your OK / Partial / GAP assessment.
  • Your model_use_note.md defining the use-case and its boundaries.
  • Proof of safeguards in methods.md (e.g., version pinning, staged rollout, monitoring).
  • A concise, non-sensitive public_summary.md for non-technical stakeholders.
Is it enough that our company is outside the EU and our provider is compliant? #

No. The EU AI Act has extraterritorial reach and uses a principle of shared responsibility. If your service is available to users in the EU, the act likely applies. You cannot inherit a provider’s compliance. You must perform your own risk assessment for the specific application and demonstrate control over the system.

How can we maintain public transparency without exposing our intellectual property (IP)? #

The golden rule: disclose processes, not secrets. Split transparency into public vs. internal layers:

  • Public: Share categories of data (as stated by the vendor) and confirm the existence of controls (monitoring, HITL, rollback). Avoid specific metrics, prompt designs, or thresholds.
  • Internal: Keep full details—risk analysis, thresholds, test cases, prompt templates—in your Evidence Pack.

This aligns with GPAI transparency expectations while protecting your IP.

What should we do if a provider doesn’t respond or has no Model Card? What about OSS/RAG systems? #

Your actions become part of your due diligence.

For commercial APIs:

  • Log everything: Record requests and follow-ups in the Evidence Pack.
  • Mark GAPs: Note the lack of response or artifacts in risk_table.xlsx.
  • Implement compensating controls: mandatory HITL for sensitive cases, strict version pinning, tighter monitoring and alerts.

For OSS / RAG systems: you effectively become the provider. Your Evidence Pack must include your own model_use_note.md, a license ledger for every source, a data manifest proving provenance, and evidence that you respect TDM opt-out signals.