LLM Vendor Due Diligence under the EU AI Act: A Hands-On Guide for GPAI Deployers
This isn’t another legal analysis. It’s a practical playbook for deployers—those accountable for a third-party model’s outcomes within their specific context. We treat compliance as an engineering problem: how to effectively request the right documents from your vendor, how to assess risks for your use-case, and which technical safeguards and contract clauses to implement.
Important Note: This is a practical guide, not a legal document. It is intended to frame engineering-driven conversations. Before implementing any practices from this guide, please read our Educational Content Disclaimer and always consult with your legal counsel.
Guide Map: From Data Collection to Implementation
If you’re on this page, you’re a GPAI Deployer (or about to become one)—you integrate third-party models and are accountable for the outcomes.
To help you quickly find what you need, this guide follows our Collect → Assess → Protect framework. Every item below is a clickable link to its section.
Why This Matters Now: 3 Real-World Facts Impacting Your Product Today
Responsibility is Shared, Not Shifted
The provider is responsible for the foundational model’s compliance with Art. 53 requirements, but you—the deployer—are accountable for how it’s applied under Art. 25. Integrating a third-party model without your own due diligence doesn’t shift the risk; it concentrates it on you.
“We’re Not in the EU” Is Obsolete
Enterprise clients and platforms are already making compliance a procurement gate. Without the right documentation, your product won’t pass vendor security review, blocking major deals.
Models Evolve Unpredictably
Updates can be frequent and “silent.” Without version pinning and a rollback strategy, a minor vendor tweak can trigger a critical incident on your side.
Step 1: Artifact Collection. Your Due-Diligence Checklist
Your first step is to collect the essential public documentation from your vendor. This is about establishing an engineering baseline for your risk assessment.
-
1. Model / Capability Card
The model’s “passport,” describing its intended purpose, limitations, known failure modes, baseline performance metrics (with dates and datasets), and version number.
Where to look:
docs/research/safety/model card/developer docs -
2. Public Training Data Summary
An aggregated overview of the training data’s categories, provenance, and time periods (without disclosing IP), including explicit policies on customer data usage for training.
Where to look:
transparency/FAQ/policy/model card/trust center -
3. Changelog / Versioning Policy
The history of model releases, notifications about breaking changes, support windows (EOL), and official notification channels.
Where to look:
API changelog/release notes/status/blog/developer updates -
4. Copyright & TDM Policies
The provider’s approach to copyright in two key areas: preventive (for future data) and reactive (for current outputs).
Where to look:
legal/copyright/DMCA/responsible AI/transparency/contact -
5. Downstream Use & Safety Policies
The “rules of the road” for you as the integrator, detailing prohibited domains, moderation requirements, Human-in-the-Loop (HITL) conditions, and incident escalation procedures.
Where to look:
acceptable use/safety/usage guidelines/developer terms
“OK / Partial / GAP” — a rapid documentation audit
HUMAN-FRIENDLY BRIDGE
Use the OK / Partial / GAP matrix for a quick audit. The goal is not to “grade the vendor” but to understand where your compensating controls are required and what you must record in your Evidence Pack.
| Artifact | OK | Partial (compensation) | GAP (critical) | Evidence to capture |
|---|---|---|---|---|
| 1 Model / Capability Card |
Intended purpose; limitations & known failure modes; metrics with dates and datasets; model version and last update date provided. Nothing to add; ideal case. |
Document exists, but metrics/risks are broad; few examples. Compensate: Narrow the use case in |
Marketing-style overview with no limitations/risks; no version/dates. Compensate: Avoid sensitive domains; increase monitoring and rollback; record the GAP and send a follow-up request to the provider. |
URL, access date, screenshot; vendor-stated version. |
| 2 Public Training Data Summary |
Aggregated data categories (web/code/books, etc.); overall time period (“up to 2023”); explicit statement of no customer data used for training. Nothing to add; ideal case. |
Categories not disclosed (“trained on internet data”), but a clear policy states customer data is not used. Compensate: Publish a public summary of categories based on provider statements, without proportions. |
No info on training data and ambiguity in ToS about client/personal data. Compensate: Exclude sensitive domains; request clarification; enhance logging and alerts. |
Link to policy/transparency page, access date, short quoted excerpt. |
| 3 Changelog / Versioning Policy |
Public changelog with dates; breaking changes/EOL policy; notification channels (RSS/API/email). Nothing to add; ideal case. |
Changelog is irregular; notifications arrive after the fact. Compensate: Enable staged rollout and version pinning; raise monitoring thresholds during updates. |
Rolling updates without notification; no guarantees of version support. Compensate: Maintain a feature-flag/rollback plan; limit functionality; record risk for procurement. |
Changelog URL, date of last entry, internal reference in code>change_log.md. |
| 4a TDM Reservation Compliance |
Clear policy stating respect for machine-readable TDM signals (e.g., “we respect robots.txt”). Nothing to add; ideal case. |
Vague policy on “responsible data sourcing” with no specific mention of TDM signals. Compensate: Add transparency in your |
No policy, or provider is known to ignore TDM signals. Compensate: Exclude high-sensitivity domains; document the GAP; consider alternative providers for copyright-sensitive use cases. |
Link to policy; screenshot of key sentences. |
| 4b Copyright Complaints Channel |
Dedicated form/email for copyright/DMCA claims, with stated response times or process. Nothing to add; ideal case. |
General “contact us” or support email is the only channel. Compensate: Create YOUR own public complaints channel; document triage in |
No defined mechanism for copyright complaints. Compensate: Mandatory own complaints process; log all incidents; restrict use in high-IP-risk domains. |
Link to form/address; screenshot of the process description. |
| 5 Downstream Use & Safety Policies |
Prohibited applications listed; practical guidance for moderation, HITL, and escalation. Nothing to add; ideal case. |
Prohibitions and guidance are declarative (“do no harm”), with no thresholds/examples. Compensate: Define your own thresholds (filters, HITL conditions) in |
Policy absent or reduced to “comply with laws.” Compensate: Develop downstream rules and UX disclaimers; restrict domains; educate users. |
Link to the policy; short list of key prohibitions/requirements. |
Important note: Even market leaders often land in Partial on some items — and that’s normal. Record sources (links/screenshots/dates) in the Evidence Pack and deliberately introduce compensating measures (HITL, version pinning, staged rollout, monitoring, contractual notifications).
Phase 1: Independent Collection
Objective: To gather all readily available public information. Focus on collecting the low-hanging fruit before drafting any requests.
Check the Standard Sections
-
Model Card
docsresearchsafety -
Data Transparency
transparencypolicyFAQ -
Releases/Versions
API changelogrelease notesstatus -
Usage/Safety
acceptable usesafetyusage guidelines -
Complaints/TDM
legalcopyrightTDM opt-out
-
GitHub
README.mdSECURITY.mdLICENSEDiscussions -
HuggingFace
Model CardFiles & versionsCommunity
Use Advanced Search (If Needed)
site:vendor.com "model card" OR "capability card"
site:vendor.com "training data" summary OR transparency
site:vendor.com changelog OR "release notes"
Record Findings Immediately (Not “Later”)
- Save the permalink and access date (e.g.,
2025-10-26). - Take a screenshot of the key section.
- Place everything in
/evidence/internal/vendor_documents/. - Mark the status in your
risk_table.xlsx: OK / Partial / GAP.
Phase 2: A Formal Request (When Public Information Isn’t Enough)
The likelihood that a large provider will craft a bespoke, comprehensive reply for a small team is low. A parallel goal here is to create an audit trail: the email you send and the response you receive (or don’t) become key artifacts proving due diligence and justifying your engineering decisions and compensating controls.
Strategy and Tone
Position your request as part of standard procurement and safe deployment. Ask for public links;
do not demand custom documents. Log every message and attachment in your
/evidence/internal/vendor_correspondence/ folder.
- Neutral, professional tone; avoid “you are required by law.”
- Make it easy to reply: links are fine; no bespoke PDFs needed.
- State you’re building a downstream transparency & risk package.
Email Structure (5 Logical Blocks)
- Context: Who you are, which model/version, and a brief use-case.
- Objective: You’re assembling a standard package for downstream transparency and risk assessment.
- Request: Five artifacts — Model/Capability Card; Public Training Data Summary; Changelog/Versioning; Complaints & TDM opt-out; Downstream Use & Safety Policies.
- Lower the Barrier: Public links are sufficient; no custom documents required.
- Call to Action: Desired response timeframe and your contact details.
Follow-Up and Escalation
Follow-up (after 7–10 days, brief)
Even public links to relevant pages would help us proceed. Thanks!
Alternative Channels (if silent/auto-reply)
- Trust/Security portal
- Partner / Enterprise Support
- Dedicated Solution Engineer
- Standard dev-support ticket
Save ticket IDs and screenshots for your audit trail.
Common Mistakes to Avoid
- Don’t cite the law as an ultimatum. Don’t write “you are obliged to provide this.” Frame your request around standard procurement and due-diligence processes.
- Don’t ask for proprietary secrets. Don’t request specific datasets, their proportions, or model weights.
- Don’t send your internal “shopping list.” Keep the email to a short, standard list of documents. Company-specific requirements belong to your internal processes, not to the vendor’s inbox.
Handling the Outcomes: An Action Plan
After you reach out to the provider, classify the outcome and act immediately. The goal is to preserve an audit trail, keep your risk_table.xlsx current, and implement compensating controls where needed.
If you receive public links
-
Mark the status as Partial/OK in
risk_table.xlsx(record sources & access dates). - Save a PDF or screenshot of key sections for your Evidence Pack.
If the response is vague
- Thank them, then narrow your request to the 1–2 most critical missing items.
- Document the outcome as Partial; accept the closest equivalents (e.g., FAQ or blog posts).
If you get no response (silence)
Mark it as a GAP in risk_table.xlsx
(source: “no reply by [DATE]”), then implement compensating controls:
-
Narrow approved use-cases and/or add mandatory HITL in
model_use_note.md. -
Enable staged rollout & version pinning in
methods.md. - Raise monitoring thresholds and alerting sensitivity.
- Document stricter notification & rollback clauses in client agreements.
-
Log everything in
/evidence/internaland add an entry tochange_log.md.
What Not to Do
-
Don’t cite the law as an ultimatum. Use neutral phrasing, e.g.,
“a standard documentation set to support our procurement and safe deployment processes.” - Don’t ask giants to change their standard ToS. Focus on analyzing their documentation and strengthening your internal processes.
- Don’t publish their documents or screenshots on your website. Use anonymized mockups for public examples and quote only publicly accessible pages.
Step 2: Documentation Assessment. A 7-Point Engineering Audit
Objective: To quickly classify risks and log them in your
risk_table.xlsx with an OK / Partial / GAP status.
No provider is perfect; the goal is to identify gaps and implement compensating controls on your side.
Intended Purpose vs. Your Use-Case
Look for: A clear description of the model’s intended purpose and its boundaries.
- OK Statements like “assists with…; not intended for…” with clear examples.
- Partial Purpose described too broadly (e.g., “a general-purpose assistant”).
- GAP Stated purpose directly conflicts with your use-case.
- Define excluded domains explicitly in
model_use_note.md. - Specify when HITL is mandatory in
model_use_note.md.
Limitations & Prohibitions
Look for: A documented list of known failure modes and explicitly prohibited domains.
- OK Specific limitations, conditions, and confidence levels.
- Partial Vague phrases like “use responsibly”; prohibitions buried in ToS.
- GAP Limitations section is missing.
- Map vendor limitations → features of your product.
- Document UX fallbacks & filtering logic in
methods.md.
Downstream Safety & Use Guidance
Look for: Practical guidance for integrators (moderation, HITL, abuse escalation).
- OK Clear steps, example thresholds, or UX constraint examples.
- Partial Declarative statements without concrete, actionable guidance.
- GAP Guidance is missing or contradicts the Model Card.
- Define moderation logic and thresholds in
methods.md. - List HITL conditions (when a human must review) in
methods.md. - Publish an incident/complaints channel and escalation process.
Evaluation & Metrics
Look for: Benchmarks used, dates of evaluation, and identified weak spots.
- OK Metrics with dates, dataset names, and commentary on degradation.
- Partial Benchmarks mentioned without specific scores or dates.
- GAP Only marketing claims (“state-of-the-art”); outdated/irrelevant tests.
- Establish your baseline performance and run regular spot-checks.
- Log results in
eval_snapshots/for traceability.
Changelog & Versioning
Look for: Release history, breaking-change policy, EOL windows, notification channels.
- OK Predictable cycles, proactive notifications, advice on version pinning.
- Partial Changelog irregular; notifications arrive late.
- GAP Silent rolling updates; unclear version support.
- Enable version pinning for the model/API.
- Roll out updates via staged deployments.
- Document a fast rollback plan in
methods.md.
Public Training Data Summary
Look for: Data categories, provenance, time periods; explicit “no customer data in training”.
- OK Categorical overview; notes on deduplication or TDM.
- Partial Vague “trained on internet data” but a clear customer-data policy exists.
- GAP No info and ToS ambiguous about customer/personal data.
- Reflect vendor statements in
public_summary.md. - For a GAP, strengthen HITL and monitoring processes.
Integration Constraints & Usage Limits
Look for: Rate limits, latency expectations, content-filter behavior, pricing risks.
- OK Explicit limits with retry/timeout guidance; stable windows.
- Partial Limits exist but are undocumented; learned only by use.
- GAP Hidden hard filters; unpredictable performance degradation.
- Implement client-side timeouts/retries.
- Set alerts for spikes in 429/5xx errors.
- Prepare a feature-degradation plan for instability.
5 Critical Red Flags Create backlog tasks immediately
-
No versioning policy
Poses a risk of breaking your product without warning.
-
Vague limitations & prohibitions
High probability of unexpected failures in production.
-
An empty training data summary
Makes it difficult to pass enterprise procurement and security questionnaires.
-
No clear complaints / TDM channel
Leaves you vulnerable to claims from rights holders.
-
Rollback is impossible + no stable release windows
Means you do not control the quality of your own product.
Step 3: Protection. Building Your Evidence Pack Lite
Understanding risks is half the battle; the other half is demonstrating a managed process. Evidence Pack Lite is a simple folder structure in your repository that proves due diligence without exposing your IP.
1/evidence/
2 public/
3 public_summary.md # Public minimum (no IP/thresholds)
4 internal/
5 model_use_note.md # Our use-case and boundaries
6 methods.md # Integration, tests, and controls
7 change_log.md # Versions, dates, and impact
8 risk_table.xlsx # OK / Partial / GAP + sources/priority
9 vendor_documents/ # Downloaded cards/policies (links/screenshots)
10 vendor_correspondence/ # Emails/responses
11 eval_snapshots/ # Test results with dates
12 manifest.sha256 # Package integrity check
Key Artifacts: What’s Inside
Mini-template • Public-facing (no IP, no thresholds)
“We use a third-party LLM, [MODEL_NAME] [VERSION], for our [FEATURE_NAME] feature. The provider discloses aggregated categories of training data (e.g., web text, code); our customer data is not used for training. Model outputs may contain errors — we apply monitoring and, where necessary, human review (HITL). Model updates are implemented in stages with rollback capabilities.”
1-pager • Purpose ↔ use-case mapping
- Map the model’s intended purpose to your use-case.
- List excluded domains.
- Define mandatory HITL conditions.
- Set incident escalation triggers.
Runbook • Operational safeguards
- Moderation / filtering logic.
- Sanity & spot-check procedures.
- Alerting channels & on-call.
- Staged rollout & rollback scheme; roles/owners.
Operational logbook • What changed & impact
DATE → VERSION/CONFIG → WHAT CHANGED → HOW VERIFIED → IMPACT → DECISION
accepted / rolled back
Risk dashboard • 7-point audit
- Status: OK / Partial / GAP.
- Source (URL/email) and access date.
- Risk priority & compensating control implemented.
Don’t Forget: Direct Transparency Obligations (Art. 72)
This guide focuses on vendor due diligence, but deployers have an additional DIRECT obligation: if your system generates or manipulates content (text, images, audio), you must disclose to users that the content is AI-generated.
Quick implementation: UI labels (“AI-generated response”), Terms of Service updates, or
API headers like X-AI-Generated: true. Document your approach in model_use_note.md.
Contracts: Managing Client Expectations & Assessing Vendor Terms
Technical processes are your foundation, but the final line of defense is in your contracts. Use the items below to frame engineering-driven conversations with legal counsel: set clear expectations with clients and realistically evaluate your vendor’s Terms of Service.
6 Key Units for Your Client Agreements
Use Notice & Limitations
Clarify third-party model, excluded domains, responsibility.
State that the feature is powered by a third-party LLM, is not intended for listed excluded domains, and that the user remains responsible for decisions based on the output.
Updates & Breakages (Notifications)
Pre-notice of model/quality changes.
Commit to notifying clients of material model or quality changes at least [N] days in advance and briefly describing the impact on functionality.
Versioning & Rollback
Pinning and safe fallback.
Reserve the right to pin a model version or perform a rollback upon metric degradation, with client notification.
Logs & Incidents (Minimum)
Forensics and reporting SLA.
Clarify that you maintain minimally necessary technical logs for investigations and will notify clients of serious incidents within ≤ [N] hours.
IP, Outputs & Retraining
Ownership & training use.
State that rights to the outputs belong to the client and that client data will not be used to train third-party models without a separate, explicit opt-in.
Termination & Disaster Recovery (DR)
Export & fallback mode.
Outline data export in an agreed format within [N] days and describe the availability of a fallback mode or phased feature degradation.
Clause Tracker (Lite): Example Approaches
A quick reference to align engineering, product, and legal on the practical shape of standard clauses. Use the table to set a baseline (“Basic”) or a softer posture (“Flexible”) depending on client profile and risk.
| Unit | Basic | Flexible |
|---|---|---|
| Use Notice | A specific list of prohibited domains; user is responsible for final decisions. |
A general disclaimer not for mission-critical decisionswithout a specific list. |
| Notifications | Notification [N] days in advance + a brief description of the impact. |
Commercially reasonable effortsto notify, without a fixed timeframe. |
| IP / Retraining | Outputs belong to the client; training on client data is opt-in only. |
Not used by default,but allows for an opt-in to improve the service. |
Where These Clauses Typically Live
- MSA/SOW + SLA: Use Notice, Updates, Versioning, Termination/DR.
- SLA / Security Addendum: Logs/Incidents.
- DPA (Data Processing Addendum): Prohibition of retraining without consent.
Important Disclaimer: This section is for educational purposes only. The units and wording provided are topics for discussion with your legal counsel, not ready-to-use contract clauses.
Engineering: The “3 Documents + 2 Processes” Framework
Compliance becomes real only when it’s translated into code, configuration, and repeatable processes. Below is the minimum engineering toolkit that shows you’re in control of your AI system.
3 Key Documents (Your Internal Artifacts)
model_use_note.md
1-Pager
- The link between the model’s intended purpose ↔ your use-case.
- A list of excluded domains.
- Conditions where HITL is mandatory.
- Incident escalation triggers, process owners, and contacts.
methods.md
Runbook
- Your safeguards & procedures: moderation/filtering logic.
- What metrics are tracked and where; operational thresholds & alerts.
- Your staged rollout and version pinning scheme.
- Who decides on a rollback and how it’s documented (decision log).
change_log.md
Operational Logbook
A standardized entry format: DATE → VERSION/CONFIG → WHAT CHANGED → HOW VERIFIED → IMPACT → DECISION (accepted/rolled back) → LINKS TO EVIDENCE
2 Critical Processes (Your Response Systems)
Monitoring (Detecting Degradation)
What to measure (examples)
- Proxy Quality: user ratings/flags, rate of failures/“not sure”.
- Technical: P95 latency, error rate, % of blocked responses.
- Business: support complaints, conversion rate of a key feature.
How to define thresholds (the process)
- Establish a baseline by measuring normal performance over 2–4 weeks.
- Assess your risk appetite (what’s worse: false alarm or missed incident?).
- Start with ranges: quality drop >10–20% / 7d; P95 / error-rate +30–50% / 24h; complaints >0.3–1.0% sessions / 24h
- Document chosen thresholds & reasoning in
methods.mdand review monthly.
Response
When a threshold is triggered, an incident is created and assigned within ≤ 4 hours.
Guideline, not a norm. Don’t transfer internal engineering thresholds into client SLAs without legal consultation.
Rollback (Returning to Stability)
Example timeframes based on typical architecture patterns:
IMPORTANT: These are EXAMPLES, not standards.
Your actual rollback targets must be based on:
- Your system’s criticality (a core user-facing feature vs. an internal tool)
- Your operational capacity (a 24/7 on-call team vs. business hours support)
- Your risk tolerance (a regulated fintech app vs. a startup experiment)
A 6-hour rollback window for a non-critical feature may be perfectly acceptable. Document YOUR reasoning in methods.md—the process matters more than the number.
Trigger conditions (examples)
- A high-severity incident.
- Repeated regression of key metrics for 2 consecutive days.
- A vendor-announced breaking change affecting your scenario.
Steps
- Feature-flag off / pin previous version.
- Gradually drain traffic.
- Perform a sanity check.
- Record in
change_log.mdand send notifications per agreements.
Action: document in methods.md your target SLA based on your architecture,
rollback triggers, and roles/responsibilities. Perform a dry run in a staging environment once a
quarter.
Special Case: Compliance for OSS & RAG Systems
When you build on Open-Source Software (OSS) or RAG instead of a commercial API, the role of “vendor” partially shifts to you. Your Evidence Pack must cover the entire chain: data → model/weights → processes.
You are now responsible for the licensing chain, data provenance, operational safeguards, and rollback strategy. The cards below outline the engineering minimums to document and implement.
Data Governance (the most critical)
- Corpus Licenses: Maintain a license ledger for each source, verifying permissions for commercial use, derivatives, and attribution. Do not mix incompatible licenses.
- Provenance: Keep a data manifest with source URL, download date, file hashes, and index version for reproducibility.
- TDM Opt-out vs. Technical Signals: Comply with legal TDM opt-out. Document your policy for respecting signals (e.g.,
robots.txt,noai) inmethods.md. - Data Hygiene / PII: Describe and enforce rules for removing or masking PII and other sensitive fields before indexing.
Model & Licensing
- Model/Weights License: Record the license and terms of the OSS model (e.g., Apache, MIT, RAIL, or vendor terms like Llama).
- Your own
model_use_note.md: Describe the system’s intended purpose, excluded domains, and liability boundaries.
Monitoring & Rollback (RAG specifics)
- What to Monitor: Retrieval metrics (rate of “no answer”), index freshness, and citation quality.
- Index Versioning: Pin the index & embeddings; maintain an index change log.
- Rollback Plan: Prepare rollback not only for the model, but also for the index (keep the last stable version addressable).
Public Transparency
- Public Summary: Describe categories of data sources (e.g., “public technical docs,” “our own blog posts”) without disclosing IP.
- Attribution: If source licenses require it, maintain an attribution ledger and publish acknowledgments.
Privacy & Security
- Risk Assessment: Conduct a DPIA if personal data is processed.
- Access Control: Document who can modify the index or model weights and keep a tamper-evident log of changes.
Conclusion for OSS / RAG
Your Evidence Pack must include a license ledger, data manifest, index_change_log, your own model_use_note, and a methods.md with TDM rules and retrieval monitoring. This demonstrates end-to-end control over your system.
8 Key Practices (Your Internal Standard)
For each practice, we define: Owner → Cadence → Artifact.
-
1
Supply Chain Transparency
-
2
Incident & Complaints Channel
-
3
Purpose Mapping
-
4
Change Management
-
5
Evaluation & Monitoring
-
6
Public Transparency (No IP Leakage)
-
7
Content & Safety Guardrails
-
8
Audit Trail of Correspondence & Sources
Your Implementation Checklist: From Zero to Audit-Ready
Tick each item as you complete it. The docked progress bar updates automatically and your progress is saved in the browser (localStorage).
Step 1: Collect (The Foundation)
Step 2: Assess (The Risk Analysis)
Step 3: Protect (Implementation & Processes)
Documentation
Engineering Processes
Legal & Team Alignment
FAQ: Common Questions for GPAI Deployers
To conclude, here are answers to the most frequent questions teams face when deploying General-Purpose AI (GPAI) systems. These are practical, engineering-first explanations designed to help you pass procurement, reduce risk, and stay audit-ready.
What are a deployer’s obligations under the EU AI Act when using a GPAI API from OpenAI, Meta, or Mistral? #
Your primary obligation is to exercise due diligence. Under the EU AI Act, liability is shared: the provider is responsible for the foundational model, and you—the deployer—are accountable for its application. In practice, you must:
- Conduct vendor due diligence by collecting and assessing documentation (e.g., Model Card, Training Data Summary).
- Map the model’s intended purpose to your specific use-case and document your risk assessment.
- Maintain an internal Evidence Pack that records processes and decisions.
- Implement technical compensating controls (monitoring, rollbacks, version pinning, HITL where required).
How do we pass an enterprise procurement or security questionnaire? Is there a vendor due-diligence checklist? #
Yes. The 7-point audit in Step 2 of this guide is your vendor due-diligence checklist. To pass procurement, prove that you run a managed process. Your Evidence Pack should include:
- A completed
risk_table.xlsxwith your OK / Partial / GAP assessment. - Your
model_use_note.mddefining the use-case and its boundaries. - Proof of safeguards in
methods.md(e.g., version pinning, staged rollout, monitoring). - A concise, non-sensitive
public_summary.mdfor non-technical stakeholders.
Is it enough that our company is outside the EU and our provider is compliant? #
No. The EU AI Act has extraterritorial reach and uses a principle of shared responsibility. If your service is available to users in the EU, the act likely applies. You cannot inherit a provider’s compliance. You must perform your own risk assessment for the specific application and demonstrate control over the system.
How can we maintain public transparency without exposing our intellectual property (IP)? #
The golden rule: disclose processes, not secrets. Split transparency into public vs. internal layers:
- Public: Share categories of data (as stated by the vendor) and confirm the existence of controls (monitoring, HITL, rollback). Avoid specific metrics, prompt designs, or thresholds.
- Internal: Keep full details—risk analysis, thresholds, test cases, prompt templates—in your Evidence Pack.
This aligns with GPAI transparency expectations while protecting your IP.
What should we do if a provider doesn’t respond or has no Model Card? What about OSS/RAG systems? #
Your actions become part of your due diligence.
For commercial APIs:
- Log everything: Record requests and follow-ups in the Evidence Pack.
- Mark GAPs: Note the lack of response or artifacts in
risk_table.xlsx. - Implement compensating controls: mandatory HITL for sensitive cases, strict version pinning, tighter monitoring and alerts.
For OSS / RAG systems: you effectively become the provider. Your Evidence Pack must include your own
model_use_note.md, a license ledger for every source, a data manifest proving provenance,
and evidence that you respect TDM opt-out signals.