
The European Commission published a set of non-binding guidelines, a Code of Practice, and a training-data template in July 2025 to clarify duties for general-purpose model providers.
These rules take effect August 2, 2025, with formal enforcement possible from August 2, 2026. Models already on market before August 2, 2025 have until August 2, 2027 to meet requirements.
The Artificial Intelligence Office will act as sole enforcer for this GPAI regime. That office favors a collaborative, proportionate approach for providers who engage early and in good faith.
This whitepaper maps a lifecycle approach that starts at large pre-training runs and follows through downstream development and use. It previews obligations for transparency, technical documentation, downstream sharing, copyright and training-data summaries using the Office template.
Key takeaway: adopt governance, documentation, and data transparency now. Following the Code of Practice can create a presumption of compliance and often reduces information requests and reviews.
Executive summary: What U.S. providers need to know now
This summary gives U.S. providers a clear short list of priorities to reach compliant market entry.
Immediate actions: identify placement channels (API, download, cloud, embedded) and flag any european union touchpoints. Assess if your model meets indicators of broad generality or generative capability and log training compute against published thresholds.
Know when your firm becomes a provider. Integration into an EU-available product, downstream placement by partners, or fine-tuning that materially changes capabilities can trigger status and liability.
Article 53 sets four core duties: technical documentation, downstream integration information, a copyright policy aligned with local law (with opt-outs), and a public training-data summary using the Office template.
Open-source, open-weight exemptions exist but are limited. Non-monetization, public weights and architecture, and clear usage terms help qualify. Copyright and training-data summaries still apply.
Models with systemic risk—likely at very large compute scales—face extra notification and monitoring duties. Providers carry the burden of proof.
Extrality: U.S. teams may be covered if models or outputs reach the market in europe, so align governance even without a local office. Documentation and traceability from pre-training through updates form the backbone of defensible compliance.
Quick checklist: compute accounting, capability scoping, provider-status mapping, documentation readiness, copyright/TDM safeguards, and template completion.
How the EU AI Act defines general-purpose AI models and “significant generality”
Regulators frame GPAI by focusing on whether a model shows broad, task‑agnostic competence. Classification depends on capability breadth rather than how a system is delivered or embedded.
“Models that display significant generality can competently perform a wide range of distinct tasks.”
Indicative compute and generative criteria
The Guidelines set a presumption: training compute above 10^23 FLOP plus generative outputs such as language, text‑to‑image, or text‑to‑video suggests GPAI status.
This is not a hard line. Exceeding that compute alone does not automatically qualify a model if it remains narrowly scoped.
Edge cases and practical advice
Large compute runs that yield a single narrow capability—like transcription‑only or music‑generation systems—may fall outside GPAI despite high FLOP counts.
Conversely, models below the threshold can still meet the standard if benchmark results and real‑world tests show wide-ranging capabilities.
U.S. teams should instrument compute tracking and capability mapping early, keep living documentation from experiments through pre‑training, and gather benchmark evidence to support classification.
When generality is ambiguous, engage the Office for clarity. This reduces downstream compliance risk and guides whether Article 53 duties and systemic risk rules apply.
When you become a “provider”: placement on the market, integration, and fine-tuning
Placing a model into a placed market can change legal status quickly. Availability via an API endpoint, a downloadable artifact, cloud-hosted access, or a model embedded in devices sold in Europe counts as placement.
Integration matters. Adding a model into software or products offered locally can make your firm a provider even if original development occurred elsewhere. Internal use that affects rights of people in Europe may also trigger duties.
Downstream actors and distribution
If a downstream actor ignores a clear upstream prohibition against EU distribution, they step into provider shoes and inherit obligations. Keep licensing and distribution terms unequivocal to prevent unintended coverage.
Modifications that trigger provider status
You become a provider when modifications materially change generality, capabilities, or risk. A practical rule of thumb: modification compute above one‑third of original training compute suggests significant change. If original compute is unknown, use >3⅓ × 10^22 FLOP as an indicator.
Practical tips: document compute used for fine‑tuning, record qualitative shifts in behavior, and maintain clear decision trees to classify original provider, downstream integrator, or modifier. Non‑EU organizations should appoint an authorized representative to handle compliance interfaces.
“Document compute and capability changes early; that record reduces later compliance friction.”
Extraterritorial reach: Why U.S.-based GPAI model providers must care
If a system or its outputs reach users inside the european union, U.S. developers may gain legal duties. That rule covers placing a model on an app store, serving requests from EU IP ranges, or embedding software in devices sold in region.
Practical exposure includes API calls from european union addresses, cloud access routed through regional hosts, distribution via local marketplaces, and devices shipped into member states.
Safe-harbor signals and clear exclusions
A safe harbor exists when developers clearly and unequivocally prohibit european union placement and use. Downstream actors who ignore such bans become treated as providers for compliance purposes.
“Make exclusions explicit, consistent, and technically enforced to shift responsibility downstream.”
Use license clauses, API terms, technical geofencing, and visible non‑EU notices. Monitor reseller channels and user patterns to spot unauthorized placements. If internal use affects rights of natural persons in region, assume duties may apply and assess risk. For firms that do become providers without local establishment, appoint an authorized representative and prepare for full Article 53 compliance or enforce strict exclusion controls.
EU begins enforcement steps on the AI Act affecting general-purpose AI provid
July guidance and FAQs turned high-level principles into clear expectations for model teams. The Commission’s July 18 materials clarify how Article 53 will apply in practice and point teams toward concrete compliance work.
From guidance to action: GPAI Guidelines, FAQ, and AI Office oversight
The Guidelines and FAQ are non‑binding but carry weight because the AI Office will lead oversight. That office can request information and perform model evaluations, particularly for non‑signatories to the Code of Practice.
Lifecycle-based compliance starting at the large pre-training run
Obligations attach from the start of a large pre‑training run and travel with systems through downstream development, deployment, and use.
Providers should estimate pre‑training compute per the Guidelines and update records after training. Maintain clear logs for architecture choices, data sources, curation methods, and evaluation plans before scaling runs.
Practical readiness: assign clear roles, set internal gates before major runs or releases, and align legal, policy, and engineering functions. Early engagement with the AI Office can help schedule reviews and reduce friction as enforcement activity scales through 2026.
Core obligations under Article 53: What model providers must deliver
Article 53 requires a package of documentation, downstream guidance, copyright rules, and a public data summary. These four deliverables form a minimum compliance kit that must be kept current across model versions.
Technical documentation
What to include: training runs, compute accounting, testing methods, benchmark results, and evaluation reports that show capabilities and limits.
Cadence: update after major training, fine‑tuning, or release. Keep change logs and versioned artifacts for traceability.
Information for downstream use
Provide clear integration notes, interface expectations, known failure modes, and safety controls. This helps downstream teams embed models responsibly.
Supply usage examples, limits, and monitoring suggestions so integrators can meet their own compliance duties.
Copyright compliance policy
Adopt a policy aligned with CDSM Directive opt‑outs and operational checks. Implement processes to detect reservation flags and respect rights holders’ choices.
Public training data summary
Publish a sufficiently detailed summary using the Office template at market placement. Include sources, top domains, languages, and filtering steps while protecting trade secrets.
“Make documentation living, versioned, and publicly reachable at placement to reduce review friction.”
Build a cross‑functional pipeline that merges engineering logs, compliance narratives, and publication workflows. Harmonize records with the Code of Practice to gain a presumption of compliance and reduce inquiries by the oversight office.
Open-source, open-weight exemption: Where it applies and where it doesn’t
A permissive, non-monetized release with public weights and architecture may qualify for a narrow exemption. To meet that bar, a license must grant broad rights to access, use, modify, and redistribute. Weights, model design, and basic usage notes should be publicly available in a central repository.
Non-monetization, open weights/architecture, and permissible safety terms
The Commission treats any paid access or indirect revenue—paid tiers, mandatory support, ad‑funded access, or required services—as monetization that negates the exemption. Microenterprise-to-microenterprise transactions may be carved out; verify eligibility before monetizing.
Proportionate, objective safety terms aimed at high‑risk misuse are allowed if they are non‑discriminatory. Avoid vague restrictions that could be read as hidden monetization or gatekeeping.
Continuing obligations: copyright policy and training data summary
Even when an exemption applies, providers must still publish a copyright compliance policy and a training data summary per the obligations act. Exemption from some technical documentation does not remove these core duties.
“Document licensing choices, state non-monetization clearly, and keep an open archive of weights and usage notes to preserve exemption clarity.”
Consider a dual-track approach: a fully open, non‑monetized release plus a separate commercial offering that accepts full obligations. Monitor Office guidance and FAQs to track evolving interpretation of monetization and safety carve-outs.
Models with systemic risk: Designation, notification, and added duties
Models systemic risk covers systems with potential for broad harm to health, safety, public security, rights, or society at scale. Providers must plan as if high-impact capabilities will trigger special oversight.
High-impact capabilities and the 10^25 FLOP presumption
The rules create a clear presumption: training compute above 10^25 FLOP suggests systemic risk. Estimate compute before a large pre‑training run to inform notification timing and internal reviews.
Provider evidence: benchmarks, parameters, modalities, data curation
When challenging a designation, providers bear the burden of proof. Assemble benchmark results (actual and forecast), scaling analyses, architecture and parameter counts, sample training examples, modalities covered, curation and processing notes, tool use, and context length.
Monitoring and mitigation beyond mere technical safeguards
Technical mitigations alone do not block designation. Instead, they shape ongoing monitoring and mitigation duties after designation. Establish a systemic-risk dossier early and update it as training and evaluations proceed.
“Document capabilities and compute credibly; gaps invite closer scrutiny and more frequent evaluations.”
Prepare notification packages and an internal review board to respond to oversight queries. Adopt augmented safety practices—security hardening, incident triage, and corrective-action loops—and integrate legal, policy, security, and ML engineering stakeholders for continuous post‑market monitoring.
The AI Office Template for training data disclosure: Transparency without trade secret leakage
A standardized template forces consistent disclosure while protecting sensitive practices.
The Template demands a public, sufficiently detailed summary about content used to train across pre‑training, alignment, and fine‑tuning. Operational inputs such as retrieval‑only data are excluded.
Scope and key fields
Web‑crawl entries must list content types, geographies, languages, collection windows, crawler identity and settings, and the top 10% of domains by volume. Smaller firms face lower thresholds.
Dataset categories
For publicly available datasets, supply modality descriptions and links when a modality exceeds 3% of public dataset volume. For private licensed sets, confirm license sources and modalities. For user or synthetic content, identify modality and origin service or model.
Rights and safety measures
Describe steps used to respect text/data mining reservations (robots.txt, metadata signals) and processes to detect and remove illegal material such as CSAM or violent extremist content.
“Publish a clear summary at market placement and keep data lineage records to speed future updates.”
Align internal governance and build a lineage catalog early. Run legal and privacy reviews to balance transparency with trade secret protection and meet placement deadlines.
Copyright and text/data mining: Implementing Article 53(1)(c) at scale
Providers need an operational copyright playbook to turn legal text into repeatable data workflows. Build a unified policy that governs crawl rules, dataset intake, and downstream safeguards. Keep a public summary that matches internal controls and the training data template.
State-of-the-art identification of opt-outs and robots.txt compliance
Respect machine-readable reservations such as robots.txt and metadata. Update crawlers to detect signals reliably and log each decision for auditability.
Excluding piracy domains and mitigating infringing outputs downstream
Maintain deny-lists for domains flagged by courts or authorities and update them regularly. Deploy filters to reduce verbatim or near-verbatim reproduction and add usage prohibitions in terms to block infringing downstream acts.
Accessible points of contact and complaint-handling mechanisms
Designate a visible point of contact for rights holders and offer an electronic complaint form with SLAs and fair triage. Keep resolution logs and link them to intake records so oversight bodies can review evidence quickly.
Operational checklist: governance, detection, deny-lists, output safeguards, clear terms, contact channels, audit logs, Template alignment, and team training.
The GPAI Code of Practice: A voluntary path to presumption of compliance
A voluntary Code of Practice offers a clear framework for harmonizing documentation and safety controls across model teams. It translates high-level guidelines into shared templates, labels, and expectations for design, testing, and publication.
Core chapters and how they work
The Code covers documentation, copyright, and safety/security for systemic-risk systems. Each chapter sets norms for records, rights handling, and defensive measures so teams speak the same language when they publish artifacts.
Practical benefits and trade-offs
Signatories gain a presumption of compliance, which often means fewer information requests and lighter evaluations from oversight bodies. That saves time and reduces audit burden during early reviews.
Non‑signatories still may qualify, but they face enhanced scrutiny and must supply robust alternative proofs of conformity under the obligations act.
Adopt, govern, and communicate
Engage while voluntary and aim to sign by August 2, 2026 to benefit from a collaborative, proportionate posture. Conduct a gap analysis mapped to Code chapters, assign accountable owners, and build an evidence repository.
Align vendors and partners with the Code and publish clear signals to customers and rightsholders. That simple move eases integration and supports market trust for model providers.
Enforcement timeline and posture: Prepare for audits and model evaluations
Three milestone dates give providers a predictable path for documentation and remediation.
Key dates and what they mean
August 2, 2025: systems placed into market after this date must ship with technical documentation, downstream use notes, a copyright policy, and a public training data summary.
August 2, 2026: oversight activity may begin. That can include information requests and targeted model evaluations, especially for teams that have not joined the Code of Practice.
August 2, 2027: legacy systems have a two‑year window to meet requirements. Use this time for planned remediation and documentation backfill.
Proportionate, collaborative engagement
The Commission emphasizes flexible, proportionate interaction for good‑faith teams. Early outreach about in‑progress training or compliance gaps can reduce friction during reviews.
No mandatory unlearning where infeasible
There is no blanket demand to retrain or erase past work when that is impossible or disproportionate. Still, disclose exceptions and justify them in your public policy and training‑data summary.
Audit readiness and practical tips
Set a regular cadence for mock evaluations, documentation spot checks, and traceability audits. Align legal, engineering, and product early so public statements match internal records.
“A transparent, well‑maintained compliance record is the best defense in any audit or evaluation.”
Pre‑register contact points and escalation paths so you can respond quickly to information requests. Clear records, timely disclosures, and active engagement with oversight bodies strengthen your position during reviews.
Technical documentation and transparency: Building a defensible compliance record
Start documentation before a large training run to capture decisions that matter later. Early entries save time when regulators or partners request information and help teams show good‑faith processes.
Model description, data lineage, performance metrics, and oversight design
Model description should include architecture narratives, training and testing protocols, and evaluation plans. Add performance results and clear statements of intended capabilities and limits.
Data lineage needs source lists, modality breakdowns, curation steps, licensing paths, and dataset composition summaries. Keep records that tie samples back to ingestion and filtering rules.
Oversight design must describe human‑in‑the‑loop points, escalation paths, and incident handling for adverse outputs.
Traceability: log retention, post-market monitoring, and updates
Adopt version control for models and datasets and immutable logs for major changes. Define log retention policies and audit trails for rights reservations and content filters.
Set post‑market monitoring: telemetry signals, incident thresholds, and feedback channels that feed updates. Review documentation after significant fine‑tuning or capability shifts and align records with the Code and the Office Template to speed reviews.
Tip: concise, dated artifacts and consistent traceability reduce review time and build partner trust.
Safety, security, and fundamental rights: Risk-based controls for GPAI
Effective protection for powerful systems pairs technical resilience with clear human oversight and incident workflows.
Adopt a risk-based approach that links capability breadth to controls. Classify hazards by likely harm, affected groups, and deployment context. Map controls to that ranking so mitigation scales with exposure.
Robustness and adversarial testing
Run red-teaming, adversarial testing, and stress tests across modalities and tool-use setups. Include scenario-based evaluations that simulate evasion and poisoning attempts.
Cyber defenses and supply chain
Implement monitoring, anomaly detection, and artifact signing to guard against data or model poisoning. Verify dependencies, record dataset provenance, and enforce secure build pipelines.
Bias, feedback loops, and oversight
Use diverse, documented datasets and targeted pre-training or fine-tuning interventions to reduce bias. Post-deployment audits must track drift and feedback loops that can amplify harms.
Output-side safeguards should limit harmful or infringing content and align with copyright policy. Ensure human escalation paths where outputs could impact fundamental rights or public safety.
“Document controls clearly and update them after incidents to speed reviews and build trust.”
Market strategy for GPAI providers: Placement decisions, service design, and licensing
Deciding how and where to place a powerful model shapes legal exposure and market opportunity. Choose between clear exclusion from regional access or a launch built for full compliance.
Strategic options: restrict access with unequivocal prohibitions to avoid provider status, or plan for intentional market placement that meets Article 53 obligations. Open, non‑monetized releases may qualify for limited exemptions; monetized offerings must carry full compliance work.
Service and licensing choices
Keep open and commercial editions operationally separate. This prevents commingled obligations and simplifies traceability during audits. Consider distribution channels—API, cloud, or on‑device—when timing Template publication and versioned records.
Design services with downstream guidance: rate limits, safety filters, and clear integration notes so integrators can meet their own duties. Appoint an authorized representative if you lack a local establishment but will act as a market provider.
Tip: adopt the Code of Practice early to gain a presumption of compliance and reduce oversight friction.
Vet partners and resellers so access controls hold. If training spans key regulatory dates, inform the Office proactively and phase compliance to match development and deployment timelines. Finally, publish a concise customer communications plan that explains rights handling, safety measures, and update cadences to build trust in the model market.
Conclusion
,Providers who act now can turn regulatory clarity into a market advantage while lowering legal risk.
Enforcement steps remain on schedule for 2025–2027, so U.S. teams should map european union exposure and decide whether to restrict access or meet full obligations.
Key deliverables are simple: robust documentation, downstream information for integrators, a clear copyright policy, and a public training data summary.
Open‑weight, non‑monetized releases may win limited relief, while systemic‑risk systems face added duties and monitoring.
Engage the AI Office, consider the Code of Practice, and build lifecycle records from pre‑training through updates. Transparent traceability and safety engineering protect audits and unlock durable market trust.