AI Law

AI Diligence for M&A: The Workstream Most Buyers Aren't Running Yet

AI diligence is now what cyber-DD was in 2018: a discrete M&A workstream that wasn't standard until the loss profile forced it. After Bartz $1.5B, Mobley, TRAIGA, and EU AI Act high-risk obligations, here is the buyer-side framework.

Promise Legal Staff

03 May 2026 • 9 min read

Why AI Diligence Is Becoming a Standalone Workstream

Modern M&A diligence runs on a standardized taxonomy. Legal, financial, tax, IP, employment, environmental, cyber, and data-privacy workstreams each have their own checklists, specialists, and reps. A dedicated AI diligence workstream is not yet on that list outside top-tier transactions. Skadden's 2026 M&A outlook treats AI assets as requiring tailored diligence, including by specialized third-party diligence firms — language that signals the discipline is emerging rather than universally adopted.

The gap matters because AI risk now intersects every other workstream simultaneously. Training-corpus exposure became a board-level concern after the Bartz v. Anthropic settlement, the largest copyright settlement in U.S. history at roughly $1.5 billion. Bias and employment-screening litigation, regulatory exposure under TRAIGA and the EU AI Act, and securities exposure tied to AI-washing claims now flow through targets that most acquirers are still pricing with pre-AI reps.

Standard tech-M&A reps were not built for this. Mayer Brown's analysis of key representations and warranties in tech M&A shows that conventional IP, no-litigation, and compliance-with-laws reps capture only fragments of AI risk. They do not price training-corpus provenance, AI bill of materials gaps, or pre-2025 vendor terms drafted before model-output indemnities became standard.

The historical analog is cybersecurity diligence. Cyber due diligence was anomalous in the mid-2010s and standard by the late 2010s, driven by a handful of headline breaches that reset buyer expectations. Promise Legal reads AI diligence as on the same trajectory, compressed. The vocabulary — AI diligence, AI BOM, training-corpus provenance, lawful-corpus warranty, the AI workstream itself — is still being defined. Promise Legal is helping define it.

What Buyers Inherit When AI Diligence Is Skipped

Skip the AI workstream and the buyer inherits five concrete exposure categories that survive change-of-control: training-corpus liability, vendor-rep gaps, algorithmic discrimination, regulatory non-compliance, and AI-washing disclosure risk. Each tracks a different doctrine, but they share a common feature — none of them resets at the closing table.

The first track is training-corpus liability. Bartz v. Anthropic resolved with a settlement structure that translates to roughly $3,000 per work across approximately 500,000 books, plus destruction of the pirated corpus within 30 days of final judgment. Asset buyers are generally not responsible for seller liabilities, but the continuity-of-enterprise exception can pull seller liability onto the buyer when the acquired business continues operating substantially intact. A target that trained on unlicensed corpora carries that exposure into the buyer's balance sheet. Reading Kadrey v. Meta as a green light is a misread — Skadden notes the ruling is record-bound and does not stand for the proposition that training is lawful in general.

The second track is the vendor-rep gap. Lawful-training-data warranties are emerging in M&A practice but are not yet universal, which means most pre-2025 vendor stacks lack the indemnity hooks a buyer needs to push corpus risk back upstream.

The third track is algorithmic discrimination. Mobley v. Workday obtained preliminary ADEA collective certification on May 16, 2025, with a prospective class spanning every applicant denied an employment recommendation through Workday's platform during the relevant period — a population that commentary describes in the “hundreds of millions”. Louis v. SafeRent reached a $2.275 million final-approval settlement in November 2024. The EEOC's Title VII guidance closes the deployer's escape hatch: algorithmic decision-making is a “selection procedure,” and a third party's assurances or representations regarding compliance do not insulate the deploying employer.

The fourth track is regulatory non-compliance. Texas's TRAIGA, the Colorado AI Act, the EU AI Act, and NYC Local Law 144 each impose distinct obligations — risk assessments, bias audits, deployer notices, transparency disclosures — that travel with the acquired system regardless of which entity originally deployed it.

The fifth track applies to public-company targets: AI-washing disclosure exposure. The SEC's first AI-washing actions hit Delphia and Global Predictions with $225,000 and $175,000 civil penalties for misleading AI-capability statements. Acquired marketing copy and investor decks become the buyer's disclosure problem.

None of these exposures reset at close. With the inheritance map clear, the diligence workstream takes shape.

Defining the AI Diligence Workstream

AI diligence resolves into three layers. The first is inventory and classification — every AI system the target develops, deploys, or depends on, sorted by risk tier and regulatory exposure. The second is provenance and provenance gaps — where training data, models, and components originated, and where the documentary chain breaks. The third is governance and documentation — the policies, records, and management-system artifacts that prove the target operates AI as a controlled function rather than a shadow workstream.

Within that architecture, a complete AI diligence file answers five concrete questions:

What AI systems does the target develop, deploy, or rely on? The buyer should request an AI Bill of Materials. The OWASP AIBOM Project defines the expected fields — algorithms, data-collection methods, frameworks, libraries, licensing, and standards-compliance posture. SPDX 3.0 extends machine-readable SBOM tooling to cover datasets, model metadata, pipelines, and runtime, so an AI BOM can be managed with the same supply-chain discipline as software components.
Where did the training data come from, and what is the provenance evidence? Each model in the AI BOM needs a corpus-source trail — licensed datasets, scraped sources, synthetic data, customer data, and the consents or licenses behind each. Gaps here drive the lawful-corpus warranty in Section 4.
What contractual representations cover AI in vendor agreements, and what are the gaps? Pre-2025 vendor contracts rarely price training-corpus provenance. The buyer must map every AI vendor against its rep set and flag the silences.
What governance documentation exists? AI policies, AI Council minutes, model cards, training records, red-team reports, and incident logs. Targets aligned to the NIST AI RMF 1.0 Govern-Map-Measure-Manage functions, or certified to ISO/IEC 42001:2023 — the only certifiable international AI management system standard — produce this evidence on request. Targets without that alignment produce excuses. Promise Legal's TRAIGA safe-harbor analysis sets out the buyer-side documentary spine that maps NIST AI RMF substantial compliance to affirmative-defense posture.
What regulatory regimes apply, and what is the target's posture under each? TRAIGA, the Colorado AI Act, the EU AI Act, NYC Local Law 144, and sectoral regulators all attach different obligations to different AI use cases. The diligence file documents posture, not aspiration.

Three layers, five questions, one file. The next move is converting those findings into reps.

Modern AI Representations and Warranties

A modern AI rep set has six families. First, a lawful-training-corpus warranty: every model was trained only on data obtained through binding consent, license, or another permissioned basis. Second, an AI BOM disclosure schedule that itemizes every model, dataset, weight, and upstream dependency. Third, algorithmic-harm reps: no material discriminatory outcome, no pending bias claim, no regulator inquiry into automated decision-making. Fourth, AI regulatory compliance reps covering substantive compliance plus framework alignment where the target has elected NIST AI RMF or ISO 42001. Fifth, AI vendor pass-through reps: every AI vendor has delivered an upstream lawful-corpus warranty, and any indemnity carve-outs survive change of control. Sixth, indemnity baskets and survival periods scoped separately from the general cap.

Outside counsel writing on the 2025 deal flow track this same architecture. Torys instructs buyers to require a specific representation that the target's AI model was trained only with permissioned data, meaning data obtained through legally binding consent or license. Osler maps the broader rep surface across privacy and data use, IP, AI functionality and training, internal AI governance frameworks, and compliance with applicable AI laws and regulations. The six-family structure is how the firm operationalizes that surface in an actual purchase agreement.

Insurance is moving in parallel. Mayer Brown reports that RWI carriers are scrutinizing AI risks more closely, with potential policy exclusions emerging for data provenance, model performance, and other AI-specific risks. A buyer who assumed the RWI tower would absorb a training-corpus claim may find the exclusion did the underwriting instead.

⚖️

If the RWI policy excludes data provenance or model performance, the AI-specific reps need their own indemnity tower. Read the exclusions before pricing the cap.

That is why Osler treats separate liability caps, survival periods, and targeted indemnities for AI-specific reps as appropriate practice for deals where AI risk has been identified in diligence. Promise Legal's view goes one step further: survival for lawful-corpus reps should run longer than standard IP survival, because copyright-style training-corpus claims have long fuse times tied to discovery and per-work damages aggregation. Bartz-style exposure can accumulate across years of corpus growth before a complaint is ever filed. The data room shows what backs each of these reps.

The AI Data Room: What Buyers Should Demand

Standard data-room sections rarely cover AI directly. IP, privacy, and material contracts each capture a slice, but none surface the training-corpus provenance, AI BOM, or governance posture that drive modern AI risk. Buyers should request a discrete AI section with a defined contents list, not a scattered set of attachments filed under legacy categories.

The minimum buyer ask should include:

A complete AI inventory with risk classification per system, identifying functions and any third-party technologies or data sources involved.
Training-data provenance documentation, including sourcing, licensing, and the AI BOM supporting any lawful-corpus warranty.
The target's AI policy stack and the AI Council charter and meeting minutes.
Model cards or AI FactSheets for material systems, following the Hugging Face model card and IBM FactSheet templates that cover training, evaluation, uses, limitations, and citation.
The vendor contract list with AI-specific representations highlighted across the six-family rep set.
Incident logs and tabletop exercise documentation for AI-related events.
Regulatory analyses for TRAIGA and the EU AI Act, with the target's position under each.
Litigation and pre-litigation correspondence touching AI, including EEOC bias complaints and internal AI ethics complaints that live outside the standard material-litigation rep and must be requested explicitly.

Sell-side teams should treat building this room before going to market as a value-creation play. When the target's AI inventory, governance documentation, and vendor-rep posture arrive pre-built, the buyer's discount-for-uncertainty narrows and the deal's risk allocation shifts toward the seller. Section 6 turns to what happens after close, when the AI workstream becomes an integration problem.

Post-Close: AI Integration as a Workstream

AI integration is governance integration, not systems integration. The substantive post-close moves are policy-stack consolidation, vendor contract reformation, and AI Council seat reallocation — not server cutover. Buyers who treat the first 90 days as a governance project capture diligence value; buyers who treat it as IT cleanup forfeit it.

The Day 1-90 playbook now carries deliverables distinct from legacy IT integration: AI inventory consolidation, duplicate-tooling rationalization, and high-risk vendor flagging. Each gap surfaced in diligence — shadow-AI usage, training-data IP ambiguity, ungoverned EU AI Act or GDPR exposure — gets a named owner and a closure plan. CloudEagle's 2026 guidance confirms that most acquirers otherwise discover these issues only after close, when leverage is gone.

The vendor contract reformation cycle propagates the acquirer's AI vendor template across the target's stack, folding the target's AI BOM into the acquirer's policy architecture. Promise Legal's AI Governance Playbook treats this as the conversion step that turns diligence findings into operational value.

For roll-up consolidators, the structural play is sharper. When the platform layer carries an AI-vendor template, AI Council charter, and NIST-aligned governance stack, every subsequent acquisition propagates onto that scaffold rather than starting fresh. Compliance-by-design becomes a platform asset and a multiple-expansion lever.

Documentation discipline after close converts diligence findings from sunk cost into evidence of good-faith integration — the record that survives a regulator inquiry or a downstream buyer's own diligence. In 2026, this workstream stops being optional.

Why This Workstream Becomes Standard in 2026

The cyber-DD precedent sets the clock. As Saul Ewing's analysis of cybersecurity diligence documents, cyber DD evolved from an IT-audit checkbox in the early 2010s into a compliance-framework workstream by the late 2010s. AI diligence is on the same arc, with a tighter clock — and three forcing functions converge in 2026.

First, the Texas Responsible AI Governance Act takes effect January 1, 2026, with the Texas Attorney General holding exclusive enforcement authority, a 60-day cure period, and civil penalties up to $200,000 per violation. Second, the EU AI Act's high-risk system obligations — risk management, data governance, technical documentation — apply beginning August 2, 2026. Third, the Bartz $1.5B settlement, paired with the Workday/Mobley preliminary collective certification, has transformed AI risk from theoretical to insurable — and therefore priceable in deal terms.

The compliance overlay deepens the pressure. Colorado's AI Act, delayed by Governor Polis on August 28, 2025 to June 30, 2026, lands mid-year, and NYC Local Law 144 — already enforced by DCWP since July 5, 2023, with bias-audit requirements and per-day civil penalties — has been operationalized long enough to set buyer-side expectations.

Promise Legal's view: AI diligence becomes table stakes by Q4 2026 across mid-market deals. Below mid-market, the workstream remains optional, but it is increasingly priced into deal structure through escrow, indemnity caps, and lawful-corpus warranties. For buyers — and especially for roll-up consolidators inheriting AI BOMs across multiple targets — the work cannot wait until the first acquired-target liability surface materializes post-close.

AI diligence is faster and cheaper as a discrete workstream than as a post-close cleanup. Talk with our team about scoping the AI diligence workstream on your next deal or roll-up acquisition.

Start the conversation