Skip to content

TIPS #38: You Can’t Unbake a Cake

Shane Shook

April 28, 2026

  • Blog Post
  • TIPS

Issue: Companies use AI models with limited validation controls, weak provenance, and insufficient forensic readiness.

Most enterprises treat AI models as reusable infrastructure. They download pretrained models from public registries, fine-tune them on internal data, and connect them to business systems. These models deliver speed and leverage but also carry inherited trust assumptions into systems that produce decisions at scale- decisions that are often impossible to reverse, difficult to explain, and increasingly likely to be the subject of regulatory and legal scrutiny.

From a security perspective, most enterprise AI models are third-party executables with training lineages, packaging risks, and often hard-to-explain behaviors. Notably, NIST classifies pretrained models, third-party datasets, and related software as part of the AI supply chain in its AI Risk Management Framework. Yet, most companies extend far less scrutiny to a downloaded AI model than they would to a new software dependency, and even less to the data, weights, and optimization choices that shaped the model.

You Can’t Unbake a Cake

Security teams have overlooked forensic quality in AI systems and created a liability. When a model begins producing harmful, biased, or manipulated output, most companies can’t reconstruct model lineage to determine which version was running, what data shaped it, or what changed during tuning. This leaves them with unmanaged risks.

Impact: Compromised or unvalidated models can embed malicious logic, leak data through inference channels, and degrade business decisions.

When provenance is weak, testing is shallow, or lineage is missing, model supply chain attacks become an operational, legal, and forensic reality. From a risk management perspective, this leaves companies exposed to a massive set of liabilities.

Regulatory holds, discovery production, and litigation reviews all require the ability to reconstruct what a system did, when, and on what basis. When companies can’t establish the chain of custody for an AI model, the evidentiary record is challengeable, potentially leading to financial and operational consequences that compound the original incident.

Upstream from these impacts are underlying model failures and compromises that are often invisible by design. Models may initially pass validation and produce outputs that seem successful until associated business outcomes start to raise red flags. The damage this creates is different from the impacts of ransomware or a broken server, manifesting in irreversible business decisions at scale. For example:

In cases of deliberate model manipulation, the attack surface is the trust relationship organizations have failed to govern. The warning signs have been evident for nearly a decade, but forensic readiness has not adapted.

As the following case studies show, the attack surface has now expanded to include production AI infrastructure:

BadNets (2017)

In 2017, researchers at NYU published BadNets, the first research paper to define the machine learning model supply chain as a security problem. The research demonstrated that a neural network could be backdoored during training to perform normally on clean inputs while systematically misclassifying inputs containing an attacker-chosen trigger. Critically, the researchers showed that the backdoor survived transfer learning, meaning any downstream model built on top of the compromised base model would inherit its behavior.

BadNetsis the prime example of the ‘can’t unbake a cake’ problem for AI models: a company that fine-tunes a backdoored AI model unknowingly embeds malicious behavior into a new model that it can’t decompose. While the research paper gained traction at the time, the controls it called for- provenance and verifying and inspecting neural networks- were not broadly adopted.

nullifAI vulnerability in Hugging Face (2025)

In February 2025, threat researchers at ReversingLabs discovered “nullifAI,” a novel technique to distribute malware on open source AI model platform Hugging Face. The attack exploited a flaw in PickleScan, Hugging Face’s security scanner, by using intentionally broken Pickle files to evade detection. Any developer that downloaded the payload- the compromised model- on their machine would unknowingly introduce an executable supply chain threat: attacker-controlled code. Without lineage tracking, there would be no way to identify which downstream models inherited the compromise.

LiteLLM Supply Chain Attack Impacts Mercor (2026)

On March 24, 2026, threat actor group TeamPCP compromised LiteLLM, a Python package widely used by AI development teams to route calls across LLM providers. The compromised versions were available for less than five hours, but the damage was done. Mercor, a $10 billion AI startup providing training data to Anthropic, OpenAI, and Meta, confirmed it was among the thousands of affected organizations. TeamPCP has publicly stated its intention to partner with ransomware and extortion groups to pursue affected organizations at scale. The extortion group Lapsus$ has separately claimed four terabytes of stolen data from Mercor tied to the compromise, including AI training datasets, source code, and contractor records.

A third-party forensic investigation is currently underway. Impacted organizations are trying to reconstruct what the compromised LiteLLM versions touched during the five-hour window, including which training jobs relied on it, which resulting models are now in production, and whether those models carry downstream effects.

Action: Treat AI models as high-risk executable assets and enforce provenance controls across the model lifecycle.

Software supply chain risk taught enterprises that they can’t trust what they can’t inventory. AI is teaching the next lesson: they can only investigate what they can prove. In other words, you can’t unbake a cake, but you can document the recipe.

Security teams must apply the core principles of supply chain security and cyber forensics to AI models via the Model Supply Chain Assurance (MSCA) framework:

1) Maintain an AI Bill of Materials (AI BOM)

You can only control your AI stack if you can identify where a model came from, which version is running, what format it arrived in, what data shaped it, what tuning was applied, and where it is deployed. An AI BOM- analogous to the SBOM- forms the foundation of MSCA by documenting models, dependencies, datasets, and configurations to ensure visibility and transparency.

2) Validate training data provenance

Provenance validation must be treated as a distinct security control, not as an assumption bundled into model evaluation: a strong benchmark score doesn’t reveal if a model learned from trusted sources or poisoned ones. AI model provenance also needs to be preserved in a searchable manner to remain useful for auditing, lineage, and forensic purposes.

Maisa helps companies build, manage, and govern AI with visibility, security, and enterprise-scale control. Its unified platform embeds traceability and auditability at the architectural level to enable lineage and reconstruction capabilities.

3) Perform adversarial testing before production

It’s critical to stress-test models and AI-enabled applications based on real-world adversary capabilities that exploit vulnerabilities, backdoors, and leakage paths.

Bishop Fox helps companies perform adversarial testing and exposure management for AI systems and applications, identifying weaknesses in model behavior, unsafe loading conditions, and dependency integrity failures before they are exploited.

“AI systems behave in ways that scanning tools simply can't anticipate. The vulnerabilities that matter most are the ones that emerge from how the model was trained, what it was built on, and how it behaves in real-world conditions.”

Vinnie Liu CEO and Co-founder, Bishop Fox

4) Enforce AI runtime policies and observability

AI model governance doesn’t end at deployment. Runtime behavior is where the impacts of compromised or poisoned models show up, often in the form behavioral drift, policy violations, and anomalous system behavior following a compromised dependency. This is particularly important in the age of agentic AI, where an agent can pass pre-deployment checks and be manipulated at runtime through prompt injection, tool abuse, or unanticipated agent-to-agent interactions.

Capsule Security secures AI agents at runtime by continuously monitoring agents to detect anomalous behaviors. The Capsule platform protects every agent action, preventing risky commands, unsafe tool usage, and data exposure.

“AI agents are becoming a new class of privileged user in the enterprise, a user that can act at machine speed and does not behave like deterministic software. Trust has to be enforced at runtime so teams can keep up with agents and stay in control of what they access and execute.”

Noar Paz CEO and Co-Founder, Capsule Security

5) Preserve evidence for reconstruction

Forensic chain of custody is essential to produce defensible records for regulators and legal discovery. Practically speaking, evidence including hashes, lineage records, evaluation histories, approval records, deployment timestamps, and model change logs must survive long enough to support a real investigation.

WitnessAI ensures the evidentiary record is preserved, providing enterprise-wide observability and controls over AI usage including governance records, policy enforcement decisions, and audit trails.

“AI workflows are maturing and starting to cross corporate LLMs, cloud models, and agents. Without unified visibility across those interactions, enterprises can't govern AI or create a defensible audit trail.”

Rick Caccia CEO and Co-founder, WitnessAI

Featured in this edition: Bishop Fox, Capsule Security, Maisa, ReversingLabs, & WitnessAI