Skip to content

TIPS #31: AI Guardrails and Data Quality

Shane Shook

September 23, 2025

  • Blog Post
  • TIPS

Issue: AI has become commonplace across the enterprise, but companies often don’t have the right guardrails or data governance in place to manage its use.

AI adoption is rapidly accelerating across businesses. The number of organizations reporting AI use rose from 55% in 2023 to 78% in 2024, according to Stanford University’s Human-Centered Artificial Intelligence (HAI) institute.

While AI technology itself isn’t new in business- just ask any data scientist- the fact that AI is accessible across the enterprise is. Today, anyone in a company can adopt AI tools, often without the necessary expertise or guardrails. Developers are copy/pasting ChatGPT output into code, IT teams are customizing LLMs for critical internal workflows, and customer-facing business units are using AI to automate customer service and helpdesk functions.

For CISOs, CIOs, and CTOs, the question is no longer whether their organization is using AI or not. The question has become: can our business rely upon its AI-driven tools and are those tools being governed effectively?

The answers often depend less on the underlying AI models and more on the controls around them and who is using them. Two challenges dominate when it comes to AI in the enterprise: lay users adopting and misusing (or misinterpreting) AI tools, and enterprise teams building AI systems without disciplined data governance. Both of these scenarios introduce risk.

User AI adoption, misuse, and misinterpretation

The democratization of AI means developers, analysts, and business staff are adopting LLMs like ChatGPT and API-based AI tools without fully understanding the risks. AI models are statistical and retrieval is approximate- and generative outputs can sound authoritative even if they are inaccurate or misleading. For example, LLMs can generate plausible but incorrect or misleading text. Many users are prone to over-trust outputs, copy-paste errors, and misuse AI (in the wrong context or otherwise). Left unchecked, these risks scale across the enterprise.

“AI can auto-triage alerts, generate risk assessments, and suggest remediations, but it is only as smart as the data it sees and the way your company uses its outputs.”

Jimmy Mesta Co-founder and CTO, RAD Security

Data quality and governance issues in enterprise AI

Organizations and teams building or integrating their own AI have another critical issue to consider: data discipline and data quality. As mentioned above, AI model training methods like Machine Learning and Reinforcement Learning are statistical at their core, meaning that reproducibility and auditability come from the data sources that govern AI models- not the models themselves.

Data sources and any structured processes that transform or move data (like data pipelines) are the anchor that make statistical methods yield consistent, traceable results. Any team building its own AI system needs to think carefully about how data and the AI model interact throughout the process, from training through testing and production. When inference data diverges from training/validation data, failures can be silent, an all-too-common occurrence.

Without normalized features, completeness checks, and audited data pipeline and Extract, Transform, Load (ETL) processes, AI systems can produce inaccurate outputs. Vector search (ANN) is efficient but approximate in recall; generative models add context-window truncation and hallucination. A clean pipeline cannot remove these architectural limits, but it ensures inputs are consistent, validated, and auditable.

“AI frees SOC analysts from repetitive work, but only if the data foundation is solid.”

Ahmed Achchak Co-founder and CEO, Qevlar AI

Impact: Over-relying on poorly governed AI introduces a heightened risk of breaches, operational disruptions, and regulatory violations.

Without user guardrails and data quality controls, “garbage in” becomes “garbage out.” At the intersection of AI and security, that can mean breaches, compliance failures, and operational disruptions.

The following case studies show the real-world impacts of insufficient user guardrails and poor AI data governance:

ChatGPT-Hallucinated Case Law

In 2023, two lawyers in a New York-based law firm filed briefs citing non-existent case citations which were generated by ChatGPT, eventually facing sanctions and a $5,000 fine. This highlights the risks of users over-relying on and under-verifying AI outputs when guardrails are not present.

Software Supply Chain Compromises Spread via AI-Generated Code

Attackers have repeatedly compromised open-source repositories such as NPM and GitHub (recent example here), inserting malicious code into widely used packages. AI code assistants can accelerate dependency imports, increasing the chance that compromised packages slip into production- unless guardrails monitor and block unvetted code.

Developers, aided by AI code suggestions, have unknowingly embedded these compromised dependencies into production software in numerous cases. AI coding assistants may even generate hallucinated packages– empty packages that attackers can leverage to deliver malicious code. These are just some of the risks when there aren’t proper guardrails around AI-assisted coding: for a full list, including supply chain vulnerabilities and insecure output handling, review OWASP’s Top 10 for LLMs.

These incidents underscore the need for oversight tools to monitor dependency use and block unvetted code adoption.

Healthcare AI Diagnostic Drift

In the past few years, a number of FDA-authorized diagnostic AI tools have been recalled after diagnostic and measurement errors or functionality delay and loss. In many cases, the devices had limited or no clinical validation or real-world validation- put another way, inference (real-world) data distributions diverged significantly from training data, leading to inaccuracies. This is an example of how AI can fail when there is poor data quality and governance oversight.

 

“If you don’t control what’s going into your modeling, you won’t control what’s coming out. At the scale of modern training datasets, there’s a lot of room for unpleasant surprises.”

Kathryn Shih Venture Partner, Forgepoint Capital

Action: Implement guardrails to prevent users from misusing AI and develop data quality discipline to ensure enterprise AI produces reliable, auditable, and business-aligned outcomes.

1) Guardrails to constrain risky AI use

CISOs, CIOs, and CTOs need to set policies, enforce metrics, and demand transparency to reduce risk. When it comes to AI adoption and use in the business, the essential governance question is: which safeguards do our current (or potential) vendors provide to prevent untrained staff from unintentionally introducing risk? Vendor evaluations should prioritize safety features and observability functions (i.e. auditable logging and runtime oversight), not just flashy AI capabilities.

WitnessAI delivers runtime protections and oversight, ensuring that non-specialists can’t over-trust AI outputs without controls.

Nudge Security gives organizations visibility into shadow AI and SaaS adoption, enabling CISOs to manage exposure before it becomes systemic.

RAD Security uses AI to monitor for cloud misconfigurations and risky user actions, providing a backstop against inadvertent errors.

2) Prioritize data quality discipline

Security and technology leaders also must enforce data quality discipline across all data sources used in AI training, validation, and production environments, including data transformation and movement technologies like data pipelines.

Qevlar AI automates SOC workflows anchored in reliable data foundations, reducing the risk of silent model failures.

Bishop Fox integrates human-led pen testing with AI tools to validate exploitability and ensure that AI outputs map to real-world risk.

When present, data pipelines must anchor enterprise AI builds through lineage, validation, and reproducibility. Databahn.ai builds deterministic, auditable pipelines that ensure telemetry and data are complete, validated, and reproducible.

Example Template: AI Governance Questions and Measurable SLOs

The following template contains example questions and measurable Service Level Objectives (SLOs) designed to help security and technology leaders evaluate potential vendors and solutions:

  • How do you validate pipeline completeness (are all logs and flows ingested without gaps)?
    • Pipeline completeness: ≥ 99% of telemetry ingested and validated.
  • Can you provide audited lineage of how raw data transforms before the model sees it?
    • Audited feature lineage: 100% of critical features tracked through ETL.
  • What recall rates are achieved on benchmarked incidents, and how are they measured?
    • Retrieval recall@10: ≥ 95% of relevant records retrieved in benchmarks.
  • How are hallucination or false context insertions tracked and reported?
    • Hallucination rate: ≤ 2% in structured security evaluations.
  • What is your mean time to detect and remediate a bad inference?
    • Time-to-correct: Median < 30 minutes from detection to remediation.

Additional SLO and SLA Considerations

CISOs, CIOs, and CTOs need to weigh accuracy, latency, and cost whenever AI systems are deployed. In every case, the risk is not whether the system runs deterministically, but whether its outputs align with business intent under governed conditions. The following table is designed as a quick risk reference guide for technology and security leaders deploying AI.

*Note: RAG is an architectural pattern that can use SQL, keyword, or vector retrieval; the retrieval choice determines both accuracy and governance burden.