How a Banking Roll-Up Surfaced 75 Shadow AI Cases

A bank rolled up from seven companies couldn't say what AI was live or where data flowed. An audit surfaced 75 shadow AI cases and 20 unauthorized MCP paths, cutting sprawl to 12.

Surfaced shadow AI cases in audit

4–8 weeks

Implementation Time

Unmukt Raizada

Founder & CEO

TrustEvals

Connect ↓

The Challenge

A PE-backed banking-software platform had been rolled up from seven companies into one business serving thousands of US financial institutions. Leadership had done the visible integration — one set of policies, one security stack.

But AI was running across three clouds, several engineering teams, and contractors who each held their own access, and no one could say what was live, who was using what, or where customer data was going.

On the engineering side, developers were shipping AI-generated code (Claude Code, Cursor) with no consistent review or security path. Non-engineering teams were using unauthorized MCPs and a sprawl of part-solving vendors on plans misaligned with actual usage. And the AI capabilities already shipping inside the product were weakly governed with limited eval harnesses.

What They Built

An AI audit across a PE-backed banking roll-up, anchoring four workstreams: the audit itself, securing how engineering builds with AI, reviewing the AI already in the product, and the governance framework to hold it together. It all runs on the TrustEvals evals platform (Python SDK, ingest gateway, eval engine, dashboard) deployed on the client's AWS instance.

AI Audit: finding every place AI was running across the seven merged companies — sanctioned tools, shadow tools on personal accounts, and MCPs — producing one unified, audit-committee-grade inventory.

Securing how engineering builds with AI: an AI code-review framework, CI/CD pipeline security, sandboxing and isolation, automated flagging plus human review, developer security training, and a vetted tool list.

Reviewing the AI in the product: architecture and multi-cloud data-flow review, prompt-security and output-validation tests, a model-selection and monitoring audit, data-privacy checks, and an AI incident-response playbook — with real evals run against the live pipeline.

The governance framework: an ISO 42001-aligned policy suite, an AI evaluation framework with use-case-specific thresholds, model-drift monitoring, a risk-assessment methodology, a consolidation plan onto a sanctioned toolset, and board-level reporting.

TrustEvals deployed its eval platform inside the client's AWS environment and worked four compounding workstreams. Discovery came first: external sensors (SentinelOne, Snyk, Cisco MCP Scanner, OAuth exports, git-history and attestation surveys) fed a classification engine that found every place AI was running across the seven merged companies — sanctioned tools, shadow tools on personal accounts, and MCPs — producing one audit-committee-grade inventory. Next, they hardened how engineering shipped AI: an AI code-review framework, CI/CD pipeline security, sandboxing, automated flagging with human review, and a vetted tool list. Third, they reviewed the AI already in the product, running prompt-security, output-validation and drift evals against the live AWS Bedrock pipeline before bank customers relied on it. Finally, they stood up the governance framework — an ISO 42001-aligned policy suite, use-case-specific eval thresholds, model-drift monitoring, a risk methodology, and board-level reporting. The team scoped it explicitly as a point-in-time audit feeding a continuous governance program. Delivered in 4–8 weeks.

AI Role

Infrastructure

Client AWS environment (production AI on AWS Bedrock) • TrustEvals platform (Python SDK, ingest gateway, eval engine, dashboard) • Okta / Entra ID + AWS IAM identity • GitLab CI/CD

Integration Points

External sensors (SentinelOne Deep Visibility, Snyk, Cisco MCP Scanner, OAuth exports, git-history scan, attestation survey) feed the discovery classifier; eval engine runs against the live AWS Bedrock pipeline; dev-security hooks into GitLab CI/CD, Endor Labs and JFrog.

Impact

75 Shadow AI Cases Found

The first unified count the company had as audit proof.

20 Unauthorized MCP Paths

Exposed across multiple entities, previously invisible.

Consolidated to 12 Tools

AI sprawl reduced to 12 sanctioned tools under the updated AI policy.

Technology Utilized

Implementation Complexity

Risk & Compliance

Best Fit For

Any multi-entity group — especially a PE-backed roll-up where AI has spread across merged companies, multiple clouds, contractors and a sprawl of part-solving vendors faster than a single view could keep up.

It also fits software companies whose own product ships AI to regulated customers and has to prove that AI is governed and evaluated. The conditions that make it work: several entities under one roof, more than one identity or cloud environment, real spend leaking into overlapping tools, and leadership that wants an audit-grade read of both internal AI use and the AI in the product.

Unmukt Raizada

Founder & CEO

TrustEvals

Founder & CEO of TrustEvals. Builds AI evaluation and governance infrastructure for finance, real estate and regulated software — eval harnesses, semantic data dictionaries, and AI audits.

Get an intro

Industry:

Financial Services

Business Function:

Product & Engineering

Company Size:

251-1,000

Project Cost:

$25K – $100K

Ownership:

Private Equity-Backed

Organization Type:

Private Company

AI Pattern:

AI-Accelerated Custom Software

AI Workforce Enablement

Value Type:

Risk & Compliance

AI Model:

Custom / proprietary

Frequently Asked Questions

How did a mid-market financial services roll-up uncover its shadow AI use?

The experts deployed an evaluation platform inside the company's AWS environment and ran a discovery workstream that fed external sensors into a classification engine, finding every place AI was running across the seven merged companies — sanctioned tools, shadow tools on personal accounts, and MCPs. This produced the company's first unified, audit-committee-grade inventory of AI use, surfacing 75 shadow AI cases and 20 unauthorized MCP paths.

What AI tools and approach powered the AI audit?

The work combined AI-accelerated custom software with AI workforce enablement, run on a proprietary evals platform (Python SDK, ingest gateway, eval engine, dashboard) deployed on the client's AWS instance. External sensors — SentinelOne, Snyk, Cisco MCP Scanner, OAuth exports, git-history scans, and attestation surveys — fed discovery classification, while the eval engine ran prompt-security, output-validation, and drift evals against the live AWS Bedrock pipeline.

What results did the financial services roll-up achieve?

The audit found 75 shadow AI cases across the seven merged companies, exposed 20 unauthorized MCP paths, and consolidated tool sprawl down to 12 sanctioned tools under an updated AI policy.

How long did the AI audit take?

It was delivered in four to eight weeks, scoped explicitly as a point-in-time audit feeding a continuous governance program.

Who is this AI audit approach best for?

Any multi-entity group — especially a PE-backed roll-up where AI has spread across merged companies, multiple clouds, and overlapping vendors — and software companies whose own product ships AI to regulated customers and must prove it is governed and evaluated.