What the FCA Supercharged Sandbox revealed about AI agent governance in practice

For three months between October 2025 and January 2026, a small group of UK financial services firms had something most of the industry has been asking for: a regulator in the room while they built. Before launch. Before a Section 166 review. While they were still figuring out whether the thing worked.

That’s the Supercharged Sandbox. The Financial Conduct Authority’s experiment in being a sparring partner during development. Aveni was in the inaugural cohort. We took Agent Assure, our AI agent governance product, through every stage of it. What follows is a plain-English account of what the Sandbox actually is, what the FCA was looking for, and what the work revealed about governing AI agents inside a regulated firm.

One thing became obvious within the first month. Most firms scoping their agentic AI deployments are asking the wrong question first.

What the FCA Supercharged Sandbox actually is

The FCA launched the Supercharged Sandbox in June 2025 in partnership with NVIDIA. It builds on the existing Digital Sandbox infrastructure run by NayaOne, with the kit cranked up: NVIDIA’s Accelerated Computing platform, the NVIDIA AI Enterprise software suite, synthetic financial services datasets, and structured engagement with the FCA’s AI Lab throughout.

The first cohort ran October 2025 to January 2026. The showcase event took place at Olympia London on 28 and 29 January 2026, in front of industry leaders, technology partners, and regulators. A second cohort opens for applications until 1 June 2026, and the FCA has been explicit about what it wants to see this time: agentic AI use cases, including compliance agents, customer interaction agents, and agentic payment services.

The FCA’s AI Lab also runs a second programme: AI Live Testing. That’s the route for firms further along, ready to move from proof of concept to real-world deployment under intensified supervision. The Sandbox is where firms work out whether the idea holds together. AI Live Testing is where they find out whether it holds up.

The existing rulebook still applies. The FCA has been clear it isn’t writing a new one. Consumer Duty, SMCR, operational resilience, all of it carries over. The Sandbox is the regulator’s way of helping firms work out what compliance looks like when the technology changes shape underneath them.

The four questions the Sandbox actually tests

Strip away the framing and a regulated firm deploying an AI agent has to answer four questions before launch, then keep answering them every single day after. We set out the full version in the AI on Trial: The Burden of Proof series. The short list:

  • How do we know this agent is behaving inside our risk appetite?
  • How do we step in when it isn’t?
  • How do we evidence Consumer Duty obligations across every single interaction?
  • How do we hold the agent to the same standard we hold a human adviser to?

The Supercharged Sandbox doesn’t give you regulatory clearance. It gives you something arguably more useful: a place to find out, in private, whether your answers to those questions survive contact with a regulator who knows what they’re looking at.

What Agent Assure tested, and what came out of it

Aveni used the Sandbox to pilot the core components of Agent Assure. A first-line protection guidance agent grounded in real client use cases. A two-tier assurance capability combining pre-deployment validation with real-time post-deployment monitoring. The whole thing built around what we call unified assurance: one compliance standard applied across human-led and AI-led customer interactions. From the FCA’s perspective, the channel is beside the point. The outcome is what matters.

Three things came out of the work that should change how firms scope their own agentic deployments.

Sampling falls apart at agentic scale. Two decades of QA practice in regulated firms have been built on a 2-3% sample. At agentic scale, where an agent is doing the work of a thousand advisers, it doesn’t hold. The Bank of England’s February 2026 AI roundtables reached the same view: traditional model risk management validation will become unsustainable as generative AI and agentic systems proliferate. We made the same argument from the marketing side in Count III of the AI on Trial series. The Sandbox confirmed it from the inside.

Independent assessor models hold up better than self-marking. Asking a primary language model to assess its own outputs is a bit like asking a sixth-former to grade their own paper. In the Sandbox, Aveni piloted an approach using small language models as independent assessors of primary model behaviour, drawing on our FinLLM suite of UK finance-specific models. The assessor sits outside the agent it’s governing. That separation is what the second line of defence is going to need, eventually, in writing.

Evidence packs have to be auditable. Look at how most AI vendor pitches end in 2026. A demo and a slide deck, where a regulated firm signing off on an AI agent deployment needs simulated interactions turned into structured risk reports a Chief Risk Officer can file alongside an SMCR sign-off. The senior manager liability stays with the named individual, which is the argument we made in Count I of the series. The model doesn’t take the call from the FCA. The senior manager does.

What this means if you’re scoping an agentic deployment

You don’t need to be in the Sandbox to learn from it. Three things apply whether or not your firm makes the second cohort.

First, the evidence pack is the deliverable. Most boards are still scoping their AI work around capability questions: what can the agent do, what model are we using, what’s the budget. The firms moving fastest in 2026 have flipped the order. They’re designing the evidence pack first and working backwards to the agent.

Second, audit trails have to be retrievable within minutes by a compliance officer running a specific query. The FCA applies the same standard to a poor outcome from a human adviser and a poor outcome from an AI agent. Your audit trail has to be specific enough to reconstruct what the agent did and, more importantly, why. We covered this in Count II of the series, and it was the single most-discussed practical gap in the Sandbox.

Third, specialist models are the difference between the demo and the deployment. General-purpose foundation models get a firm to roughly 70% of the quality bar a UK regulator will accept. The remaining 30% — domain provenance, explainability, audit logging, financial services context — is the part that cannot be retrofitted at the deployment gate. The full case sits in Count V of the series, where we set out why financial services needs purpose-built models, with FinLLM as our answer to that gap.

What happens next

Aveni is applying to AI Live Testing for Agent Assure. The Sandbox proved the assurance approach. AI Live Testing is the next gate, and it’s the one that matters for real-world deployment.

For firms watching the second Supercharged Sandbox cohort, applications close on 1 June 2026. For firms already scoping an agentic deployment, the published Sandbox findings are the closest thing the UK currently has to a regulator’s view of what good AI agent governance looks like in practice. Worth tracking, even if your product isn’t in the room.

The governance question is the one most firms have yet to answer. The Sandbox is where you find out whether yours holds.


Read the full AI on Trial: The Burden of Proof series


Frequently Asked Questions

What is the FCA Supercharged Sandbox? The FCA Supercharged Sandbox is a programme run by the Financial Conduct Authority’s AI Lab that gives UK financial services firms a controlled environment to test AI applications. It launched in June 2025 in partnership with NVIDIA, and provides access to NVIDIA Accelerated Computing, the NVIDIA AI Enterprise software suite, synthetic datasets, and direct regulatory engagement. The first cohort ran from October 2025 to January 2026.

When does the second FCA Supercharged Sandbox cohort start? Applications for the second cohort are open until 1 June 2026, with a particular focus on agentic AI use cases including compliance agents, customer interaction agents, and agentic payment services. Selected firms participate in a three-month programme culminating in a showcase event.

Does the FCA Supercharged Sandbox give firms regulatory approval? No. Participation in the Supercharged Sandbox does not indicate FCA approval, endorsement, or authorisation of a product or service. The Sandbox is a testing environment for early-stage AI proof of concepts. Firms further along in development can apply to the separate AI Live Testing service.

What is the difference between the FCA Supercharged Sandbox and FCA AI Live Testing? The Supercharged Sandbox supports firms in the discovery and experimentation phase of their AI work. AI Live Testing supports firms ready to deploy AI in a real-world environment under intensified supervision. Together they form a pipeline from early-stage testing through to live deployment.

Who was in the inaugural FCA Supercharged Sandbox cohort? The inaugural cohort included firms working across retail finance, compliance, and wholesale markets. Aveni was selected for the cohort and used the programme to pilot Agent Assure, its AI agent governance product. Other named participants include Napier AI, which worked on financial crime applications.

What did Aveni test in the FCA Supercharged Sandbox? Aveni piloted Agent Assure, an AI agent governance product, in the inaugural Supercharged Sandbox. The work tested a first-line protection guidance agent alongside a two-tier assurance capability combining pre-deployment validation and real-time monitoring. The approach used small language models from Aveni’s FinLLM suite as independent assessors of primary model behaviour, applying one compliance standard across human-led and AI-led customer interactions.

Does the FCA expect firms to use the Supercharged Sandbox before deploying AI? No. The Supercharged Sandbox is voluntary and is one route firms can take to develop and test AI safely. The FCA’s published position is that existing regulatory frameworks, including Consumer Duty and SMCR, apply to AI deployments regardless of whether a firm has been through the Sandbox. The programme sits alongside the existing compliance framework as a support mechanism.

Share with your community!

In this article

Related Articles

Join our newsletter

Be the first to hear about new features, releases, and best-practice guides.

Aveni AI Logo