Consumer Duty

Consumer Duty Compliance: A Practical Guide for Financial Firms

February 24, 2026
18 min read

Consumer Duty compliance requires firms to demonstrate clear evidence that customers receive good outcomes. This guide brings together the key topics firms need to understand, from evidence standards and FCA expectations to the operational challenges of monitoring customer interactions at scale.

Why “Reasonable Steps” Are No Longer Enough: Evidencing Consumer Duty at Scale

The FCA’s Consumer Duty requires firms to take “reasonable steps” to deliver good outcomes for customers. Most firms believe they are meeting that standard. The problem is that the FCA’s interpretation of “reasonable” has shifted significantly since the Duty came into force in July 2023.

During the initial implementation phase, many firms treated Consumer Duty compliance as a documentation exercise. They updated policies, trained staff, reviewed a sample of customer interactions, and reported to the board. That approach was sufficient to meet the first deadline. It doesn’t satisfy the FCA’s current expectations.

The regulator has made clear that evidencing Consumer Duty compliance requires data, not just processes. Firms need to demonstrate good outcomes across all customer interactions, not just the small percentage that lands in a QA sample. And they need to do this consistently, with audit trails that hold up under regulatory scrutiny.

How FCA expectations have shifted

2023 was enough then. It isn’t now.

2023 Implementation

✓

Policies updated

✓

Staff training completed

✓

Sample of interactions reviewed

✓

Board report submitted

→

FCA’s Current Standard

→

Data evidencing outcomes across all interactions

→

Outcome-specific monitoring per all four Duty areas

→

Full coverage, not a sample

→

Audit trails from issue to action

This guide is a comprehensive resource covering the full scope of Consumer Duty evidence requirements. It explains what the FCA means by “reasonable steps” and how that standard has evolved, where the most common evidence gaps sit across financial services firms, why manual sampling introduces regulatory risk that many firms underestimate, what a credible and scalable evidence framework looks like in practice, how to evaluate technology for systematic Consumer Duty monitoring, and how to move from sample-based QA to full-coverage evidence in a phased, realistic timeline.

Whether you lead compliance, run QA operations, or sit on the board of a regulated firm, this guide provides a structured path from where most firms are today to where the FCA expects them to be.

What the FCA Means by “Reasonable Steps”

The Consumer Duty’s cross-cutting rules require firms to act in good faith, avoid foreseeable harm, and enable customers to pursue their financial objectives. Principle 12, the overarching standard, states that a firm must act to deliver good outcomes for retail customers. The FCA’s Finalised Guidance (FG22/5) sets this against the benchmark of what could “reasonably be expected of a prudent firm.”

In practice, “reasonable steps” means firms must demonstrate they have identified, monitored, and acted on evidence of customer outcomes across all four Duty outcomes: products and services, price and value, consumer understanding, and consumer support.

The FCA has been explicit about what this requires. Its published good and poor practice findings state that firms need “the right culture and governance” and must use “data to identify, monitor and confirm they are satisfied that their customers’ outcomes are consistent with the Duty.” Firms that wait for the FCA to intervene rather than addressing issues proactively, are falling short of this standard.The critical word is “evidence.” Board reports must include the results of monitoring, evidence of poor outcomes (including whether specific customer groups are affected), and an overview of actions taken. Process completion alone does not satisfy this test.

The FCA’s multi-firm review of insurance firms confirmed this.

The review found that firms’ approaches were “overly focused on processes being completed rather than on the outcomes delivered,” and that “few firms were able to provide clear evidence of where the monitoring of outcomes had directly led to the firm taking action.”

The FCA has made it clear that having a QA process in place is necessary, but that alone does not constitute evidence of good outcomes. They want to see what you found, what it means, and what you did about it.

For boards and senior leadership, this creates a direct governance obligation. Consumer Duty reporting that reaches the board must contain specific, data-backed evidence, not summary statistics built on incomplete monitoring.

Explore how the Consumer Duty cross-cutting rules apply in practice and what evidence the FCA looks for →

Read a detailed breakdown of all four Consumer Duty outcomes and what the FCA expects firms to evidence →

The Evidence Gap Most Firms Have

For many firms, the evidence gap is structural. Existing QA and compliance processes were designed for a different regulatory environment, one where sampling a small percentage of interactions, recording completion rates, and responding to complaints was considered adequate.

Consumer Duty sets a higher standard. Firms need to evidence good outcomes across the entire customer base, not just the interactions they review. They need data that’s specific to each of the four Duty outcomes, with tolerances and thresholds that are clearly articulated, regularly reviewed, and linked to remedial action.

Learn how conduct risk monitoring is shifting from reactive reviews to proactive detection under Consumer Duty →

The most common evidence gaps across financial services firms fall into four areas:

Coverage

Most firms review between 2% and 5% of customer interactions through manual QA. That means 95% to 98% of conversations, meetings, and service interactions go unreviewed. When the FCA asks a firm to demonstrate that customers are consistently receiving good outcomes, a sample covering 2% of interactions provides limited assurance.

The coverage gap is compounded by volume. A 500-adviser network generating 25,000 client interactions per year would need approximately 37,500 hours of assessor time at 90 minutes per case to achieve full coverage manually. That’s the equivalent of roughly 18 full-time compliance staff dedicated solely to QA review. For most firms, that level of resourcing is not feasible, which means the coverage gap persists by default rather than by design.

The FCA’s position is that firms should be able to confirm outcomes across their full customer base. A firm that reviews 2% of interactions and extrapolates the results is making assumptions about the other 98%. Under Consumer Duty, assumptions are not evidence.

The coverage challenge applies to firms of every size. In its review of the first annual Consumer Duty board reports, the FCA examined 180 firms, including 55 smaller firms, some with fewer than 10 employees. It found examples of good practice at every scale. The evidence standard does not change based on firm size. What changes is how firms meet it proportionately.

Smaller firms with limited access to formal MI can draw on qualitative sources: staff observations, customer feedback, complaints data, and input from trade bodies. The FCA accepts these approaches where formal data is genuinely difficult to obtain. The requirement is that whatever evidence a firm uses is structured, documented, and leads to a reasoned conclusion about whether customers are receiving good outcomes. Informal impressions, however accurate they may be, do not satisfy the standard.

A note on governance for smaller firms
The FCA has recommended that smaller firms without dedicated compliance or audit functions consider appointing a “critical friend” with relevant sector knowledge to provide independent feedback on their Consumer Duty approach. This could be an external consultant, a trade body contact, or a senior peer with regulatory experience. The role is to stress-test conclusions, identify practical gaps, and provide the kind of impartial challenge that larger firms get from their second and third lines of defence. For smaller IFA firms and protection networks, this is a practical and proportionate way to meet the governance expectations the FCA sets out in its board reporting guidance.

Discover why 2% sampling leaves firms exposed under Consumer Duty and what full coverage looks like in practice →

Consistency

Manual QA is inherently subjective. Two assessors reviewing the same interaction will often reach different conclusions about outcome quality, risk severity, and required follow-up. Across a large adviser network, this creates variation in how outcomes are measured and reported, which undermines the reliability of the evidence.

This variation is particularly acute for firms operating across multiple offices, regions, or business lines. A QA team in one office may apply stricter criteria for consumer understanding than a team in another. When this inconsistency feeds into board reporting, the firm’s evidence base becomes unreliable at precisely the moment it needs to be defensible.

For Heads of Operations and QA Managers, consistency is also a training and resourcing challenge. Maintaining calibration across a team of assessors requires regular benchmarking sessions, shared case studies, and documented assessment criteria. Many firms run these exercises, but the inherent subjectivity of manual review limits how far calibration can go.

Explore what the FCA considers adequate evidence and what quietly fails review →

Specificity

Many firms repackage existing management information (MI) for Consumer Duty reporting without asking whether it actually demonstrates outcomes. The FCA has flagged this directly, noting that firms should “not be complacent and assume that they can just repackage existing data.” The regulator wants firms to think seriously about what information they need to understand their customers’ outcomes and the issues they may face.

The specificity gap shows up most clearly in board packs. A board report that shows “95% of reviewed interactions rated satisfactory” tells leadership very little about whether customers are receiving good outcomes across each of the four Duty areas. It does not reveal whether consumer understanding was tested, whether vulnerability was identified and acted on, whether price and value was assessed against the target market, or whether products remained suitable over time.

Compliance Officers face the challenge of translating existing MI into outcome-specific reporting without the underlying data to support it. When the data was never collected with Consumer Duty outcomes in mind, retrofitting it to meet the standard produces reports that look complete but lack substance.

Use our Consumer Duty compliance checklist to assess whether your monitoring covers all four outcome areas →

Timeliness

Manual sampling typically runs weeks or months behind live interactions. By the time an issue surfaces in a QA review, the harm may already have occurred. The opportunity for early intervention has passed.

The timeliness gap has direct consequences for firms in sectors where customer interactions carry high conduct risk. Protection insurance conversations, equity release advice, and debt management calls all involve scenarios where delayed identification of a suitability concern or missed vulnerability indicator can result in measurable customer harm. For these firms, the gap between when a problem occurs and when it is identified represents both regulatory risk and potential redress exposure.

The FCA expects firms to act proactively. Evidence collected months after the fact demonstrates what went wrong; it does not demonstrate oversight.

Learn how Consumer Duty changes conduct risk monitoring and why most firms are behind

Why Manual Sampling Creates Regulatory Risk

Sampling has been the foundation of QA in financial services for decades. A firm selects a percentage of customer interactions (typically 2% to 5%), reviews them against a set of criteria, and reports the findings. On paper, this looks rigorous. In practice, it introduces risks that are difficult to defend under Consumer Duty.

Statistical risk. A 2% sample gives a firm visibility of 2% of what is happening. The remaining 98% is invisible. If a conduct issue, vulnerability indicator, or suitability concern sits within that 98%, the firm has no evidence it existed, no record that it was identified, and no audit trail showing what action was taken. Under Consumer Duty, the FCA expects firms to confirm that customers are receiving good outcomes consistently. A sample that misses the vast majority of interactions makes that confirmation unreliable.

Selection bias. Firms typically sample either randomly or based on basic criteria such as interaction type or adviser. Neither method reliably captures the highest-risk interactions. A complaint that was never formally raised, an affordability concern mentioned mid-conversation, or a vulnerable customer who did not self-identify will not appear in a criteria-based sample, and may not surface in a random one either.

Response time. By the time a manual QA process identifies a pattern (for example, a specific adviser consistently failing to confirm understanding, or a product being recommended without adequate disclosure), multiple customers may already have experienced poor outcomes. The FCA expects firms to act proactively. Retrospective sampling, by its nature, limits a firm’s ability to intervene early.

For QA Managers, these risks are often well understood but difficult to resolve within existing resource constraints. The issue is not awareness; it is capacity. Manual QA teams are typically stretched thin, and increasing the sample rate from 2% to even 10% would require a significant uplift in headcount without fundamentally changing the structural limitations of the approach.

See how AI-powered monitoring achieves 100% interaction coverage while reducing manual QA time →

Book a demo to see how Aveni Detect replaces sample-based QA with full interaction coverage →

Technology Approaches to Systematic Evidence

Delivering 100% coverage through manual QA would require an impractical number of reviewers. A 500-adviser network generating 25,000 client interactions per year would need approximately 37,500 hours of assessor time at 90 minutes per case. That is the equivalent of roughly 18 full-time compliance staff dedicated solely to QA review.

This is where technology becomes essential. AI-powered monitoring platforms analyse every customer interaction automatically, flagging conduct risk, vulnerability indicators, suitability concerns, and Consumer Duty outcome markers at a fraction of the time and cost of manual review.

Several capabilities matter when evaluating technology for Consumer Duty evidence.

Interaction analysis across channels. The platform should handle voice calls, meeting recordings, written correspondence, and advice documentation. Consumer Duty applies across all customer touchpoints, and evidence needs to reflect that.

Real-time and retrospective monitoring. Retrospective analysis reviews completed interactions and generates evidence for reporting and audit. Real-time monitoring flags issues during or immediately after an interaction, enabling faster intervention. The most effective approaches combine both.

Outcome-specific assessment. The technology should assess interactions against Consumer Duty’s four outcomes directly, with configurable criteria that reflect the firm’s specific products, services, and target market. Generic sentiment analysis does not meet this standard.

Evidence capture and export. Every flagged interaction should generate a traceable evidence record linking the original conversation to the identified risk, the outcome assessment, and any subsequent action. This creates the audit trail the FCA expects when it examines a firm’s Consumer Duty evidence.

Integration with existing workflows. The technology should feed into existing QA processes, compliance dashboards, and board reporting frameworks. Evidence that sits in a standalone system, disconnected from the firm’s governance structure, adds complexity without improving oversight.

Aveni Detect · Published Results

What full-coverage monitoring delivers

100%

Interaction coverage

Up from 2–3% manual sampling

83%

QA time reduction

Per assessment

75%

Faster vulnerability detection

Identified and acted on sooner

Aveni published case study data. Figures based on documented client outcomes.

As a reference point, firms using Aveni Detect have achieved 100% interaction coverage (up from typical 2–3% manual sampling), with QA assessment time reduced by 83% and vulnerability detection completed 75% faster. These results are documented in Aveni’s published case studies.

See how to evaluate Consumer Duty technology options and build a credible business case for your firm →

Implementation Roadmap

From sample-based QA to full coverage

Wks 1–4

Evidence Gap Analysis

Map evidence against all four outcomes. Identify gaps.

Wks 4–8

Technology Selection

Evaluate platforms against coverage and assessment needs.

Wks 8–14

Rollout & Adoption

Deploy in stages. Start with one team, then expand.

Ongoing

Continuous Improvement

Quarterly reviews. Refine tolerances over time.

Most firms achieve full Consumer Duty coverage within 12 weeks of deployment.

Phase 1: Evidence Gap Analysis (Weeks 1 to 4)

Start by mapping your current evidence against the FCA’s four Consumer Duty outcomes. For each outcome, document what data you currently collect, how much of your customer base it covers, and where the gaps sit. Pay particular attention to coverage (what percentage of interactions are reviewed), consistency (how standardised your assessment criteria are), and timeliness (how quickly issues surface after they occur).

This analysis gives you a clear picture of where your evidence is strong and where regulatory risk exists.

Phase 2: Technology Selection (Weeks 4 to 8)

With your evidence gaps documented, evaluate technology options against your specific requirements. Key selection criteria include: channel coverage (voice, written, face-to-face), integration with your existing CRM and compliance platforms, the platform’s ability to assess against Consumer Duty outcomes specifically, and evidence export capabilities for board reporting and regulatory review.

Consider running a proof-of-concept with a small subset of interactions to validate accuracy and assess how the technology handles your firm’s specific language, products, and customer profiles.

Phase 3: Rollout and Adoption (Weeks 8 to 14)

Deploy in stages. Start with a single team, product line, or office to establish baseline metrics and refine assessment criteria before expanding. Incorporate feedback from QA assessors, compliance leads, and frontline staff as you scale. Training should focus on how the technology supports existing QA workflows (not replaces them), and how the evidence it generates feeds directly into Consumer Duty board reporting.

Follow a detailed 90-day implementation roadmap for Consumer Duty compliance technology →

Phase 4: Continuous Improvement (Ongoing)

Consumer Duty compliance is not a one-time project. The FCA expects firms to refine their monitoring, update their tolerances, and act on what the evidence tells them over time. Build a review cycle that assesses the effectiveness of your evidence framework quarterly, incorporates regulatory developments, and tracks improvement in customer outcomes.

Book a demo to see how firms achieve full Consumer Duty coverage within 12 weeks →

Where to Start

The FCA’s expectations around Consumer Duty evidence will continue to increase. The regulator has signalled that multi-firm reviews, sector-specific scrutiny, and enhanced board reporting requirements are all part of its 2025–2026 agenda. Firms that depend on sampling and manual processes face growing regulatory risk with each review cycle.

Building a scalable evidence framework takes planning, the right technology, and a phased approach. But the starting point is straightforward: understand where your current evidence falls short, and begin closing the gaps.

Three Next Steps for Your Firm

1. Assess your evidence gaps. Map your current monitoring against the FCA’s four Consumer Duty outcomes and identify where coverage, consistency, and timeliness fall short.

2. Explore technology-enabled monitoring. See how AI-powered platforms deliver 100% interaction coverage and outcome-specific evidence tailored to your firm’s needs. Evaluate your options with our Consumer Duty technology assessment and business case guide →

3. Build your business case. Understand the cost, timeline, and return on investment for moving from sample-based QA to systematic evidence at scale. See what a realistic 90-day implementation looks like for firms moving to full-coverage monitoring →

Interested in seeing how technology can enable full Consumer Duty coverage?

Aveni Detect

See 100% of interactions.
Evidence every outcome.

Full interaction coverage, outcome-specific evidence, and board-ready audit trails. All aligned to FCA Consumer Duty requirements.

Book a demo → Explore Aveni Detect →

Frequently Asked Questions

What does Consumer Duty compliance require firms to evidence?

Firms must demonstrate good outcomes across all four FCA Consumer Duty outcomes: products and services, price and value, consumer understanding, and consumer support. Evidence must be data-backed, outcome-specific, and supported by a clear audit trail. The FCA has stated that process completion alone does not satisfy this standard.

What are the four Consumer Duty outcomes?

The four Consumer Duty outcomes are products and services, price and value, consumer understanding, and consumer support. Firms must monitor and evidence good outcomes across all four areas, not just report on process compliance. Board reports must include specific data tied to each outcome, along with evidence of any remedial action taken.

What counts as a ‘reasonable step’ under Consumer Duty?

The FCA benchmarks “reasonable steps” against what could reasonably be expected of a prudent firm. In practice, this means firms must identify, monitor, and act on evidence of customer outcomes across their full customer base. Waiting for the FCA to intervene before addressing issues falls short of this standard.

Why is 2% QA sampling a problem under Consumer Duty?

A 2% sample leaves 98% of customer interactions unreviewed, which means conduct issues, vulnerability indicators, and suitability concerns in that 98% leave no evidence trail. The FCA expects firms to confirm good outcomes consistently across all customers, not extrapolate from a small sample. Manual sampling also introduces selection bias and runs weeks or months behind live interactions, limiting early intervention.

What evidence does the FCA expect in Consumer Duty board reports?

The FCA expects board reports to include the results of outcome monitoring, evidence of poor outcomes (including whether specific customer groups are affected), and an overview of actions taken in response. In its review of 180 firms’ first annual Consumer Duty board reports, the FCA found that boards were approving reports without documented evidence of challenge. Process completion and summary statistics built on incomplete monitoring do not satisfy the standard.

How can firms achieve full Consumer Duty interaction coverage?

AI-powered monitoring platforms can analyse every customer interaction automatically, flagging conduct risk, vulnerability indicators, suitability concerns, and Consumer Duty outcome markers across all channels. Firms using Aveni Detect have achieved 100% interaction coverage, with QA assessment time reduced by 83% and vulnerability detection completed 75% faster, based on Aveni’s published case study data. A phased implementation approach, from evidence gap analysis through to full rollout, can achieve full coverage within 12 weeks.

What is the biggest evidence gap in Consumer Duty compliance?

Coverage is the most widespread gap. Most financial services firms review between 2% and 5% of customer interactions through manual QA, leaving the vast majority of conversations, meetings, and service interactions unreviewed. The FCA’s position is that assumptions about the unreviewed majority are not evidence.

What should firms do first to improve their Consumer Duty evidence?

Start by mapping current monitoring against the FCA’s four Consumer Duty outcomes and identifying where coverage, consistency, and timeliness fall short. This evidence gap analysis gives a clear picture of regulatory risk before any technology or process changes are made. From there, firms can evaluate technology-enabled monitoring and build a business case for moving from sample-based QA to systematic evidence at scale.