From Good to Great: How Aveni’s AI Outclasses GPT-4 in Identifying Vulnerabilities

4 min read

Aveni’s fine-tuned RoBERTa language model has been knocking it out of the park when it comes to detecting vulnerabilities in call transcripts, even beating the latest GPT-4. Over the past two years, we’ve put our model through rigorous tests and it’s come out on top every time. Tailored with our unique vulnerability detection data, RoBERTa outshines GPT-3 and GPT-3.5, and even leaves GPT-4 trailing behind. It’s not just more effective, it’s also more reliable and cost-efficient.  Here’s why:


Specialisation vs. Generalisation:

  • RoBERTa: Our model is specifically fine-tuned on a large dataset of call transcripts annotated for vulnerability detection, making it highly specialised for this task. It excels at identifying nuanced and context-specific signs of vulnerability within conversations.
  • GPT Models: While GPT models are versatile and capable of handling a wide range of tasks, they are designed as general-purpose models. This generality means they may not perform as well on specialised tasks like vulnerability detection without extensive fine-tuning, which is costly and complex.

Handling Noisy Data:

  • RoBERTa: The training data for our RoBERTa model includes a lot of noisy, transcribed conversations. Our fine-tuning process allows the model to learn to navigate these inconsistencies effectively.
  • GPT Models: These models, particularly in their out-of-the-box state, are less effective at handling noisy data. Even when optimised with prompt engineering and few-shot learning techniques, they struggle with transcription errors and specific context of our data.

Few-shot learning involves giving an AI a few examples of how to do something new and then asking it to do the same task on its own. For example, providing it with three different examples of vulnerabilities and explaining, “These are all vulnerabilities to do with health.” Now, if you show it a new example, it’ll be able to tell if it’s a health related vulnerability or not based on those few examples. 

Cost Efficiency:

  • RoBERTa: Once fine-tuned, the RoBERTa model is cheaper to run and deploy. It doesn’t require the high per-call costs associated with GPT API usage.
  • GPT Models: The API costs for using GPT models, especially for extensive tasks like vulnerability detection across numerous call transcripts, can be prohibitively expensive. Each call processed through GPT incurs a cost which can add up quickly.


The Value of Effective Vulnerability Detection in AI Solutions

Effective vulnerability detection is particularly crucial in financial services, where identifying and responding to customer vulnerabilities can significantly impact customer satisfaction and compliance with regulations. Aveni’s fine-tuned RoBERTa model offers several advantages:

Accuracy and Reliability: By achieving higher F1-scores compared to GPT models, our RoBERTa model ensures more accurate detection of vulnerabilities, reducing false positives and negatives.

Cost Savings: Deploying a fine-tuned model like RoBERTa is more cost-effective in the long run, avoiding the high recurring costs of API calls to more generic models like GPT-3, GPT-3.5, or GPT-4.

An F1 score helps us see how good a model is at (in this case) spotting vulnerabilities.

Precision checks how many of the people it says need help really do need help.

Recall checks how many people who needed help it actually found, out of all the people who needed help.

The F1 score combines both of these checks into one number to show how well the system is doing overall. The closer the number is to 1, the better the system is at correctly finding people who display characteristics of vulnerability.

Explainability and Transparency: At Aveni, we have complete ownership of the training data used to fine-tune RoBERTa, and we are fully aware of the data employed during its initial pre-training. This comprehensive understanding of both the pre-training and fine-tuning processes allows us to use techniques that enhance the explainability of the model’s decisions. Unlike GPT models, which often function as black boxes, our approach ensures clarity and transparency in how our model operates.


Control Over the Model: Our in-depth knowledge of the pre-training and fine-tuning processes grants Aveni full control over the model. This control enables us to make precise modifications and continuous improvements, ensuring the model remains effective and up-to-date.


Compliance and Ethical Considerations: Accurate vulnerability detection isn’t just about ticking boxes; it’s essential for staying on the right side of regulations like Consumer Duty. Plus, it ensures customer interactions are handled ethically, which builds trust and protects reputation.


To sum up, our fine-tuned RoBERTa model is the clear winner for spotting vulnerabilities in call transcripts. It’s more accurate, dependable, and budget-friendly compared to the more generic GPT models. This highlights just how crucial it is to fine-tune models for specific tasks. By tailoring our approach, we’ve created a powerful tool that’s not only smart but also efficient, making it the best choice for this important job especially in light of Consumer Duty guidance from the FCA.

Related posts

Adviser productivity
The landscape of mergers and acquisitions (M&A) in wealth management is undergoing significant changes, driven largely by evolving regulatory scrutiny. In a recent webinar, Jana Sivananthan, CRO at 7IM along...
UK’s leading financial advice platform integrates generative AI for unprecedented automation   London, UK – 04 June 2024 –, an award-winning AI Fintech company, and intelliflo, a global leader...
Aveni’s fine-tuned RoBERTa language model has been knocking it out of the park when it comes to detecting vulnerabilities in call transcripts, even beating the latest GPT-4. Over the past...
Feel like the first Consumer Duty board report deadline is creeping up on you too quickly? Don’t panic. Here are some quick and easy steps financial advice firms can take...
Adviser productivity
In the ever-evolving landscape of financial services, staying ahead of regulatory changes is not just a necessity but a strategic advantage. Recently, Jana Sivananthan, CRO at 7 IM shared his...
Adviser productivity
A number of recent, high profile cases , serves as a stark reminder of the importance of meticulous record-keeping for financial advisors. The Financial Conduct Authority (FCA) is penalising advice...
In this webinar, Aveni’s CEO, Joseph Twigg, Head of NLP, Iria Del Rio and Chief Client Officer, Robbie Homer-Plews, held a live Q&A bootcamp as a crash course in AI...
Financial Services
Just four months into 2024 and the FCA has been relentless at proving that Consumer Duty is every bit the ‘TCF with teeth’ the industry expected it to be. Companies...
The financial services (FS) industry is steeped in complexity and ever-evolving regulations. From process inefficiencies to outdated legacy systems that require manual data input that hasn’t been maintained to a...
What is the EU AI Act: the key takeaways   The December 2023 EU AI Act is the first comprehensive legal framework for AI in the world. It aims to...
Consumer Duty
Based on current trends and strategic planning, here are the top 5 regulatory developments the Financial Conduct Authority (FCA) will prioritise in 2024 and everything you need to know about...
Company News
Our co-founder and chief scientist Dr Lexi Birch is leading a newly approved EuroLLM project, utilising the Barcelona Supercomputer. The project was awarded 1.7M hours to help train large language...

Aveni’s platform uses the latest in NLP to transform productivity and risk oversight.

Scale compliance at a fraction of the cost

Cut financial advice admin from hours to minutes with Aveni’s AI assisitant

Aveni Assist

Get up and running with Aveni Assist and how it can help transform productivity and compliance. 

Aveni Detect

Get up and running with Aveni Detect and how it can help transform productivity and compliance. 

Read the latest articles from Aveni

Access our latest whitepapers, webinars, brochures and more

Jargon-bust your way to a better understanding of all things AI