Company News

From Good to Great: How Aveni’s AI Outclasses GPT-4 in Identifying Vulnerabilities

May 23, 2024
4 min read

Aveni’s fine-tuned RoBERTa language model has been knocking it out of the park when it comes to detecting vulnerabilities in call transcripts, even beating the latest GPT-4. Over the past two years, we’ve put our model through rigorous tests and it’s come out on top every time. Tailored with our unique vulnerability detection data, RoBERTa outshines GPT-3 and GPT-3.5, and even leaves GPT-4 trailing behind. It’s not just more effective, it’s also more reliable and cost-efficient. Here’s why:

Specialisation vs. Generalisation:

RoBERTa: Our model is specifically fine-tuned on a large dataset of call transcripts annotated for vulnerability detection, making it highly specialised for this task. It excels at identifying nuanced and context-specific signs of vulnerability within conversations.
GPT Models: While GPT models are versatile and capable of handling a wide range of tasks, they are designed as general-purpose models. This generality means they may not perform as well on specialised tasks like vulnerability detection without extensive fine-tuning, which is costly and complex.

Handling Noisy Data:

RoBERTa: The training data for our RoBERTa model includes a lot of noisy, transcribed conversations. Our fine-tuning process allows the model to learn to navigate these inconsistencies effectively.
GPT Models: These models, particularly in their out-of-the-box state, are less effective at handling noisy data. Even when optimised with prompt engineering and few-shot learning techniques, they struggle with transcription errors and specific context of our data.

Few-shot learning involves giving an AI a few examples of how to do something new and then asking it to do the same task on its own. For example, providing it with three different examples of vulnerabilities and explaining, “These are all vulnerabilities to do with health.” Now, if you show it a new example, it’ll be able to tell if it’s a health related vulnerability or not based on those few examples.

Cost Efficiency:

RoBERTa: Once fine-tuned, the RoBERTa model is cheaper to run and deploy. It doesn’t require the high per-call costs associated with GPT API usage.
GPT Models: The API costs for using GPT models, especially for extensive tasks like vulnerability detection across numerous call transcripts, can be prohibitively expensive. Each call processed through GPT incurs a cost which can add up quickly.

The Value of Effective Vulnerability Detection in AI Solutions

Effective vulnerability detection is particularly crucial in financial services, where identifying and responding to customer vulnerabilities can significantly impact customer satisfaction and compliance with regulations. Aveni’s fine-tuned RoBERTa model offers several advantages:

Accuracy and Reliability: By achieving higher F1-scores compared to GPT models, our RoBERTa model ensures more accurate detection of vulnerabilities, reducing false positives and negatives.

Cost Savings: Deploying a fine-tuned model like RoBERTa is more cost-effective in the long run, avoiding the high recurring costs of API calls to more generic models like GPT-3, GPT-3.5, or GPT-4.

An F1 score helps us see how good a model is at (in this case) spotting vulnerabilities.

Precision checks how many of the people it says need help really do need help.

Recall checks how many people who needed help it actually found, out of all the people who needed help.

The F1 score combines both of these checks into one number to show how well the system is doing overall. The closer the number is to 1, the better the system is at correctly finding people who display characteristics of vulnerability.

Explainability and Transparency: At Aveni, we have complete ownership of the training data used to fine-tune RoBERTa, and we are fully aware of the data employed during its initial pre-training. This comprehensive understanding of both the pre-training and fine-tuning processes allows us to use techniques that enhance the explainability of the model’s decisions. Unlike GPT models, which often function as black boxes, our approach ensures clarity and transparency in how our model operates.

Control Over the Model: Our in-depth knowledge of the pre-training and fine-tuning processes grants Aveni full control over the model. This control enables us to make precise modifications and continuous improvements, ensuring the model remains effective and up-to-date.

Compliance and Ethical Considerations: Accurate vulnerability detection isn’t just about ticking boxes; it’s essential for staying on the right side of regulations like Consumer Duty. Plus, it ensures customer interactions are handled ethically, which builds trust and protects reputation.

To sum up, our fine-tuned RoBERTa model is the clear winner for spotting vulnerabilities in call transcripts. It’s more accurate, dependable, and budget-friendly compared to the more generic GPT models. This highlights just how crucial it is to fine-tune models for specific tasks. By tailoring our approach, we’ve created a powerful tool that’s not only smart but also efficient, making it the best choice for this important job especially in light of Consumer Duty guidance from the FCA.