How low-resource Natural Language Processing is making Speech Analytics accessible to industry

Written on
byLexi Birch

Natural Language Processing (NLP) is a subset of artificial intelligence that enables machines to understand, process and analyse natural language in the way that humans will. The machine analyses data, interprets, measures sentiment and provides the intended inference from it. The data used for Natural Language Processing (and other forms of machine learning) may be labelled. Labelled data is data with predefined tags that provides information that the machine can learn from. This process is called supervised learning. A simple example of labelled data is the bio data of customers with labels indicating that the strings of letters with an ‘@’ symbol is their email address, the two digit numbers is their age, the images are their passport photos, etc. However, with unlabelled data, there aren’t such tags and the machine has to categorise or cluster the data attributes with similar patterns. This process is known as unsupervised learning. 


Natural Language Processing has achieved remarkable progress in the past decade on the basis of neural models. Using large amounts of labelled data can help achieve state-of-the-art performance for tasks such as sentiment detection, Named Entity Recognition (NER), Natural Language Inference (NLI) or question-answering. For these tasks, the labels or tags would be the sentiment of a review, or the people, places or organisations mentioned in the text. However, the dependence on labelled data prevents NLP models from being applied to low-resource settings because of the time, money, and expertise that is often required to label large amounts of textual data.  


We’re going to take a look at recent advances in NLP, which allow deep learning models to learn from very few examples. This is crucial for speech analytics where labelled examples are often in very short supply. 


Pre-train and Fine-tune


In the last few years, a new paradigm – pre-train and fine-tune – has emerged, which allows us to leverage large amounts of unlabelled data for NLP. The premise is that perhaps it’s better to first learn to model the language itself, then once we have a model that understands the language, we can share this knowledge with the many different tasks we care about by fine-tuning it on small amounts of labelled data. Language modelling is a machine learning task where the model needs to learn how to predict a missing word given the context of the rest of the sentence. This is a generic task with abundant naturally occurring data and can be used to pre-train such a generic model.


Arguably, the model that kick-started this trend was the Bidirectional Encoder Representations from Transformers (BERT) model. BERT is a transformer-based machine learning technique for pre-training developed by Google. The model takes input sentences where some words are masked out, and the task is to predict the masked words. The thing that really set BERT apart was the ease of fine-tuning. BERT is cleverly designed so that it’s easy to do this for lots of different tasks. You can download BERT pre-trained on a large English corpus like the BooksCorpus, and then for your task, you fine-tune BERT on labelled data. You can add a task-specific “head” onto BERT to create a new architecture for your task. This approach has led to huge improvements over state-of-the-art, providing a nice off-the-shelf solution to standard problems. 


Let’s  look at some other ways of using pre-trained language models:




Recently, researchers realised that an alternative paradigm would be to make the final task look more like language modelling. This would mean a fine-tuning step won’t be needed. It would also mean that we’re potentially able to perform new downstream tasks with little or no labelled data. This paradigm is called pre-train and prompt



The first pre-train and prompt paper, which showed the potential of this approach, was published in 2020 by Google (Raffel et al. 2020). They suggested a unified approach to transfer learning in Natural Language Processing with the goal of setting a new state-of-the-art in the field. To this end, they treated all NLP problems as a “text-to-text” problem. Such a framework allows using the same model, objective, training procedure, and decoding process for different tasks, including summarisation, sentiment analysis, question answering, and machine translation. The researchers call their model a Text-to-Text Transfer Transformer (T5) and train it on the large corpus of web-scraped data to get state-of-the-art results on several NLP tasks. The way to make all NLP tasks text-to-text is by selecting the appropriate prompts. This is so that the pre-trained LM itself can be used to predict the desired output, sometimes even without any additional task-specific training. This allows few-shot (learning from only a few examples of labelled data) and even zero-shot (generalising to a new task with no examples of labelled data) behaviour.




In this example, we see a prompt that takes a prompting function to generate a sentence where the language model needs to predict Z, which in this case, we would expect to be a positive sentiment. This allows us to directly use the language model for a specific task, sentiment detection. 


There are many different possible tasks that language models can perform: 



Here we can see examples of different prompts for different tasks. T5 was applied to several benchmarks and surpassed previous state-of-the-art results across many different individual Natural Language Processing tasks. T5 caused great interest in prompting and since then various improvements and challenges have been identified. 


Finding good prompts is difficult, and recent work has focussed on finding them automatically. Another active research question is how and when to train a model with prompts. 


Working with large language models is also challenging. Large language model size has been increasing 10x every year for the last few years. This road leads to diminishing returns, higher costs, more complexity, and new risks. Downsizing efforts are also underway in the Natural Language Processing community, using transfer learning techniques such as knowledge distillation which trains a smaller student model that learns from the original model. This student model can then be used for more efficient inference eg. DistilBERT (Sanh et al. 2019).


Data Augmentation


Data augmentation is a set of techniques to artificially increase the amount of data by generating new copies from existing data. This includes making small changes to data or using deep learning models to generate new data points. Data augmentation techniques make machine learning models more robust by creating variations that the model may see in the real world. It is widely used in image processing, and augmenting textual data is more difficult due to the complexity and structure of a language. Common methods for data augmentation in Natural Language Processing are:


Token level augmentation 


Takes existing data and creates new examples by adding variety at the word level. Common augmentations would be synonym replacement, word insertion, word swap and word deletion. 


Sentence level augmentation 


Takes existing data and creates new examples by replacing whole sentences. A popular method here is back translation where for example an English sentence is translated into German, and then re-translated back into English. Another method is applying paraphrasing models to original texts. However, of particular interest to us, in the context of prompting, is that we can use a large pretrained language model to generate new examples from prompts of existing instances. GPT3Mix (Yoo et al. 2021) is a prompt-based approach that doesn’t require fine-tuning: a prompt is constructed using a few sample sentences from the task-specific training data as well as the data description. Then a large pretrained language model (GPT3) generates new sentences influenced by the sample sentences. 



Here we show an example taken from their paper on automatically generating training data for the sentiment detection task. The authors report a substantial improvement over baselines such as back translation. 


Aveni’s perspective


At Aveni Labs, we’re experimenting with and leveraging these approaches to produce models that can be trained using very little labelled data. We use prompting to create more labelled data, and use data augmentation to expand our labelled dataset. Our expert understanding of these methods means that we can deliver production ready models with far less data than was previously possible for machine learning solutions. Instead of needing 1,000s of training examples, we can make classifiers that work with only 100s, or in some cases even 10s, of real training examples. With this approach, we’re able to build superior models that need less human supervision, have excellent transcription accuracy, and greater functionality, for example, accurately identifying vulnerable customers


We work at the forefront of Artificial Intelligence and Natural Language Processing. Our world-class NLP engineers have employed these techniques and approaches to build our product – Aveni Detect – which lets you analyse 100% of customer interaction to power business improvement. Learn more here


Other recent posts

Is Consumer Duty just another TCF thing?

Is Consumer Duty just another TCF thing?

As the industry and analysts continue to pick apart the FCA’s final guidance on Consumer Duty, there are some skeptics wondering whether the new rules genuinely mark a shift in the regulator’s approach to...

Image by on Freepik

5 technologies to help firms comply with Consumer Duty

The Consumer Duty final guidance comes with big requirements for evidence. Firms will need to deploy advanced data-driven technology solutions to meet the regulator’s data-focused demands, ultimately adapting their approach and processes in order...

[Cybernews Interview]: Enhancing business operations with AI solutions

[Cybernews Interview]: Enhancing business operations with AI solutions

Natural Language Processing (NLP) is a field in artificial intelligence that enables machines to understand, process and analyse natural language in a similar way that humans do. As NLP becomes one of the most...

Aveni Detect platform to revolutionise approach to Risk Assurance as Consumer Duty rules launched

Aveni Detect platform to revolutionise approach to Risk Assurance as Consumer Duty rules launched

We’re helping financial services firms meet new Consumer Duty regulations with our game-changing Aveni Detect product. Our AI platform uses the latest advances in Natural Language Processing (NLP) to monitor every customer interaction to...

How your business can reduce costs in a high inflation environment

How your business can reduce costs in a high inflation environment

Global inflation rates have been consistently on the rise over the past year. The high inflation is driven by high energy prices, issues in supply chains and an increase in consumer demand. While focus...

A woman sitting cross-legged thinking about her credit score which an indicator shows to be poor

5 crucial areas to improve to be Consumer Duty compliant

With the cost of living crisis pushing more and more people into financial vulnerability, the FCA is putting renewed pressure on lenders to provide the best possible service and support for their customers. Their...

Why a ‘Machine Line of Defence’ is critical to meeting the FCA’s Consumer Duty

Why a 'Machine Line of Defence' is critical to meeting the FCA's Consumer Duty

The FCA is constantly reminding lenders of the standards of the consumer duty of care as the cost of living keeps rising. Over the past six months, the rising cost of living has rarely been...

How Consumer Duty has made Speech Analytics Essential

How Consumer Duty has made Speech Analytics Essential

Speech analytics has been commonly used for a number of years, evolving from a novelty add-on to a powerful solution for businesses to improve their processes. With the Financial Conduct Authority’s Consumer Duty of...

Forging a fairer way forward after the FCA’s ‘Dear CEO’ letter to Retail Lenders

Forging a fairer way forward after the FCA’s ‘Dear CEO’ letter to Retail Lenders

The rising cost of living has led to lending firms facing an increasing number of vulnerable customers, as more borrowers struggle with their personal finance. Consumers are experiencing growing financial pressure leading to a...