How low-resource Natural Language Processing is making Speech Analytics accessible to industry

Written on
byLexi Birch

Natural Language Processing (NLP) is a subset of artificial intelligence that enables machines to understand, process and analyse natural language in the way that humans will. The machine analyses data, interprets, measures sentiment and provides the intended inference from it. The data used for Natural Language Processing (and other forms of machine learning) may be labelled. Labelled data is data with predefined tags that provides information that the machine can learn from. This process is called supervised learning. A simple example of labelled data is the bio data of customers with labels indicating that the strings of letters with an ‘@’ symbol is their email address, the two digit numbers is their age, the images are their passport photos, etc. However, with unlabelled data, there aren’t such tags and the machine has to categorise or cluster the data attributes with similar patterns. This process is known as unsupervised learning. 

 

Natural Language Processing has achieved remarkable progress in the past decade on the basis of neural models. Using large amounts of labelled data can help achieve state-of-the-art performance for tasks such as sentiment detection, Named Entity Recognition (NER), Natural Language Inference (NLI) or question-answering. For these tasks, the labels or tags would be the sentiment of a review, or the people, places or organisations mentioned in the text. However, the dependence on labelled data prevents NLP models from being applied to low-resource settings because of the time, money, and expertise that is often required to label large amounts of textual data.  

 

We’re going to take a look at recent advances in NLP, which allow deep learning models to learn from very few examples. This is crucial for speech analytics where labelled examples are often in very short supply. 

 

Pre-train and Fine-tune

 

In the last few years, a new paradigm – pre-train and fine-tune – has emerged, which allows us to leverage large amounts of unlabelled data for NLP. The premise is that perhaps it’s better to first learn to model the language itself, then once we have a model that understands the language, we can share this knowledge with the many different tasks we care about by fine-tuning it on small amounts of labelled data. Language modelling is a machine learning task where the model needs to learn how to predict a missing word given the context of the rest of the sentence. This is a generic task with abundant naturally occurring data and can be used to pre-train such a generic model.

 

Arguably, the model that kick-started this trend was the Bidirectional Encoder Representations from Transformers (BERT) model. BERT is a transformer-based machine learning technique for pre-training developed by Google. The model takes input sentences where some words are masked out, and the task is to predict the masked words. The thing that really set BERT apart was the ease of fine-tuning. BERT is cleverly designed so that it’s easy to do this for lots of different tasks. You can download BERT pre-trained on a large English corpus like the BooksCorpus, and then for your task, you fine-tune BERT on labelled data. You can add a task-specific “head” onto BERT to create a new architecture for your task. This approach has led to huge improvements over state-of-the-art, providing a nice off-the-shelf solution to standard problems. 

 

Let’s  look at some other ways of using pre-trained language models:

 

Prompting

 

Recently, researchers realised that an alternative paradigm would be to make the final task look more like language modelling. This would mean a fine-tuning step won’t be needed. It would also mean that we’re potentially able to perform new downstream tasks with little or no labelled data. This paradigm is called pre-train and prompt

 

 

The first pre-train and prompt paper, which showed the potential of this approach, was published in 2020 by Google (Raffel et al. 2020). They suggested a unified approach to transfer learning in Natural Language Processing with the goal of setting a new state-of-the-art in the field. To this end, they treated all NLP problems as a “text-to-text” problem. Such a framework allows using the same model, objective, training procedure, and decoding process for different tasks, including summarisation, sentiment analysis, question answering, and machine translation. The researchers call their model a Text-to-Text Transfer Transformer (T5) and train it on the large corpus of web-scraped data to get state-of-the-art results on several NLP tasks. The way to make all NLP tasks text-to-text is by selecting the appropriate prompts. This is so that the pre-trained LM itself can be used to predict the desired output, sometimes even without any additional task-specific training. This allows few-shot (learning from only a few examples of labelled data) and even zero-shot (generalising to a new task with no examples of labelled data) behaviour.

 

 

 

In this example, we see a prompt that takes a prompting function to generate a sentence where the language model needs to predict Z, which in this case, we would expect to be a positive sentiment. This allows us to directly use the language model for a specific task, sentiment detection. 

 

There are many different possible tasks that language models can perform: 

 

 

Here we can see examples of different prompts for different tasks. T5 was applied to several benchmarks and surpassed previous state-of-the-art results across many different individual Natural Language Processing tasks. T5 caused great interest in prompting and since then various improvements and challenges have been identified. 

 

Finding good prompts is difficult, and recent work has focussed on finding them automatically. Another active research question is how and when to train a model with prompts. 

 

Working with large language models is also challenging. Large language model size has been increasing 10x every year for the last few years. This road leads to diminishing returns, higher costs, more complexity, and new risks. Downsizing efforts are also underway in the Natural Language Processing community, using transfer learning techniques such as knowledge distillation which trains a smaller student model that learns from the original model. This student model can then be used for more efficient inference eg. DistilBERT (Sanh et al. 2019).

 

Data Augmentation

 

Data augmentation is a set of techniques to artificially increase the amount of data by generating new copies from existing data. This includes making small changes to data or using deep learning models to generate new data points. Data augmentation techniques make machine learning models more robust by creating variations that the model may see in the real world. It is widely used in image processing, and augmenting textual data is more difficult due to the complexity and structure of a language. Common methods for data augmentation in Natural Language Processing are:

 

Token level augmentation 

 

Takes existing data and creates new examples by adding variety at the word level. Common augmentations would be synonym replacement, word insertion, word swap and word deletion. 

 

Sentence level augmentation 

 

Takes existing data and creates new examples by replacing whole sentences. A popular method here is back translation where for example an English sentence is translated into German, and then re-translated back into English. Another method is applying paraphrasing models to original texts. However, of particular interest to us, in the context of prompting, is that we can use a large pretrained language model to generate new examples from prompts of existing instances. GPT3Mix (Yoo et al. 2021) is a prompt-based approach that doesn’t require fine-tuning: a prompt is constructed using a few sample sentences from the task-specific training data as well as the data description. Then a large pretrained language model (GPT3) generates new sentences influenced by the sample sentences. 

 

 

Here we show an example taken from their paper on automatically generating training data for the sentiment detection task. The authors report a substantial improvement over baselines such as back translation. 

 

Aveni’s perspective

 

At Aveni Labs, we’re experimenting with and leveraging these approaches to produce models that can be trained using very little labelled data. We use prompting to create more labelled data, and use data augmentation to expand our labelled dataset. Our expert understanding of these methods means that we can deliver production ready models with far less data than was previously possible for machine learning solutions. Instead of needing 1,000s of training examples, we can make classifiers that work with only 100s, or in some cases even 10s, of real training examples. With this approach, we’re able to build superior models that need less human supervision, have excellent transcription accuracy, and greater functionality, for example, accurately identifying vulnerable customers

 

We work at the forefront of Artificial Intelligence and Natural Language Processing. Our world-class NLP engineers have employed these techniques and approaches to build our product – Aveni Detect – which lets you analyse 100% of customer interaction to power business improvement. Learn more here

 

Other recent posts

consumer duty resources

5 top RegTech trends Chief Risk Officers need to be on top of in 2023

As we progress into the new year, it’s important for Chief Risk Officers (CROs) to keep an eye on the latest RegTech trends. By adopting the right technologies, CROs can ensure that their organisation...

soundwaves

A quick history of Natural Language Processing 

Natural Language Processing (NLP) is a field of artificial intelligence that involves the use of algorithms, statistical models, and other techniques to analyse, understand, and generate human language. NLP has a wide range of...

consumer duty resources

Aveni’s 2022 Consumer Duty Resources

As the year comes to an end, it’s a good time to reflect on the events of the past 12 months, both the good and the not-so-good. For both the Financial Services industry and...

Female customer support operator with headset and smiling, with collegues at background.

Aveni Detect now available on Genesys AppFoundry to transform Consumer Duty compliance

Aveni Detect is now available on the Genesys AppFoundry ™, a marketplace of solutions offering a curated selection of applications and integrations that elevate customer and employee experiences.   Developed as a ‘Machine Line...

consumer duty survey

CRO Consumer Duty survey findings: Consumer Outcomes not prioritised?

We conducted a consumer duty survey to fully understand the preparedness of CRO and Senior Risk and Compliance Executives for the Consumer Duty, what activities they are prioritising to meet tightening regulations and ensure good customer...

CRO survey promo 1 (1)

CRO Consumer Duty survey findings: Customers at the heart of your business? Now is the time to prove it.

We surveyed +80 Senior Compliance and Risk Officers from across the UK’s Financial Services sector to find out their top priorities and investment plans for Consumer Duty and how they’ll meet the tightening data-driven...

1668795954121

Aveni's end of the year Reunion!

Aveni works at the forefront of innovation in AI and Natural Language Processing, providing quality assurance and regulatory compliance solutions for financial services and utilities companies in the UK. However, that’s not all we...

pexels-thisisengineering-3861969 (1)

Adoption of machine learning and data driven technologies at the heart of human centred advice

This is an excerpt from our Human+ whitepaper. Download here for full details about how data driven technologies can power human centred advice.   The recent acceleration in digitisation of financial advice combined with advances...

executive understanding ai

AI: Why an executive understanding is so important’  - 5 key takeaways from Aveni Labs webinar

Data-driven technologies underpinned by rapidly evolving AI, are set to be placed at the heart of firms’ operating models. It emphasises the need for Financial Services Executives to have a clear understanding of AI,...