Dr Lexi Birch is one of the leading experts in NLP, a Senior Research Fellow at Edinburgh University’s School of Informatics and Head of Aveni Labs.  She explains how the untapped power of NLP holds extraordinary transformative potential for the Financial Industry if applied in the right way.      

We’ll begin with the basics:  what is NLP? It’s a field of artificial intelligence which creates computational models that interact with human speech and writing.  It’s an interdisciplinary  field  of  study  that  draws  from  domains  as  diverse  as computer science, machine learning, psychology, and linguistics.

What opportunities does NLP offer?

NLP has experienced a huge leap forward in the last four years, as a result of advances in deep learning. Deep learning is a subset of machine learning where artificial neural networks, models inspired by the human brain, learn from large amounts of data.  Deep learning models, which until recently were research projects, are becoming productised.  For example,  speech recognition transcribes  your Skype  conversations;  you  can  read  emails,  posts  or  websites  written  in  over 100 languages; dialogue systems sit on our kitchen countertops in the form of an Echo, Google Home or even your mobile device.  The potential applications of these technologies are still relatively untapped, especially for specialised industries.  We need people who understand the financial industry, its gaps, inefficiencies and pain points, to look at the capabilities of natural language processing and to imagine what use it could be to their business.

How is NLP useful to finance?

The finance industry generates a large volume of unstructured data in the form of news,  financial reports,  tweets,  recorded conversations and email communications. This data is an immensely valuable resource, if it was more accessible.

Being able to understand these sources, extract information and produce  automated  summaries  and  reports, creates  opportunities  for  greatly  improving areas like market prediction, risk monitoring, financial reporting, and customer analysis.  Another broad area of the financial industry which can be transformed by NLP is using conversational analytics to improve customer interactions,  by making  them  more  efficient  to  process  and  ensuring  uniformly high quality of service.

What can NLP do?

Detecting Material Information

The most studied application of NLP in finance has been to monitor news and financial reports to predict stock market prices.  Events such as further outbreaks of coronavirus, flooding, or company losses can be extracted from news articles or financial  reports.

Monitoring other kinds of data sources for critical information has been less well explored.  For example, customer conversations can be monitored for risks.  Banks and financial institutions monitor customer conversations to make sure that they are sold the right product,  that their rights are protected and that any complaints are correctly handled.

Extracting Information

Tools based on information extraction are useful for trying to classify, reduce and summarise huge amounts of financial information. This often means extracting named entities such as “Elon Musk”, “Tesla” or “Australia” and the relationships between them, and then updating knowledge bases or CRMs with this information.

Understanding Sentiment

Another common application of NLP in finance is measuring document sentiment or tone.  Sentiment analysis is often applied to customer materials such as reviews and survey responses,  where sentiment can be positive, negative or neutral.  Sentiment can be predicted by using the occurrence of words such as “weak”, “cuts” or “unhappy”, but more complex examples are harder to get right,  for example “Markets bounced back from weak growth in the last quarter” expresses both positive and negative sentiment with relation to the entity “markets”. 

Modelling Topics

A very important task in NLP is predicting topic or topic structure.  Labelling documents with topics like “utilities”, ”banking”, “automobile” can help to filter relevant information.  Topic structure, or segmenting documents and conversations into sections, is often essential for  other downstream tasks like information extraction.  If you are in the authentication part of a conversation,  then you will know that an address refers to the customer address, but if it occurs elsewhere, it could be referring to a branch of the bank. 

Multilingual  Capabilities

In  an  increasingly  globalised  world,  the  ability to  apply  NLP  methods  cross-lingually  is  becoming  more  important.   Market moving information is very likely to come from other languages such as Chinese or Russian.  To be able to leverage text from different languages, you can translate documents into English and then apply English natural language processing tools.  But leveraging labelled English data for training tools to work directly in other languages is an active area of research.

Understanding Speech

Working directly with audio recordings instead of written  text  is  challenging,  but  also  presents  some exciting opportunities.   Processing  speech is more costly than text and can introduce transcription errors, however it can also  be  used  to monitor events in real time.   Many  global  public  companies hold earnings conference calls which are the timeliest source of financial results. Financial institutions could monitor their customer calls in real time to prevent risks from occurring, rather than detecting them after the fact.  Voice also gives us access to people’s emotional state and their tone,  which can give us a deeper understanding of the speaker’s intentions.


Natural language generation applications have traditionally produced natural language descriptions of structured data or database entries.  Over the last couple of years, models which generate from written documents have improved enormously and can be used to generate summaries or reviews.  Generation is also an important part of dialogue systems, formulating questions or responses to the users utterances.


There are a number of challenges to successfully deploying NLP technology.  These include:


Audio and  text  data  can  be  very  large  and  thus  requires  significant investments  in  storage  and  computation.   This  can  be  costly  and  difficult  to manage.

Natural Language

The nature of human language is that it is diverse, noisy and  ambiguous.   Language  written  by  a  journalist  is  very  different  to  speech between an advisor and her client.  Tweets and emails can contain misspellings and  emoticons.   But  maybe  the  most  challenging  aspect  of  language  is  the subtlety and ambiguity of language itself, which depends vitally on the context and  the  intentions  of  the  speaker  and  their  audience.   All  these  phenomena make processing language very challenging. 

The Talent War

Perhaps the biggest challenge is talent.  To make the best use of unstructured data, experts from the fields of computational linguistics, machine learning and computer science need to be hired.  There is huge demand for people with these skills.

At Aveni, we believe NLP is an extremely exciting research area in finance due to the number of compelling unsolved problems it can tackle.  When you combine this potential with the fact that deep learning has recently opened up a new frontier for NLP, delivering new levels of performance across a wide range of applications, the case for pursuing NLP in finance is compelling.