Highlights from top NLP conference: Empirical Methods in Natural Language Processing (EMNLP) 2020

Written on
byBarry Haddow

Last month Aveni Labs, represented by Lexi Birch and Barry Haddow, presented their work at one of the top conferences in the field of natural language processing (NLP).  From silent speech to Transformers, we get the lowdown on some of the newest, mind-bending innovations in NLP from Barry:

 

The conference  EMNLP (empirical methods in natural language processing) was originally meant to take place under the Caribbean sunshine in the Dominican republic. But due to COVID-19 restrictions, the conference was fully online, with grey, autumnal Edinburgh as the alternative backdrop. At least it saved on the air travel and meant we could still help with the school run.

 

So what happens at an event like EMNLP? Well, the conference is organised around papers, each of which has a slot for presentation, with questions and answers. We co-authored three of the 750 or so papers, and all the papers were chosen by a four-month process of peer review from a collection of over 3,000 initial submissions. EMNLP covers all aspects of automating the processing of human language; some push the state-of-the-art in performance, others deepen our understanding of why some methods work (or not), whilst others may introduce new tasks or techniques. So what does a typical paper look like? Let’s survey the top-rated papers to get an idea.

 

The best paper award itself went to two researchers from California for “Digital Voicing of Silent Speech”. This caught my interest, because it addressed a problem that I had never thought of before. So what’s it about? Let’s suppose that someone is “speaking”, but not making any sound, just mouthing their words. Can we develop a system to predict the sounds that they would be making? Since we have to stick electrodes on their face, it’s no use for surveillance, but could be applied for silent communication. The solution involved some clever data collection and modelling techniques, and the samples in the presentation sounded surprisingly good.  

 

The first of the honourable mentions is on chatbots, a much more familiar topic. It’s not about how to make a better chatbot though, but how to tell which is the best chatbot. But isn’t that simple? You just get a bunch of people to chat to the chatbots, then tell you which one they liked best. Well yes, but the authors of this paper came up with a better idea. You get people to follow bot-bot, human-bot and human-human conversations, and see how long it takes them to “Spot the bot …” . The longer it takes, the better the bot. It turns out that this is a more reliable and efficient way of judging bots. 

 

The next honourable mention (If Beam Search is the Answer, What was the Question?) is much less accessible to those outside the field. It’s an important contribution though, as “beam search” is used a lot, behind the scenes. For example, in machine translation (i.e every time you run Google translate), beam search is the method for searching through the myriad of possible translations. This paper starts with some observations about the surprising effectiveness of beam search and provides an explanation … which will ultimately help us come up with better methods. 

 

EMNLP also includes “demo” papers, which describe an interesting piece of software. The award for the best demo paper went to “Transformers: State-of-the-Art Natural Language Processing”, from the US start-up Hugging Face, for the impressive open-source toolkit. Transformers are a useful but complex way of encoding a sentence as a collection of numbers, and since they appeared in 2017 they have been applied to all sorts of NLP tasks (machine translation, search, natural language understanding etc). The Hugging Face toolkit makes it straight-forward for NLP engineers to use transformers in their own applications.

 

The last of the honourable mentions went to a data set paper,  GLUCOSE: GeneraLized and COntextualized Story Explanations. Data and annotation of data are incredibly important in modern NLP, data is the coal that feeds the machine, so it’s good to see data set creation being valued. The focus of GLUCOSE is commonsense reasoning; people’s ability to build a mental model of a story scenario from a series of events. This is not so easy for machines, so the makers of GLUCOSE use crowd-sourcing to build a huge set of these scenarios for the machines to learn from.

 

These were the top-rated papers from EMNLP, but they illustrate the variety of paper types at the conference. From software releases to evaluation techniques, from new problems to a deeper analysis of existing solutions, and of course everything is driven by data.

 

This insight was given by Barry Haddow, head of NLP at Aveni. To find out more about Barry and what he does, visit this blog post 

Find out the latest news and insights from Aveni.  Follow us on LinkedIn and Twitter

Other recent posts

CRO Consumer Duty survey findings: Customers at the heart of your business? Now is the time to prove it.

CRO Consumer Duty survey findings: Customers at the heart of your business? Now is the time to prove it.

We surveyed +80 Senior Compliance and Risk Officers from across the UK’s Financial Services sector to find out their top priorities and investment plans for Consumer Duty and how they’ll meet the tightening data-driven...

Aveni’s end of the year Reunion!

Aveni's end of the year Reunion!

Aveni works at the forefront of innovation in AI and Natural Language Processing, providing quality assurance and regulatory compliance solutions for financial services and utilities companies in the UK. However, that’s not all we...

Adoption of machine learning and data driven technologies at the heart of human centred advice

Adoption of machine learning and data driven technologies at the heart of human centred advice

This is an excerpt from our Human+ whitepaper. Download here for full details about how data driven technologies can power human centred advice.   The recent acceleration in digitisation of financial advice combined with advances...

executive understanding ai

AI: Why an executive understanding is so important’  - 5 key takeaways from Aveni Labs webinar

Data-driven technologies underpinned by rapidly evolving AI, are set to be placed at the heart of firms’ operating models. It emphasises the need for Financial Services Executives to have a clear understanding of AI,...

Record calls consumer duty

The FCA emphasises the importance of recording customer calls to meet Consumer Duty requirements

In the FCA’s recent Consumer Duty Retail Lending webinar, Jonathan Phelan, Head of Department, Consumer Finance at the FCA urged firms to record customer interactions as a critical step in being able to monitor...

financial advice data capturing

Driving Automation in the Financial Advice Industry with Data Capturing

The present day financial advice industry has seen little innovation since the turn of the last decade, but is now entering a period where it will be fundamentally disrupted.    The industry has relied...

Colin Clark Aveni Chairman

Investment heavyweight Colin Clark appointed as chair of AI fintech business Aveni

Aveni.ai has appointed investment heavyweight Colin Clark as Chairman of its newly formed Board. The strategic appointment comes in the wake of a recent successful funding injection of £2.75 million to grow market share...

financial services space

Aveni's busy week: FinTech Award Win, HomeGame 2 and JP Morgan Innovation Events!

Last week was a busy one for Aveni. From winning a Scottish Financial Technology award to being a part of two exciting and impactful events with JP Morgan and the langcat, we loved getting...

Impact of AI

Human+: Human-in-the-Loop AI

AI is already changing our lives; from expert systems which predict the weather and the stock market, to facial recognition and internet search results. Its application is growing more extensive. Some uses of AI...