Highlights from top NLP conference: Empirical Methods in Natural Language Processing (EMNLP) 2020

4 min read

Last month Aveni Labs, represented by Lexi Birch and Barry Haddow, presented their work at one of the top conferences in the field of natural language processing (NLP).  From silent speech to Transformers, we get the lowdown on some of the newest, mind-bending innovations in NLP from Barry:


The conference  EMNLP (empirical methods in natural language processing) was originally meant to take place under the Caribbean sunshine in the Dominican republic. But due to COVID-19 restrictions, the conference was fully online, with grey, autumnal Edinburgh as the alternative backdrop. At least it saved on the air travel and meant we could still help with the school run.


So what happens at an event like EMNLP? Well, the conference is organised around papers, each of which has a slot for presentation, with questions and answers. We co-authored three of the 750 or so papers, and all the papers were chosen by a four-month process of peer review from a collection of over 3,000 initial submissions. EMNLP covers all aspects of automating the processing of human language; some push the state-of-the-art in performance, others deepen our understanding of why some methods work (or not), whilst others may introduce new tasks or techniques. So what does a typical paper look like? Let’s survey the top-rated papers to get an idea.


The best paper award itself went to two researchers from California for “Digital Voicing of Silent Speech”. This caught my interest, because it addressed a problem that I had never thought of before. So what’s it about? Let’s suppose that someone is “speaking”, but not making any sound, just mouthing their words. Can we develop a system to predict the sounds that they would be making? Since we have to stick electrodes on their face, it’s no use for surveillance, but could be applied for silent communication. The solution involved some clever data collection and modelling techniques, and the samples in the presentation sounded surprisingly good.  


The first of the honourable mentions is on chatbots, a much more familiar topic. It’s not about how to make a better chatbot though, but how to tell which is the best chatbot. But isn’t that simple? You just get a bunch of people to chat to the chatbots, then tell you which one they liked best. Well yes, but the authors of this paper came up with a better idea. You get people to follow bot-bot, human-bot and human-human conversations, and see how long it takes them to “Spot the bot …” . The longer it takes, the better the bot. It turns out that this is a more reliable and efficient way of judging bots. 


The next honourable mention (If Beam Search is the Answer, What was the Question?) is much less accessible to those outside the field. It’s an important contribution though, as “beam search” is used a lot, behind the scenes. For example, in machine translation (i.e every time you run Google translate), beam search is the method for searching through the myriad of possible translations. This paper starts with some observations about the surprising effectiveness of beam search and provides an explanation … which will ultimately help us come up with better methods. 


EMNLP also includes “demo” papers, which describe an interesting piece of software. The award for the best demo paper went to “Transformers: State-of-the-Art Natural Language Processing”, from the US start-up Hugging Face, for the impressive open-source toolkit. Transformers are a useful but complex way of encoding a sentence as a collection of numbers, and since they appeared in 2017 they have been applied to all sorts of NLP tasks (machine translation, search, natural language understanding etc). The Hugging Face toolkit makes it straight-forward for NLP engineers to use transformers in their own applications.


The last of the honourable mentions went to a data set paper,  GLUCOSE: GeneraLized and COntextualized Story Explanations. Data and annotation of data are incredibly important in modern NLP, data is the coal that feeds the machine, so it’s good to see data set creation being valued. The focus of GLUCOSE is commonsense reasoning; people’s ability to build a mental model of a story scenario from a series of events. This is not so easy for machines, so the makers of GLUCOSE use crowd-sourcing to build a huge set of these scenarios for the machines to learn from.


These were the top-rated papers from EMNLP, but they illustrate the variety of paper types at the conference. From software releases to evaluation techniques, from new problems to a deeper analysis of existing solutions, and of course everything is driven by data.


This insight was given by Barry Haddow, head of NLP at Aveni. To find out more about Barry and what he does, visit this blog post 

Find out the latest news and insights from Aveni.  Follow us on LinkedIn and Twitter

Related posts

UK’s leading financial advice platform integrates generative AI for unprecedented automation   London, UK – 04 June 2024 –, an award-winning AI Fintech company, and intelliflo, a global leader...
Aveni’s fine-tuned RoBERTa language model has been knocking it out of the park when it comes to detecting vulnerabilities in call transcripts, even beating the latest GPT-4. Over the past...
In this webinar, Aveni’s CEO, Joseph Twigg, Head of NLP, Iria Del Rio and Chief Client Officer, Robbie Homer-Plews, held a live Q&A bootcamp as a crash course in AI...
The financial services (FS) industry is steeped in complexity and ever-evolving regulations. From process inefficiencies to outdated legacy systems that require manual data input that hasn’t been maintained to a...
What is the EU AI Act: the key takeaways   The December 2023 EU AI Act is the first comprehensive legal framework for AI in the world. It aims to...
Company News
Our co-founder and chief scientist Dr Lexi Birch is leading a newly approved EuroLLM project, utilising the Barcelona Supercomputer. The project was awarded 1.7M hours to help train large language...
We know that there’s a lot to come in the next twelve months. That’s why we asked a popular chatbot what it predicts to be the top 5 generative AI...
Adviser productivity
Cavendish Online, part of Lloyds Banking Group, has partnered with, the Artificial Intelligence fintech business, to become one of the first protection distributors in the market to use AI...
Artificial intelligence (AI) is transforming almost every sector of the world, and the finance industry is no exception. From robo-advisors to algorithmic trading to chatbots answering customer questions, AI is...
Artificial Intelligence (AI) has been a hot topic, not just in finance but in homes and businesses across the world. From whipping up long paragraphs in seconds to translating languages,...
Company News
Aveni has been selected by 7IM, the client-centric, technology-driven wealth and investment management business, as a strategic AI partner. The firm plans to leverage the technology as it accelerates growth...
Large language models (LLMs) are a rapidly evolving field, with new and existing models being released and improved on all the time. In this podcast episode, host and Aveni CEO...

Aveni’s platform uses the latest in NLP to transform productivity and risk oversight.

Scale compliance at a fraction of the cost

Cut financial advice admin from hours to minutes with Aveni’s AI assisitant

Aveni Assist

Get up and running with Aveni Assist and how it can help transform productivity and compliance. 

Aveni Detect

Get up and running with Aveni Detect and how it can help transform productivity and compliance. 

Read the latest articles from Aveni

Access our latest whitepapers, webinars, brochures and more

Jargon-bust your way to a better understanding of all things AI