Last month Aveni Labs, represented by Lexi Birch and Barry Haddow, presented their work at one of the top conferences in the field of natural language processing (NLP). From silent speech to Transformers, we get the lowdown on some of the newest, mind-bending innovations in NLP from Barry:
The conference EMNLP (empirical methods in natural language processing) was originally meant to take place under the Caribbean sunshine in the Dominican republic. But due to COVID-19 restrictions, the conference was fully online, with grey, autumnal Edinburgh as the alternative backdrop. At least it saved on the air travel and meant we could still help with the school run.
So what happens at an event like EMNLP? Well, the conference is organised around papers, each of which has a slot for presentation, with questions and answers. We co-authored three of the 750 or so papers, and all the papers were chosen by a four-month process of peer review from a collection of over 3,000 initial submissions. EMNLP covers all aspects of automating the processing of human language; some push the state-of-the-art in performance, others deepen our understanding of why some methods work (or not), whilst others may introduce new tasks or techniques. So what does a typical paper look like? Let’s survey the top-rated papers to get an idea.
The best paper award itself went to two researchers from California for “Digital Voicing of Silent Speech”. This caught my interest, because it addressed a problem that I had never thought of before. So what’s it about? Let’s suppose that someone is “speaking”, but not making any sound, just mouthing their words. Can we develop a system to predict the sounds that they would be making? Since we have to stick electrodes on their face, it’s no use for surveillance, but could be applied for silent communication. The solution involved some clever data collection and modelling techniques, and the samples in the presentation sounded surprisingly good.
The first of the honourable mentions is on chatbots, a much more familiar topic. It’s not about how to make a better chatbot though, but how to tell which is the best chatbot. But isn’t that simple? You just get a bunch of people to chat to the chatbots, then tell you which one they liked best. Well yes, but the authors of this paper came up with a better idea. You get people to follow bot-bot, human-bot and human-human conversations, and see how long it takes them to “Spot the bot …” . The longer it takes, the better the bot. It turns out that this is a more reliable and efficient way of judging bots.
The next honourable mention (If Beam Search is the Answer, What was the Question?) is much less accessible to those outside the field. It’s an important contribution though, as “beam search” is used a lot, behind the scenes. For example, in machine translation (i.e every time you run Google translate), beam search is the method for searching through the myriad of possible translations. This paper starts with some observations about the surprising effectiveness of beam search and provides an explanation … which will ultimately help us come up with better methods.
EMNLP also includes “demo” papers, which describe an interesting piece of software. The award for the best demo paper went to “Transformers: State-of-the-Art Natural Language Processing”, from the US start-up Hugging Face, for the impressive open-source toolkit. Transformers are a useful but complex way of encoding a sentence as a collection of numbers, and since they appeared in 2017 they have been applied to all sorts of NLP tasks (machine translation, search, natural language understanding etc). The Hugging Face toolkit makes it straight-forward for NLP engineers to use transformers in their own applications.
The last of the honourable mentions went to a data set paper, GLUCOSE: GeneraLized and COntextualized Story Explanations. Data and annotation of data are incredibly important in modern NLP, data is the coal that feeds the machine, so it’s good to see data set creation being valued. The focus of GLUCOSE is commonsense reasoning; people’s ability to build a mental model of a story scenario from a series of events. This is not so easy for machines, so the makers of GLUCOSE use crowd-sourcing to build a huge set of these scenarios for the machines to learn from.
These were the top-rated papers from EMNLP, but they illustrate the variety of paper types at the conference. From software releases to evaluation techniques, from new problems to a deeper analysis of existing solutions, and of course everything is driven by data.
This insight was given by Barry Haddow, head of NLP at Aveni. To find out more about Barry and what he does, visit this blog post