Academic Papers

Cache & Distil: Optimising API Calls to Large Language Models

The paper focuses on optimising API calls to Large Language Models (LLMs) through the concept of neural caching.This involves training a smaller student model to handle user requests, thereby reducing the frequency of expensive API calls. 


The study focuses on the use of active learning (AL) strategies such as Margin Sampling and Query by Committee to improve the performance of the student model.


The experimental setup involves four classification tasks: ISEAR, RT-Polarity, FEVER, and Openbook. These tasks range from emotion annotation to fact-checking and question-answering. The datasets are split into online and test portions, with classes uniformly distributed. The paper also discusses the annotation process by the LLM to simulate the online setup.


The findings reveal the benefits of AL-based policies in improving the student model’s performance. Margin Sampling and Query by Committee consistently outperform baselines, indicating the robustness of the student model to noise introduced by the LLM. The study suggests that smart LLM query allocation and online knowledge distillation can play a crucial role in optimising API calls to LLMs.


The paper concludes that there is potential for smart LLM query allocation in continuously distilling LLMs into student models. Leveraging AL strategies in the online setup can lead to significant improvements in performance and efficiency. The authors include, Head of Aveni Labs, Alexandra Birch.


Download the research paper

Other resources

Consumer Duty Solutions Series: What does a data-driven regulator expect from you?

Risk Monitoring and Advisor Insight

Consumer Duty Solutions Series: 3 risks firms need to address to be compliant

Aveni Assist explainer video

Retrieval-augmented Multilingual Knowledge Editing

Consumer Duty: Your Machine Line of Defence

Chief Risk Officer Consumer Duty Survey Results

Assessing the Reliability of Large Language Model Knowledge

Consumer Duty: Board Reports Generating the right information for the right reports

OpusCleaner and OpusTrainer, open source toolkits for training Machine Translation and Large language models

Aveni’s platform uses the latest in NLP to transform productivity and risk oversight.

Scale compliance at a fraction of the cost

Cut financial advice admin from hours to minutes with Aveni’s AI assisitant

Aveni Assist

Get up and running with Aveni Assist and how it can help transform productivity and compliance. 

Aveni Detect

Get up and running with Aveni Detect and how it can help transform productivity and compliance. 

Read the latest articles from Aveni

Access our latest whitepapers, webinars, brochures and more

Jargon-bust your way to a better understanding of all things AI