Is bias a problem in machine learning? 

Written on
byLexi Birch

In all decision-making processes, whether human or machine, bias can result in unfair outcomes. We discuss why this can be a problem for machine learning, and what we can do about it.  

Machine Learning: Machine learning (ML) is very good at capturing signals and correlations in data. Deep learning, coupled with substantial data, and lots of computation, allows us to build systems that exploit correlations in data. This behaviour allows us to model complex problems, but it also means that these models are vulnerable to bias. 

Bias: Is defined as observing results that are systematically prejudiced. If we look at why ML models might be biased, we see that the datasets used to train and test models contain biases. Bias is introduced when it reflects the biases that exist in humans. Datasets can be further biased by what data is available, or simply what the dataset creator’s frame of reference is. For example, it is very common that a dataset specialises in a particular demographic. Someone developing a speech recognition system might collect predominantly audio recordings of male voices. Our Job as machine learning practitioners is to separate the genuine signals in the data from the bias that is discriminative or unfair. 

Fairness: ML models will contain biases, but the real question is does this product treat people fairly? Models are trained to maximise accuracy over a training set and the models with the highest accuracy overall, might not have the best performance for particular subsets of the data. Often improving accuracy for a subpopulation, like facial recognition for black women, might mean slightly lower accuracy for the more frequent class, such as white men. Fairness means trying to gauge whether the models are achieving the desired tradeoff between overall accuracy and performance for all subpopulations. 

Discrimination: With machine learning systems becoming more ubiquitous in automated decision making, it is crucial that we make these systems sensitive to the type of bias that results in discrimination, especially discrimination on illegal grounds. Machine learning is already being used to make or assist decisions in the following domains of Recruiting (Screening job applicants), Banking (Credit ratings/Loan approvals), Judiciary (Recidivism risk assessments), Welfare (Welfare Benefit Eligibility), Journalism (News Recommender Systems) etc. Given the scale and impact of these industries, it is crucial that we take measures to prevent unfair discrimination. 

 

Example 

These experiments were run in 2017 by a researcher at the university of Washington, Rachael Tatman. She took a set of common words and collected recordings from people who marked their gender and their place of origin and compared the performance of Google’s speech recognition software. The results show robust differences in accuracy across both gender and dialect, with lower accuracy for 1) women and 2) speakers from Scotland. This finding shows the need for sociolinguistically-stratified validation of ML systems. Before we can fix the problem, we need to be able to quantify it.  

 

Automatic Speech Recognition (ASR) 

 

Rachael Tatman, Gender and Dialect Bias in YouTube’s Automatic Captions (2017) (Note that lower “word error rate” indicates better performance) 

Dealing with Bias in ML 

Because of fears that algorithms will further entrench and propagate human biases, there have been significant efforts by the Artificial Intelligence (AI) community to avoid and correct discriminatory bias in algorithms, while also making them more transparent.  

There has been an explosion of academic interest in methods for developing fair algorithms, however fewer methods have been implemented in production machine learning systems used by governments or private companies and there is little transparency about how ML decisions are made and how fair they really are. I describe three general approaches which have been deployed: 

 

Data transformation:

The fairness-enhancing approaches that have achieved the most practical success seem to be efforts to improve performance by adding training data, especially for underrepresented groups. IBM’s facial gender classification system was performing poorly for dark-skinned people—and dark-skinned women in particular—IBM responded with a system trained on more representative data which reportedly reduced the error rates on dark-skinned women Tenfold. 

 

Algorithm manipulation:

Other methods which are of intense interest in the academic field are methods which manipulate the algorithm and introduce a penalty while learning which encourages the model to reduce discrimination. It can be hard to implement in practice – especially with overlapping characteristics, like gender, age, race. 

 

Outcome manipulation:

Here you could adjust for known biases directly on the output of the models. 

 

Dealing with Bias in Aveni 

At Aveni we are putting significant effort into: understanding, tracking and mitigating bias. 

 

One factor which reduces our risk of exposure to bias is the fact that many of our models are geared towards assisting humans and augmenting their abilities to make decisions quickly and with maximum information. This Human+ approach means that decisions our models make are less likely to affect the fairness of the system outcome. Where models make predictions, we link the decision directly to the evidence we had for making it. This makes the model as transparent as possible and further helps to mitigate problems with bias.  

 

The other key piece to making sure that bias in our systems does not lead to unfair outcomes is rigorous testing. We use this to continually improve and review our results to verify accuracy, precision and recall on different subgroups, and we do this on a regular basis. We also produce annual bias reports, which can be used for auditing and reporting purposes.  

 

In circumstances where we have access to large enough datasets, we deploy fairness toolkits to investigate and mitigate bias in our datasets and in our models. We are continuously evaluating the latest tools and research, to deliver the most accurate and fair models possible.  

 

To learn more about how our cutting-edge techniques can deliver the outcomes that you want. Visit: work with us  

 

You can also find us on  LinkedIn and Twitter  

Other recent posts

Forging a fairer way forward after the FCA’s ‘Dear CEO’ letter to Retail Lenders

Forging a fairer way forward after the FCA’s ‘Dear CEO’ letter to Retail Lenders

The rising cost of living has led to lending firms facing an increasing number of vulnerable customers, as more borrowers struggle with their personal finance. Consumers are experiencing growing financial pressure leading to a...

5 ways Financial firms can improve consumer understanding

5 ways Financial firms can improve consumer understanding

Ensuring consumers are provided with the information they need, in a way they understand to make informed decisions about financial products and services is no longer optional. The FCA has made this clear in...

Aveni secures £2.75M investment to expand AI platform

Aveni secures £2.75M investment to expand AI platform

We’re excited to share, that we’ve secured an investment of £2.75 million to deliver the expansion of our Aveni Detect platform to the financial services and utility markets across the UK. The ground-breaking conversational...

How firms can apply Consumer Duty of Care to Existing Services

How firms can apply Consumer Duty of Care to Existing Services

It’s no longer news that the FCA announced the new Consumer Duty of Care (CDC) proposal to help provide a higher standard of service and fairer outcomes for customers. While it may be easier...

The big fat Aveni reunion!

The big fat Aveni reunion!

Working remotely comes with so many positives, but sometimes, you need a change from working by yourself all day and having such easy access to the fridge and all the calories that lie within....

7 critical actions to take from the FCA’s Consumer Duty Consultation and how speech analytics can help achieve them

7 critical actions to take from the FCA’s Consumer Duty Consultation and how speech analytics can help achieve them

Financial Service firms need to rely on more than just human resources to implement the new FCA’s Customer Duty of Care rules and guidelines by April 2023. Deloitte recommended the following critical actions, and...

Why Board Members Should Prepare for the Consumer Duty of Care Changes

Why Board Members Should Prepare for the Consumer Duty of Care Changes

The FCA’s new Consumer Duty of Care proposals have been causing a bit of controversy since they were first announced in December last year. Although the aim is to help provide a higher standard...

Ramadan Diaries with Hayfa Bukhari

Ramadan Diaries with Hayfa Bukhari

In this blog, I explain what Ramadan means to me and what a typical day looks like during this time. I’ve written this blog before Ramdan starts so it gives you an idea of...

How Your Business Can Prepare for the FCA’s New Consumer Duty of Care Proposals

How Your Business Can Prepare for the FCA’s New Consumer Duty of Care Proposals

If you’re struggling to understand how to show regulators that your business is compliant following the FCA’s recent Consumer Duty of Care proposals, you’re absolutely not the only one.    Aimed at bringing in...