Fairness in artificial intelligence models: how to mitigate discrimination in the presence of multiple sensitive attributes?

29/08/2024
* Xamy López
Government, Artificial intelligence, Technology

Suppose we have a machine learning model, 𝑓, that predicts the price of an insurance bonus, Y, for data that includes a sensitive attribute, such as gender. Discrimination may occur due to statistical bias (past injustices or sample imbalance), a correlation between the sensitive attribute and some explanatory variable, or intentional bias .

To avoid this bias, there has been legislation (such as the AI ACT - Europe, 2024) that limits or even eliminates the use of certain sensitive attributes in artificial intelligence models. However, simply removing these attributes is not always the solution that generates the best level of fairness or the best model performance. There are preprocessing approaches (which modify the input data), processing approaches (which add a fairness penalty), and postprocessing approaches (which modify the univariate distribution of predictions to create an intermediate distribution, as done in Sequential Fairness)

There have been several postprocessing approaches to mitigate these effects if a model has a single sensitive attribute (Single sensitive atribute, SSA). But what can we do if there are multiple sensitive attributes (Multiple sensitive atribute, MSA)? One possible approach is to consider the intersection of the distributions created by each of the combinations of the sensitive attributes. For example, if the sensitive attributes are gender (female and male) and ethnicity (black and white), these four cases would be considered with the SSA approach:

This can be computationally expensive as the number of sensitive attributes increases. Additionally, when adding a new sensitive attribute, the previous work is lost because new distributions must be found with the new combinations. Another approach (which is the focus of this blog) is Sequential Fairness. In summary, this approach seeks to modify the model's predictions to be fair for the first sensitive attribute and then modify these new predictions again to be fair for the second attribute (and consequently also for the first), and so on. The benefits of this approach are that it is a commutative process (the order of the sequence of attributes to make the model fair does not matter), it is easy to add new sensitive attributes, and it also makes interpretability easier.

The idea is to find a representative distribution that lies between the conditional distributions for the predictions of the sensitive attributes. This is achieved using the Wasserstein barycenter, which tries to minimize the total cost of moving one distribution to another through optimal transport. The concept of the Wasserstein barycenter extends the idea ofStrong Demographic Parity) to multiple attributes, which seeks to reduce inequity in groups and requires that a model's predictions be independent of sensitive attributes.

It is important to note that methods for reducing unfairness in predictive models always come at a cost to performance. However, this approach, by using the Wasserstein barycenter, ensures that metrics like accuracy and MSE suffer the least possible damage.

Equipy is a Python package that implement Sequential Fairness in continuous prediction models with multiple sensitive attributes, using the concept of the Wasserstein barycenter to minimize the impact on model performance while mitigating bias and discrimination that may arise from sensitive attributes in predictions.

* This blog is based on the presentation made during the Quantil seminar on August 8, 2024, by Agathe Fernandes Machado titled "EquiPy: A Python package for Sequential Fairness using Optimal Transport with Applications in Insurance", where she shared insights about the research conducted by her and her team at the *Université du Québec à Montréal (**UQAM) to develop a Python package that implements sequential fairness* to mitigate injustices in the presence of multiple sensitive attributes.

SESGO: Una mirada crítica a los sesgos de la IA en español

En los últimos años, los modelos de lenguaje han transformado la manera en la que interactuamos con la información. Desde asistentes virtuales hasta sistemas de apoyo en la toma de decisiones, estas herramientas se han vuelto omnipresentes …

Read article

Algorithmic Justice

Justicia en los Modelos de Inteligencia Artificial: Nueva Perspectiva Basada en el Re-diseño de Algoritmos

En los últimos años, los modelos de inteligencia artificial han demostrado un potencial increíble para transformar industrias, desde la salud hasta las finanzas. Sin embargo, también han expuesto un problema preocupante: el sesgo algorítmico.

Read article

Machine Learning

Robust Inference and Uncertainty Quantification for Data-Driven Decision Making

Los modelos de aprendizaje automático se han convertido en herramientas esenciales para la toma de decisiones en sectores críticos como la salud, las políticas públicas y las finanzas. Sin embargo, su aplicación práctica enfrenta dos grandes desafíos: el sesgo de selección en los datos y la cuantificación adecuada de la incertidumbre.

Read article

Redes Neuronales

El Potencial Impacto del Aprendizaje de Máquinas en el Diseño de las Políticas Públicas en Colombia: Una década de experiencias

Este blog es un resumen extendido del articulo Riascos, A. (2025).1 Desde el inicio de la llamada tercera ola de redes neuronales (Goodfellow et al., (2016)), en la primera década de este siglo, se ha generado una gran esperanza en las posibilidades de la inteligencia artificial para transformar todas las actividades humanas. Asimismo, se han levantado alertas sobre los riesgos que conlleva la introducción de esta nueva tecnología (Bengio et al., (2024)).

Read article

Deep Learning

Explorando Redes Neuronales en Grafos para la Clasificación de Asentamientos Informales en Bogotá, Colombia

Los asentamientos informales son definidos como áreas residenciales cuyos habitantes no poseen tenencia legal de las tierras, los barrios carecen de servicios básicos e infraestructura urbana y no cumplen con requisitos de planificación, así como se pueden encontrar en zonas de peligro ambiental y geográfico (ONU, 2015).

Read article

Technology

Reinforcement Learning para Optimización de Portafolios

En el contexto de los mercados financieros, la optimización de portafolios consiste en identificar la combinación óptima de activos para maximizar la relación retorno-riesgo. No obstante, esta toma de decisiones se realiza en un entorno de incertidumbre, ya que el comportamiento de los activos no es estacionario a lo largo del tiempo.

Fairness in artificial intelligence models: how to mitigate discrimination in the presence of multiple sensitive attributes?

Tags

Newsletter

Recent articles

IA

SESGO: Una mirada crítica a los sesgos de la IA en español

Read article

Algorithmic Justice

Justicia en los Modelos de Inteligencia Artificial: Nueva Perspectiva Basada en el Re-diseño de Algoritmos

Read article

Machine Learning

Robust Inference and Uncertainty Quantification for Data-Driven Decision Making

Read article

Redes Neuronales

El Potencial Impacto del Aprendizaje de Máquinas en el Diseño de las Políticas Públicas en Colombia: Una década de experiencias

Read article

Deep Learning

Explorando Redes Neuronales en Grafos para la Clasificación de Asentamientos Informales en Bogotá, Colombia

Read article

Technology

Reinforcement Learning para Optimización de Portafolios

Read article

Let's keep in touch

Our social networks

Services

Resources