Tutorial on Countering Bias in Personalized Rankings
From Data Engineering to Algorithm Development

to be held as part of the 37th IEEE International Conference on Data Engineering (ICDE2021)

April 19 - 22, 2021 - ONLINE

Introduction

This tutorial provides the ICDE community with recent advances on the assessment and mitigation of data and algorithmic bias in personalized rankings. We first introduce conceptual foundations, by surveying the state of the art and describing real-world examples of how bias can impact ranking algorithms from several perspectives (e.g., ethics and system's objectives). Biases can arise in different forms and circumstances, and those leading to unfairness are just one type among the multitude biases affecting our data engineering processes (e.g., popularity biases, cognitive biases).

After presenting a broad taxonomy of biases, this tutorial continues with a systematic presentation of techniques to uncover, assess, and reduce each type of bias along the personalized ranking design process, giving a primary focus on the role of data engineering in each step of the pipeline. Hands-on parts provide attendees with implementations of bias mitigation algorithms, in addition to processes and guidelines on how data is organized and manipulated by these algorithms, leveraging open-source tools and public datasets; in this part, attendees are engaged in the design of bias countermeasures and in articulating impacts on stakeholders. The tutorial finally analyzes open issues and future directions in this vibrant and rapidly evolving research area.

Target Audience

This tutorial is accessible to researchers, industry technologists and practitioners. For people not familiar with rankings, this tutorial covers necessary background material. No prior knowledge on biases is assumed. Basic knowledge of Python programming and of quite common libraries, such as Pandas and NumPy, is preferred but not strictly necessary. One aspect relevant from the outline is that bias is a highly interdisciplinary topic, touching on several dimensions beyond algorithms. Hence, our tutorial is of interest for an interdisciplinary audience, with different backgrounds, beyond the information retrieval community. Our tutorial will cover fundamental notions of bias and fairness which can be potentially of interest also for those who are working on data engineering in other areas (e.g., machine learning, security, social networks).

Our tutorial is tailored around the ICDE community, thus focusing more on data engineering processes to be shaped to characterize and mitigate biases. Thanks to our tutorial, ICDE attendees will understand key aspects of bias in personalized rankings, materialize biases into underlying systems, play with mitigation and articulate impacts on stakeholders, identify challenges and opportunities.

Outline

Due to the ongoing worldwide COVID-19 situation, the Bias @ ICDE 2021 tutorial will take place online.

Timing Content
65 mins Session I: Foundations
Recommendation Principles
  • Recommendation principles. To introduce the problems associated to algorithmic bias, we will present the recommendation task as the generation of the most effective personalized ranking for a user, as in modern recommender systems.
  • Multi-sided recommendation aspects. Recommender systems have an impact on multiple actors, namely consumers, providers, system's owners. We will present these actors and the phases of the recommendation process where they have a role (design, algorithm, and evaluation).
Algorithmic Bias Foundations
  • Motivating examples. We will present real-world examples where bias can impact recommendation, considering domains such as music, education, social platforms, and recruiting.
  • Perspectives impacted by bias. Bias has an impact on several perspectives such as the economy, law, society, security, technology, and psychology.
  • Ethical aspects influenced by bias. Bias can have an impact at the ethical level and lead to issues such as recommendation of inappropriate content, lack of privacy, violation of autonomy and identity, introduction of opacity, lack of fairness, or the compromising of users' social relationships.
  • Objectives influenced by bias. We will present recommendation objectives influenced by bias (utility, coverage, diversity, novelty, visibility, exposure) and provide examples of related work.
Bias through the Pipeline
  • Recommendation pipeline. We will provide an initial overview of the recommendation pipeline, to characterize how bias can exist at several stages, namely, data acquisition and storage, data preparation, model training, model prediction, model evaluation, and recommendation delivery.
  • Types of bias associated to the pipeline. We explore the types of bias that can emerge at different stages of the pipeline, i.e., those associated to the users, platforms, data collection, data preparation, model exploitation, and model evaluation.
Bias Mitigation Design
  • Bias-aware process pipeline. Intervention strategies to mitigate algorithmic bias require an analysis of where and how bias might affect the system. We present a pipeline to support mitigation design.
  • Techniques for bias treatment. We will present the three main classes of mitigation techniques (pre-, in-, and post-processing), along with examples of solutions proposed for recommender systems.
  • Real-world applications. We will present examples of real-world platforms and of their approaches to deal with bias.
20 mins Session II: Hands-on
Hands on Recommender Systems
  • Data preparation starting from public datasets (i.e., COCO and Movielens datasets).
  • Model definition (e.g., user/item embeddings, layers stacking) and training (e.g., epochs, loss, optimizer)
  • User-item relevance matrix computation from a pre-trained model (e.g., model load, predictions).
  • Model evaluation oriented to utility (e.g., NDCG, beyond-accuracy metrics).
Hands on Item Popularity Bias
  • Definition and characterization of item popularity biases in interactions and recommendations.
  • Application of mitigation techniques based on pre-, in-, and post-processing.
  • Comparison of mitigation techniques based on bias and recommendation utility trade-offs.
  • Comparison of mitigation techniques on beyond-utility metrics (e.g., coverage, diversity, novelty).
05 mins Cincluding Remarks

Presenters

Ludovico Boratto

Ludovico Boratto
EURECAT - Centre Tecnòlogic de Catalunya (Spain)

Ludovico Boratto is Senior Research Scientist in at EURECAT. His research focuses on recommender systems and on their impact on stakeholders. His research has been published in top-tier conferences and journals. He is editor of the book "Group Recommender Systems: An Introduction" (Springer). He is editorial board member of the "Information Processing & Management" journal (Elsevier) and guest editor of other special issues. He is regularly PC member of the main Data Mining conferences. In 2012, he got a Ph.D. at the University of Cagliari, where he was research assistant until May 2016.



Mirko Marras

Mirko Marras
École Polytechnique Fédérale de Lausanne EPFL (Switzerland)

Mirko Marras is Postdoctoral Researcher at the École Polytechnique Fédérale de Lausanne EPFL. His research focuses on data mining and machine learning for recommender systems, with attention to bias issues, mainly under online education settings. He authored papers in top-tier journals, such as Pattern Recognition Letters and Computers Human Behavior. He gave talks and demos at international conferences and workshops, e.g., TheWebConf2018, ECIR2019, and INTERSPEECH2019. He is PC member of major conferences, e.g., ACL, AIED, EDM, ECML-PKDD, EMNLP, ITICSE, ICALT, UMAP. He co-chaired the BIAS workshop at ECIR 2020 and 2021 and gave tutorials on Bias in Recommender Systems at UMAP2020 and ICDM2020. In 2020, he received a Doctoral Degree from University of Cagliari.

Contacts

Please, reaching out to us at ludovico.boratto@acm.org and mirko.marras@epfl.ch.