TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models

1University of British Columbia & Vector Institute 2Toyota Technological Institute at Chicago 3Carleton College 4The Wharton School, University of Pennsylvania
*Equal contribution
TIBET Example


Analyse biases for any prompt, with any black-box T2I model, using TIBET!

Our approach calculates CAS and MAD scores to measure association with counterfactual prompts and bias degree in generated images. Qualitative metrics like Top-K Concepts and Axis-Aligned Top-K Concepts offer post-hoc model explanations. Additionally, our approach enables comparisons with counterfactual explanations.

Abstract

Text-to-Image (TTI) generative models have shown great progress in the past few years in terms of their ability to generate complex and high-quality imagery. At the same time, these models have been shown to suffer from harmful biases, including exaggerated societal biases (e.g., gender, ethnicity), as well as incidental correlations that limit such a model's ability to generate more diverse imagery. In this paper, we propose a general approach to study and quantify a broad spectrum of biases, for any TTI model and for any prompt, using counterfactual reasoning. Unlike other works that evaluate generated images on a predefined set of bias axes, our approach automatically identifies potential biases that might be relevant to the given prompt, and measures those biases. In addition, we complement quantitative scores with post-hoc explanations in terms of semantic concepts in the images generated. We show that our method is uniquely capable of explaining complex multi-dimensional biases through semantic concepts, as well as the intersectionality between different biases for any given prompt. We perform extensive user studies to illustrate that the results of our method and analysis are consistent with human judgements.

How it works

TIBET


Given an input prompt, we query an LLM (GPT-3) to identify axes of biases (Step 1), and generate counterfactual prompts for each axis of bias (Step 2). Here, we show a sample of three counterfactual prompts for the physical appearance bias, and two for the ableism bias. Next, we use a black-box TTI model (Stable Diffusion) to generate images for the initial prompt as well as each counterfactual for all axes of bias (Step 3). In this example, we leverage VQA based concept extraction to obtain a list of concepts and their frequencies for each set of images, and compare the concepts of the initial set with concepts of each counterfactual to obtain CAS scores (Step 4). Finally, we compute MAD, a measure of how strong the bias is in the images generated by the initial prompt (Step 5).

Poster

Example

TIBET Use

BibTeX

@misc{chinchure2023tibet,
          title={TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models}, 
          author={Aditya Chinchure and Pushkar Shukla and Gaurav Bhatt and Kiri Salij and Kartik Hosanagar and Leonid Sigal and Matthew Turk},
          year={2023},
          eprint={2312.01261},
          archivePrefix={arXiv},
          primaryClass={cs.CV},
          url={https://arxiv.org/abs/2312.01261}, 
    }