Socially Responsible and Trustworthy Generative Foundation Models: Principles, Challenges, and Practices

Yue Huang1, Canyu Chen2, Chujie Gao1, Lu Cheng3, Bhavya Kailkhura4, Nitesh V. Chawla1, Xiangliang Zhang1
1 University of Notre Dame · 2 Northwestern University · 3 University of Illinois Chicago · 4 Lawrence Livermore National Laboratory

CIKM 2025 Tutorial · Half-day (3 hours) · Hands-on evaluation and mitigation

Introduction

Generative foundation models (GenFMs), such as large language and multimodal models, are transforming information access, retrieval, and knowledge systems. However, their deployment raises critical concerns around social responsibility, including fairness, bias mitigation, environmental impact, misinformation, and safety. This tutorial provides a comprehensive overview of recent research and best practices at the intersection of GFMs and responsible AI. We introduce foundational concepts, evaluation metrics, and mitigation strategies, with case studies across various domains (e.g., text, vision, code). The tutorial is designed for researchers and practitioners interested in building or auditing socially responsible GFMs. We highlight open challenges and propose future research directions relevant to the CIKM community.

Audience & Prerequisites & Benefits

Target Audience & Prerequisites. This tutorial is designed for researchers, graduate students, and industry professionals in machine learning, information retrieval, data science, and AI ethics or governance. No specialized background is required beyond a general familiarity with artificial intelligence concepts. Prior experience with deep learning or generative models will be helpful.

Benefits. Attendees will gain a comprehensive understanding of responsible AI practices as they relate to generative foundation models, including exposure to practical auditing workflows and hands-on tools for evaluating trustworthiness. The tutorial will equip participants with the knowledge and resources needed to identify and address social and ethical risks in real-world applications, ultimately enabling them to contribute to the development of more trustworthy and socially responsible AI systems.

Table 1: Tutorial outline and key points
Section Title Key Points
The Dual Nature of GenFMs and the Need for Responsibility GenFMs as assistants and simulators; new risks in information systems; importance of responsibility for CIKM.
Understanding Social Responsibility: Taxonomy and Case Studies Six key risk dimensions; real-world failure cases; hands-on prompt-based risk audits.
Evaluation Methods and Benchmarks Key benchmarks and metrics; evaluation pipelines; hands-on open-source tools.
Enhancement Strategies for Responsible GenFMs Main mitigation techniques; effectiveness and scalability; hands-on simple mitigations.
Governance and Policy Perspectives Overview of policy frameworks; community standards; institutional roles.
Open Challenges and Community Discussion Key open questions; CIKM community engagement; interactive discussion.

Outline

The Dual Nature of GenFMs and the Need for Responsibility (20 minutes)

We begin by outlining the growing impact of Generative Foundation Models (GenFMs) in real-world systems and why ensuring their responsible behavior is a critical challenge. GenFMs serve in two primary roles: as assistants, they help users accomplish daily tasks such as creative writing, translation, code generation, and search augmentation; as simulators, they generate synthetic data, simulate user behavior, or model complex scenarios, especially used in scientific studies. These roles demonstrate that GenFMs are no longer passive technologies—they actively shape how individuals interact with information and systems. As GenFMs gain broader adoption in search engines, conversational agents, recommendation systems, and decision-making pipelines, their influence over knowledge access and user experience grows. This elevates the importance of building responsible and trustworthy GenFMs. We motivate this need by highlighting societal risks such as misinformation, exclusionary bias, and unintended memorization. These concerns are especially relevant to the CIKM community, whose core interests include information retrieval, data management, and the deployment of intelligent systems in real-world settings.

Understanding Social Responsibility: Taxonomy and Case Studies (45 minutes)

This part presents a structured understanding of the major risks and responsibility dimensions in GenFMs. We introduce a taxonomy that covers most of the responsibility and trustworthiness-related dimensions: Safety: Preventing outputs that cause harm or enable malicious behavior. Privacy: Avoiding unintended memorization or exposure of sensitive data. Robustness: Ensuring stability under distribution shifts and adversarial prompts. Truthfulness: Reducing hallucinations and factual inaccuracies. Fairness: Addressing demographic, geographic, or ideological biases. Machine Ethics: Focusing on ensuring that AI-powered machines exhibit ethically acceptable behavior.

Each dimension is grounded in illustrative case studies from both open-source and commercial GenFMs. We examine well-documented failures, including biased healthcare advice, offensive language generation, and identity leakage. These cases demonstrate how technical issues translate into measurable societal harms. To deepen participants’ understanding, we integrate hands-on exercises where attendees explore failure modes using curated prompts and open-source diagnostic tools (e.g., toxicity detectors like OpenAI moderator as well as Llama Guard, and bias probes).

Evaluation and Benchmarks (20 minutes)

We introduce the first phase of the 2E strategy: Evaluation. Participants will learn about widely used benchmarks and diagnostic datasets, such as traditional datasets—BBQ, TruthfulQA, and RealToxicityPrompts, and novel benchmarks like HarmBench, TrustLLM, and DecodingTrust. We will also introduce the metrics for evaluating fairness, safety, and so on. We also present tools like OpenAI Evals, TrustEval, and AI Fairness 360. Integrated exercises will allow participants to apply these tools and observe how GenFMs behave under standard evaluation settings.

Enhancement for Responsible GenFMs (20 minutes)

This section focuses on the second phase of the 2E strategy: Enhancement. We present strategies to mitigate model risks through interventions at different stages: data-level filtering, prompt-level steering, model fine-tuning (e.g., RLHF), and post-processing (e.g., detoxification, Retrieval-Augmented Generation (RAG)). Participants will experiment with basic mitigation techniques using provided code notebooks and re-evaluate model behavior after intervention.

Governance and Policy Perspectives (20 minutes)

Beyond technical fixes, this module explores institutional, regulatory, and industrial mechanisms for ensuring GenFM responsibility. We present recent developments such as the EU AI Act and the NIST AI RMF, as well as efforts from the open-source community (e.g., OpenRAIL, BigScience governance). In addition, we discuss major enterprise-led initiatives, such as OpenAI’s system cards and model usage guidelines, Google’s Responsible AI Principles, Meta’s Llama Responsible Use Guide, and Microsoft’s Responsible AI Standard. These industry frameworks set standards for risk assessment, transparency, and responsible deployment at scale.

Open Challenges and Community Discussion (25 minutes)

In closing, we highlight three pressing open challenges for GenFM trustworthiness. First, as models and their applications continuously evolve, evaluation and mitigation strategies must be adaptive and dynamic to remain effective without sacrificing utility or user experience. Second, alignment techniques can have dual effects—improving safety in some contexts while inadvertently introducing new vulnerabilities or biases—underscoring the need for more nuanced approaches. Third, addressing advanced AI risks requires a combination of technical innovation, interdisciplinary collaboration, and forward-looking governance frameworks to proactively manage emerging threats.

Presenters

Yue Huang

Yue Huang

Ph.D. Student, Computer Science and Engineering, University of Notre Dame

Yue Huang is a Ph.D. student in Computer Science and Engineering at the University of Notre Dame. He earned his B.S. in Computer Science from Sichuan University. His research investigates the trustworthiness and social responsibility of foundation models. Yue has published at premier venues including NeurIPS, ICLR, ICML, ACL, EMNLP, NAACL, CVPR, and IJCAI. His work has been highlighted by the U.S. Department of Homeland Security and recognized with the Microsoft Accelerating Foundation Models Research Award, the KAUST AI Rising Star Award (2025), Industry Mentor of NSF POSE Award, best paper award of DIGBUG@ICML'25, and SciSocLLM@KDD'25. He has delivered invited talks on “Trustworthiness in Large Language Models” and “Socially Responsible Generative Foundation Models” at UIUC, USC, UVA, IBM Research, and other institutions.

Canyu Chen

Canyu Chen

Ph.D. Student, Northwestern University

Canyu Chen is a Ph.D. student at Northwestern University. He focuses on truthful, safe and responsible Large Language Models with the applications in social computing and healthcare. He has started and led an initiative "LLMs Meet Misinformation" (https://llm-misinformation.github.io), aiming to combat misinformation in the age of LLMs. He has publications in top-tier conferences including ICLR, NeurIPS, EMNLP, EACL, and WWW. He won multiple awards such as Sigma Xi Student Research Award 2024, the Didactic Paper Award in the workshop ICBINB@NeurIPS 2023, Spotlight Research Award in the AGI Leap Summit 2024. He is a co-organizer of the Workshop Reasoning and Planning for Large Language Models at ICLR 2025.

Chujie Gao

Chujie Gao

Incoming Ph.D. Student, Computer Science and Engineering, University of Notre Dame

Chujie Gao is an incoming Ph.D. student in Computer Science and Engineering at the University of Notre Dame. Her research focuses on the trustworthiness and reliability of generative foundation models, with interests spanning the helpful, honest, and harmless (HHH) principles, evaluation frameworks, and applications to downstream tasks. Her work has published in conferences such as NeurIPS, ICML, ICLR, NAACL, and CIKM.

Lu Cheng

Lu Cheng

Assistant Professor, Computer Science, University of Illinois Chicago

Lu Cheng is an assistant professor in Computer Science at the University of Illinois Chicago. Her research interests are responsible and reliable AI, causal machine learning, and AI for social good. She is the recipient of the PAKDD Best Paper Award, Google Research Scholar Award, Amazon Research Award, Cisco Research Faculty award, AAAI New Faculty Highlights, 2022 INNS Doctoral Dissertation Award (runner-up), 2021 ASU Engineering Dean's Dissertation Award, SDM Best Poster Award, IBM Ph.D. Social Good Fellowship, Visa Research Scholarship, among others. She co-authors two books: "Causal Inference and Machine Learning (Chinese)" and "Socially Responsible AI: Theories and Practices".

Bhavya Kailkhura

Bhavya Kailkhura

Staff Scientist, Lawrence Livermore National Laboratory

Bhavya Kailkhura a Staff Scientist and a council member of the Data Science Institute (DSI) at LLNL. He leads efforts on AI safety, efficiency, and their applications to science and national security. His work has earned several awards, including the All-University Doctoral Prize (Syracuse Uni., 2017), the LLNL Early and Mid Career Recognition Program Award (2024), and the best paper awards at ICLR SRML, AAAI CoLoRAI, and others. He is an IEEE Senior Member and served as Associate Editor for ACM JATS (2023) and Frontiers in Big Data and AI (2021). He has held roles such as panelist, program chair, and organizer for workshops and conferences including ICASSP, AAAI, and GlobalSIP.

Nitesh Chawla

Nitesh Chawla

Frank M. Freimann Professor, University of Notre Dame

Nitesh Chawla is the Frank M. Freimann Professor of Computer Science and Engineering at the University of Notre Dame. He is the Founding Director of the Lucy Family Institute for Data and Society. He is an expert in artificial intelligence, data science, and network science. He is the recipient of 2015 IEEE CIS Outstanding Early Career Award; the IBM Watson Faculty Award; the IBM Big Data and Analytics Faculty Award; and the 1st Source Bank Technology Commercialization Award. He was recognized with the Rodney F. Ganey Award and Michiana 40 under 40 honor. He is a Fellow of both ACM and IEEE.

Xiangliang Zhang

Xiangliang Zhang

Leonard C. Bettex Collegiate Professor, University of Notre Dame

Xiangliang Zhang is a Leonard C. Bettex Collegiate Professor in the Department of Computer Science and Engineering, University of Notre Dame. Her main research interests and experiences are in machine learning and data mining. She has published more than 270 refereed papers in leading international conferences and journals. She serves as associate editor of IEEE Transactions on Dependable and Secure Computing, Information Sciences, and International Journal of Intelligent Systems, and regularly serves as area chair or on the (senior) program committee of IJCAI, SIGKDD, NeurIPS, AAAI, ICML, and WSDM.