학술논문

Sociotechnical Safety Evaluation of Generative AI Systems

Document Type

Working Paper

Author

Weidinger, Laura; Rauh, Maribeth; Marchal, Nahema; Manzini, Arianna; Hendricks, Lisa Anne; Mateos-Garcia, Juan; Bergman, Stevie; Kay, Jackie; Griffin, Conor; Bariach, Ben; Gabriel, Iason; Rieser, Verena; Isaac, William

Source

Subject

Computer Science - Artificial Intelligence
Computer Science - Computation and Language
Computer Science - Computers and Society

Language

Abstract

Generative AI systems produce a range of risks. To ensure the safety of generative AI systems, these risks must be evaluated. In this paper, we make two main contributions toward establishing such evaluations. First, we propose a three-layered framework that takes a structured, sociotechnical approach to evaluating these risks. This framework encompasses capability evaluations, which are the main current approach to safety evaluation. It then reaches further by building on system safety principles, particularly the insight that context determines whether a given capability may cause harm. To account for relevant context, our framework adds human interaction and systemic impacts as additional layers of evaluation. Second, we survey the current state of safety evaluation of generative AI systems and create a repository of existing evaluations. Three salient evaluation gaps emerge from this analysis. We propose ways forward to closing these gaps, outlining practical steps as well as roles and responsibilities for different actors. Sociotechnical safety evaluation is a tractable approach to the robust and comprehensive safety evaluation of generative AI systems.
Comment: main paper p.1-29, 5 figures, 2 tables

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송