학술논문

Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion

Document Type

Working Paper

Author

Lee, Seongmin; Hoover, Benjamin; Strobelt, Hendrik; Wang, Zijie J.; Peng, ShengYun; Wright, Austin; Li, Kevin; Park, Haekyu; Yang, Haoyang; Chau, Duen Horng

Source

Subject

Computer Science - Computation and Language
Computer Science - Artificial Intelligence
Computer Science - Human-Computer Interaction
Computer Science - Machine Learning

Language

Abstract

Diffusion-based generative models' impressive ability to create convincing images has garnered global attention. However, their complex structures and operations often pose challenges for non-experts to grasp. We present Diffusion Explainer, the first interactive visualization tool that explains how Stable Diffusion transforms text prompts into images. Diffusion Explainer tightly integrates a visual overview of Stable Diffusion's complex structure with explanations of the underlying operations. By comparing image generation of prompt variants, users can discover the impact of keyword changes on image generation. A 56-participant user study demonstrates that Diffusion Explainer offers substantial learning benefits to non-experts. Our tool has been used by over 10,300 users from 124 countries at https://poloclub.github.io/diffusion-explainer/.
Comment: 5 pages, 7 figures

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송