학술논문

Parsing and Summarizing Infographics with Synthetically Trained Icon Detection
Document Type
Conference
Source
2021 IEEE 14th Pacific Visualization Symposium (PacificVis) PACIFICVIS Pacific Visualization Symposium (PacificVis), 2021 IEEE 14th. :31-40 Apr, 2021
Subject
Computing and Processing
Training
Visualization
Semantics
Layout
Detectors
Object detection
Tools
Human-centered computing
Visualization systems and tools
Visualization toolkits
Computing methodologies
Artificial intelligence
Computer vision
Computer vision problemsObject detection
Language
ISSN
2165-8773
Abstract
Widely used in news, business, and educational media, infographics are handcrafted to effectively communicate messages about complex and often abstract topics including ‘ways to conserve the environment’ and ‘coronavirus prevention’. The computational understanding of infographics required for future applications like automatic captioning, summarization, search, and question-answering, will depend on being able to parse the visual and textual elements contained within. However, being composed of stylistically and semantically diverse visual and textual elements, infographics pose challenges for current A.I. systems. While automatic text extraction works reasonably well on infographics, standard object detection algorithms fail to identify the stand-alone visual elements in infographics that we refer to as ‘icons’. In this paper, we propose a novel approach to train an object detector using synthetically-generated data, and show that it succeeds at generalizing to detecting icons within in-the-wild infographics. We further pair our icon detection approach with an icon classifier and a state-of-the-art text detector to demonstrate three demo applications: topic prediction, multi-modal summarization, and multi-modal search. Parsing the visual and textual elements within infographics provides us with the first steps towards automatic infographic understanding.