학술논문

Connecting NeRFs, Images, and Text

Document Type

Working Paper

Author

Ballerini, Francesco; Ramirez, Pierluigi Zama; Mirabella, Roberto; Salti, Samuele; Di Stefano, Luigi

Source

Subject

Computer Science - Computer Vision and Pattern Recognition

Language

Abstract

Neural Radiance Fields (NeRFs) have emerged as a standard framework for representing 3D scenes and objects, introducing a novel data type for information exchange and storage. Concurrently, significant progress has been made in multimodal representation learning for text and image data. This paper explores a novel research direction that aims to connect the NeRF modality with other modalities, similar to established methodologies for images and text. To this end, we propose a simple framework that exploits pre-trained models for NeRF representations alongside multimodal models for text and image processing. Our framework learns a bidirectional mapping between NeRF embeddings and those obtained from corresponding images and text. This mapping unlocks several novel and useful applications, including NeRF zero-shot classification and NeRF retrieval from images or text.
Comment: Accepted at CVPRW-INRV 2024

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송