학술논문

An architecture for xml information retrieval in a peer-to-peer environment
Document Type
Conference
Source
Proceedings of the ACM first Ph.D. workshop in CIKM. :17-24
Subject
XML information retrieval
XML-retrieval
architecture
content-based XML retrieval
distributed search
information retrieval
peer-to-peer
search engine
Language
English
Abstract
XML has become a widely accepted standard for modelling, storing, and exchanging structured documents. Taking advantage of the document structure can result in improving the retrieval performance of XML-documents notably. A growing number of these documents are stored in Peer-to-Peer networks, which are promising self-organizing infrastructures. Documents are distributed over the Peer-to-Peer network by either being stored locally on individual peers or by being assigned to collections such as Digital Libraries. Current search methods for XML-documents in Peer-to-Peer networks lack the use of Information Retrieval techniques for vague queries and relevance detection. Our work aims for the development of a search engine for XML-documents, where Information Retrieval methods are enhanced by using structural information. Documents and global index are distributed over a Peer-to-Peer Network, building a virtually unlimited storage space. In this paper, a conceptual architecture for XML Information Retrieval in Peer-to-Peer networks is proposed. Based on this general architecture, a component-structured architecture for a concrete search engine is presented, which uses an extension of the Vector Space Model to compute relevance for dynamic XML-documents.

Online Access