학술논문

Extracting templates from Web pages
Document Type
Conference
Source
2013 International Conference on Green Computing, Communication and Conservation of Energy (ICGCE) Green Computing, Communication and Conservation of Energy (ICGCE), 2013 International Conference on. :788-791 Dec, 2013
Subject
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Power, Energy and Industry Applications
Signal Processing and Analysis
Decision support systems
Erbium
Handheld computers
Document Object Model
Minimum Description Length
Template Extraction
VIPS
Language
Abstract
In today's world, World Wide Web is the most popular information providers. A website is a collection of web pages and Web pages usually include information for the users. The web sites are designed with common templates and content. The template is used to access the content easily by consistent structures even the templates are not explicitly announced The current Template extraction techniques are degrading the performance of web applications such as search engine due to irrelevant terms in templates. Hence, we present a new method for detecting and extracting templates from web pages automatically by identifying the relevant information.