학술논문

Historical corpus annotation
Document Type
Chapter
Author
Source
Quantitative Historical Linguistics : A Corpus Framework, 2017, ill.
Subject
annotation
historical corpora
metadata
XML
spelling normalization
morphological
syntactic
semantic
treebank
annotation scheme LatinISE
Historical and Diachronic Linguistics
Language
English
Abstract
Chapter 4 explains the concept and process of annotation for historical corpora, from a theoretical, practical, and technical point of view, and discusses the challenges presented by historical texts. We introduce basic terminology for XML technologies and corpus metadata, and we describe the different levels of linguistic annotation, from spelling normalization to morphological, syntactic, and semantic analysis, and briefly present the state of the art for historical corpora and treebanks. We cover annotation schemes and standards and illustrate the main concepts in corpus annotation with an example from LatinISE, a large annotated Latin corpus.

Online Access