학술논문

Design & Research of Legal Affairs Information Service Platform Based on UIMA and Semantics
Document Type
Article
Text
Source
International Journal of Future Generation Communication and Networking, 03/30/2016, Vol. 9, Issue 3, p. 1-14
Subject
Legal affairs
UIMA
Big data
Chinese word segmentation
Language
영어(ENG)
ISSN
2233-7857
Abstract
The law is the most powerful weapon to safeguard national stability and ensure flourishing of all causes as well as an instrument to protect the rights and interests of the masses, therefore, the just and accurate use of legal provisions is of crucial importance. With the increase of informatization level of the legal industry itself, more and more services in legal affairs are provided in information-based forms, and there is a large number of unstructured information in such business data. It’s a problem the legal industry needs to tackle in information management to rapidly acquire valuable content from the massive unstructured data and make use of such content. Based on analysis of problems arising in existing laws and regulations informatization system, this paper comes up with solution of a legal affairs information service platform based on cloud computing, UIMA, semantics, big data and Chinese word segmentation. This paper also proposes the four-layer technical framework solution on the basis of the design of integration and management method of unstructured data of heterology and isomerism, analyzing and processing method of unstructured data, semantics based unstructured information retrieval method and construction and maintenance method of ontology library. It also provides detailed introduction to the realization of the combination of big data and cloud computing and its application in this information platform by virtue of designing UFS - a distributed file system, MapReduce - a batch processing technology and BigTable - a distributed database. Data acquisition and expanded data analysis can be conducted by making use of the expandable UIMA framework and the sequential indexing of data content and analysis results can be materialized by means of applying Lucene indexing technology. With regard to information retrieval, the concept of ontology is introduced on the basis of traditional search model and a new search model based on domain ontology is proposed. IKAnalyzer 3.x is proposed to facilitate Chinese word segmentation. By taking advantage of such information service platform, legal affairs enterprises can effectively integrate structured and unstructured information resources and implement the storage, analysis, retrieval and decision-making applications of business data content.