학술논문

Iterative Self-Supervised Learning for Legal Similar Case Retrieval

Document Type

Periodical

Author

Source

IEEE Access Access, IEEE. 12:17231-17241 2024

Subject

Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Law
Iterative methods
Training
Self-supervised learning
Computational modeling
Task analysis
Data models
Information systems
Legal information retrieval
similar case retrieval
iterative training
self-supervised learning

Language

ISSN

2169-3536

Abstract

In the realm of legal artificial intelligence (AI), the spotlight has been cast on its remarkable precision and efficiency, especially in tasks such as similar case retrieval where the identification of pertinent cases in response to a given query is of paramount importance. This task, distinct from traditional text retrieval, presents a set of unique challenges that necessitate the availability of high-quality, annotated datasets to facilitate efficient model training. The intricacies of handling extended queries and candidate documents, coupled with the varied interpretations of similarity, further compound the complexity of this endeavor. This study introduces an innovative training approach, combining dense and sparse retrieval methods. Utilizing a sparse retrieval model, we extract unlabeled data from extensive legal cases. Subsequently, a dense retrieval model screens this data, merging it with labeled data to create pseudo-labeled data, iteratively training until convergence. The results demonstrate exceptional performance in the Chinese law retrieval task dataset, showcasing a notable 3.66% precision enhancement and a substantial 3.62% improvement in mean average precision (MAP). However, the dataset’s imbalance across different charges of cases poses a challenge, potentially affecting retrieval performance for long-tailed legal cases. Nonetheless, these outcomes signify accelerated and more efficient retrieval of similar cases for legal professionals. Additionally, they provide high-quality references for non-legal individuals lacking expertise in the field.

Online Access

Open Access (EBSCO) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송