학술논문

Towards a (Semi-)Automatic Urban Planning Rule Identification in the French Language
Document Type
Conference
Source
2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA) Data Science and Advanced Analytics (DSAA), 2023 IEEE 10th International Conference on. :1-10 Oct, 2023
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Satellites
Urban planning
Time series analysis
Natural languages
Data science
Satellite images
Labeling
Natural Language Processing
supervised learning
data augmentation
Language
Abstract
One of the objectives of the Hérelles project is to find new mechanisms to facilitate the labeling (or semantization) of clusters from time series of satellite images. To achieve this, a proposed solution is to associate textual elements of interest with satellite data. The first step in this process consists of an automatic extraction of the information in the form of rules from urban planning documents composed in the French language. To address this challenge, we propose a method which is based on the multi-label classification of textual segments. It includes a special format for representing segments, in which each segment has a title and a subtitle. In addition, we propose a cascade approach aiming to deal with hierarchy of class labels. Finally, we develop several text augmentation techniques for the texts in French, which are able to improve the prediction results. We demonstrate experimentally that the resulting framework correctly classifies each type of segment with more than 90% of accuracy.