학술논문

Mining the Highway-Rail Grade Crossing Crash Data: A Text Mining Approach
Document Type
Conference
Source
2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) Machine Learning And Applications (ICMLA), 2019 18th IEEE International Conference On. :1063-1068 Dec, 2019
Subject
Computing and Processing
Engineering Profession
Robotics and Control Systems
Signal Processing and Analysis
Computer crashes
Vehicle crash testing
Text mining
Safety
Automobiles
Road transportation
Feature extraction
Railroad safety, Highway-rail grade crossing, Text mining, Similarity, Random Forest, Logistic Regression
Language
Abstract
Railroad traffic safety is a major worldwide concern. Since 2000, there have been 48,083 crashes at highway-rail grade crossings in the US resulting in 6,103 fatalities, 18,851 injuries, and $302,065,336 cost of vehicle damages. Federal Railroad Administrator (FRA) seeks to improve safety and consolidate the grade crossing in risky areas. Towards the goal of better highway-rail cross safety, this paper aims to explore the reasons behind highway-rail crossing crashes by text mining within the narrative information in the crash data. The similarity of the extracted crash reasons between the 50 states and Washington DC in the USA is also calculated to create a comprehensive document for Departments of Transportation (DOTs) and railroad agencies. Important features are extracted from the FRA crash data along with two new features (geographical region and weight of narrative crash data) to classify the type of train-vehicle crashes into "train struck car" and "car struck train". The weight of narrative field was calculated using text mining techniques. Random Forest and Logistic Regression machine learning algorithms have been applied to build prediction models for the classification task. Experimental results indicate an overall accuracy of 0.86 and 0.74 for the proposed Random Forest and Logistic Regression models respectively.