학술논문

From data to insight: Exploring contaminants in different food groups with literature mining and machine learning techniques
Document Type
article
Source
Current Research in Food Science, Vol 7, Iss , Pp 100557- (2023)
Subject
Food-chain contaminants
Food toxicants
Literature review
Contaminant data
Exposome
Unavoidable exposure
Nutrition. Foods and food supply
TX341-641
Food processing and manufacture
TP368-456
Language
English
ISSN
2665-9271
Abstract
Food remains a major source of human exposure to chemical contaminants that are unintentionally present in commodities globally, despite strict regulation. Scientific literature is a valuable source of quantification data on those contaminants in various foods, but manually summarizing the information is not practicable. In this review, literature mining and machine learning techniques were applied in 72 foods to obtain relevant information on 96 contaminants, including heavy metals, polychlorinated biphenyls, dioxins, furans, polycyclic aromatic hydrocarbons (PAHs), pesticides, mycotoxins, and heterocyclic aromatic amines (HAAs). The 11,723 data points collected from 254 papers from the last two decades were then used to identify the patterns of contaminants distribution. Considering contaminant categories, metals were the most studied globally, followed by PAHs, mycotoxins, pesticides, and HAAs. As for geographical region, the distribution was uneven, with Europe and Asia having the highest number of studies, followed by North and South America, Africa and Oceania. Regarding food groups, all contained metals, while PAHs were found in seven out of 12 groups. Mycotoxins were found in six groups, and pesticides in almost all except meat, eggs, and vegetable oils. HAAs appeared in only three food groups, with fish and seafood reporting the highest levels. The median concentrations of contaminants varied across food groups, with citrinin having the highest median value. The information gathered is highly relevant to explore, establish connections, and identify patterns between diverse datasets, aiming at a comprehensive view of food contamination.