학술논문

Robust statistical tools for identifying multiple stellar populations in globular clusters in the presence of measurement errors. A case study: NGC 2808
Document Type
Working Paper
Source
A&A 658, A141 (2022)
Subject
Astrophysics - Astrophysics of Galaxies
Astrophysics - Solar and Stellar Astrophysics
Language
Abstract
The finding of multiple stellar populations (MP), defined by patterns in the stellar element abundances, is nowadays considered a distinctive feature of globular clusters. However, while data availability and quality improved in last decades, this is not always true for the techniques adopted to their analysis, rising problems of objectivity of the claims and reproducibility. Using NGC 2808 as test case we show the use of well established statistical clustering methods. We focus the analysis to the RGB phase, where two data sets are available from recent literature for low- and high-resolution spectroscopy. We adopt both hierarchical clustering and partition methods. We explicitly address the usually neglected problem of measurement errors. The results of the clustering algorithms were subjected to silhouette width analysis to compare the performance of the split into different number of MP. For both data sets the results are at odd with those reported in the literature. Two MP are detected for both data sets, while the literature reports five and four MP from high- and low-resolution spectroscopy respectively. The silhouette analysis suggests that the population sub-structure is reliable for high-resolution spectroscopy data, while the actual existence of MP is questionable for the low-resolution spectroscopy data. The discrepancy with literature claims is explainable due to the difference of methods adopted to MP characterisation. By means of Monte Carlo simulations and multimodality statistical tests we show that the often adopted study of the histogram of the differences in some key elements is prone to multiple false positive findings. The adoption of statistically grounded methods, which adopt all the available information to subset the data and explicitly address the problem of data uncertainty, is of paramount importance to present more robust and reproducible researches.
Comment: Match the A&A accepted version. Acknowledgments fixed