학술논문

Constructing Temporal Networks of OSS Programming Language Ecosystems
Document Type
Conference
Source
2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) SANER Software Analysis, Evolution and Reengineering (SANER), 2023 IEEE International Conference on. :663-667 Mar, 2023
Subject
Computing and Processing
Computer languages
Codes
Social networking (online)
Ecosystems
Pipelines
Collaboration
Reproducibility of results
Language
ISSN
2640-7574
Abstract
One of the primary factors that encourage developers to contribute to open source software (OSS) projects is the collaborative nature of OSS development. However, the collaborative structure of these communities largely remains unclear, partly due to the enormous scale of data to be gathered, processed, and analyzed.In this work, we utilize the World Of Code dataset, which contains commit activity data for millions of OSS projects, to build collaboration networks for ten popular programming language ecosystems, containing in total over 290M commits across over 18M projects. We build a collaboration graph representation for each language ecosystem, having authors and projects as nodes, which enables various forms of social network analysis on the scale of language ecosystems. Moreover, we capture the information on the ecosystems’ evolution by slicing each network into 30 historical snapshots. Additionally, we calculate multiple collaboration metrics that characterise the ecosystems’ states.We make the resulting dataset publicly available 1 , including the constructed graphs and the pipeline enabling the analysis of more ecosystems.