Paper published in a book (Scientific congresses and symposiums)
A Dataset of Bot and Human Activities in GitHub
Chidambaram, Natarajan; Decan, Alexandre; Mens, Tom
2023In Proceedings of the 20th International Conference on Mining Software Repositories (MSR 2023)
Peer reviewed
 

Files


Full Text
paper_rendered.pdf
Author preprint (186.44 kB) Creative Commons License - Attribution
Request a copy

All documents in ORBi UMONS are protected by a user license.

Send to



Details



Keywords :
software development; bot activity; dataset; GitHub event stream; empirical analysis
Abstract :
[en] Software repositories hosted on GitHub frequently use development bots to automate repetitive, effort intensive and error-prone tasks. To understand and study how these bots are used, state-of-the-art bot identification tools have been developed to detect bots based on their comments in commits, issues and pull requests. Given that bots can be involved in many other activity types, there is a need to consider more activities that they are carrying out in the software repositories they are involved in. We therefore propose a curated dataset of such activities carried out by bots and humans involved in GitHub repositories. The dataset was constructed by identifying 24 high-level activity types that could be extracted from 15 lower-level event types that were queried from GitHub's event stream API for all considered bots and humans. The proposed dataset contains around 834K activities performed by 408 bots and 655 humans involved in GitHub repositories, during an observation period ranging from 25 November 2022 to 9 March 2023. By analysing the activity patterns of bots and humans, this dataset could lead to better bot identification tools and empirical studies on how bots play a role in collaborative software development.
Disciplines :
Computer science
Author, co-author :
Chidambaram, Natarajan  ;  Université de Mons - UMONS
Decan, Alexandre  ;  Université de Mons - UMONS > Faculté des Sciences > Service de Génie Logiciel
Mens, Tom  ;  Université de Mons - UMONS > Faculté des Sciences > Service de Génie Logiciel
Language :
English
Title :
A Dataset of Bot and Human Activities in GitHub
Publication date :
15 May 2023
Event name :
International Conference on Mining Software Repositories (MSR_
Event date :
15-16/5/2023
Audience :
International
Main work title :
Proceedings of the 20th International Conference on Mining Software Repositories (MSR 2023)
Publisher :
IEEE
Peer reviewed :
Peer reviewed
Research unit :
S852 - Génie Logiciel
Research institute :
Infortech
R300 - Institut de Recherche en Technologies de l'Information et Sciences de l'Informatique
R150 - Institut de Recherche sur les Systèmes Complexes
Funders :
DigitalWallonia4.AI research project ARIAC
F.R.S.-FNRS - Fonds de la Recherche Scientifique [BE]
Funding number :
2010235; F.4515.23; O.0157.18F-RG43; T.0149.22
Funding text :
This work is supported by DigitalWallonia4.AI research project ARIAC (grant number 2010235), as well as by the Fonds de la Recherche Scientifique – FNRS under grant numbers F.4515.23, O.0157.18F-RG43 and T.0149.22.
Available on ORBi UMONS :
since 16 March 2023

Statistics


Number of views
27 (15 by UMONS)
Number of downloads
5 (5 by UMONS)

Scopus citations®
 
2
Scopus citations®
without self-citations
1
OpenCitations
 
0

Bibliography


Similar publications



Contact ORBi UMONS