TY - GEN
T1 - Coding together at scale: GitHub as a collaborative social network
AU - Lima, Antonio
AU - Rossi, Luca
AU - Musolesi, Mirco
N1 - Publisher Copyright:
Copyright © 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2014/6
Y1 - 2014/6
N2 - GitHub is the most popular repository for open source code (Finley 2011). It has more than 3.5 million users, as the company declared in April 2013, and more than 10 million repositories, as of December 2013. It has a publicly accessible API and, since March 2012, it also publishes a stream of all the events occurring on public projects. Interactions among GitHub users are of a complex nature and take place in different forms. Developers create and fork repositories, push code, approve code pushed by others, bookmark their favorite projects and follow other developers to keep track of their activities. In this paper we present a characterization of GitHub, as both a social network and a collaborative platform. To the best of our knowledge, this is the first quantitative study about the interactions happening on GitHub. We analyze the logs from the service over 18 months (between March 11, 2012 and September 11, 2013), describing 183.54 million events and we obtain information about 2.19 million users and 5.68 million repositories, both growing linearly in time. We show that the distributions of the number of contributors per project, watchers per project and followers per user show a power-law-like shape. We analyze social ties and repository-mediated collaboration patterns, and we observe a remarkably low level of reciprocity of the social connections. We also measure the activity of each user in terms of authored events and we observe that very active users do not necessarily have a large number of followers. Finally, we provide a geographic characterization of the centers of activity and we investigate how distance influences collaboration.
AB - GitHub is the most popular repository for open source code (Finley 2011). It has more than 3.5 million users, as the company declared in April 2013, and more than 10 million repositories, as of December 2013. It has a publicly accessible API and, since March 2012, it also publishes a stream of all the events occurring on public projects. Interactions among GitHub users are of a complex nature and take place in different forms. Developers create and fork repositories, push code, approve code pushed by others, bookmark their favorite projects and follow other developers to keep track of their activities. In this paper we present a characterization of GitHub, as both a social network and a collaborative platform. To the best of our knowledge, this is the first quantitative study about the interactions happening on GitHub. We analyze the logs from the service over 18 months (between March 11, 2012 and September 11, 2013), describing 183.54 million events and we obtain information about 2.19 million users and 5.68 million repositories, both growing linearly in time. We show that the distributions of the number of contributors per project, watchers per project and followers per user show a power-law-like shape. We analyze social ties and repository-mediated collaboration patterns, and we observe a remarkably low level of reciprocity of the social connections. We also measure the activity of each user in terms of authored events and we observe that very active users do not necessarily have a large number of followers. Finally, we provide a geographic characterization of the centers of activity and we investigate how distance influences collaboration.
UR - http://www.scopus.com/inward/record.url?scp=84909956490&partnerID=8YFLogxK
M3 - Conference article published in proceeding or book
AN - SCOPUS:84909956490
T3 - Proceedings of the 8th International Conference on Weblogs and Social Media, ICWSM 2014
SP - 295
EP - 304
BT - Proceedings of the 8th International Conference on Weblogs and Social Media, ICWSM 2014
PB - The AAAI Press
T2 - 8th International Conference on Weblogs and Social Media, ICWSM 2014
Y2 - 1 June 2014 through 4 June 2014
ER -