opensource.microsoft.com, but then there’s an internal side, too, where Microsoft employees manage repos and teams. “We’ve open sourced that and other companies are picking it up and using that internally for themselves, so it’s a bi-directional thing,” McAffer explained.
GitHub is a very rich environment, where lots of interactions are possible. Microsoft, like a lot of companies, was finding it difficult to keep track of everything going on and understanding what was happening with its repos.
“We ended up engaging with the GHTorrent project. We did quite a bit of work with them, and we actually are now sponsoring the GHTorrent project so we pay for all their Azure resources, everything you can see that at GHTorrent.org,” he said.
GHTorrent helps Microsoft understand what’s going on at GitHub but also internally in terms of its own projects. Even so, there are some things that GHTorrent was not set up to do, including work with private repositories and some of the more detailed data concerning teams where admin permissions are required.
The company ended up creating another system called GHCrawler, which it also open source. This tool tracks everything on GitHub down to the commit level, team, and permissions changes. That data is then used in metrics and tracking analysis to discover insights such as how many pull requests are coming in, how fast they are getting action, and how long they take to close or merge. “It gives us a way of tracking our presence,” said McAffer.
Consumption of open source is an entirely different matter, and a different process, at Microsoft. The company uses open source in myriad ways, and the need to track them all and to manage the legal security aspects is enormous.
“We’ve done a massive amount of work to simplify the processes and the policies around open source use, to really understand the key attributes of being a responsible consumer of open source, how to do it right, and to make sure we adhere to the licenses,” McAffer said.
“To that end, we’ve written a lot of tools internally to discover, track, and monitor what’s going on there and report on the use of open source,” he continued. Those tools also tend to be somewhat proprietary as they are deeply integrated with Microsoft’s engineering systems.
“We’ve been trying to see ways we can tease that out and make more of that available to the open source community more broadly, but it’s been a bit hard because it is very specific in many ways to our business policy or our engineering system, which isn’t going to be the same as anybody else’s,” McAffer said.
The Microsoft open source journey has been an interesting one over the years and in the true open source spirit, we will continue to share what we learned with everyone in that process, from tools to code.
We would like to thank Carol Smith, Senior Open Source Program Manager at Microsoft and Jeff McAffer, Director of Open Source Programs Office at Microsoft for being interviewed. We would also like to thank Pam Baker for performing the interview.These resources were created in partnership with the TODO Group: the professional open source program networking group at The Linux Foundation. A special thank you to Pam Baker for writing assistance and the open source program managers who contributed their time and knowledge to making these comprehensive guides. Participating companies include Autodesk, Comcast, Dropbox, Facebook, Google, Intel, Microsoft, Netflix, Oath (Yahoo + AOL), Red Hat, Salesforce, Samsung and VMware. To learn more, visit: todogroup.org.