Sonarlizer Xplorer: a tool to mine Github projects and identify technical debt items using SonarQube

Date

2022-06-27

Department

Program

Citation of Original Publication

D. Pina, A. Goldman and C. Seaman, "Sonarlizer Xplorer: a tool to mine Github projects and identify technical debt items using SonarQube," 2022 IEEE/ACM International Conference on Technical Debt (TechDebt), 2022, pp. 71-75, doi: 10.1145/3524843.3528098.

Rights

© 2022 IEEE.  Personal use of this material is permitted.  Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Subjects

Abstract

The advancement of artificial intelligence and the imple-mentation of machine learning capabilities in programming languages such as Python, along with cloud services, allow researchers to apply methods to cluster and predict behav-iors and patterns in software engineering data. On the other hand, these methods need a large amount of data in order to work with high accuracy in different contexts. This paper introduces Sonarlizer Xplorer: a tool that captures a large number of technical debt items and code metrics from pub-lic GitHub projects. Sonarlizer Xplorer is composed of two sub-tools. The first is Github Xplorer, responsible for mining public Github repositories from an initial project. The second is Sonarlizer, responsible for taking projects and analyzing them using SonarQube. We used the tool over four months, collecting technical debt items and code metrics on almost 46,000 public Java projects. In addition, we mined over 57 million repositories and 4 million users.