On the Concerns of Developers When Using GitHub Copilot

Written by textmodels | Published 2024/03/04
Tech Story Tags: github-copilot | ai-code | ai-code-generation | software-development | can-ai-code | ai-applications | copilot-usage-challenges | github-copilot-user-experience

TLDRAn empirical study conducted by software developers sheds light on the practical challenges of using GitHub Copilot. Through analysis of issues, causes, and solutions encountered during real-world usage, this study provides valuable insights into the effectiveness and limitations of the AI code generation tool.via the TL;DR App

Authors:

(1) Xiyu Zhou, School of Computer Science, Wuhan University, Wuhan, China;

(2) Peng Liang, School of Computer Science, Wuhan University, Wuhan, China;

(3) Zengyang Li, School of Computer Science, Central China Normal University, Wuhan, China;

(4) Aakash Ahmad, School of Computing and Communications, Lancaster University Leipzig, Leipzig, Germany;

(4) Mojtaba Shahin, School of Computing Technologies, RMIT University, Melbourne, Australia;

(4) Muhammad Waseem, Faculty of Information Technology, University of Jyväskylä, Jyväskylä, Finland.

Abstract

With the recent advancement of Artificial Intelligence (AI) and the emergence of Large Language Models (LLMs), AI-based code generation tools have achieved significant progress and become a practical solution for software development. GitHub Copilot, referred to as AI pair programmer, utilizes machine learning models that are trained on a large corpus of code snippets to generate code suggestions or auto-complete code using natural language processing. Despite its popularity, there is little empirical evidence on the actual experiences of software developers who work with Copilot. To this end, we conducted an empirical study to understand the issues and challenges that developers face when using Copilot in practice, as well as their underlying causes and potential solutions. We collected data from 476 GitHub issues, 706 GitHub discussions, and 184 Stack Overflow posts, and identified the issues, causes that trigger the issues, and solutions that resolve the issues when using Copilot. Our results reveal that (1) Usage Issue and Compatibility Issue are the most common problems faced by Copilot users, (2) Copilot Internal Issue, Network Connection Issue, and Editor/IDE Compatibility Issue are identified as the most frequent causes, and (3) Bug Fixed by Copilot, Modify Configuration/Setting, and Use Suitable Version are the predominant solutions. Based on the results, we delve into the main challenges users encounter when implementing Copilot in practical development, the possible impact of Copilot on the coding process, aspects in which Copilot can be further enhanced, and potential new features desired by Copilot users.

Index Terms—GitHub Copilot, GitHub Issue, GitHub Discussions, Cause, Solution

I. INTRODUCTION

In software development endeavors, developers strive for automation and intelligence that most of the code can be generated automatically with minimized human coding effort. Numerous studies and software products have been dedicated to improving the efficiency of developers through the development of systems that can recommend and generate code [1] [2]. Large Language Models (LLMs) are a type of natural language processing techniques based on deep learning which are capable of automatically learning the grammar, semantics, and pragmatics of language, and generating a wide variety of contents. Recently, with the rapid development of LLMs, AI code generation tools trained on large amounts of code snippets are increasingly in the spotlight (e.g., AIaugmented development in Gartner Trends 2024 [3]), making it possible for programmers to automatically generate code with minimized human effort [4].

On June 29, 2021, GitHub, the world’s largest community of developers, and OpenAI, a leading research organization in artificial intelligence, jointly announced the launch of a new product named GitHub Copilot [5]. This innovative tool is powered by OpenAI’s Codex, a large-scale neural network model that is trained on a massive dataset of source code and natural language text. The goal of GitHub Copilot is to provide advanced code autocompletion and generation capabilities to developers, effectively acting as an “AI pair programmer” that can assist with coding tasks in real-time. Copilot has been designed to work seamlessly with a wide range of Integrated Development Environments (IDEs) and text editors, including VSCode, Visual Studio, Neovim, and JetBrains. By collecting contextual information like function names and comments, Copilot is able to generate code snippets in a variety of programming languages, which can improve developers’ productivity and help them complete coding tasks more efficiently.

Since its release, Copilot has gained significant attention and discussion within the developer community. Many developers have praised its effectiveness, while others have expressed concerns about the potential impact on code security and intellectual property. Some prior research investigated the quality of the code generated by Copilot [6] [7] [8], while others examined its performance in practical software development [9] [10] [11]. However, little is known about the issues, causes, and solutions encountered during the practical use of Copilot. Considering that GitHub Copilot is currently a widely used and highly representative AI code generation tool, it is of great significance to study the various obstacles that users face when using it, which can help us evaluate the usefulness of Copilot in different aspects, as well as explore the ways in which such AI code generation tools interact with software developers.

To this end, we conducted a thorough analysis of the issues faced by software developers when coding with GitHub Copilot, as well as their causes and solutions, by collecting data from GitHub Issues, GitHub Discussions, and Stack Overflow (SO) posts. We sought to gain a comprehensive understanding of the issues of using Copilot by systematically analyzing the collected data, which would help us evaluate the effectiveness and limitations of Copilot in practical settings. We developed several dedicated Web crawlers that use the official API and obtained the issues, discussions, and posts related to GitHub Copilot. We then conducted manual data labelling, and extracted relevant data items. In the last step, the Constant Comparison method [12] was employed to analyze the extracted data and answer the research questions defined.

Our findings show that: (1) Usage Issue and Compatibility Issue are the most common problems faced by developers, (2) Copilot Internal Issue, Network Connection Issue, and Editor/IDE Compatibility Issue are identified as the most frequent causes, and (3) Bug Fixed by Copilot, Modify Configuration/Setting, and Use Suitable Version are the predominant solutions.

The contributions of this work: (1) we identified the issues of using Copilot in the software development practice with a curated dataset [13] and provided a two-level taxonomy for these issues; (2) we identified the causes and solutions of these issues, and came up with a taxonomy for these causes and solutions; and (3) we provided the mapping relationship between the identified issues and solutions.

The rest of this paper is structured as follows: Section II presents the Research Questions (RQs) and the research design employed in this study. Section III provides the results and analysis of our study, which are further discussed in Section IV. Section V outlines the potential threats to validity. Section VI reviews the related work, and Section VII concludes this work along with the directions for future research.

This paper is available on arxiv under CC 4.0 license.


Written by textmodels | We publish the best academic papers on rule-based techniques, LLMs, & the generation of text that resembles human text.
Published by HackerNoon on 2024/03/04