Study Reveals Common Issues Faced by GitHub Copilot Users

Written by textmodels | Published 2024/03/04
Tech Story Tags: github-copilot | ai-code | ai-code-generation | can-ai-code | ai-applications | software-development | github-copilot-user-experience | hackernoon-top-story | hackernoon-es | hackernoon-hi | hackernoon-zh | hackernoon-fr | hackernoon-bn | hackernoon-ru | hackernoon-vi | hackernoon-pt | hackernoon-ja | hackernoon-de | hackernoon-ko | hackernoon-tr

TLDRA study investigated user challenges with GitHub Copilot, revealing common issues such as usage obstacles, compatibility concerns, and code suggestion quality. Causes range from internal system issues to network connectivity problems, while solutions include bug fixes, configuration adjustments, and version updates. Overall, the findings shed light on improving the user experience of GitHub Copilot.via the TL;DR App

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Xiyu Zhou, School of Computer Science, Wuhan University, Wuhan, China;

(2) Peng Liang, School of Computer Science, Wuhan University, Wuhan, China;

(3) Zengyang Li, School of Computer Science, Central China Normal University, Wuhan, China;

(4) Aakash Ahmad, School of Computing and Communications, Lancaster University Leipzig, Leipzig, Germany;

(4) Mojtaba Shahin, School of Computing Technologies, RMIT University, Melbourne, Australia;

(4) Muhammad Waseem, Faculty of Information Technology, University of Jyväskylä, Jyväskylä, Finland.

III. RESULTS AND ANALYSIS

In this section, we report the study results of the three RQs and provide the analysis of the key findings. In Section III-A, we presented the types of issues, while in Sections III-B and III-C, we present the types of causes and solutions for the corresponding issues, respectively. The results of issue types are categorized into two levels: categories (e.g., Suggestion Content Issue) and types (e.g., LESS EFFECTIVE SUGGESTION). Meanwhile, the results for causes and solutions are organized only as types (e.g., Use Suitable Version). It should be noted that, only causes that were proven to lead to the issues, and solutions that can resolve the issues, are extracted and provided in the results. Therefore, not all issues have corresponding causes and solutions. We provide examples with the “#” symbol, which indicates the “GitHub Issue ID”, “GitHub Discussion ID”, or “SO Post ID” in the provided dataset [13].

A. Type of Issues (RQ1)

Fig. 2 presents the taxonomy of the issues extracted from our data. It can be observed that Usage Issues (56.9%) account for the majority of issues faced by Copilot users. Besides, a substantial number of users have raised Feature Requests (15.3%) based on their user experience and requirements. There is also a portion of users who have encountered Compatibility Issues (15.3%) when using Copilot in different environments, while smaller percentages were identified as Suggestion Content Issues (4.9%), User Experience Issues (4.2%), and Copyright and Policy Issues (3.4%).

  1. Usage Issue (56.9%): Usage Issue refers to a category of obstacles encountered by users when attempting to utilize some of the fundamental functions of Copilot (e.g., installation failure). This category of issues is further divided into six types, which are detailed elaborated below.

• FUNCTIONALITY USAGE ISSUE refers to the abnormality of various code generation-related features provided by Copilot. In addition to providing code suggestions, Copilot offers various interactive features to better engage with users, such as “previous/next suggestion”, “view all suggestions”, and configuration of shortcut keys to accept suggestions. Users may encounter exceptions when using these features. For example, a user reported that “Copilot no longer suggesting on PyCharm” (Discussion #11199).

• SETUP/OPERATION ISSUE refers to errors or malfunctions that occur during the initialization or operation of Copilot, and often involve runtime exceptions. These issues can prevent Copilot from running correctly, or cause it to crash unexpectedly, such as when a user encounters “copilot start error in VSCode” (Discussion #30996).

• AUTHENTICATION FAILURE refers to the issues related to user login and authentication difficulties when using Copilot. Copilot requires users to log in to their GitHub account before using the service. Only users with access permissions (including paid subscriptions, student identity verification, etc.) can use Copilot’s code generation service. During the authentication process, users may encounter various issues related to it, resulting in the inability to use Copilot. For example, a user mentioned in the discussion forum that “I cannot log in after the upgrade” (Discussion #18132).

• ACCESSING FAILURE refers to the situation where users fail to access Copilot’s server, which often involves errors related to server connections. A user may encounter an error message like “GitHub Copilot could not connect to server” (Discussion #11801).

• INSTALLATION ISSUE refers to the problems encountered during the installation process of Copilot, including installation errors, inability to find installation methods, and other related issues. For instance, some users may encounter issues such as “Errors when installing Copilot” (Discussions #17250).

• VERSION CONTROL ISSUE refers to problems that users encounter when adjusting the version of Copilot or its runtime environment (e.g., IDE), including the inability to upgrade the Copilot version or abnormal issues like continuing to prompt for upgrades even after upgrading. For example, a user said “copilot plugin fails to update” when using it in IntelliJ IDEA (Discussion #17298).

Analysis: As a relatively new AI code product, we identified Usage Issue at various stages of user interaction with Copilot. Users also tend to report these problems and look for assistance, which made Usage Issue the most prevalent category of issues. FUNCTIONALITY USAGE ISSUE (233), SETUP/OPERATION ISSUE (201), and AUTHENTICATION FAILURE (199) are the top three types. We attribute the higher frequency of the first two types of issues to the deficiencies in Copilot’s feature design and stability, which are also influenced by users’ environments and operations. The third type is primarily associated with specific details that arise when Copilot requires users to log in using their GitHub accounts.

2) Feature Request (15.3%): Feature Request refers to the features that users request to add or improve based on their experience and actual needs when using Copilot. These feature requests not only help improve the user experience of Copilot but also contribute to the exploration of how AI code generation tools can better interact with developers. This category is further divided into four types, as shown below.

• FUNCTION REQUEST refers to the requests for developing new functions in Copilot, which typically arise from users’ genuine needs and difficulties encountered while utilizing the tool. For example, a user has suggested that the addition of a “Code Explanations Feature” could enhance the usefulness of Copilot (Discussion #7509).

• INTEGRATION REQUEST refers to a type of request for Copilot to be available on certain platforms or to be integrated with other plugins. This is mainly due to some users’ desire to use Copilot in specific environments. For instance, a user called for “Support for Intellij 2022.2 EAP family” (Discussion #17045). The requests for integration also reflect the popularity of Copilot among developers to some extent.

• UI REQUEST refers to the requests made by users for changes to the user interface (UI) of Copilot, which may involve modifying the appearance of the Copilot icon, adjusting usage prompts, and other related aspects. These requests are generally aimed at improving the visual effects and user experience of Copilot. For example, a user may request the addition of a “status indicator” (Issue #163) to provide information about the current working status of Copilot.

• PROFESSIONAL COPILOT VERSION refers to the requests from some users for a professional version of Copilot.

These users are typically developers from internal teams of certain companies, who hope to receive more professional and reliable code generation services in their actual work. Specifically, they have higher requirements for the reliability and security of Copilot’s code, as well as team certification and other aspects.

Analysis: For FUNCTION REQUEST (123), we observed that users commonly express a desire for greater flexibility in configuring Copilot to align more closely with their development habits. For instance, common requests include the ability to accept Copilot’s suggestions word by word and to specify where Copilot should automatically operate in terms of file types or code development scopes. More innovative demands involve the need for Copilot to provide suggestions according to the whole project, as well as features like code explanations and chat functionality [15], which have already launched as a technical preview in Copilot X. INTEGRATION REQUEST (75) reflects the wish of developers to use Copilot in their familiar environments. This places higher demands on the Copilot team, as we have identified a significant number of Compatibility Issues.

3) Compatibility Issue (15.3%): This category covers the issues that arise from mismatches between Copilot and its runtime environment. Copilot operates as a plugin in various IDEs and text editors (e.g., VSCode and IntelliJ IDEA), and the complexity of the environments and interference from other plugins can result in an increased number of compatibility issues. These issues are further classified into four types, which are elaborated below.

• EDITOR/IDE COMPATIBILITY ISSUE refers to issues arising from mismatches between Copilot and its IDE or editor. These issues typically manifest as Copilot being unable to operate properly in a specific IDE or editor.

• PLUG-IN COMPATIBILITY ISSUE refers to a type of matching issue that arises when Copilot and other plugins are active and working together in the same environment. Such issues can cause partial or complete malfunctions of Copilot and other plugins and are usually identified through troubleshooting methods such as disabling Copilot or other plugins. For instance, one user reported “a Keyboard shortcut conflict with Emmet” (Issue #47) that prevented him from receiving code suggestions generated by Copilot.

• FRAMEWORK COMPATIBILITY ISSUE refers to a type of compatibility problem between Copilot and the framework it operates on. One common example is the compatibility issue between Copilot.vim [16], an official version of Copilot designed specifically for Vim, and Node.js.

• KEYBOARD COMPATIBILITY ISSUE refers to the situation when Copilot’s functionality cannot be used in some uncommon keyboard layouts. For example, a user with a German keyboard layout cannot use most of Copilot’s code generation-related features. (Discussion #7094).

Analysis: Compatibility Issue arises from the complex operational environments in which users utilize Copilot, as well as the compatibility robustness of Copilot itself. In the case of EDITOR/IDE COMPATIBILITY ISSUE (132), VSCode, the platform officially recommended for Copilot usage, has garnered a higher number of reported compatibility issues. We also have found similar problems in other widely used IDEs, like Visual Studio, IDEA, and PyCharm. The appearance of PLUG-IN COMPATIBILITY ISSUE (72) is less predictable, with typical problems involving conflicts with other code completion tools.

4) Suggestion Content Issue (4.9%): This category of issues refers to the problems related to the content of the code generated by Copilot. The generation of code suggestions is the core feature of AI code generation tools like Copilot, and the quality of the suggestions directly determines whether users will adopt them. Therefore, the content of the generated code is naturally an area of concern for users, researchers, and the Copilot team. These issues are further divided into seven specific situations, which are detailed below.

• LOW QUALITY SUGGESTION refers to situations where Copilot is unable to comprehend the context sufficiently to generate useful code. Such code suggestions may not have any syntactical errors, but due to their poor quality, they are unlikely to be adopted by users. For instance, Copilot once generated an empty method containing only a return statement without meeting the requirements specified in the user’s code (Discussion #6631).

• NONSENSICAL SUGGESTION refers to the code suggestions provided by Copilot that are completely irrelevant to the user’s needs or produce strange output. Such suggestions are considered almost unusable and provide little heuristic assistance to the user. For example, a user received an inaccessible fake URL generated by Copilot (Discussion #14212).

• SUGGESTION WITH BUGS refers to the situation where Copilot is able to generate relevant code based on the context, but the suggested code contains some bugs. This can result in the program being able to run, but not in the way that the developer intended, or in some cases, causing errors or crashes. For example, a user reported that Copilot suggested using “setState(!state)” instead of “setState(true)” (Issue #43), which caused a logical bug in the code.

• INCOMPREHENSIBLE SUGGESTION refers to the situation where Copilot provides code suggestions, but due to the complexity of the code logic or lack of experience, users found it challenging to comprehend the suggested code and requires additional source to verify its correctness. For example, a user said “My Github Copilot just autocompleted it for me, then I scoured the internet trying to find information pertaining to it but couldn’t”. (SO #73075410)

• SUGGESTION WITH INVALID SYNTAX refers to the situation where the suggestions generated by Copilot may contain syntax errors that prevent the program from running properly. One example is when the suggested code is missing a closing bracket, causing the editor to display a syntax error (Discussion #38941).

• LESS EFFECTIVE SUGGESTION refers to the code suggestions generated by Copilot that are functionally correct and meet the user’s requirements, but may suffer from suboptimal execution efficiency or convoluted logic, potentially impacting the overall quality of the code.

• INSECURE SUGGESTION refers to the code suggestions generated by Copilot that introduce security vulnerabilities. For example, a user indicated that the code suggestion for him lacked accountability for the sizes being read (Discussion #6636).

Analysis: The quality of the code suggestions is a pivotal factor in determining Copilot’s capability for practical code development. We identified a relatively small amount of Suggestion Content Issues, possibly indicating that users are less inclined to report issues related to suggested code compared to usage-related problems. Among these issues, LOW QUALITY SUGGESTION, NONSENSICAL SUGGESTION, and SUGGESTION WITH BUGS are the three most frequently reported types, while INSECURE SUGGESTION and LESS EFFECTIVE SUGGESTION are less prevalent. This result shows that quality of generated code is a major concern of users, while the security and effectiveness are not in a high priority.

5) User Experience Issue (4.2%): This category covers user feedback on their experience of using Copilot. Compared with Usage Issue, Copilot generally runs and functions as intended, but the user experience is suboptimal. User experience issues can emerge due to specific usage scenarios or be prevalent across various situations, providing insights into areas where Copilot could be improved. User Experience Issue can be further classified into four types, which are detailed below.

• POOR FUNCTIONALITY EXPERIENCE refers to a type of user experience issue where the usage of Copilot’s core code generation-related functionalities is unsatisfactory. These issues can often hinder the coordination between users and Copilot, and even decrease the efficiency of actual development work. For instance, a user complained that the automatically generated suggestions provided by Copilot were highly distracting, forcing him to manually trigger the code generation functionality (Discussion #13007).

• POOR SUBSCRIPTION EXPERIENCE refers to the obstacles that users encounter during the process of subscribing to Copilot’s services. Copilot offers several subscription methods (e.g., student verification, paid subscription), leading to some inconvenience for users during the subscription process. For example, one user was unsure what to do next after setting up a billing (Discussion #19119).

• POOR PERFORMANCE refers to the performance issues that occur when Copilot is running, which usually directly impacts the user experience. These issues include high CPU usage, long response times, and overly frequent server access. • POOR AUTHENTICATION EXPERIENCE refers to the inconvenience that users encounter when authenticating their identities before using Copilot. The most common situation is that Copilot frequently prompts users to relogin, which can be a significant source of frustration.

Analysis: User Experience Issues provide valuable insights into the direction for improving Copilot. Among the POOR FUNCTIONALITY EXPERIENCE issues (25), the most commonly reported issues involve Copilot’s inline suggestions causing disruptions to the user’s coding process (5) and the inconvenience of being unable to accept certain portions of the suggested code (2). These concerns align with some of the demands mentioned by users in Feature Requests, e.g., setting when Copilot can generate code and the length of suggested code.

6) Copyright and Policy Issue (3.4%): Copilot is trained on a large corpus of open source code and generates code suggestions based on the users’ code context. The way in which Copilot operates raises concerns regarding potential copyright and policy issues, as expressed by some users. These issues are divided into three types, as shown below.

• CODE COPYRIGHT ISSUE refers to the concerns raised by some code authors regarding the unauthorized use of their open-source code by Copilot for model training. GitHub is currently one of the most popular web-based code hosting platforms, and since the release of Copilot, there have been suspicions among some code authors that their code hosted on GitHub has been used for training without proper consideration of their license.

• CODE TELEMETRY ISSUE refers to the concerns expressed by users regarding Copilot’s approach of collecting their code to generate suggestions, which may potentially result in the leakage of confidential code. Some users may also simply be unwilling to have their own code, as well as the code generated by Copilot for them, collected for other purposes.

• VIOLATION OF MARKETPLACE POLICY is a specific case where a user reported that Copilot was able to be published on the VSCode marketplace despite using proposed APIs, while other plugins were prohibited. The user suspected that this behavior may be in violation of the Marketplace Policy (Issue #3).

Analysis: The emergence of Copyright and Policy Issues reveals the users’ concerns about the way Copilot works. Copilot is trained on multi-language open-source code and also needs to collect users’ code context during its operation to generate suggestions. These two facts have led people to pay more attention to copyright and intellectual property issues when using Copilot, especially in in-house development.

B. Type of Causes (RQ2)

  1. Results: We identified a total of 337 causes, which were collected from 24.1% of all the issues, and were categorized into 13 types as presented in Table II. The result indicates that the most frequent causes of issues are Copilot Internal Issue (21.4%) and Network Connection Issue (15.4%), with Editor/IDE Compatibility Issue (11.1%) and Unsupported Platform (9.2%) also commonly reported. The specific instances, occurrence count, and the proportion of each type of cause are presented in Table II. Due to the space limit, we analyze the top five most frequent causes. It is worth noting that certain types of issues can potentially be the causes of other issues.

2) Analysis: Copilot Internal Issue, which can lead to various types of usage problems, is the most common type of causes. As Copilot is a closed-source project, its internal details are not publicly known to users. Therefore, we attribute upstream issues related to Copilot as Copilot Internal Issue, encompassing language model, functional design, and serverside issues, which are caused by internal factors. Typically, the identification of Copilot Internal Issue relies on user feedback regarding abnormal usage experiences, which the Copilot team will need to further investigate to pinpoint specific underlying causes. Additionally, the occurrence of Copilot Internal Issue often results in a cluster of users reporting similar issues within a certain time period. For instance, a server-side bad deployment may cause a group of users to encounter AUTHENTICATION FAILURE.

Network Connection Issue is a common type of causes that can lead to authentication failures, runtime exceptions, access failures, and so on. Most network-related problems are attributed to the user’s network environment. A common situation is that users access Copilot through an HTTP proxy or VPN, which can cause SSL interference and prevent them from using the service. However, a good news is that Copilot now supports access through an HTTP proxy, thus addressing such concerns [17].

Editor/IDE Compatibility Issue can lead to the generation of various usage issues, primarily including abnormal functionality usage and Copilot’s operational problems, and more.

Unsupported Platform refers to the reason why some users are unable to use Copilot effectively in certain IDE or text editor, which mainly leads to some issues related to usage and compatibility. Since Copilot is not open source, many platforms are unable to integrate it immediately upon its release, which has resulted in some users encountering various obstacles while attempting to use it. Therefore, we recommend that users try to use the IDEs that Copilot officially supports, as this will provide them with a more convenient and stable code generation service, as well as access to a mature discussion community that can help solve most common issues.

Improper Configuration/Setting is one of the major causes of functionality usage and compatibility issues. For instance, after installing Copilot, its default configuration and settings may cause it to malfunction in a particular IDE or conflict with other plugins. The majority of these problems can be addressed through configuration adjustments.

C. Type of Solutions (RQ3)

  1. Results: We identified a total of 497 solutions, which were used to address 35.5% of all the issues, and were categorized into 11 types as shown in Table III. The result reveals that most of the usage bugs were addressed by official fixes (Bug fixed by Copilot 27.0%) after user feedback, and when users tried to solve issues themselves, Modify Configuration/Setting (22.1%), Use Suitable Version (17.1%), and Reinstall/Restart/Reauthorize Copilot (12.3%) were commonly used as effective solutions. The specific instances, occurrence count, and the proportion of each type of solution are presented in Table III. Due to space constraints, we will focus on interpreting the results of the top 5 solutions, as well as some important analysis. It should be noted that Others (4.2%) is a collection of dedicated solutions that are usually specific to particular environments and issues.

The mapping between issue types and solution types with the distribution is shown in Table IV, using abbreviations to represent each type of solution. For example, “BFC” represents Bug Fixed by Copilot. The mapping shows that the majority of solutions are targeted towards Usage Issues and Compatibility Issues. The main solutions for Feature Requests are primarily waiting for official feature implementation (FIC) or achieving similar effects through configuration or setting modifications (MCS). User Experience Issues are mostly improved by the Copilot team (BFC). Additionally, using the appropriate version of Copilot and editor/IDE (USV) can lead to a better experience. Solving Suggestion Content Issues is relatively challenging, as only a few can be addressed through modifying the input way (MIW), while the majority lacks effective solutions. Copyright and policy Issues also have relatively limited solutions. The main solution is to control the collection of users’ code by adjusting the settings of Copilot.

2) Analysis: Bug Fixed by Copilot is the primary solution for addressing various issues, particularly certain usage and compatibility issues. This is reasonable since Copilot Internal Issues are the most frequent causes, indicating that many problems of Copilot, as a new tool, cannot be resolved by users’ own effort. Due to the closed-source nature of Copilot, users can just provide their feedback, and wait for the response and solution from the Copilot team.

Modifying Configuration/Setting is a common solution for resolving issues related to improper configurations or settings, and can address FUNCTIONALITY USAGE ISSUES, PLUG-IN COMPATIBILITY ISSUES, and AUTHENTICATION FAILURES by users themselves. Additionally, for some feature requests, users can achieve the functionality they want by simply making some configuration changes, such as changing the keyboard shortcut bonding to accept code suggestions. However, we found that due to the complexity of Copilot running environments, it is difficult to provide a recommended configuration that is suitable for all situations. Therefore, each case must be analyzed individually.

Use Suitable Version provides an effective way to address SETUP/OPERATION ISSUES, FUNCTIONALITY USAGE ISSUES, INSTALLATION ISSUES, and EDITOR/IDE COMPATIBILITY ISSUES. Copilot was rapidly iterated on many versions based on user feedback and development plans. Meanwhile, some IDEs have also released new versions to be compatible with it. However, some older versions may be more stable compared to the latest one which may contain some bugs or compatibility issues. Thus, using the appropriate version is a highly effective solution for users.

Reinstall/Restart/Reauthorize Copilot is another solution for users to simply resolve AUTHENTICATION FAILURES, SETUP/OPERATION ISSUES, and FUNCTIONALITY USAGE ISSUES on their own. Its principle is to reset the current state of Copilot, returning any previous errors or settings that may have existed back to their initial state.

Feature Implemented by Copilot acts as an official action that mainly addresses user Feature Requests. The development speed of Copilot’s new features is relatively fast, and the Copilot team is currently experimenting with some of the latest features in the Copilot X [18].


Written by textmodels | We publish the best academic papers on rule-based techniques, LLMs, & the generation of text that resembles human text.
Published by HackerNoon on 2024/03/04