GCP – A framework for adopting Gemini Code Assist and measuring its impact
Software development teams are under constant pressure to deliver at an ever-increasing pace. As sponsors of the DORA research, we recently took a look at the adoption and impact of artificial intelligence on the software development lifecycle. The research shows that more than 75% of developers are relying on AI in their daily responsibilities.
With Gemini Code Assist, developers aim to boost their efficiency and code quality. But what’s the process to effectively adopt AI-assisted coding? How do you measure the impact of these tools on your team’s performance?
In this article, we’ll provide a practical framework for adopting AI-assisted code creation, and for evaluating the effectiveness of AI assistance in your software development workflow.
This post outlines a four-step framework to adopt AI code assistants like Gemini Code Assist on your software development team: Adoption, trust, acceleration, and impact.
-
Adoption: Ensure your developers are actively using the tool by tracking measures like daily active use and code suggestions.
-
Trust: Gauge developers’ confidence in the AI’s output by monitoring code suggestion acceptance rates and lines of code accepted.
-
Acceleration: Look for improvements in development speed and software quality through existing productivity metrics like DORA measures, story points, or ticket closures.
-
Impact: Connect these improvements to your overall business goals by assessing changes in key performance indicators like revenue, market share, or time to market.
- aside_block
- <ListValue: [StructValue([(‘title’, ‘Try Gemini Code Assist Standard for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6b070c8b20>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
The AI-assistance journey
Committing to code AI-assistance involves change management, a process defined by a transition from one position (development before AI-assistance) to another (development after). Put simply, it’s a journey.
The four phases of AI-assisted productivity improvement
The journey can be understood in four progressive phases.
-
Adoption: Evaluation and proof of concept activities to identify how and where code AI-assistance may contribute to developer outcomes.
-
Trust: Establishment of confidence in AI-assistance’s output.
-
Acceleration: Assessment of AI-assistance’s ability to improve team development speed with existing productivity metrics like DORA measures, story points, or ticket closures.
-
Impact: Confirmation of improvement to key business performance indicators such as revenue, market share, or time to market.
In addition to what each phase does, it’s important to understand how involved parties contribute. Technology and business leaders frequently initiate evaluations (Adoption) and affirm ongoing success (Impact). Between these two activities, individual developers explore coding AI-assistance, familiarize with its capabilities, and establish regular use (Trust). During this time it is important to allow developers ample time to learn and experiment with ways to utilize AI. As teams of developers explore and develop trust together, the group iterates through feedback to further optimize team productivity (Acceleration).
The four phases feed forward and cannot be skipped
One common mistake organizations make is believing that using code AI-assistance (Adoption) will yield immediate business results (Impact). Another way to express this is that they believe they can transition from Adoption immediately to Impact, skipping Trust and Acceleration.
If an organization does not meaningfully Adopt AI assistance tools, and has minimal Trust in its suggestions, it’s unrealistic to expect an Acceleration in team productivity, much less substantive business Impact.
Confirming Adoption and Trust benefit from 6 to 8 weeks of time
Another misconception is that adopting code AI-assistance will have an impact overnight. Every organization is different, but we’ve found that at least 6 to 8 weeks, or four two-week sprints, is needed before impact to your organization’s productivity (Acceleration) may be observed. With each phase progressing into the next, it takes time for the effects of adopting AI assistance to propagate forward. This awareness is particularly important when conducting an evaluation, which we discuss further later in this article.
Effect and measures that can be used with each phase
While the four phases of the AI assistance journey are conceptual, they can and should be measured to affirm progress and impact. Below we describe what measures may be used when and why.
-
Adoption. Daily activity (developer use of AI-assistance), code suggestion (AI code recommendations), and chat exposure (AI chat requests) volume are early signals indicating whether developers are taking advantage of AI-assistance. With this phase and using these measures, you want to confirm consistent and growing daily developer engagement. As Adoption grows, you can shift focus to establishing Trust.
-
Trust. Are adopting developers accepting AI-assistance? Code suggestions accepted (AI code recommendations accepted), acceptance rate (the percentage of code suggestions accepted divided by the number of code suggestions), and lines of code accepted (number of lines of code accepted) measures can be used to assess trust. Monitoring for low acceptance rates for code suggestions and lines of code should prompt you to investigate why trust may be low. You may also elicit further understanding from developer interviews and surveys (sample survey questions).
-
Acceleration. You may already have developer productivity (Acceleration) measures in place, including established DORA software delivery measures. Alternatively, you may want to evaluate it through completed story points or ticket closures per time period, among other measures. Once you’ve established adoption and trust, monitoring acceleration measures for improvement can both confirm the productivity benefits of AI assistance as well as provide line of sight to business impact outcomes and measures.
-
Impact. This final phase is expressed in business key performance indicators. Specific impact measures differ among organizations, and should be monitored by organization leaders to evaluate the business yield of AI-assistance. Impact measures may include revenue, market share, reduced time to product improvements, and other business health criteria. An observed improvement in Acceleration would expect to positively contribute to an Impact measure(s) as well.
It is important to be aware that AI-assistance measures, those within Adoption and Trust phases, are not development productivity measures, those found in the Acceleration phase. To illustrate why, consider the following: would a high AI-assistance code suggestion acceptance rate or significant volume of AI-assisted lines of code accepted that negatively impacted DORA measures or average ticket closures still be considered a development productivity improvement? Most would agree it would not and this is why it is important to make the distinction. AI-assistance metrics measure Adoption and Trust of AI-assistance but development productivity metrics express the impact. Impact measures reveal the final effect.
With a journey, phases, and corresponding measures defined, you can collectively use these elements as a guiding framework to progress and affirm the impact of code AI-assistance.
Measuring impact with Gemini Code Assist
Gemini Code Assist supports Adoption and Trust measurement through Gemini for Cloud Logging logs, in Preview. Through these logs active use, code suggestions, code suggestions accepted, acceptance rate, chat exposures, and lines of code accepted are made visible. This further includes discrete activity on a per user, programming language, IDE client, and activity time basis – deep insights not always available with aggregate AI-assistance measures. These insights can be used to assess both organization journey performance and also answer specific questions like “How many AI-assisted lines of code did we accept last week, by programming language? By developer?”
Gemini Code Assist logs provides discrete activity insights including code exposure and acceptances by programming language, user, and time
While Gemini Code Assist logs provide discrete details per activity, we also provide a sample dashboard built on Log Analytics to assist with aggregate Adoption and Trust measures review.
Gemini Code Assist measures dashboard sample using Log Analytics.
In addition to the above, Cloud Monitoring metrics are provided to monitor active use across Gemini for Cloud, in Preview, including Gemini Code Assist.
The four phases of AI-assisted impact evaluation
Before making a commitment to code AI-assistance, many organizations choose to first conduct an evaluation. Like the AI-assistance adoption journey, an evaluation journey may be a phased process. Here again each phase would feed into the next and involve specific parties.
-
Success criteria. Before starting an evaluation you need to define and baseline the evaluation success criteria. There are two audiences that need to be considered when defining success criteria, the development team and business decision makers, and both should agree on the definition of success. Success criteria could include improvements to Acceleration measures such as DORA, story point velocity, or tickets resolved, as examples. This phase is often overlooked but it is the most important phase as we have found organizations can overlook, or fail to collectively agree on, a success criteria before initiating their evaluation and then experience difficulty when trying to retroactively assess AI-assistance’s impact.
-
Participants. While there are multiple approaches to consider, the most common involve either selecting a single team of developers and conducting evaluation with successive project efforts (the first project with AI-assistance and the following without), or comparing two teams performance in an A/B cohort (one team using AI-assistance while the other does not). Whatever you choose, discuss and agree on who will be participating and why upfront. An “apples to apples” comparison should be prioritized. For example, utilizing an A/B cohort where your A team is significantly more experienced and is also using AI-assistance against a B team that is less experienced and is not using AI-assistance may lead to an lopsided evaluation that makes impact evaluation difficult at best if not unreliable.
-
Measure. The AI-assistance journey can guide progressing and monitoring your evaluation. Quantitative and qualitative measures, alongside success criteria, can be regularly reviewed to ensure an evaluation progresses to a point where the impact of AI-assistance can be assessed.
-
Commit. If you have agreed on your success criteria and have been measuring and facilitating progress along the way, you will be able to confirm or reject an code AI-assistance commitment pending satisfaction of the accepted success criteria.
Levels of evaluation investment
The intensity of a code AI-assistance evaluation can vary. A Minimal evaluation may involve a handful of developers, qualitative surveys, and the monitoring of Acceleration measures only. A more typical approach in our experience, and one opted for by most organizations, is a Moderate investment looking at both quantitative and qualitative measures and involves either a team performing successive development with and without code AI-assistance, or two teams in an A/B cohort doing the same. An Involved evaluation, listed for completeness, utilizes formal research, lab studies, and analysis. In practice, we have found few organizations for an Involved evaluation.
Whatever intensity model is pursued, we want to again reiterate the importance of defining success criteria up front and gathering baseline data for comparison.
An evaluation may complete when you observe improvement to Acceleration measures
In defining commitment success criteria it is important to consider targeting what may be sufficient to confirm AI-assistance’s impact. Often, improvements to Acceleration phase productivity measures can provide line of sight to Impact phase measure improvement (business key performance indicator). Conversely, choosing a commitment success criteria that requires significant time or effort to conclude may reduce the opportunity to confirm AI-assistance impact sooner if at all.
Get started today
Ready to get started with AI assistance for software development? For more information about Gemini Code Assist, visit our product page, documentation, and quickstart.
Additional resources:
-
An AI-assistance commitment decision may find further support in the presence of 2024 DORA report AI-assistance impact insights and Google Research foundations.
-
DORA’s software delivery performance metrics are good indicators for the Acceleration and Impact phases of your AI assistance adoption journey. Teams that are not yet tracking these metrics can use the DORA Quick Check to capture their current software delivery performance.
-
DORA’s research also shows many capabilities lead to improved software delivery performance. Measurements for these capabilities can inform the Adoption and Trust phases of your adoption journey and can serve as leading indicators for later journey phases.
-
Surveys (reference samples) can qualitatively assist with the adoption of AI and target specific areas you’d like to improve with the use of code assistants which may include software delivery performance, documentation quality, code review time, flow, or user centricity.
Read More for the details.