How to measure AI adoption in a product team
A story of pivoting from flawed, process-heavy metrics to a simple, artifact-based system designed to improve quality and drive AI impact.
“So, what’s our AI adoption rate?”
It’s a familiar moment for many product leaders right now. You’re hearing the constant noise from conferences and LinkedIn posts, stories of “fully AI automated teams” and magical productivity gains. Inevitably, a C-level, hearing this same noise, turns to you and asks, “So, what’s our AI adoption rate?”
You’re stuck. The only answers you have feel like “vanity metrics,” things that are easy to count but don’t tell you if your team is building better products. Does “prompts per week” or “tool logins” really capture impact?
Is AI actually helping your team build better products?
I was in this exact position, responsible for driving internal AI adoption for our team of product managers and designers. I did what any of AI-first employee would: I asked the LLMs themselves for help. Their suggestions were useless: a generic list of “time-saved” reports and “satisfaction surveys” that felt un-trackable and completely missed the point for our creative work.
It became clear this was a problem that needed a human product-thinking toolkit. So, I had to build my own framework from scratch.
My goal here is simple: to share the hands-on, internal framework I’m experimenting with to move a team of PMs and designers from just using AI to creating real impact with it.
This isn’t a perfect, finished “playbook.” It’s a first iteration, a “V1” draft that I’ll probably have to rework soon. But it’s an experiment that’s finally getting us traction, and it all started by scrapping that “obsolete” first draft. I’m sharing my journey, hoping it can inspire you or at least give you a few ideas to try.
I’ll walk you through my process as a story in three acts:
Act I: The false start. We’ll look at my initial, “obsolete” metrics plan, a classic, process-driven approach, and I’ll explain exactly why I scrapped it.
Act II: The pivot. I’ll introduce the new, initiative-based framework I’m testing, which treats AI adoption like a product, complete with its own “spec” and success metric.
Act III: The flywheel. Finally, I’ll share the most important part: the human-centric support system (think shadowing and power users) that’s required to make any framework actually work.
I’ll first walk you through this process as a story. After sharing this journey, I’ll dive into key reflections, a set of actionable lessons for product leaders, and the further reading that helped shape my thinking.
Ready to measure what really matters?
Act I: The false start
Like any product, my first draft was built on good intentions and deeply flawed assumptions.
My first formal proposal for measuring adoption seemed logical. I wanted to measure the AI boost, so I broke my plan down into three key areas:
Improving processes: How could AI help us do our current jobs faster and better?
Innovating in the product: How could we use AI to solve user problems in new ways?
Sharing knowledge: How could we ensure we all learn and grow together?
My plan involved what I thought were clever tracking mechanisms: adding a specific Confluence label for “ai-assisted” docs, asking designers to use a Figma tag for AI-inspired designs, creating a new AI experimentation log for PMs to fill out, and even launching an AI Idea Box for new features.
As I reviewed the draft, I had that sinking feeling. I realized this framework was a dead end.
First, it was all process. It was creating more work. I was asking an already busy team of PMs and designers to do more tagging, more labeling, and fill out more logs. I was about to incentivize activity, not impact.
Second, it was built on a foundation of vanity metrics and subjective data.
Tracking “ai-assisted” labels doesn’t tell you if the quality of the doc was any good.
“Perceived AI Impact” surveys are highly subjective. I’d seen this fail firsthand. I talked to a developer in a conference who was convinced he was an AI power user, only to discover he was still just copy-pasting code from an LLM tool. My definition of a power user was full IDE integration. His “I’m using it well” survey response would have been completely misleading.
Finally, it obsessed over measuring time saved, which I now believe is a distraction from an important goal: improving quality. I knew from my own work that I wasn’t saving time; I was re-investing it. The 8 hours I used to spend on a slide deck became 1 hour, but I immediately used those saved 7 hours to build a prototype and write a spec.
This V1 draft was trying to measure the wrong thing, in the wrong way. I had to scrap it and start over.
Act II: The pivot
I didn’t just need to iterate; I needed to pivot. My V1 failed because it was built on a flawed foundation. I was trying to measure activity instead of impact.
My thinking finally clicked when I stopped asking “What can I measure?” and started asking a much better question: “What kind of conversations do I want to have with the team?”
As the person responsible for driving adoption, this was everything. I realized that the metric I set wouldn’t just be a number on a dashboard; it would define my 1-on-1s for the next six months.
I had a choice. Do I want to be the metric-enforcer, asking:
How many minutes did you save on that meeting summary? How many Confluence pages did you create with Gemini this week?
Or, do I want to be a strategic partner and brainstorm:
How can we use AI to create an amazing new deliverable for your projects?
Frankly, that second conversation is more fun and impactful for everyone. It’s the one that builds trust and actually helps people see the potential of these tools, rather than just forcing them to track their time. I decided to build my entire framework around enabling that second conversation.
This led me to what I call the professor’s homework analogy. I took the perspective of an institutor assigning a homework. The goal isn’t just to get the task done. The homework is to create deliverables so valuable they can be shared and benefit everyone in the team. These artifacts should boost the individual project while also contributing to the team’s collective knowledge.
This single, simple indicator became my new north star:
The % of initiatives that have four specific, AI-built artifacts.
I defined this homework as four core artifacts:
The external brain (NotebookLM): A centralized source of truth for all initiative research. PMs and Designers can upload all their user interviews, research docs, and market data into one place.
Value: This becomes a strategic partner for insights. Instead of manually re-reading 50 pages, you can ask it direct questions like, “What patterns are emerging from our last 10 user interviews?” or “Summarize our competitor’s strategy for this segment.” It helps you analyze data at a scale you can’t manually, uncovering “deep insights” that lead to better product decisions.
The personal assistant (Gem): A custom-built Gem or GPT designed to solve one unique, project-specific task, like a “custom Gem that generates use cases based on a specific persona and template”.
Value: This frees up mental energy for high-impact work. It automates the “tedious but necessary” parts of our jobs. Imagine a Gem that “transforms messy meeting transcripts into formal meeting minutes” or one that “converts user interview meeting transcripts into customer insight templates”. By eliminating this “busy work”, it frees us to focus on the “complex, strategic, and creative problems that only we can solve”.
The AI sketchpad (prototype): An AI-generated prototype used to quickly visualize and test a core idea, especially abstract concepts.
Value: This is a creative accelerant. It allows PMs and Designers to explore dozens of design concepts and generate prototypes from simple prompts. Instead of getting stuck on the first idea, we can explore more possibilities to find the best solution. This leads to more creative, novel, and validated solutions.
The AI co-pilot (Deep Research): A key deliverable (like a competitive analysis or research summary) created in deep partnership with an AI tool like Gemini.
Value: This elevates our strategic thinking. The AI can accelerate analysis and quickly map the competitive landscape, doing the heavy lifting of gathering and summarizing. This allows the PM or Designer to focus their expertise on synthesis, uncovering deep insights, and pinpointing clear strategic opportunities.
Note about the AI tools: I provided a “starter pack” of recommended, IT-approved tools that were perfect for each artifact’s use case. But I was clear that these were just recommendations. The real goal was the artifact concept: the outcome, not the specific tool. As long as the tool was compliant, they were free to use whatever worked best for them.
To make this simple, we didn’t create a new dashboard. We integrated this metric directly into our existing workflow. We added a field in the project tracking tool where team members can simply log a link to each artifact as they complete it for their initiative.
This pivot meant explicitly de-prioritizing the old metrics. We decided that measuring time saved is a distraction. And we decided against surveying self-reported confidence because, as I’d seen, it’s highly subjective and the quality of the work is the only true and objective indicator of our capabilities.
Recap: As product managers, we love a clear user story. Here is the spec I gave to the team, which is the heart of this new framework:
As a Product Manager or Designer, When I am working on a major project, I want to create these four specific AI artifacts (NotebookLM, Gem, Prototype, Deep Research) and link them in our project tracking tool, So that I can improve the quality of my own work, contribute to our team’s collective knowledge, and help us measure our real AI adoption.
Act III: The flywheel
A new metric is just a compass. It can’t provide the engine. I quickly realized that simply announcing this new framework wasn’t enough. The “Why” was clear, the “What” was defined, but I was missing the most critical part: the “How”.
If I’m being honest, a metric is just a conversation starter. The real work is in the support system you build around it. My goal was to create a human-centric flywheel to get people unstuck quickly and build trust.
This support system has three parts:
Empathy through shadowing
Discussions are critical, but you can’t really know what’s wrong until you get behind their shoulder. I made it my job to shadow team members. This was humbling and incredibly revealing.
I saw people struggling in ways I had forgotten, because I was already a power user. I watched a colleague stare at a blank Gemini prompt, type “Hello”, get a generic response, and close the tab in frustration. I saw others who couldn’t even find the right link to access the tool. These aren’t problems you can solve from a metrics dashboard. You have to be there, and in 30 minutes, you can help them get the AI shift they need.
Scaling support with power users
I couldn’t be the only one doing this. The next step was to find volunteers passionated by AI who could help me spread the ideas. I deputized a team of dedicated power users to act as AI helpers. They help with shadowing and are the first line of support, ensuring that no one stays stuck for long.
The 30-minutes promise and proof
Finally, I had to make the metric an offer of help, not a threat. I did this in two ways:
A “prove it” repository: I created a central repository in Confluence where everyone could share the artifacts they built. This provided social proof that it was possible and gave others concrete examples to learn from.
The 30-minutes promise: I shared a link for anyone to book a 30-minute 1-on-1 session with me. The promise was simple: “If you spend 30 minutes with me, you’ll be able to build these artifacts.” This turned the entire dynamic from “you must do this” to “let me help you do this.”
This flywheel, shadowing, scaling with power users, and providing clear examples with a promise of help, is what actually makes the metric work. The metric just points us in the right direction; the human support system is what provides the fuel.
Reflections on this framework
This framework is far from perfect. It’s an experiment, and I’m very aware that it’s a first draft that will need to change. For one, the logging is still just declarative. A team member can link an artifact, but I still have to follow up to ensure it’s a high-quality deliverable.
But what this framework has done is give us a baseline. It helps me know where we are and what to focus on. More importantly, this entire journey has clarified our thinking on what’s required for a AI transformation.
A Baseline that prepares the team for what’s next
This framework may seem ambitious, but I defined the artifacts with the future in mind. You want to prepare your team for the next steps. For example, creating a Gem is a valuable stepping stone. It trains the team to think in terms of building more complex, automated AI solutions. These artifacts aren’t just homework; they are training for the next wave of AI tools and practices.
An alternative: Measuring AI’s impact on existing metrics
In hindsight, the best way to measure AI adoption might be to use metrics that already make sense for the job. For developers, this could mean looking at their DORA metrics and observing how AI impacts them. If you already know how to measure product managers and designers efficiency before AI, it will be far easier to measure the impact of AI. This is a powerful alternative to creating a whole new framework from scratch.
A warning: Watching for system-level bottlenecks
I also know that we must maintain an overall observation at the company level. Improving one part of the process can just move the bottleneck somewhere else. To be concrete, if designers are creating more mockups, with higher quality and faster, thanks to AI, it doesn’t mean product managers will be able to transform all of that into detailed specifications, unless they are also using AI. Then the bottleneck just moves to software development, and so on. We can start with proxy metrics for a specific team, but we must observe the overall process to ensure everyone is getting value.
The non-negotiable: Management sponsorship
It’s also obvious that if you don’t have management as a sponsor, it won’t work. They have to use the tools themselves and show that it’s a group practice, being motivating in every meeting. This sponsorship is also critical to make legal progress more rapidly on approving tools. Employees have to use compliant tools, but legal can take time, so you need that management help.
Sponsors are also the best ones to explain why we are doing this. Employees might be confused about the goals, wondering if we want to save time or improve quality. You need sponsors to communicate this “why” and to create incentive systems and rewards to make sure people are going in the right direction.
The horizon: From AI-first to AI-native
This brings me to the ultimate goal. This framework is designed to make people AI-first: To make them think about AI as the first step for each thing they do. This is a debatable approach, and not the ideal end state, but at this stage, you have to force people to build this understanding and discernment.
The ideal situation is to be AI-Native: This is when you use AI not just as a first step in old processes, but when the processes themselves are renewed by AI’s potential. Here’s the difference:
An AI-first PM will use AI to build a presentation to convince a committee to build a feature.
An AI-native PM will just build the feature and present the results for info, without asking for permission.
That’s the next phase, and this framework is a first step toward it.
Actionable lessons for product leaders
This journey has been one of my most challenging as a PM, but it’s taught me a few core lessons. If you’re tasked with driving AI adoption, I hope my experience can help you find your own footing.
Design your metrics around your conversations: This is the most important lesson. Don’t start with a spreadsheet. Start by asking yourself: What discussion do I want to have with my team? Do you want to ask about minutes saved, or do you want to brainstorm new, high-impact deliverables? Build the simplest possible metric that forces that positive conversation.
Start with one, simple, trackable metric. My V1 framework was a complex mess of different, hard-to-track ideas. My V2 pivoted to focusing on one and only one north star metric to keep things simple: the % of initiatives with the four artifacts. This gave everyone a clear, unambiguous target. Critically, I made it easy to track by building it into our project tracking tool workflow. Don’t add friction; integrate.
Treat AI adoption as a product: Your metrics framework is a product. It needs a clear spec (like the 4 artifacts), a simple UI (logging it in our project tracking tool), clear documentation (training and examples), and a human support system (the power users).
Get executive sponsorship: This will not work without management sponsors. This is non-negotiable. You need them for two things:
The “why”: Sponsors must communicate the reason we’re doing this. Is it to save time? No. It’s to improve quality. That message must come from the top to eliminate confusion and build trust.
The “how”: Sponsors are your enablers. You need them to help legal and IT progress more rapidly to approve new AI tools, which is often the biggest bottleneck. They can also create the incentive systems and rewards that motivate the entire team.
Build a human support system: Metrics are just a compass; your people are the engine. You must build a support system. This includes finding passionate power users, creating a central place for examples, and, most importantly, shadowing your team. You will be shocked to find the real blockers aren’t what you think: sometimes, people just don’t know where to click.
Look for existing metrics: While I built a new framework for the team, a good place to start is often with metrics you already track. If your developer teams use DORA metrics, for example, how does AI impact those? Measuring AI’s influence on metrics your team already understands and values can be a much easier way to prove its impact.
Further reading
This framework was the result of a lot of trial, error, and research. It’s an experiment, and like any good product, it’s built on the work of others. As I mentioned, this was one of the hardest PM problems I’ve faced, and LLMs couldn’t just give me the answer.
I did, however, find a few articles that helped lead me to this first-step framework. I’m sharing them here in case they can help you on your own journey.
From Memo to Movement (First Round Review, featuring Shopify): This article was an inspiration for the importance of management sponsorship. It details how Shopify got its legal team to default to “yes” on new tools and even encouraged unlimited spend on AI to signal its importance, reinforcing that adoption must be a top-down priority.
How tech companies measure the impact of AI on software development (The Pragmatic Engineer): This piece was a validation. It confirms that lack of clear metrics is the number one challenge for most leaders. Its core idea, to stop inventing new metrics and instead measure AI’s impact on existing core metrics like DORA, directly supports the approach I believe is a smart alternative to my own.
25 proven tactics to accelerate AI adoption at your company (Peter Yang, in Lenny’s Newsletter): This article provided a great tactical playbook. It gave concrete examples for two of my most important lessons: “cutting the red tape” (like giving teams a budget) and “turning enthusiasts into teachers” (the “power users” concept).
How are you tackling AI adoption? I’m especially curious if you’ve also found “time saved” to be a flawed metric for creative work. Please share your own lessons, frameworks, and failures in the comments!

