For most of my career, estimating software work was already a bit of a guessing game.

We all know the feeling. A “quick change” turns into a deep dive, and a “complex feature” sometimes just… works out in an afternoon. Scrum was never really about being precise. It was about dealing with the fact that software is uncertain by nature.

Then AI showed up.

And it didn’t remove uncertainty. It just changed its shape.

A while ago I was building a .NET feature with API integrations, validation, unit tests, and some Azure configuration. In my head I still had the old reflex. “This is a two or three day job.”

Out of curiosity I opened GitHub Copilot, described what I needed, and watched it start generating code almost immediately. Controllers, services, validation, tests, even documentation. Not perfect code, but definitely not nothing.

I remember thinking, “this would have taken me most of the week.”

Instead, I wasn’t writing everything from scratch anymore. I was reviewing, adjusting, and trying to understand what the AI had actually produced.

So the task didn’t become a 15 minute job.

But it also wasn’t a three day job anymore. It became something else entirely.

That’s where Scrum starts to feel different.

Because what exactly are we estimating now?

Traditionally, we assume bigger tasks take longer and smaller tasks take less time. That mental model breaks down quickly when AI can generate entire chunks of a solution in minutes. Azure Bicep templates, REST APIs, database scripts, tests, all of it can appear faster than you can properly read it.

Sometimes it looks great at first glance. Sometimes it hides subtle bugs or architectural problems you only discover later when you dig in.

And that’s the strange part. You often don’t know which one you’re dealing with until you’re already inside the code.

So the real work shifts. Writing becomes cheap. Understanding becomes expensive.

Velocity used to be one of those Scrum metrics teams quietly relied on. If last sprint was 40 points, next sprint would be somewhere in that range. It was never perfect, but it was useful.

AI breaks that rhythm.

One developer ships five stories because AI handled most of the implementation. Another ships two because they spent their time reviewing, correcting, and untangling AI generated complexity. Who delivered more value? It’s no longer obvious.

Velocity starts to measure something messy. A mix of experience, judgment, tooling, and how risky the generated code happened to be.

The real bottleneck has shifted away from typing code.

It’s now understanding.

Understanding what the business actually needs. Understanding the architecture. Understanding whether what was generated makes sense in production. AI can produce thousands of lines of code, but it cannot tell you if those lines align with security rules, compliance requirements, or long term design choices.

That responsibility still sits with us.

I’ve also noticed this during sprint planning. A question that didn’t exist a few years ago now appears almost immediately. “Can AI help with this?”

That single question changes the whole discussion. Something that looks difficult might become trivial. Something that looks simple might turn into a risk because it needs careful validation of generated output.

The uncertainty hasn’t gone away. It has just moved from writing code to reviewing it.

Even the Definition of Done feels different now. When you are no longer writing every line yourself, you don’t naturally carry the same context into the solution.

So checking quality becomes more important than ever. Not just “does it work”, but also “does it make sense”.

Security. Architecture. Business rules. Tests that are not just generated, but actually trusted.

Not because AI should be distrusted, but because it should be treated like any other contributor who didn’t sit in your design meetings.

Fast, helpful, but still needing review.

And in a strange way, this brings Agile back to its core idea.

We are forced to focus more on outcomes again. Because effort is no longer a reliable signal. Customers don’t care how many story points were completed or how much code was written. They care if the problem is solved. They care if the product works better than it did before.

That becomes the real measure again.

People sometimes say Scrum will become obsolete because of AI. I don’t think that’s true.

Scrum was never about predicting effort perfectly. It was about helping teams navigate uncertainty.

AI didn’t remove uncertainty. It just changed what it looks like.

And the teams that adapt won’t be the ones writing the most code anymore.

They will be the ones making the best decisions about what code should exist in the first place.

Looking at it that way, maybe story points aren’t becoming meaningless because Scrum is failing.

Maybe they are becoming less useful because writing code is no longer the hardest part of building software.