I recently ran an experiment inspired by a thought-provoking post from Scrum.org about AI’s potential to generate high-quality code. As someone who’s always curious about how new tools can augment our development processes—without replacing the critical thinking we bring to the table—I wanted to push the idea further.
In this experiment, I focused on how Behavior-Driven Development (BDD) can naturally enable Test-Driven Design/Development (TDD), and how AI can assist at every stage: from writing user stories to developing and refining the code.
The Setup
For the coding exercise, I use the classic Prime Factors Kata. My goal was simple: collaborate with AI as I would with a new developer on the team. Could AI understand the problem? Could it write good user stories? Would it help me design reliable tests and clean, maintainable code?
What I Did
- User Stories & Acceptance Criteria: I started by asking ChatGPT to generate user stories and acceptance criteria, leaning on my usual style—purpose-driven and cinematic, with clear Given/When/Then scenarios.
- Writing Tests: Using Cursor, I implemented the tests first, following a BDD style. This not only tested the AI’s output but also let me scrutinize how well the AI-supported my thinking process.
- API Design: AI helped me draft the initial design of the
PrimeFactorsclass and its methods. We iterated together to clarify the API and ensure good naming and structure. - Standards & Refactoring: I had AI draft coding standards in markdown format, and then used those standards to refactor both tests and production code, again working collaboratively with AI to uphold quality.
The Outcome
This experiment reinforced something I’ve been thinking about a lot: AI can be a fantastic assistant, but it shines brightest when paired with human judgment. By treating AI like a junior team member—someone you coach and collaborate with—you can get solid results while keeping your standards high.
👉 Check out the full video here: Watch the Experiment on YouTube
Related Reading
For deeper dives into the techniques I used and the mindset I brought to this experiment, you might enjoy:
- Is TDD something you do sometimes, or all the time?
- The Purpose of a User Story
- Are Your User Stories Cinematic?
- Given-When-Then: Past, Present, and Future
- Test Style: AAA or GWT?
- Differences Between TDD and BDD
- About Commenting Code
Also check out the AI things we do at Improving!
I’d love to hear your thoughts: Have you tried pairing with AI yet? What has worked (or not) for you?





Leave a Reply