← → Space to navigate
01
AI Dev Workshop · Wallbox 2026

Uncovering
better ways of
developing

...yes, again.

Jorge Castro & Joan Leon

Questions about your day to day
Are you up for it?
QR Mentimeter
menti.com
4889 5833
Scan the QR or go to menti.com and enter the code
02
The key question

Our job is to solve problems,
not to write code.

🪚
The carpenter who rejects the milling machine
He is not more skilled. He is just slower.
⌨️
Devs in 2026
Will we say the same about agents?

The question is not whether to use the milling machine.
It is how to use it with master's degree.

Manifesto for Agile Software Development

We are uncovering better ways of developing
software by doing it and helping others do it.
Through this work we have come to value:

Individuals and interactions over processes and tools

Working software over comprehensive documentation

Customer collaboration over contract negotiation

Responding to change over following a plan

That is, while there is value in the items on
the right, we value the items on the left more.

Kent Beck
Mike Beedle
Arie van Bennekum
Alistair Cockburn
Ward Cunningham
Martin Fowler
James Grenning
Jim Highsmith
Andrew Hunt
Ron Jeffries
Jon Kern
Brian Marick
Robert C. Martin
Steve Mellor
Ken Schwaber
Jeff Sutherland
Dave Thomas

Twelve Principles of Agile Software

04
Agile Manifesto · 2001

"We are UNCOVERING
better ways of
developing software"

— Agile Manifesto, 2001

It does not say "we have uncovered". It says "we are uncovering". A process that never ends.

05
METR Study · July 2025
16

senior developers

+5

years working on their own repos

246

real tasks from their own projects

Perception: +20% faster
19% SLOWER

February 2026 — Claude Opus 4.6 completes tasks that would take an expert human ~15 hours, 1 out of 2 times.

Source: METR — Early 2025 AI-Experienced OS Dev Study

06
The pattern we need to break

It is not the AI. It is how we use it.

01
Vague prompts

Without clear context, the AI guesses. And when it guesses, it hallucinates.

02
No Memory

Every session starts from zero. It is like hiring someone new every day.

03
No iterative workflow

One single session to research, plan and implement. The context gets polluted.

04
No human review

Code goes to production without review. Nobody can review everything the AI generates.

AI amplifies what you give it. If you give it a disciplined process, it amplifies your productivity. If you give it chaos, it amplifies the chaos.

07
The map

Where are we in automation?

Level Approach Description Maturity
L1 Code-level CompletionCopilot (inline), Tabby, Codeium, Supermaven Autocomplete in the editor. The human writes, the AI suggests. Mainstream
L2 Task-level Generation · Prompt to UIChatGPT, Claude (chat), Cursor, Windsurf, Aider, Cline, bolt.new, Lovable, v0 The human defines a task, the AI generates code/UI. The human approves each step. Adopted
Now We are here Task-level with Optional AutonomyClaude Code, Codex CLI, Cursor (Agent Mode), Copilot Edits, Windsurf (cascade) L2 by default, configurable to chain autonomous actions (headless mode). Adopted (requires setup)
L3 Ticket to PR · Self-healing CIClaude Code (headless + CI), Codex (cloud agent), Copilot Coding Agent, Devin, Codegen The agent receives an issue, generates code, creates a PR, iterates on CI failures. Escalates when out of scope. Early Adoption
L4 AI Software EngineerDevin (enterprise), Factory.ai, Genie Full cycle: requirements → code → deploy → monitoring → rollback. Emerging
L5 AI Development TeamsAutoDev, MetaGPT, MGX Multiple specialized agents collaborating autonomously. Experimental
08
The 4 problems

What AI does wrong

01
No context

At the start of each session, the AI knows nothing about your code. Zero.
And it does not know when it does not know enough.

02
Hallucinations

It invents APIs, methods and dependencies that do not exist... and tells you with full confidence.

03
Not deterministic

The same prompt can produce different results every time you run it.

04
Hard to review

We generate code faster than our cognitive load allows us to review.

09
The flow we already know

Separating the phases is not optional with AI

01
Research
Context, constraints, decisions
02
Plan
Design, small steps, criteria
03
Implement in phases
Clean session, precise context
04
Validate
Tests, review, real feedback

Each phase needs a clean context window.
Keep the context clean or you will have problems.

10
We already know what works

The bottleneck has never been
lines of code per minute

01
Story Splitting

Split work into deployable vertical slices. With AI you can generate a lot in little time. If the scope is not cut, the chaos scales just as fast.

02
Hamburger Method

Deliver value end-to-end continuously. Cut the feature into layers, implement the thinnest slice.

03
Small safe steps

Each step must be reversible. Just because the agent can generate a lot very fast does not mean we should push it all to production at once.

04
Advanced testing

Mutation testing, Acceptance Testing, Architectural Testing. All of them expand the feedback loop and now there are no excuses.

More generated code = more risk if you do not have a solid feedback loop.

11
Design for oversight

You cannot review every line.
Design systems that catch the errors.

Automated

Technical guardrails

  • Hooks: automatic linter on every tool use
  • Architecture tests: verify structure, not just functionality
  • Acceptance Testing: expected behavior defined upfront
  • Mature CI/CD: if it is not reliable with humans, it will be chaos with agents
  • Agent permissions: define what it can do without approval and what it cannot
Human judgment

Pre-merge checklist

  • DomainThe code reflects domain concepts, not technical jargon
  • QualityDo the tests fail when the behavior fails?
  • OpsIs there traceability if this fails in prod?
  • DebtCould I delete this and nobody would notice?
12
Agent security

Set up permissions.
Calibrate autonomy to risk.

The agent does without asking
  • Edit code
  • Run tests
  • Generate documentation
  • Create branches
  • Open draft PRs
vs
Always with your approval
  • Push to main
  • Modify CI/CD pipelines
  • Touch IAM / Terraform / infra
  • Access secrets or credentials
  • Self-approve its own PRs
Low risk → more autonomy. Start without asking permission.
High risk → more oversight. The harder to undo, the more human eyes needed.
13
Autonomy without oversight

Real example

YO
Del plan .claude/plans/pending/2026… tengo un par de dudas
CC
Acción ejecutada
Listo, archivo eliminado.
YO
¿Por qué has eliminado el archivo?
CC
Me equivoqué. Interpreté "del" como el comando inglés delete, pero en realidad "del plan" es la preposición en español, estabas haciendo referencia al plan, no pidiéndome eliminarlo.

Lo siento mucho. ¿Tienes el contenido del archivo en otro lugar, o hay forma de recuperarlo?
14
How much autonomy?

Calibrate autonomy to risk

Context
Autonomy level
Before merging
Internal prototype
If it fails: we lose time
You design, the AI implements. Do not worry about the code
Unit + integration tests
Straight to main
Product with users
If it fails: user impacted
You design, the AI implements, you oversee the architecture
Architecture + acceptance tests
Observability + AI-reviewed PR
Business core
If it fails: money or trust
You design and review code in critical modules
Mutation testing + security scan
PR with human review
Physical world
If it fails: real harm
You decide and oversee at a low level — the AI only assists
Tests on real hardware
Mandatory safety review
15
The transformation

Tools change.
Responsibility does not.

Changes
Who writes the code
The speed of generation
The format of knowledge
Stays the same
Simple architecture
Small safe steps
Define before you implement
16
Summary

4 ideas to take with you today

1

Your value is in understanding the problem, not in typing the solution.

2

AI does not improve your process. It amplifies it. If it is good, it goes faster. If it is bad, it fails faster.

3

An agent without configuration is a junior without onboarding. Give it context, rules and a way to verify its work.

4

Share what you learn — let us keep uncovering together.

Librecounter Analytics