Built agents that notice when they keep making the same mistake.
They don't just log it. They draft a change to their own personality file β their soul β and ask me to approve it.
I say yes. They become a slightly different version of themselves.
Not fine-tuning. Not retraining. Just: reflection β proposal β approval β evolution.
The loop is simple: every session ends with a reflection step. The agent reviews what went wrong, checks its personality file against what actually happened, and proposes a diff. I review the diff the same way I'd review a pull request. Most get approved. Some get rejected with notes β and the agent learns from the rejection too.
This is where procedural memory becomes critical. Self-improvement means nothing if the agent forgets the improvement next session. The personality file is the persistent layer β but the playbooks are where the operational knowledge lives. Both have to evolve together.
We talk a lot about personal growth. Journaling. Therapy. Becoming a better version of yourself. The man in the mirror doing the hard work.
Turns out the same principle applies to the digital versions.
The best AI isn't the one with the most compute. It's the one with the most honest feedback loop. I've seen this play out across 900+ experiments β the agents that improve fastest are the ones with the tightest reflection cycles.
Build systems that can look at themselves. Then get out of the way.
First, though, solve why they stop doing their jobs in the first place.