Blog

Thoughts and technical writings

Why the scaffold around an AI model matters more than the model itself, and five principles for getting the most out of agentic coding tools.

Read more →

How I built Snow, an MCP server that gives AI agents persistent, contextual memory through a local SQLite database with FTS5 search and typed metadata schemas.

Read more →

How I configured OpenCode with a primary coding agent and specialized subagents for research and debugging, with tailored temperature settings and tool access.

Read more →

The tension between safety and capability in agent design, and how to build constraints that prevent bad outcomes without rendering agents useless.

Read more →

How I discovered 40% of my token budget was going to waste, and the strategies I use to optimize costs without sacrificing quality.

Read more →

Hard-won lessons about observability, cost control, human-agent interaction, and why simple agents outperform clever ones.

Read more →

Patterns that work for coordinating multiple agents: hub-and-spoke coordination, shared state management, and graceful degradation.

Read more →

Why accuracy alone tells you nothing useful about agent performance, and the multi-dimensional evaluation framework I use instead.

Read more →

Designing effective collaboration patterns between humans and agents, from escalation handoffs to approval gates and input requests.

Read more →

What to do when your agent burns through API quotas at 2 AM, and the systematic approach to debugging autonomous systems.

Read more →

The difference between post-hoc explanations and embedded reasoning, and why explainability is crucial for trusting autonomous systems.

Read more →

Why traditional monitoring approaches fail for AI agents, and the three critical gaps: decision visibility, cost attribution, and quality signals.

Read more →

How fragile tool integrations can bring down your entire agent, and the defensive patterns that make systems resilient to external API changes.

Read more →

Why most agent logging is a firehose of noise, and how decision-tree logging transformed my debugging workflow from hours to minutes.

Read more →

Lessons learned from transitioning monolithic ML applications to microservices architecture and the impact on performance and maintainability.

Read more →

How to conduct effective code reviews when working with data scientists, product managers, and other non-engineering stakeholders.

Read more →

A practical guide to taking machine learning models from Jupyter notebooks to production-ready systems that scale.

Read more →

© 2026 Matt Emmons