Log

A messy index of what I’m noticing: work, cities, meals, music, films, links, clips, and half-formed thoughts.

2 entries · #agents

Jun 7 · thought

Agents need workflow state

The best AI products don't feel like chatbots. They feel like someone quietly cleaned up the workflow graph behind the scenes.

Jun 5 · link

Agent evals and fake precision

Saving this because it explains why agent evals often become fake precision: clean numbers on a benchmark that doesn't resemble the real workflow.

Anthropic (opens in new tab)

Log · Arjun Aggarwal