autonomous agents

2 articles

When Outputs Lie

Your AI agent's outputs look composed. Its internal state is desperate. Anthropic's emotion vectors research reveals a second axis of agent drift that output evals can't catch.

autonomous agents interpretability AI safety agent reliability
Read more