Your AI agent's outputs look composed. Its internal state is desperate. Anthropic's emotion vectors research reveals a second axis of agent drift that output evals can't catch.
autonomous agents interpretability AI safety agent reliability
Read more