iris-eval/mcp-server
MCP-native agent evaluation and observability server
[](https://github.com/iris-eval/mcp-server) [](https://npmjs.com/package/@iris-eval/mcp-server) [](https://npmjs.com/package/@iris-eval/mcp-server) [](https://github.com/iris-eval/mcp-server/actions/workflows/ci.yml) [](LICENSE)
See what your AI agents are actually doing. Iris is an open-source MCP server that logs every trace, evaluates output quality, and tracks costs across all your agents. Any MCP-compatible agent discovers and uses it automatically — no SDK, no code changes.
Your agents are running in production. Traditional monitoring sees 200 OK and moves on. It has no idea the agent just:
Iris sees all of it.
| | | |---|---| | Trace Logging | Hierarchical span trees with per-tool-call latency, token usage, and cost in USD. Stored in SQLite, queryable instantly. | | Output Evaluation | 12 built-in rules across 4 categories: completeness, relevance, safety, cost. PII detection, prompt injection patterns, hallucination markers. Add custom rules with Zod schemas. | | Cost Visibility | Aggregate cost across all agents over any time window. Set budget thresholds. Get flagged when agents overspend. | | Web Dashboard | Real-time dark-mode UI with trace visualization, eval result
Loading reviews...