mcp-evals

MCP Tool

mclenhard/mcp-evals

A Node.js package and GitHub Action for evaluating MCP (Model Context Protocol) tool implementations using LLM-based scoring. This helps ensure your MCP server's tools are working correctly and performing well.

Install

$ npx loaditout add mclenhard/mcp-evals

Platform-specific configuration:

.claude/settings.json

{
  "mcpServers": {
    "mcp-evals": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-evals"
      ]
    }
  }
}

Add the config above to .claude/settings.json under the mcpServers key.

About

MCP Evals

A Node.js package and GitHub Action for evaluating MCP (Model Context Protocol) tool implementations using LLM-based scoring, with built-in observability support. This helps ensure your MCP server's tools are working correctly, performing well, and are fully observable with integrated monitoring and metrics.

Installation

As a Node.js Package

npm install mcp-evals

As a GitHub Action

Add the following to your workflow file:

name: Run MCP Evaluations
on:
  pull_request:
    types: [opened, synchronize, reopened]
jobs:
  evaluate:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: write
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          
      - name: Install dependencies
        run: npm install
        
      - name: Run MCP Evaluations
        uses: mclenhard/mcp-evals@v1.0.9
        with:
          evals_path: 'src/evals/evals.ts'    # Can also use .yaml files
          server_path: 'src/index.ts'
          openai_api_key: ${{ secrets.OPENAI_API_KEY }}
          model: 'gpt-4'  # Optional, defaults to gpt-4

Usage -- Evals

1. Create Your Evaluation File

You can create evaluation configurations in either TypeScript or YAML format.

Option A: TypeScript Configuration

Create a file (e.g., evals.ts) that exports your evaluation configuration:

import { EvalConfig } from 'mcp-evals';
import { openai } from "@ai-sdk/openai";
import { grade, EvalFunction} from "mcp-evals";

const weatherEval: EvalFunction = {
    name: 'Weather Tool Evaluation',
    description: 'Evaluates the accuracy and completeness of weather information retrieval',
    run: async () => {
      const result = await grade(openai("gpt-4"), "What is the weather in New York?");
      return JSON.parse(result);
    }
};
const config: EvalConfig = {
    model: openai("gp

Reviews

Loading reviews...

Quality Signals

Quality Score4000

125

Stars

Installs

Last updated267 days ago

Security: AREADME

New

mcp-evals

Install

About

Tags

Reviews

Quality Signals

Safety

Details

Embed Badge