MCP Leaderboard
This leaderboard benchmarks different MCP server implementations using the same model baseline, providing a fair comparison of MCP server performance and capabilities.
Updated at 11/06/2025 03:11:30
Github
23 tasks| Rank | MCP | Author | Model | Pass@1 (avg ± std) | Pass@4 | Pass^4 | Avg Tokens | Turns | Avg Time | Cost per run |
|---|---|---|---|---|---|---|---|---|---|---|
Rank | MCP | Author | Model | Pass@1 (avg ± std) | Pass@4 | Pass^4 | Avg Tokens | Turns | Avg Time | Cost per run |
#1 | Klavis AI Strata MCP Server One MCP server that guides your AI agents through thousands of tools in multiple apps progressively(with GitHub integration enabled). | KlavisAI | claude-sonnet-4-20250514 | 31.5 ± 3.6% | 39.1% | 21.74% | 533,385 | 21.7 | 358.3s | $39.55 |
#2 | GitHub Official MCP Server The GitHub MCP Server connects AI tools directly to GitHub's platform. This gives AI agents, assistants, and chatbots the ability to read repositories and code files, manage issues and PRs, analyze code, and automate workflows. All through natural language interactions. | GitHub | claude-sonnet-4-20250514 | 16.3 ± 5.7% | 30.4% | 8.70% | 701,252 | 11.2 | 196.5s | $49.61 |
Notion
28 tasks| Rank | MCP | Author | Model | Pass@1 (avg ± std) | Pass@4 | Pass^4 | Avg Tokens | Turns | Avg Time | Cost per run |
|---|---|---|---|---|---|---|---|---|---|---|
Rank | MCP | Author | Model | Pass@1 (avg ± std) | Pass@4 | Pass^4 | Avg Tokens | Turns | Avg Time | Cost per run |
#1 | Klavis AI Strata MCP Server One MCP server that guides your AI agents through thousands of tools in multiple apps progressively(with Notion integration enabled). | KlavisAI | claude-sonnet-4-20250514 | 34.8 ± 6.4% | 50.0% | 25.00% | 424,474 | 24.3 | 147.6s | $37.83 |
#2 | Notion Notion MCP is official hosted server that gives AI tools secure access to your Notion workspace. It's designed to work seamlessly with popular AI assistants like ChatGPT, Cursor, and Claude. | Notion | claude-sonnet-4-20250514 | 21.4 ± 5.1% | 39.3% | 7.14% | 650,879 | 19.7 | 193.2s | $56.10 |
Postgres
21 tasks| Rank | MCP | Author | Model | Pass@1 (avg ± std) | Pass@4 | Pass^4 | Avg Tokens | Turns | Avg Time | Cost per run |
|---|---|---|---|---|---|---|---|---|---|---|
Rank | MCP | Author | Model | Pass@1 (avg ± std) | Pass@4 | Pass^4 | Avg Tokens | Turns | Avg Time | Cost per run |
#1 | InsForge Official InsForge MCP Server that connects AI tools to InsForge's complete backend platform. This gives AI agents the ability to manage databases, execute SQL queries, handle authentication, manage storage buckets, deploy serverless functions, and monitor container logs. All through natural language interactions. | InsForge | claude-sonnet-4-5-20250929 | 54.8 ± 5.3% | 61.9% | 47.62% | 391,019 | 19.9 | 150.2s | $50.00 |
#2 | Supabase Official Supabase MCP server that enables AI tools to interact with Supabase projects. Supports project management, schema design, migrations, SQL queries, branch management, and configuration through natural language commands. | Supabase | claude-sonnet-4-5-20250929 | 52.4 ± 5.8% | 71.4% | 28.57% | 554,427 | 21.4 | 239.2s | $71.62 |
#3 | Postgres MCP Pro Postgres MCP Pro provides configurable read/write access and performance analysis for you and your AI agents. | Crystal DBA | claude-sonnet-4-5-20250929 | 48.8 ± 4.0% | 57.1% | 38.10% | 492,931 | 26.0 | 214.7s | $63.55 |
If you have any MCP servers that need evaluation, please feel free to contact us at hello@evalsys.org.