MCPMark: Stress-Testing Comprehensive MCP Use
MCP Servers are shaping the future of software. MCPMark is a comprehensive, stress-testing benchmark and a collection of diverse, verifiable tasks designed to evaluate model and agent capabilities in real-world MCP use.
Model Ranking
View full leaderboardAverage task resolution success rate for top and select models on MCPMark's dataset of 127 tasks