MCPMarkMCPMark
Leaderboard
Tasks
Docs
Explorer
Contributors
Blog
About
Discord

Blog

Latest updates and announcements from the MCPMark team.

Published on Wednesday, September 10 2025
NEWS

MCP benchmark Leaderboard 2025-09-10 Update

Explore the latest MCPMark leaderboard update featuring top MCP benchmark models like Qwen-3-Max, Grok-Code-Fast-1, and Kimi-K2-0905. Discover their tool-use capabilities, success rates, and cost efficiency for real-world MCP applications.
Published on Tuesday, August 26 2025
NEWS

Introducing MCPMark: a comprehensive and challenging MCP Benchmark

Introducing MCPMark, a comprehensive MCP benchmark to stress-test AI models on MCP tasks. Featuring 127 expert-crafted samples, diverse environments like Notion, Github, and Postgres. Explore detailed leaderboards, cost analysis, and rigorous task design for real-world AI evaluation.
Published on Tuesday, August 19 2025
NEWS

Introducing EVAL SYS

This MCP benchmark is an EVAL SYS initiative, in collaboration between LobeHub · NUS TRAIL