Find Math Paper
L3
FilesystemPapers
Search through academic papers to identify and locate mathematics-related content that satisfies specific mathematical criteria and research requirements.
Created by Xiangyan Liu
2025-08-12
Pattern AnalysisData Extraction
Model Ranking
Click on the dots to view the trajectory of each task run
Task State
Task Initial State Files
Download ZIP package to view the complete file structure
papers/
├── 1707.06347.html
├── 2105.04165.html
├── 2201.11903.html
├── 2303.08774.html
├── 2306.08640.html
├── 2310.02255.html
├── 2310.08446.html
├── 2312.00849.html
├── 2312.07533.html
├── 2312.11805.html
├── 2402.00253.html
├── 2402.03300.html
├── 2403.05530.html
├── 2404.13046.html
├── 2404.14367.html
├── 2404.14396.html
├── 2405.09818.html
├── 2405.13911.html
├── 2405.16473.html
├── 2405.16640.html
├── 2406.08478.html
├── 2406.16852.html
├── 2406.17294.html
├── 2407.01284.html
├── 2407.01509.html
├── 2407.21783.html
├── 2408.03326.html
├── 2408.12528.html
├── 2409.19256.html
├── 2410.05993.html
├── 2410.06166.html
├── 2410.10563.html
├── 2410.13848.html
├── 2410.17885.html
├── 2410.21276.html
├── 2411.07975.html
├── 2411.10442.html
├── 2411.11930.html
├── 2411.14432.html
├── 2412.05271.html
├── 2412.08443.html
├── 2412.10302.html
├── 2412.15115.html
├── 2412.16720.html
├── 2412.17256.html
├── 2412.18319.html
├── 2412.20631.html
├── 2501.04686.html
├── 2501.06186.html
├── 2501.12599.html
├── 2501.12948.html
├── 2501.17811.html
├── 2502.01456.html
├── 2502.09621.html
├── 2502.10391.html
├── 2502.13923.html
├── 2503.01785.html
├── 2503.06520.html
├── 2503.06749.html
├── 2503.07065.html
├── 2503.07365.html
├── 2503.07536.html
├── 2503.10291.html
├── 2503.10615.html
├── 2503.12937.html
├── 2503.13939.html
├── 2503.14476.html
├── 2503.17352.html
├── 2503.18892.html
├── 2503.19786.html
├── 2503.20783.html
├── 2503.21620.html
├── 2503.21776.html
├── 2503.22679.html
├── 2504.02587.html
├── 2504.05599.html
├── 2504.07491.html
├── 2504.07934.html
├── 2504.07954.html
├── 2504.11455.html
├── 2504.14945.html
├── 2504.16656.html
├── 2505.00703.html
└── arxiv_2025.bib
Instruction
Please use FileSystem tools to finish the following task:
You are given a directory containing multiple paper files. Please help me find a math-related benchmark paper. I don’t remember its name, but I remember it not only checks whether the answer is correct, but also analyzes whether the model suffers from insufficient knowledge, lacks generalization ability, or relies on rote memorization. After finding this paper, rename its corresponding HTML file to answer.html.
Verify
Python