Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory, King Abdullah University of Science and Technology, and HUMAIN have built MathNet, the largest curated dataset of Olympiad-level math problems ever assembled. Spanning 47 countries, 17 languages, and 143 competitions over four decades, it contains more than 30,000 high-quality problems and solutions—five times the size of the next-largest dataset. The project will be presented at the International Conference on Learning Representations in Brazil this month.
From fragmented archives to a unified resource
For years, the International Mathematical Olympiad (IMO) delegations shared problem booklets that were rarely preserved or digitized. Most existing datasets rely on informal solutions scraped from community forums, which often lack depth and standardized formatting. MathNet changes that by consolidating official national competition booklets, each containing expert-written, peer-reviewed solutions that sometimes span multiple pages. This approach provides both AI researchers and students with a centralized, searchable repository of rigorous mathematical problems from diverse traditions.
The team spent years tracking down 1,595 PDF volumes totaling over 25,000 pages, many obtained from physical scans and personal archives. A key contributor was Navid Safaei, a longtime IMO community figure who had manually collected and digitized booklets since 2006. His personal archive became a critical foundation for the dataset.
A benchmark for AI—and a training ground for students
MathNet serves a dual purpose: it is a benchmark for evaluating AI models on mathematical reasoning and a practical tool for students preparing for competitions. While recent AI milestones suggest models can solve Olympiad problems at high levels, MathNet reveals significant gaps. When tested on 6,400 problems from the dataset, even the top-performing model, GPT-5, achieved only 69.3 percent accuracy—failing nearly one in three problems. Performance dropped even further on problems that included figures, highlighting visual reasoning as a persistent weakness.
The diversity of MathNet also exposes linguistic limitations in current AI systems. Open-source models scored 0 percent on Mongolian-language problems, despite performing well in English and Chinese. This underscores how training data skewed toward major languages can create blind spots in model capabilities.
For students, MathNet offers a lifeline. Many competitors, especially in regions without structured training programs, previously relied on fragmented and inconsistent resources. Shaden Alshammari, the lead author and a former IMO participant, recalls the isolation many students faced. “I remember students who had to prepare entirely on their own because no one in their country was guiding them,” she says. “We hope this dataset gives them a centralized place with high-quality problems and solutions to learn from.”
Standardized rigor and global mathematical traditions
Unlike community-sourced datasets, MathNet draws exclusively from official competition booklets, ensuring solutions are vetted and solutions follow structured approaches. The dataset includes both text- and image-based problems, covering a wide range of mathematical disciplines such as combinatorics, number theory, and geometry. It also organizes problems by topic and competition, enabling targeted training and research.
To validate the dataset, the team assembled a team of over 30 human evaluators from countries including Armenia, Russia, Ukraine, Vietnam, and Poland. These evaluators worked together to verify thousands of solutions, ensuring consistency and accuracy. Sultan Albarakati, a co-author and current IMO board member, is working to share the dataset directly with the IMO Foundation.
Tanish Patil, deputy leader of Switzerland’s IMO team, sees MathNet as a transformative resource. “It provides standardized formatting, verified solutions, and essential metadata that other archives lack,” he says. “It will be fascinating to see how this dataset is used to improve reasoning models and whether it can help address the challenge of determining whether a problem is truly original.”
Looking ahead: a more inclusive future for math AI
MathNet’s release marks a turning point in how AI systems learn and apply mathematical reasoning. By incorporating problems from underrepresented languages and regions, it challenges models to move beyond narrow linguistic and cultural biases. As researchers continue to refine these systems, datasets like MathNet will be instrumental in bridging gaps—not just in performance, but in accessibility and global participation.
The team plans to expand the dataset further, adding more problems, languages, and competitions. Their goal is to create a living archive that reflects the full spectrum of mathematical thought worldwide, ensuring that both AI and human learners have the tools they need to succeed.
AI summary
MIT araştırmacıları tarafından oluşturulan MathNet, 30 binden fazla IMO problemi ve çözümüyle dünya genelindeki matematik öğrencileri ve yapay zekâ geliştiricileri için dev bir kaynak haline geldi. Detaylar ve kullanım alanları.