Thirty of the world’s brainiest mathematicians secretly convened in Berkeley, California to devise elaborate maths problems and square off with ChatGPT’s o4-mini. Ahead of the meeting they communicated only on Signal, because emailed problems might be discovered and scanned by the large language model. They were able to find 10 problems that stymied the bot, but the results still stunned participants. One open question in number theory, described as a “good PhD-level problem”, took the bot 10 minutes to solve — first pulling up related literature, then trying to solve a simpler version, before presenting the correct solution and cheekily writing, “no citation necessary because the mystery number was computed by me!’” At the same time, mathematicians worry that the model's overconfidence could lead to people trusting it too much.