New artificial intelligence research from Apple shows AI reasoning models may not be "thinking" so well after all.

According to a paper published just days before Apple's WWDC event, large reasoning models (LRMs) — like OpenAI o1 and o3, DeepSeek R1, Claude 3.7 Sonnet Thinking, and Google Gemini Flash Thinking — completely collapse when they're faced with increasingly complex problems. The paper comes from the same researchers who found other reasoning flaws in LLMs last year.

https://mashable.com/article/apple-research-ai-reasoning-models-collapse-logic-puzzles 

https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf