Mark Erdmann's Highlights on 'Greg Kamradt on X: "How do SOTA LLMs do on ARC Prize? We wanted to see how gpt-4o, claude sonnet, and gemini did on public tasks So we made a baseline template with @LangChainAI that tests them all Scores: * Claude Sonnet: 21% * gpt-4o: 9% * gemini 1.5: 8% https://t.co/6wXW8E3vOE" / X' | Glasp