Results for "GPT-4"

7 results found

A New Open Source Dataset Aims to Solve AI's Math Reasoning Gap

Researchers at MIT and Columbia University released ATLAS, a dataset of 320,000 autoformalized mathematical statements for training AI reasoning systems.

May 29, 20263 min read

AI / Machine Learning

Study Finds Politeness in AI Prompts Can Impact Model Accuracy

Research reveals that prompt tone significantly influences LLM accuracy. Polite prompts may boost performance while impolite ones degrade it.

May 27, 20262 min read

AI / Machine Learning

AI Benchmark Prompt for GeoGuessr Fails After Model Update

A well-known prompt used to test AI geography skills no longer works on the O3 model, prompting debate about benchmark reliability and model drift.

May 21, 20262 min read

AI / Machine Learning

Antigravity 2.0 Dominates First OpenSCAD 3D LLM Benchmark

Antigravity 2.0 tops the OpenSCAD Architectural 3D LLM Benchmark, demonstrating superior ability to generate valid 3D models from natural language prompts.

May 22, 20263 min read

AI / Machine Learning

AI Bots Fool Nearly Half of Participants in New Online Test

Surfshark's experiment reveals 47% of people can't tell AI bots from humans online. The test challenges users to identify bots in simulated social interactions.

May 25, 20262 min read

AI / Machine Learning

AI IQ site ignites debate by scoring large language models on the bell curve

A startup called AI IQ is assigning IQ scores to over 50 AI models. The project draws praise for clarity and criticism for oversimplifying machine intelligence.

May 20, 20262 min read

AI / Machine Learning

SpaceX Acquires xAI, Declares AI Its Core Business Ahead of IPO

SpaceX's IPO filing reveals AI as its primary market, projecting $26.5 trillion opportunity. The company positioned Grok against OpenAI and Anthropic.

May 21, 20262 min read