Results for "AI proof verification"
192 results found

A Mathematician Verified an OpenAI Proof. Here's What He Found
Mathematician Will Sawin reviewed a proof from an OpenAI model that claimed to disprove a famous conjecture. His findings reveal both the promise and limits of AI in mathematics.

A New Open Source Dataset Aims to Solve AI's Math Reasoning Gap
Researchers at MIT and Columbia University released ATLAS, a dataset of 320,000 autoformalized mathematical statements for training AI reasoning systems.

Why Some Experts Compare AI Chatbots to Religious Belief Systems
A growing number of researchers argue people treat large language models with faith-like trust, raising concerns about blind reliance on AI.

Tampering Threats Emerge for Encrypted AI Reasoning Systems
Privacy-preserving AI models that process encrypted data may be vulnerable to undetectable manipulation, researchers warn. The finding challenges assumptions about security in confidential computing.

OpenClaw AI Agent Steps Into the Physical World With a Robot Body
An AI coding agent named OpenClaw has been given a physical robot body, demonstrating how AI models can simplify robot building and deployment.

Anthropic's New 'Dreaming' System Lets AI Agents Learn From Their Own Mistakes
Anthropic unveils 'dreaming,' a self-improvement system for AI agents, plus new tools for outcomes and multi-agent orchestration. Early adopters report dramatic gains in task completion.

Salesforce Faces Growing Questions Over AI Product Readiness
Salesforce's aggressive marketing of its Agentforce AI platform is drawing skepticism as customers question whether the technology is ready for real-world use.

Apple to Pay iPhone Owners $250 Million Over Missing AI Features
Apple will pay $250 million to settle a class-action lawsuit over delayed AI features. Eligible iPhone owners can claim part of the settlement. The payout addresses claims Apple misled users about Siri and other AI capabilities.

AI Benchmark Prompt for GeoGuessr Fails After Model Update
A well-known prompt used to test AI geography skills no longer works on the O3 model, prompting debate about benchmark reliability and model drift.

Google's Gemini 3.5 Flash Reshapes Enterprise AI Cost Equation
Google claims its new Gemini 3.5 Flash model can save enterprises over $1 billion annually by delivering near-frontier performance at triple the speed and half the cost.

AI IQ site ignites debate by scoring large language models on the bell curve
A startup called AI IQ is assigning IQ scores to over 50 AI models. The project draws praise for clarity and criticism for oversimplifying machine intelligence.

Anthropic Surpasses OpenAI in Corporate AI Adoption for First Time
Anthropic's Claude overtakes OpenAI's ChatGPT in business AI adoption. But escalating costs and competition threaten its lead.

Why Autonomous AI Fails Without a Body-Like Feedback System
AI systems that rely on pure autonomy often fail. A new framework compares AI to the human body, arguing that feedback loops build trust.

Salesforce Turns Slackbot Into a Full AI Agent for the Enterprise
Salesforce rebuilt Slackbot from a simple notification tool into an AI agent that searches data, drafts documents and takes actions, intensifying workplace AI competition.

AI demand forces a fundamental shift in enterprise data center strategy
Rising AI workloads are pushing companies to rethink infrastructure, moving from general-purpose servers to specialized GPU clusters and liquid-cooled data centers.

AI Coding Benchmarks Overlook Long-Term Code Health Risks
Current AI coding benchmarks measure one-shot performance but ignore quality erosion from repeated edits. This oversight could lead to unmaintainable codebases at scale.

AI Outpaces Human Patching, Making Vulnerability Windows Obsolete
AI-powered bug detection finds vulnerabilities faster than humans can patch. The industry shifts from reactive patching to building resilient software from the start.

Anthropic Nears First Profit as AI Race Intensifies
Anthropic is set to report its first profitable quarter since founding in 2021, marking a milestone in the competitive AI landscape.

AI-Generated Content Floods Social Media Platforms
AI-generated content floods social media platforms challenging moderation systems and raising concerns about online authenticity.

iOS 27 Siri update brings agentic AI capabilities through accessibility features
Apple's iOS 27 introduces advanced AI voice controls that make Siri more intuitive and proactive, hinting at future agentic AI powers.