Results for "AI proof verification"

192 results found

A Mathematician Verified an OpenAI Proof. Here's What He Found

Mathematician Will Sawin reviewed a proof from an OpenAI model that claimed to disprove a famous conjecture. His findings reveal both the promise and limits of AI in mathematics.

Jun 1, 20263 min read

AI / Machine Learning

A New Open Source Dataset Aims to Solve AI's Math Reasoning Gap

Researchers at MIT and Columbia University released ATLAS, a dataset of 320,000 autoformalized mathematical statements for training AI reasoning systems.

May 29, 20263 min read

AI / Machine Learning

Why Some Experts Compare AI Chatbots to Religious Belief Systems

A growing number of researchers argue people treat large language models with faith-like trust, raising concerns about blind reliance on AI.

Jun 1, 20263 min read

CyberSecurity

Tampering Threats Emerge for Encrypted AI Reasoning Systems

Privacy-preserving AI models that process encrypted data may be vulnerable to undetectable manipulation, researchers warn. The finding challenges assumptions about security in confidential computing.

Jun 2, 20262 min read

AI / Machine Learning

OpenClaw AI Agent Steps Into the Physical World With a Robot Body

An AI coding agent named OpenClaw has been given a physical robot body, demonstrating how AI models can simplify robot building and deployment.

May 20, 20262 min read

AI / Machine Learning

Anthropic's New 'Dreaming' System Lets AI Agents Learn From Their Own Mistakes

Anthropic unveils 'dreaming,' a self-improvement system for AI agents, plus new tools for outcomes and multi-agent orchestration. Early adopters report dramatic gains in task completion.

May 20, 20263 min read

Big Tech

Salesforce Faces Growing Questions Over AI Product Readiness

Salesforce's aggressive marketing of its Agentforce AI platform is drawing skepticism as customers question whether the technology is ready for real-world use.

May 25, 20263 min read

Big Tech

Apple to Pay iPhone Owners $250 Million Over Missing AI Features

Apple will pay $250 million to settle a class-action lawsuit over delayed AI features. Eligible iPhone owners can claim part of the settlement. The payout addresses claims Apple misled users about Siri and other AI capabilities.

May 22, 20263 min read

AI / Machine Learning

AI Benchmark Prompt for GeoGuessr Fails After Model Update

A well-known prompt used to test AI geography skills no longer works on the O3 model, prompting debate about benchmark reliability and model drift.

May 21, 20262 min read

AI / Machine Learning

Google's Gemini 3.5 Flash Reshapes Enterprise AI Cost Equation

Google claims its new Gemini 3.5 Flash model can save enterprises over $1 billion annually by delivering near-frontier performance at triple the speed and half the cost.

May 20, 20262 min read

AI / Machine Learning

AI IQ site ignites debate by scoring large language models on the bell curve

A startup called AI IQ is assigning IQ scores to over 50 AI models. The project draws praise for clarity and criticism for oversimplifying machine intelligence.

May 20, 20262 min read

AI / Machine Learning

Anthropic Surpasses OpenAI in Corporate AI Adoption for First Time

Anthropic's Claude overtakes OpenAI's ChatGPT in business AI adoption. But escalating costs and competition threaten its lead.

May 20, 20262 min read

AI / Machine Learning

Why Autonomous AI Fails Without a Body-Like Feedback System

AI systems that rely on pure autonomy often fail. A new framework compares AI to the human body, arguing that feedback loops build trust.

May 19, 20262 min read

AI / Machine Learning

Salesforce Turns Slackbot Into a Full AI Agent for the Enterprise

Salesforce rebuilt Slackbot from a simple notification tool into an AI agent that searches data, drafts documents and takes actions, intensifying workplace AI competition.

May 19, 20262 min read

Big Tech

AI demand forces a fundamental shift in enterprise data center strategy

Rising AI workloads are pushing companies to rethink infrastructure, moving from general-purpose servers to specialized GPU clusters and liquid-cooled data centers.

May 21, 20263 min read

AI / Machine Learning

AI Coding Benchmarks Overlook Long-Term Code Health Risks

Current AI coding benchmarks measure one-shot performance but ignore quality erosion from repeated edits. This oversight could lead to unmaintainable codebases at scale.

May 21, 20263 min read

AI / Machine Learning

AI Outpaces Human Patching, Making Vulnerability Windows Obsolete

AI-powered bug detection finds vulnerabilities faster than humans can patch. The industry shifts from reactive patching to building resilient software from the start.

May 21, 20263 min read

AI / Machine Learning

Anthropic Nears First Profit as AI Race Intensifies

Anthropic is set to report its first profitable quarter since founding in 2021, marking a milestone in the competitive AI landscape.

May 21, 20262 min read

Tech Policy & Regulation

AI-Generated Content Floods Social Media Platforms

AI-generated content floods social media platforms challenging moderation systems and raising concerns about online authenticity.

May 21, 20263 min read

AI / Machine Learning

iOS 27 Siri update brings agentic AI capabilities through accessibility features

Apple's iOS 27 introduces advanced AI voice controls that make Siri more intuitive and proactive, hinting at future agentic AI powers.

May 21, 20263 min read