Results for "AI reliability"

154 results found

Why Companies Are Quietly Bringing Back Workers After AI Replacements

After replacing staff with AI, many firms are now rehiring humans to fix errors and ensure safe, reliable operations. Human oversight is proving essential.

May 29, 20263 min read

AI / Machine Learning

AI Benchmark Prompt for GeoGuessr Fails After Model Update

A well-known prompt used to test AI geography skills no longer works on the O3 model, prompting debate about benchmark reliability and model drift.

May 21, 20262 min read

Big Tech

Google AI search now pulls expert advice from Reddit

Google's AI-powered search results will now include Reddit posts as expert sources. The change aims to improve answer quality but raises questions about content reliability.

May 29, 20262 min read

AI / Machine Learning

AI coding boom creates production chaos, Resolve AI launches multi-agent fix

Resolve AI expands its platform with multi-agent investigation to tackle production failures caused by rapid AI code generation. The system uses coordinated agents that verify each other's findings.

May 21, 20263 min read

AI / Machine Learning

Starbucks Drops Faulty AI Inventory System That Failed to Count

Starbucks scrapped an AI inventory tool after it repeatedly miscounted stock. The system’s failure highlights challenges in retail automation.

May 22, 20262 min read

AI / Machine Learning

Google's AI Still Struggles to Spell Its Own Name

Google's latest AI models continue to fail at basic spelling, even for the company's own name. The issue highlights deeper limitations in how large language models process text.

May 28, 20262 min read

AI / Machine Learning

Amazon Claims Major Advance in Data Center Speed for AI Workloads

Amazon says its new networking technology dramatically accelerates data flow in its cloud data centers, solving a key bottleneck for AI training and other intensive workloads.

May 28, 20263 min read

Big Tech

Orbital AI Data Centers Face Months-Long Outage Risks, Experts Warn

Hyperscalers eye space-based AI compute, but experts flag severe operational risks including months-long outages due to physical access limits and radiation.

May 30, 20263 min read

AI / Machine Learning

Google's Gemini Leaks Its Own System Prompt in User Chat

A user discovered that Google's Gemini AI revealed its internal system prompt during a conversation, raising questions about AI transparency and safety.

May 21, 20261 min read

AI / Machine Learning

Grok's Government Adoption Lags, Undermining xAI's Growth Story

Grok appears in only 3 of 400+ government AI use cases per Reuters. The low adoption undercuts xAI's growth story tied to a potential massive SpaceX IPO.

May 22, 20262 min read

AI / Machine Learning

Antigravity 2.0 Dominates First OpenSCAD 3D LLM Benchmark

Antigravity 2.0 tops the OpenSCAD Architectural 3D LLM Benchmark, demonstrating superior ability to generate valid 3D models from natural language prompts.

May 22, 20263 min read

Tech Policy & Regulation

Tech Lobbying Weakens Climate Rules for Data Centers

Tech companies lobbied to kill stricter clean energy rules for gas-powered data centers, weakening climate pledges.

May 25, 20262 min read

AI / Machine Learning

AI Coding Benchmarks Overlook Long-Term Code Health Risks

Current AI coding benchmarks measure one-shot performance but ignore quality erosion from repeated edits. This oversight could lead to unmaintainable codebases at scale.

May 21, 20263 min read

AI / Machine Learning

OpenClaw AI Agent Steps Into the Physical World With a Robot Body

An AI coding agent named OpenClaw has been given a physical robot body, demonstrating how AI models can simplify robot building and deployment.

May 20, 20262 min read

AI / Machine Learning

Google's Gemini 3.5 Flash Reshapes Enterprise AI Cost Equation

Google claims its new Gemini 3.5 Flash model can save enterprises over $1 billion annually by delivering near-frontier performance at triple the speed and half the cost.

May 20, 20262 min read

AI / Machine Learning

AI IQ site ignites debate by scoring large language models on the bell curve

A startup called AI IQ is assigning IQ scores to over 50 AI models. The project draws praise for clarity and criticism for oversimplifying machine intelligence.

May 20, 20262 min read

AI / Machine Learning

Anthropic Surpasses OpenAI in Corporate AI Adoption for First Time

Anthropic's Claude overtakes OpenAI's ChatGPT in business AI adoption. But escalating costs and competition threaten its lead.

May 20, 20262 min read

AI / Machine Learning

Why Autonomous AI Fails Without a Body-Like Feedback System

AI systems that rely on pure autonomy often fail. A new framework compares AI to the human body, arguing that feedback loops build trust.

May 19, 20262 min read

AI / Machine Learning

Salesforce Turns Slackbot Into a Full AI Agent for the Enterprise

Salesforce rebuilt Slackbot from a simple notification tool into an AI agent that searches data, drafts documents and takes actions, intensifying workplace AI competition.

May 19, 20262 min read

Big Tech

AI demand forces a fundamental shift in enterprise data center strategy

Rising AI workloads are pushing companies to rethink infrastructure, moving from general-purpose servers to specialized GPU clusters and liquid-cooled data centers.

May 21, 20263 min read