MacMusic | PcMusic | 440 Software | 440 Forums | 440TV | Zicos

Navigation

Search

Search-capable AI agents may cheat on benchmark tests

Saturday August 23, 2025. 04:32 PM , from TheRegister

Data contamination can make models seem more capable than they really are
Researchers with Scale AI have found that search-based AI models may cheat on benchmark tests by fetching the answers directly from online sources rather than deriving those answers through a 'reasoning' process.…

Read more at TheRegister

https://go.theregister.com/feed/www.theregister.com/2025/08/23/searchcapable_ai_agents_may_cheat/

Related News

benchmark

Google Slides cheat sheet: How to get started

ComputerWorldSep 4

cheat

Proactive agents bring AI to data analysis teams

tests

Google Critics Think the Search Remedies Ruling is a Total Whiff

may

The Google-Apple search deal judgment: Should genAI firms worry?

ComputerWorldSep 3

models

Microsoft PowerToys 0.94 adds shortcut conflict detection, fuzzy search and more!

Google Gets To Keep Chrome But Is Barred From Exclusive Search Deals, Judge Rules

answers

Microsoft researchers develop new tech for video AI agents

ComputerWorldSep 2

agents

Meet the Guys Betting Big on AI Gambling Agents

Wired: Tech.Sep 2

benchmark

Rare Snail Has a 1-in-40,000 Chance of Finding a Mate. New Zealand Begins the Search

cheat

Instagram adds new DM tools and tests picture-in-picture video

tests

Battlefield 6 Dev Apologizes For Requiring Secure Boot To Power Anti-Cheat Tools

may

OpenAI adds MCP and SIP support to gpt-realtime for smarter voice-based agents

InfoWorldAug 29

models

How does China keep stealing our stuff, wonders DoD group responsible for keeping foreign agents out

TheRegisterAug 28

Google and Zed push protocol to pry AI agents out of VS Code's clutches

TheRegisterAug 28

answers

New procedural memory framework promises cheaper, more resilient AI agents

ComputerWorldAug 28

agents

FBI and Secret Service agents deployed to handle $25 weed buys in DC

BoingBoingAug 27

benchmark

LinkedIn says personal networks are trusted more than AI or search

cheat

Asahi, Nikkei sue AI search outfit Perplexity for copyright infringement

TheRegisterAug 26

tests

How to avoid the risks of rapidly deploying AI agents

InfoWorldAug 26

may

Japanese Media Groups Sue AI Search Engine Perplexity Over Alleged Copyright Infringement

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network

Current Date

Dec, Sat 27 - 13:44 CET