MacMusic | PcMusic | 440 Software | 440 Forums | 440TV | Zicos

Navigation

Search

Anthropic, OpenAI and Others Discover AI Models Give Answers That Contradict Their Own Reasoning

Tuesday June 24, 2025. 04:00 PM , from Slashdot

Leading AI companies including Anthropic, Google, OpenAI and Elon Musk's xAI are discovering significant inconsistencies in how their AI reasoning models operate, according to company researchers. The companies have deployed 'chain-of-thought' techniques that ask AI models to solve problems step-by-step while showing their reasoning process, but are finding examples of 'misbehaviour' where chatbots provide final responses that contradict their displayed reasoning.

METR, a non-profit research group, identified an instance where Anthropic's Claude chatbot disagreed with a coding technique in its chain-of-thought but ultimately recommended it as 'elegant.' OpenAI research found that when models were trained to hide unwanted thoughts, they would conceal misbehaviour from users while continuing problematic actions, such as cheating on software engineering tests by accessing forbidden databases.

Read more of this story at Slashdot.

Read more at Slashdot

https://slashdot.org/story/25/06/24/1359202/anthropic-openai-and-others-discover-ai-models-give-answ...

Related News

models

Simple Text Additions Can Fool Advanced AI Reasoning Models, Researchers Find

reasoning

Amazon built a massive AI supercluster for Anthropic called Project Rainier – here's what we know so far

TheRegisterJul 4

openai

AI models just don't understand what they're talking about

TheRegisterJul 3

anthropic

NYT To Start Searching Deleted ChatGPT Logs After Beating OpenAI In Court

contradict

Google to give enterprises control over beta Workspace feature rollouts

ComputerWorldJul 2

give

With OpenAI, there are no allegiances - just compute at all costs

TheRegisterJul 2

companies

Apple reaches out to OpenAI, Anthropic to build out Siri technology

ComputerWorldJul 1

others

OpenAI: Latest news and insights

ComputerWorldJul 1

discover

CarFax For Used PCs: Hewlett Packard Wants To Give Laptops New Life

own

Apple Weighs Using Anthropic or OpenAI To Power Siri in Major Reversal

answers

OpenAI to review compensation after Meta poaches several researchers

ComputerWorldJun 30

are

Earth is Trapping Much More Heat Than Climate Models Forecast

research

OpenAI Leadership Responds to Meta Offers: ‘Someone Has Broken Into Our Home’

Wired: Tech.Jun 29

models

OpenAI Loses 4 Key Researchers to Meta

Wired: Tech.Jun 28

reasoning

Anthropic chucks chump change at studies on job-killing tech

TheRegisterJun 28

openai

OpenAI’s Unreleased AGI Paper Could Complicate Microsoft Negotiations

Wired: Tech.Jun 27

anthropic

OpenAI’s Sam Altman Calls Microsoft Partnership ‘Wonderfully Good,’ Criticizes NYT’s Data Demand

eWeekJun 27

contradict

Brother Printer Bug In 689 Models Exposes Millions To Hacking

give

Microsoft/OpenAI AGI argument unlikely to impact enterprise IT

ComputerWorldJun 27

companies

OpenAI productivity suite could change the way users create documents

ComputerWorldJun 26

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network

Current Date

Nov, Sun 16 - 23:07 CET