MacMusic  |  PcMusic  |  440 Software  |  440 Forums  |  440TV  |  Zicos
companies
Search

Study Accuses LM Arena of Helping Top AI Labs Game Its Benchmark

Thursday May 1, 2025. 03:00 PM , from Slashdot
Study Accuses LM Arena of Helping Top AI Labs Game Its Benchmark
An anonymous reader shares a report: A new paper from AI lab Cohere, Stanford, MIT, and Ai2 accuses LM Arena, the organization behind the popular crowdsourced AI benchmark Chatbot Arena, of helping a select group of AI companies achieve better leaderboard scores at the expense of rivals.

According to the authors, LM Arena allowed some industry-leading AI companies like Meta, OpenAI, Google, and Amazon to privately test several variants of AI models, then not publish the scores of the lowest performers. This made it easier for these companies to achieve a top spot on the platform's leaderboard, though the opportunity was not afforded to every firm, the authors say.

'Only a handful of [companies] were told that this private testing was available, and the amount of private testing that some [companies] received is just so much more than others,' said Cohere's VP of AI research and co-author of the study, Sara Hooker, in an interview with TechCrunch. 'This is gamification.' Further reading: Meta Got Caught Gaming AI Benchmarks.

Read more of this story at Slashdot.
https://slashdot.org/story/25/05/01/0525208/study-accuses-lm-arena-of-helping-top-ai-labs-game-its-b...

Related News

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network
Current Date
May, Fri 2 - 09:21 CEST