MacMusic  |  PcMusic  |  440 Software  |  440 Forums  |  440TV  |  Zicos
sora
Search

Is OpenAI's Video-Generating Tool 'Sora' Scraping Unauthorized YouTube Clips?

Saturday September 20, 2025. 04:34 PM , from Slashdot
Is OpenAI's  Video-Generating Tool 'Sora' Scraping Unauthorized YouTube Clips?
'OpenAI's video generation tool, Sora, can create high-definition clips of just about anything you could ask for...' reports the Washington Post.
'But OpenAI has not specified which videos it grabbed to make Sora, saying only that it combined 'publicly available and licensed data'...'

With ChatGPT, OpenAI helped popularize the now-standard industry practice of building more capable AI tools by scraping vast quantities of text from the web without consent. With Sora, launched in December, OpenAI staff said they built a pioneering video generator by taking a similar approach. They developed ways to feed the system more online video — in more varied formats — including vertical videos and longer, higher-resolution clips... To explore what content OpenAI may have used, The Washington Post used Sora to create hundreds of videos that show it can closely mimic movies, TV shows and other content...

In dozens of tests, The Post found that Sora can create clips that closely resemble Netflix shows such as 'Wednesday'; popular video games like 'Minecraft'; and beloved cartoon characters, as well as the animated logos for Warner Bros., DreamWorks and other Hollywood studios, movies and TV shows. The publicly available version of Sora can generate only 20-second clips, without audio. In most cases, the look-alike scenes were made by typing basic requests like 'universal studios intro.' The results also showed that Sora can create AI videos with the logos or watermarks that broadcasters and tech companies use to brand their video content, including those for the National Basketball Association, Chinese-owned social app TikTok and Amazon-owned streaming platform Twitch...

Sora's ability to re-create specific imagery and brands suggests a version of the originals appeared in the tool's training data, AI researchers said. 'The model is mimicking the training data. There's no magic,' said Joanna Materzynska, a PhD researcher at Massachusetts Institute of Technology who has studied datasets used in AI. An AI tool's ability to reproduce proprietary content doesn't necessarily indicate that the original material was copied or obtained from its creators or owners. Content of all kinds is uploaded to video and social platforms, often without the consent of the copyright holder... Materzynska co-authored a study last year that found more than 70 percent of public video datasets commonly used in AI research contained content scraped from YouTube.
Netflix and Twitch said they did not have a content partnership for training OpenAI, according to the article (which adds that OpenAI 'has yet to face a copyright suit over the data used for Sora.')
Two key quotes from the article:

'Unauthorized scraping of YouTube content continues to be a violation of our Terms of Service.' — YouTube spokesperson Jack Malon
'We train on publicly available data consistent with fair use and use industry-leading safeguards to avoid replicating the material they learn from.' — OpenAI spokesperson Kayla Wood

Read more of this story at Slashdot.
https://news.slashdot.org/story/25/09/20/0120220/is-openais-video-generating-tool-sora-scraping-unau...

Related News

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network
Current Date
Sep, Sun 21 - 01:19 CEST