Meta Researchers Build an AI That Learns Equally Well From Visual, Written or Spoken Materials

Saturday January 22, 2022. 01:02 AM , from Slashdot/Apple

An anonymous reader quotes a report from TechCrunch: Meta (AKA Facebook) researchers are working on an AI that can learn capably on its own whether it does so in spoken, written or visual materials. The traditional way of training an AI model to correctly interpret something is to give it lots and lots (like millions) of labeled examples. A picture of a cat with the cat part labeled, a conversation with the speakers and words transcribed, etc. But that approach is no longer in vogue as researchers found that it was no longer feasible to manually create databases of the sizes needed to train next-gen AIs. Who wants to label 50 million cat pictures? Okay, a few people probably -- but who wants to label 50 million pictures of common fruits and vegetables?

Currently some of the most promising AI systems are what are called self-supervised: models that can work from large quantities of unlabeled data, like books or video of people interacting, and build their own structured understanding of what the rules are of the system. For instance, by reading a thousand books it will learn the relative positions of words and ideas about grammatical structure without anyone telling it what objects or articles or commas are -- it got it by drawing inferences from lots of examples. This feels intuitively more like how people learn, which is part of why researchers like it. But the models still tend to be single-modal, and all the work you do to set up a semi-supervised learning system for speech recognition won't apply at all to image analysis -- they're simply too different. That's where Facebook/Meta's latest research, the catchily named data2vec, comes in.

The idea for data2vec was to build an AI framework that would learn in a more abstract way, meaning that starting from scratch, you could give it books to read or images to scan or speech to sound out, and after a bit of training it would learn any of those things. It's a bit like starting with a single seed, but depending on what plant food you give it, it grows into an daffodil, pansy or tulip. Testing data2vec after letting it train on various data corpi showed that it was competitive with and even outperformed similarly sized dedicated models for that modality. (That is to say, if the models are all limited to being 100 megabytes, data2vec did better -- specialized models would probably still outperform it as they grow.)

Read more of this story at Slashdot.