Luddy School of Informatics, Computing, and Engineering’s Zoran Tiganj, assistant professor of Computer Science, and Justin Wood, associate professor of Informatics, along Indiana University Distinguished Professor Linda Smith, co-authored a paper detailing first-of-its-kind research linking infant development and the training of machine learning models.
The paper, “Curriculum Learning With Infant Egocentric Videos,” addressed a key question -- how do you compare the learning abilities of human infants and machines? The paper was presented during the prestigious Neural Information Processing Systems Conference in New Orleans.
Other co-authors were Saber Sheybani, an IU postdoctoral fellow, and IU graduate student Himanshu Hansaria.
Tiganj said researchers used egocentric vision data recorded by Smith's group at Psychological and Brian Sciences to train machine learning systems.
“We found that feeding data in the developmental order had a profound impact on learning in the artificial system,” Tiganj said. “This is consistent with the idea that the gradual development of mobility in human infants reflects evolutionary adaptation so that infants receive visual data optimal for rapid development. This also informs learning in machines and tells us that training artificial systems with data of gradually increasing temporal complexity leads to improved performance.”
The paper, which was accepted as a NeurlPS “spotlight,” detailed research that showed generic learning algorithms, with no hard-coded knowledge about objects or space, can learn to solve complex real-world vision tasks when the models are trained through the eyes of infants.
The order of the training data matters. Machines learn best when they receive training data in the same order as infants, and that slow visuals are important for kick-starting visual learning. This contradicts previous research that suggested the order of AI training data doesn’t matter.
Researchers introduced the first computer vision system that had been trained in a “developmentally correct” order. That’s an important step in reverse engineering the learning algorithms in human brains.
The paper’s abstract said that infants possess a remarkable ability to rapidly learn and process visual inputs. As an infant's mobility increases, so does the variety and dynamics of their visual inputs.
The co-authors asked if this change in the properties of the visual inputs is beneficial or even critical for the proper development of the visual system. They used video recordings from infants wearing head-mounted cameras to train a variety of self-supervised learning models. They separated the infant data by age group and evaluated the importance of training with a curriculum aligned with developmental order.
The co-authors found that initiating learning with the data from the youngest age group provided the strongest learning signal and led to the best learning outcomes in terms of downstream task performance. They showed that the benefits of the data from the youngest age group are due to the slowness and simplicity of the visual experience.
That was strong evidence about the importance of the properties of the early infant experience and developmental progression in training. That could lead to reverse engineering the learning mechanisms in newborn brains using image-computable models from artificial intelligence.
Smith gave one of the keynote talks at the conference.
NeurlPS is an interdisciplinary conference that brings together researchers in machine leaning, neuroscience, statistics, optimization, computer vision, natural language processing, life sciences, natural sciences, social sciences and other adjacent fields.