Paul Coen felt energized. The second-year Computer Science Ph.D. student was presenting his work at the Midwest Computer Vision Workshop at the Luddy Hall atrium, standing among nearly 100 other posters showcasing research that may lead to AI-powered breakthroughs in medical imaging, manufacturing, education, self-driving vehicles and beyond.
“Vision has fascinated me since I was a child because of how complex it can be,” Coen said. “Computer vision is about mimicking human sight, but it’s also more than that. It’s how can we build sensors that are better than human vision, that can see spectra that we can’t?”
The Midwest Computer Vision Workshop, held September 16-17 at the Luddy School of Informatics, Computing, and Engineering, marked the revival of a workshop series that used to rotate annually around Midwest universities before the pandemic. Nearly 200 faculty and students from 15 Midwest universities attended, more than twice as many as attended the previous Midwest workshop, which was in 2018.
“I’m very excited to restart the workshop after many years of hiatus, and to host it here at IU,” said David Crandall, Luddy Professor of Computer Science and director of the Luddy Artificial Intelligence Center and the IU Computer Vision Lab, who organized the workshop. “The turnout and enthusiasm have been amazing.”
Coen’s poster presented “Fast TAM,” a real-time algorithm for tracking how objects move in videos, which was developed with recently-graduated master’s student Michael Siler. Other posters included “Common Sense Reasoning for Deepfake Detection,” “Facial Affective Behavior Analysis with Instruction Tuning,” “Full-Body Cardiovascular Sensing with Remote Photoplethysmography,” and “Iris Recognition for Infants.”
The poster session followed the previous day’s talk sessions at Franklin Hall that featured leading scientists on artificial intelligence and computer vision, including the Luddy School’s Zoran Tiganj, assistant professor of Computer Science, and Samantha Wood, assistant professor of Informatics.
Crandall said the workshop had two main goals: to help build a community for computer vision in the Midwest, and to give students and faculty an opportunity to present their work in a setting less overwhelming and expensive than the large international conferences.
“Many people don’t think of the Midwest as a hotspot for AI or computer vision research,” he said, “but it really is. Midwest universities do some of the best and most interesting computer vision research in the world. We want to highlight and build on that.”
Hosting the workshop at IU provided a benefit for Luddy students and faculty, another of Crandall’s priorities.
“I wanted to host the workshop here both so our students could learn more about computer vision from all these amazing visitors, and so that the visitors could learn more about IU and Bloomington,” he said. “That’s why we had the poster session in the Luddy Hall atrium, so that it would be easy for our students to stop by on their way to class.”
To encourage visitors to explore campus and to think beyond technology, the workshop organized informal tours of the Musical Arts Center in the Jacobs School of Music, the new “Blurring the Lines” exhibit on AI-artist collaboration in the Eskenazi School of Art, Architecture and Design, and a tour of the Calabi-Yau Sculpture outside Luddy Hall by the sculpture’s designer, Emeritus Professor of Computer Science Andrew Hanson.
The workshop was supported in part by the IU Institute for Advanced Study and the Luddy Artificial Intelligence Center.
During Monday’s talk sessions at Franklin Hall, 19 presentations were divided into four sessions: Recognition and Representations, Multimodal Computer Vision, Emerging Applications and Directions, and Generative Models.
Besides Tiganj and Wood, speakers came from the University of Notre Dame, the University of Michigan, the University of Illinois Urbana-Champaign, The Ohio State University, Michigan State University, the University of Chicago, the University of Illinois Chicago, Purdue University, the University of Wisconsin-Madison, Wayne State University and Toyota Technological Institute at Chicago.
Tiganj said his talk, titled “Cognitive science-inspired approaches for training and evaluating computer vision models,” centered on research using data from cognitive science to train and evaluate computer vision models.
“The data included infant egocentric videos, image recognition data from two-year-old toddlersand perceptual judgment data from adult humans,” he said. “Overall, this line of research helps us understand the importance of the properties of the data for efficient learning and generalization. It also provides benchmarks to evaluate the advancement of AI models and their alignment with human perception.”
Tiganj said the research was an interdisciplinary collaboration with Justin Wood, Luddy associate professor of Informatics, Linda Smith and Robert Nosofsky, distinguished professors in the Department of Psychological and Brain Sciences, graduated Ph.D. student and current postdoc Saber Sheybani, and current Ph.D. students Billy Dickson and Sahaj Maini Singh.
Samantha Wood’s talk was titled “Reverse Engineering the Origins of Vision.” She said it dealt with one of the great unsolved mysteries of science: the origins of knowledge and the ongoing debate between nativism (knowledge is present at birth) and empiricism (knowledge comes from sensory experiences and environment interactions), and what mechanisms exist in the brain at birth and how the brain learns from its experiences.
“My research (through the Building a Mind Lab) takes an integrative approach at the intersection of artificial and biological intelligence, using AI models to better understand how newborn brainsprocess visual information and learn.”
Wood said the lab takes a parallel controlled-rearing approach. Researchers perform controlled-rearing experiments on newborn chicks and "newborn" vision models to ensure that the animals and models receive the same training data and test trials.
“By using this comparative approach,” she said, “we can see how closely AI models approximate biological intelligence by assessing whether newborn animals and newborn models show the same patterns of learning successes and failures.”
She said that vision transformers – which are generic models with little prior information about the world – showed learning patterns similar to newborn chicks.
“That suggests that knowledge emerges through experience and interaction with the environment, aligning more with an empiricist perspective,” she said.
Other topics included “Towards Open World Visual Understanding” by Michigan State’s Yu Kong, “On the Shoulders of Pre-trained Giants: Versatile Visual and Multimodal Learning with Less Supervision” by the University of Illinois’s Yuxiong Wang and “Generative AI for 3D Content” by the University of Michigan’s Jun Gao.
“We had great talks from some of the best faculty in the world, who happen to also be in the Midwest,” Crandall said. “Even at the international computer vision conferences, it’s not often that you get to hear so many great talks from so many great universities all at once.”
Coen praised the chance to hear computer vision experts discuss so many cutting-edge topics. He said he also benefitted from the networking opportunities, and hoped it would spur more Midwest collaboration.
“It was really fun. I’m happy the workshop has started back up.”