David Doermann presents From YOLOv1 to YOLOv12 ? The Evolution of Real-Time Object Detection

On 2025-10-27 11:00:00 at G205, Karlovo náměstí 13, Praha 2
The YOLO (You Only Look Once) family of models has revolutionized real-time
object detection by unifying classification and localization into a single,
efficient neural network pass. This lecture traces the historical trajectory of
YOLO from its inception in 2016 to the cutting-edge YOLOv12, developed by
researchers at the University at Buffalo. We begin with YOLOv1’s pioneering
architecture and examine successive improvements through YOLOv8, including
innovations in backbone design, multi-scale prediction, and deployment
flexibility.

The focus then shifts to YOLOv12, which introduces a hybrid architecture
combining convolutional efficiency with transformer-inspired attention
mechanisms. Key advancements include FlashAttention for accelerated inference,
Residual Efficient Layer Aggregation Networks (R-ELAN) for stable training, and
expanded support for tasks such as segmentation, pose estimation, and oriented
bounding boxes. YOLOv12 sets new benchmarks in speed and accuracy across a range
of applications, from medical imaging to surveillance.

This lecture not only highlights the technical evolution of YOLO but also
underscores the growing role of academic institutions in shaping the future of
computer vision. Attendees will gain a comprehensive understanding of YOLO’s
development, its practical impact, and the architectural trends driving
next-generation vision models.


Short Bio
Dr. David Doermann is a professor of Empire Innovation and the Chair of the
Department of Computer Science and Engineering at the University at Buffalo
(UB). He was the inaugural director of the Institute for Artificial Intelligence
and Data Science (IAD) from 2018-2023. Before coming to UB, he was a program
manager at the Information Innovation Office of the Defense Advanced Research
Projects Agency (DARPA). At DARPA, he developed, selected, and oversaw research
and transition funding in computer vision, human language technologies, voice
analytics, and media forensics. From 1993 to 2018, David was a research faculty
member at the University of Maryland, College Park. In his role at the Institute
for Advanced Computer Studies, he served as Director of the Laboratory for
Language and Media Processing and as an adjunct member of the graduate faculty
for the Department of Computer Science and the Department of Electrical
Engineering. He and his group of researchers focused on many innovative topics
related to analyzing and processing document images and video, including triage,
visual indexing and retrieval, enhancement, and recognition of visual media's
textual and structural components. His recent research has focused on advanced
AI techniques for computer vision, medical image analysis, federated learning,
neural architectural search, binary neural networks, and detecting false and
misinformation in multimedia content. David has over 300 publications in
conferences and journals, is a fellow of the IEEE and IAPR, has numerous awards,
including an honorary doctorate from the University of Oulu, Finland, and is a
founding Editor-in-Chief of the International Journal on Document Analysis and
Recognition.

In his work at UB, Dr. Doermann is a passionate advocate for integrating
artificial intelligence into our rapidly evolving world. He primarily focuses on
empowering individuals and organizations with the knowledge and skills they need
to thrive in the age of AI. He firmly believes that AI has the potential to
revolutionize industries, enhance human capabilities, and address some of the
most pressing global challenges. Dr. Doermann was nominated for the State of New
York Governors Commission on Artificial Intelligence, Robotics, and Automation,
representing the SUNY system. He serves on the SUNY STrategic Research
InVEstment (STRIVE) Task Force on Artificial Intelligence and the UB Led Task
Force on Generative AI in Teaching and Learning. He has organized several
workshops and conferences at UB centered on the challenges of recent AI. He is a
member of the DARPA ISAT study group focused on future uses of AI for the
government.
Responsible person: Petr Pošík