많은 참석 부탁드립니다.
일시 : 2026년 4월 13일(월) 오후 1시 30분
장소 : 정운오IT교양관 610호
Title: Scaling Video Reasoning for the Real World
Abstract:
Video understanding has advanced rapidly, yet current models still struggle with the complexity of real-world scenarios. Unlike curated benchmarks, real-world video is often long, noisy, and incomplete, making reliable reasoning significantly more challenging. In this talk, I will discuss how video understanding systems can scale across both training and inference by better identifying salient information, leveraging diverse multimodal signals, and retaining past experience through an advanced memory system. I will further highlight a shift toward more active forms of reasoning, where models move beyond passive observation and adaptively allocate computation at inference time to support long-horizon reasoning.