Europe/Berlin
DO: JvF25/3-303 | BN: b-it/1.047

DO: JvF25/3-303 | BN: b-it/1.047

TU Dortmund University Room 3-303 Lamarr-Institut Joseph-von-Fraunhofer-Str. 25 44227 Dortmund University of Bonn Room 1.047 Institute for Informatics Friedrich-Hirzebruch-Allee 8
Description

Anticipation by Prof. Jürgen Gall

This lecture will give a brief introduction on anticipating future actions from videos.

Vision-Language Action models for Cognitive Robots by Prof. Sven Behnke

This lecture introduces Vision-Language-Action (VLA) models as a unifying framework for cognitive robots capable of grounding perception, language understanding, and physical interaction. We examine how modern VLA architectures integrate multimodal representations to interpret visual scenes, follow natural-language instructions, and generate executable action plans in real time. Key topics include multimodal transformers, affordance grounding, task decomposition, action policy learning, and bridging high-level semantic reasoning with low-level robot control. Through examples from state-of-the-art research and robot demonstrations, students will gain insight into how VLA models enable adaptive, generalizable, and human-aligned robotic cognition.

   
From the same series
1 2 3 4 5 6 7 9 10 11 12 13
Organised by

Vanessa Faber & Brendan Balcerak Jackson