Francesco Mannella presents Topological Alignment of Action-Perception Domains: Goal Formation from Sensorimotor Contingencies

On 2026-05-12 11:00:00 at E112, Karlovo náměstí 13, Praha 2
The brain builds internal representations to support goal-directed behavior.
Dominant theories rely on generative forward models to solve the problem of
finding the optimal policy. We propose an alternative approach in which the
problem of control is cast as a problem of representation learning via the
acquisition of anticipatory action maps. Rather than estimating hidden states
and correcting predictions, the brain constructs an internal topological space
in which each node already represents both a multimodal sensory state and a
motor plan for reaching that state. The challenge thus shifts from online
estimation to the offline construction of a representational geometry that
directly affords goal specification and action selection. This reframing
highlights several key sub-problems: how sensory representations are acquired,
how motor plans become intrinsic to those representations rather than computed
over them, and how such sensorimotor nodes can later be retrieved and deployed
as goals. To address these sub-problems, we propose a learning framework that
treats policies as an input domain with the same status as sensory inputs.
Topological maps are built over each sensory modality and over the policy space,
with learning constrained so that all maps align with respect to action-related
events. The resulting structure is a topologically aligned space in which each
point maps a sensorimotor contingency and can serve directly as a goal
representation. Alongside this computational framework, we offer a
neuroscientific account that bridges our proposal to a broader hypothesis
concerning the functional integration of fronto-parietal cortical networks,
suggesting how the mechanisms we describe may be reflected in the coordinated
activity of frontal and parietal regions during goal-directed behavior. Finally,
we present two implementations of this approach, each situated within a distinct
abstracted simulated environment: a manipulation domain, in which a simplified
2D agent learns to interact with objects, and a visual attention domain, in
which a retina learns to saccade across a 2D scene containing different shapes.
Responsible person: Petr Pošík