Michael Bloesch presents Learning a compact optimisable representation for dense SLAM

On 2018-11-14 16:00:00 at E112, Karlovo náměstí 13, Praha 2

The representation of geometry in real-time 3D perception systems continues to
be a critical research issue. Dense maps capture complete surface models and
can
be augmented with semantic labels, but their high dimensionality makes them
computationally costly to store and process, and unsuitable for rigorous
probabilistic inference. Sparse feature-based representations avoid these
problems, but capture only partial scene information and are mainly useful for
localisation only.
During the talk, I will discuss how concepts from traditional SLAM can be
combined with learned elements in order to derive a compact and dense
representation of scene geometry. In particular, I will present an approach
inspired both from learned depth from images and auto-encoders that compresses
a
depth map into a code consisting of a small number of parameters. Conditioning
the depth map on the image allows the code to only represent aspects of the
local geometry which cannot directly be predicted from the image. This approach
is suitable for use in a keyframe-based monocular dense SLAM system: While each
keyframe with a code can produce a depth map, the code can be optimised
efficiently and jointly with pose variables and together with the codes of
overlapping keyframes to attain global consistency. I will explain how to learn
the code representation, highlight its strengths and weaknesses and discuss
potential improvements.

External www: https://arxiv.org/abs/1804.00874