Oleksandr Shekhovtsov presents Training Binary CNNs with Sound and Simple Techniques

On 2020-05-19 11:00:00 at The Seminar will be held Online (see below)
We focus on the problem of training CNNs with binary weights and activations,
which is challenging due to the lack of gradients and optimization over
discrete
weights.

Many successful results have been achieved in the literature with an
empirical approach, called straight-through estimation. The lacking gradient in
binary weights and activations is replaced with the gradient of smooth
functions. Different ad-hoc suggestions appear as to which function to use:
identity, clipped identity, sigmoid, etc. Optimizing over binary weights also
causes confusion: works introduce latent real weights updated with clipping
rules or try to perform discrete optimization using different heuristics.

We work in the setting where all binary entities are replaced with Bernoulli
random variables. The gradients are well defined for expectations and we can
optimize continuous weight probabilities. We show how to derive an accurate
gradient estimation method as simple as straight-through without ambiguities
and
need for heuristics. We further show how optimizing the weight probabilities
can
be elegantly handled by mirror descent, recovering again a simple and efficient
update rule, free of heuristics.

Please notice that this seminar will be held online. You can attend it by using
one of the following links:
Invitation link for CTU users:
https://cw.felk.cvut.cz/brute/bbb.php?join=cw_h0I7b7qtCv
Invitation link for non CTU users:
https://cw.felk.cvut.cz/bbb/cw_h0I7b7qtCv

The meeting opens at 10:50 to test the connection, microphones, etc.
The seminar starts at 11:00
The meeting will be recorded
Responsible person: Petr Pošík