seminars

Alexander Shekhovtsov presents Explainable Training of Binary Neural Networks

On 2020-12-22 11:00 at Online https://cw.felk.cvut.cz/brute/bbb.php?join=cw_Xptf5FKT1b
There is a high demand for neural networks using low precision computations or
even using mostly binary operations, which are much faster and need less
energy.
The performance of such binary neural networks on benchmarks like the ImageNet
classification challenge steadily improves while the number of unclear tricks
and special ingredients involved in the training procedures grows. Many of
these
tricks are about the question of how to train with binary activations and or
binary weights by somehow (ab)using backpropagation. Can we instead derive
learning methods that would be correct in a certain sense so that we would know
what we are doing? Towards this end, we apply the stochastic relaxation method:
each binary entity has a probability of taking a particular state, then the
optimization and gradients can be performed with respect to these continuous
probabilities. With some surprise, we have derived the popular straight-through
estimator in a particular form and some of the popular weight update rules.
This
theoretically grounded approach allowed us to analyze these estimators,
understand the limitations, analytically derive useful recommendations
regarding
their application in practice, obtain improved estimators and numerically
verify
their accuracy.
The conceptual message of this research is that it is possible to train binary
networks using explainable methods, although partially coinciding with previous
empirical approaches but now free of guessing, with known properties and
limitations, swap-in more accurate methods as needed and improve them further.

Meeting link (CTU login):
https://cw.felk.cvut.cz/brute/bbb.php?join=cw_Xptf5FKT1b
Meeting link (non-CTU): https://cw.felk.cvut.cz/bbb/cw_Xptf5FKT1b


References:
[1] "Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks"
https://arxiv.org/abs/2006.03143
[2] "Reintroducing Straight-Through Estimators as Principled Methods for
Stochastic Binary Networks"
https://openreview.net/forum?id=F8lXvXpZdrL

Web page of this event:
https://docs.google.com/document/d/1dwnWx31cscD4wRKJArqScfGa8zU-4rUKMrvpoxnmiwU/edit#

Responsible person: Petr Pošík