Téma:Novel activation functions for Deep Neural Networks accounting for per-example variance
Vedoucí: Oleksandr Shekhovtsov Ph.D., Doc. Dr. Boris Flach
Vypsáno jako:Diplomová práce
Popis:Methods like batch normalisation (Ioffe, Szegedy, 2015) or self normalising networks (Klambauer, Unterthiner, Mayr, 2017) are used to improve accuracy and speed of learning for deep networks, especially if they have a large number of parameters (> 10^6).

A probabilistic interpretation of deep networks (recently found by us) allows a different avenue to tackle the issue. This interpretation shows that standard activation functions (tanh, sigmoid) correspond to the most simple approximation when computing the probabilities for the nodes of the next layer given the probabilities for the nodes of the previous layer. Better approximations will account for the variance on the level of a single example (as opposed to batch statistics) and lead to advanced activation functions.

The master thesis will aim at considering two variants of such advanced activation functions and compare the precision and convergence speed of learning for them with the aforementioned standard methods. Furthermore, the candidate is expected to contribute to a better understanding of the underlying theoretical phenomena as well as to the search of better approximations starting from the mentioned probabilistic interpretation of deep networks.

Prerequisites: good command of mathematics, probability theory and foundations of machine learning; programming skills in python; C++ and CUDA can be helpful.

During the project the student will work with main-stream neural-network frameworks that allow symbolic modelling and learning on GPUs (tensorflow) and test model improvements on the state of the art datasets and architectures.

Supervisors: Boris Flach and Olexandr Shekhovtsov.
Literatura:[1] Sergey Ioffe, Christian Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, arXiv:1502.03167 [2] Günter Klambauer, Thomas Unterthiner, Andreas Mayr, Sepp Hochreiter, Self-Normalizing Neural Networks, arXiv:1706.02515
Vypsáno dne:02.08.2017