Detail of the student project

List
Topic:Hluboké stochastické prediktory
Department:Katedra kybernetiky
Supervisor:doc. Boris Flach, Dr. rer. nat. habil.
Announce as:Diplomová práce, Semestrální projekt
Description:Description: It is well known from statistical pattern recognition that stochastic predictors (classifiers) can not be better than their deterministic counterparts. Using deterministic predictors together with standard loss functions (e.g. 0/1 loss), on the other hand, makes the corresponding empirical loss to be a piece-wise constant function, i.e. not suitable for deep networks. The standard solution is to interpret the network outputs as class probabilities and to use an ML estimator (cross entropy) instead.

This thesis proposal strives at exploring another alternative: to interpret the network outputs as probabilities of a stochastic predictor and to minimise the corresponding averaged expected loss, which becomes a smooth function of the network parameters. This opens a further option - to introduce stochasticity not only for the last layer but also in all hidden layers of the network.

Tasks:
1. Consider suitable stochastic neuron models e.g. stochastic binary neurons, stochastic ReLU neurons and suitable stochastic gradient estimators for them (e.g. "straight through" estimator).
2. Implement stochastic CNN predictors along with the chosen stochastic gradient estimators. Train them on datasets like MNIST and CIFAR10/100 for different loss functions (0/1 loss, hierarchical losses, L1/2 loss)
3. Compare the resulting stochastic predictors with their respective baselines (i.e. their deterministic counterparts trained by MLE). Criteria for this analysis are: empirical loss, robustness against geometric and colour distortions, robustness against adversarial attacks as well as expected calibration error.
Bibliography:1. Shekhovtsov et al. (2020), Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks, Advances in Neural Information Processing Systems (NeurIPS)
2. Chuan Guo et al. (2017), On Calibration of Modern Neural Networks, arXiv:1706.04599
3. Why deep-learning AIs are so easy to fool, Nature 574, 163-166 (2019) and citations given there.
Responsible person: Petr Pošík