|Téma:||Vylepšený softwarový balíček pro kvantované neuronové sítě|
|Vedoucí:||Mgr. Oleksandr Shekhovtsov, Ph.D.|
|Vypsáno jako:||Diplomová práce, Semestrální projekt|
|Popis:||Project: for an improved speed and energy efficiency of neural networks, when they are deployed in mobile, robotics, surveillance, etc. applications, the network weights and activations can be quantized. In the 'Quant' library we implement state-of-the-art and new theoretically principled quantized training and adoptation methods.
Contribute to the development of ‘quant’ library: a library for training neural networks with quantized weights and activations with low bit resolution (down to 1 bit) based on pytorch. While quantized neural networks have a potential for fast execution on different edge devices, the library has to provide in the first place a sufficiently fast training on GPUs. We want to achieve fast training and best performance with any specified quantization level. In the project:
- analize variants how to implement network propagation consisting of elementary layers with a method determined at runtime.
- Profile the training loop to identify performance bottlenecks.
- Improve the inference speed by obtaining the test-time equivalent quantized model. Each model should have an .inference() method.
- Improve code efficiency by using computation streams, C++ extensions implementing essential computations blocks using A10 library, possibly C++ CUDA extension for the most critical operations, if such are identified.
- Extend the library to support additional methods.
- Test the library on a realistic large classification problem, e.g. surveillance.
|Literatura:|| Binary Neural Networks as a general-propose compute paradigm for on-device computer vision
 Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks
 Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks
 Straight-Through Top-to-Bottom: A General Formalization for Binary, Quantized and Categorical Variables (draft)
 Relaxed Quantization for Discretized Neural Networks
 A Survey of Quantization Methods for Efficient Efficient Neural Network Inference