Podrobnosti studentského projektu

Téma:Multimodální podobnost videí
Katedra:Katedra kybernetiky
Vedoucí:Ing. Jan Zahálka
Vypsáno jako:Diplomová práce, Bakalářská práce, Semestrální projekt
Popis:What makes videos (dis)similar to each other? And how do we judge this similarity automatically?

Videos as a data format are used in a wide variety of domains - e. g., entertainment, news & media, security, or data analytics. There is a large demand for AI techniques that are able to assist in analyzing/making sense of video collections, esp. large ones. To be able to search efficiently in these collections, we must establish a similarity structure. In the conventional data spaces, classics such as Euclidean or cosine distance might suffice, but what about videos?

Videos are a multimodal data format (visual data, audio, metadata, a temporal axis, possibly comments/annotations...) packed with information, they vary in length and exhibit very high content variance (a feature-length film is very different from an Instagram clip, and both are very different from a piece of surveillance footage). What similarity do we choose then? Which one of the existing ones is the best? Or can we come up with a new, better one?

This is an open-ended project for a student interested in scientific work. The topic is broad, you will be able to steer it towards what interests you to an extent.
Za obsah zodpovídá: Petr Pošík