|Topic:||Multimodální podobnost videí|
|Supervisor:||Ing. Jan Zahálka|
|Announce as:||Diplomová práce, Bakalářská práce, Semestrální projekt|
|Description:||What makes videos (dis)similar to each other? And how do we judge this similarity automatically?
Videos as a data format are used in a wide variety of domains - e. g., entertainment, news & media, security, or data analytics. There is a large demand for AI techniques that are able to assist in analyzing/making sense of video collections, esp. large ones. To be able to search efficiently in these collections, we must establish a similarity structure. In the conventional data spaces, classics such as Euclidean or cosine distance might suffice, but what about videos?
Videos are a multimodal data format (visual data, audio, metadata, a temporal axis, possibly comments/annotations...) packed with information, they vary in length and exhibit very high content variance (a feature-length film is very different from an Instagram clip, and both are very different from a piece of surveillance footage). What similarity do we choose then? Which one of the existing ones is the best? Or can we come up with a new, better one?
This is an open-ended project for a student interested in scientific work. The topic is broad, you will be able to steer it towards what interests you to an extent.