Seznam

Téma:	Framework for AI System Output Evaluation by Humans
Vedoucí:	Mgr. Petr Baudiš , Ing. Jan Šedivý CSc.
Vypsáno jako:	Diplomová práce,Bakalářská práce
Popis:	In design of artificial intelligence systems that are interactive by nature and have free-form output, for example Question Answering, dialog, chat bot or figure synthesis systems, it is difficult to evaluate the performance of the system on a large dataset as many answers cannot be judged correct or incorrect simply by matching them against a predefined template - a human needs to enter the loop and evaluate. The task here is surveying the area for existing solutions and building a framework for human evaluation of results - capable of both interactive evaluation (users ask questions and evaluate if the output is correct) and batch evaluation (users evaluate sets of pre-generated answers of past questions), and supporting aggregation, analysis, memo-ization and export of results. The framework should be reasonably generic, but we will apply it to the Question Answering domain, on the "brmson" system developed in our group.
Vypsáno dne:	18.11.2014