Seznam

Téma:Framework for AI System Output Evaluation by Humans
Vedoucí:Mgr. Petr Baudiš , Ing. Jan Šedivý CSc.
Vypsáno jako:Diplomová práce,Bakalářská práce
Popis: In design of artificial intelligence systems that are interactive
by nature and have free-form output, for example Question Answering,
dialog, chat bot or figure synthesis systems, it is difficult to
evaluate the performance of the system on a large dataset as many
answers cannot be judged correct or incorrect simply by matching them
against a predefined template - a human needs to enter the loop and
evaluate.

The task here is surveying the area for existing solutions and
building a framework for human evaluation of results - capable of both
interactive evaluation (users ask questions and evaluate if the output
is correct) and batch evaluation (users evaluate sets of pre-generated
answers of past questions), and supporting aggregation, analysis,
memo-ization and export of results. The framework should be reasonably
generic, but we will apply it to the Question Answering domain, on the
"brmson" system developed in our group.
Vypsáno dne:18.11.2014