Detail of the student project

List
Topic:Framework for AI System Output Evaluation by Humans
Department:Katedra kybernetiky
Supervisor:Mgr. Petr Baudiš , Ing. Jan Šedivý CSc.
Announce as:DP,BP
Description: In design of artificial intelligence systems that are interactive
by nature and have free-form output, for example Question Answering,
dialog, chat bot or figure synthesis systems, it is difficult to
evaluate the performance of the system on a large dataset as many
answers cannot be judged correct or incorrect simply by matching them
against a predefined template - a human needs to enter the loop and
evaluate.

The task here is surveying the area for existing solutions and
building a framework for human evaluation of results - capable of both
interactive evaluation (users ask questions and evaluate if the output
is correct) and batch evaluation (users evaluate sets of pre-generated
answers of past questions), and supporting aggregation, analysis,
memo-ization and export of results. The framework should be reasonably
generic, but we will apply it to the Question Answering domain, on the
"brmson" system developed in our group.
Date:18.11.2014
Responsible person: Petr Pošík