Seznam |
Téma: | Framework for AI System Output Evaluation by Humans |
---|---|
Vedoucí: | Mgr. Petr Baudiš , Ing. Jan Šedivý CSc. |
Vypsáno jako: | Diplomová práce,Bakalářská práce |
Popis: | In design of artificial intelligence systems that are interactive
by nature and have free-form output, for example Question Answering, dialog, chat bot or figure synthesis systems, it is difficult to evaluate the performance of the system on a large dataset as many answers cannot be judged correct or incorrect simply by matching them against a predefined template - a human needs to enter the loop and evaluate. The task here is surveying the area for existing solutions and building a framework for human evaluation of results - capable of both interactive evaluation (users ask questions and evaluate if the output is correct) and batch evaluation (users evaluate sets of pre-generated answers of past questions), and supporting aggregation, analysis, memo-ization and export of results. The framework should be reasonably generic, but we will apply it to the Question Answering domain, on the "brmson" system developed in our group. |
Vypsáno dne: | 18.11.2014 |