
Bachelor thesis:Design of Probabilistic Models for Text Input Correction ( PDF )
Author:Novák Antonín
Supervisor:Ing. Jan Šedivý CSc.
Abstract:This thesis introduces a new algorithm for a Search Query Spelling Correction System. It is based on Learning to Rank approach and allows to use a large number of various signals leading to an improved accuracy. Its performance will be tested against the conventional solution - the noisy channel model. The new system was developed on a Czech Internet search query set, but the feature vector structure and the algorithm can be easily adapted for any other human language when sufficient data is available. We will describe the algorithm's details, the training set and other datasets that were used. In the end we will present final results.
Submited:May 2013
More info: