Crowdsourcing with Multi Dimensional Trust and Active Learning
Baras, John, S.
June 01, 2016
We consider a typical crowdsourcing task that aggregates input from multiple workers as a problem in information fusion. To cope with the issue of noisy and sometimes malicious input from workers, trust is used to model workers expertise. In a multi-domain knowledge learning task, however, using scalarvalued trust to model a workers performance is not sufficient to reflect the workers trustworthiness in each of the domains. To address this issue, we propose a probabilistic model to jointly infer multi-dimensional trust of workers, multi-domain properties of questions, and true labels of questions. Our model is flexible and extensible to incorporate metadata associated with questions. To show that, we further propose two extended models, one of which handles input tasks with real-valued features and the other handles tasks with text features by incorporating topic models. Finally, we evaluate our model on real-world datasets and demonstrate that our models are superior to state-of-the-art. In addition, our models can effectively recover trust vectors of workers, which can be very useful in task assignment adaptive to workers trust in the future. In order to decrease entropies and reduce error rates more quickly with fewer annotations from workers, we further propose strategies for selecting which questions to ask and which workers to assign the questions to based on multi- dimension characteristics of questions and workers trust values in those dimensions. These results can be applied for fusion of information from multiple data sources like sensors, human input, machine learning results, or a hybrid of them.