Reflect and correct: A misclassification prediction approach to active inference

TitleReflect and correct: A misclassification prediction approach to active inference
Publication TypeJournal Articles
Year of Publication2009
AuthorsBilgic M, Getoor L
JournalACM Transactions on Knowledge Discovery from Data (TKDD)
Volume3
Issue4
Pagination20:1–20:32 - 20:1–20:32
Date Published2009/12//
ISBN Number1556-4681
Keywordsactive inference, collective classification, information diffusion, label acquisition, viral marketing
Abstract

Information diffusion, viral marketing, graph-based semi-supervised learning, and collective classification all attempt to model and exploit the relationships among nodes in a network to improve the performance of node labeling algorithms. However, sometimes the advantage of exploiting the relationships can become a disadvantage. Simple models like label propagation and iterative classification can aggravate a misclassification by propagating mistakes in the network, while more complex models that define and optimize a global objective function, such as Markov random fields and graph mincuts, can misclassify a set of nodes jointly. This problem can be mitigated if the classification system is allowed to ask for the correct labels for a few of the nodes during inference. However, determining the optimal set of labels to acquire is intractable under relatively general assumptions, which forces us to resort to approximate and heuristic techniques. We describe three such techniques in this article. The first one is based on directly approximating the value of the objective function of label acquisition and greedily acquiring the label that provides the most improvement. The second technique is a simple technique based on the analogy we draw between viral marketing and label acquisition. Finally, we propose a method, which we refer to as reflect and correct, that can learn and predict when the classification system is likely to make mistakes and suggests acquisitions to correct those mistakes. We empirically show on a variety of synthetic and real-world datasets that the reflect and correct method significantly outperforms the other two techniques, as well as other approaches based on network structural measures such as node degree and network clustering.

URLhttp://doi.acm.org/10.1145/1631162.1631168
DOI10.1145/1631162.1631168