Michele Dallachiesa, C. Aggarwal, Themis Palpanas
Jan 4, 2019
Journal of Data and Information Quality (JDIQ)
In many real applications that use and analyze networked data, the links in the network graph may be erroneous or derived from probabilistic techniques. In such cases, the node classification problem can be challenging, since the unreliability of the links may affect the final results of the classification process. If the information about link reliability is not used explicitly, then the classification accuracy in the underlying network may be affected adversely. In this article, we focus on situations that require the analysis of the uncertainty that is present in the graph structure. We study the novel problem of node classification in uncertain graphs, by treating uncertainty as a first-class citizen. We propose two techniques based on a Bayes model and automatic parameter selection and show that the incorporation of uncertainty in the classification process as a first-class citizen is beneficial. We experimentally evaluate the proposed approach using different real data sets and study the behavior of the algorithms under different conditions. The results demonstrate the effectiveness and efficiency of our approach.