- Oct 17 2017 cs.SI arXiv:1710.05333v1Why is a given node in a time-evolving graph ($t$-graph) marked as an anomaly by an off-the-shelf detection algorithm? Is it because of the number of its outgoing or incoming edges, or their timings? How can we best convince a human analyst that the node is anomalous? Our work aims to provide succinct, interpretable, and simple explanations of anomalous behavior in $t$-graphs (communications, IP-IP interactions, etc.) while respecting the limited attention of human analysts. Specifically, we extract key features from such graphs, and propose to output a few pair (scatter) plots from this feature space which "best" explain known anomalies. To this end, our work has four main contributions: (a) problem formulation: we introduce an "analyst-friendly" problem formulation for explaining anomalies via pair plots, (b) explanation algorithm: we propose a plot-selection objective and the LookOut algorithm to approximate it with optimality guarantees, (c) generality: our explanation algorithm is both domain- and detector-agnostic, and (d) scalability: we show that LookOut scales linearly on the number of edges of the input graph. Our experiments show that LookOut performs near-ideally in terms of maximizing explanation objective on several real datasets including Enron e-mail and DBLP coauthorship. Furthermore, LookOut produces fast, visually interpretable and intuitive results in explaining "ground-truth" anomalies from Enron, DBLP and LBNL (computer network) data.
- Oct 17 2017 physics.soc-ph cs.SI arXiv:1710.05265v1The dismantling network problem only asks the minimal vertex set of a graph after removing which the remaining graph will break into connected components of sub-extensive size, but we should also consider the efficiency of intermediate states during the entire dismantling process, which is measured by the general performance R in this paper. In order to improve the general performance of the belief-propagation decimation (BPD) algorithm, we introduce a compound algorithm (CA) mixing the BPD and the node explosive percolation (NEP) algorithm. In this CA, the NEP algorithm will rearrange and optimize the head part of a dismantling sequence given by the BPD. Two ancestor algorithms are connected at the joint point where the general performance can be optimized. It dismantles a graph to small pieces as quickly as the BPD, and it is with the efficiency of the NEP during the entire dismantling process. We find that a wise joint point is where the BPD breaks the original graph to subgraphs no longer larger than the 1% of the original one. We refer the CA with this settled joint point as the fast CA and the fast CA is in the same complexity class with the BPD algorithm. The computation on some real-world instances also exhibits that using the fast CA to optimize the intermediate process of a dismantling algorithm is an effective approach.
- Social network analysis provides meaningful information about behavior of network members that can be used in diverse applications such as classification, link prediction, etc. however, network analysis is computationally expensive because of feature learning for different applications. In recent years, many researches have focused on feature learning methods in social networks. Network embedding represents the network in a lower dimensional representation space with the same properties which presents a compressed representation of the input network. In this paper, we introduce a novel algorithm named "CARE" for network embedding that can be used for different types of networks including weighted, directed and complex. While current methods try to preserve local neighborhood information of nodes, we utilize local neighborhood and community information of network nodes to cover both local and global structure of social networks. CARE builds customized paths, which are consisted of local and global structure of network nodes, as a basis for network embedding and uses skip-gram model to learn representation vector of nodes. Then, stochastic gradient descent is used to optimize our objective function and learn the final representation of nodes. Our method can be scalable when new nodes are appended to network without information loss. Parallelize generation of customized random walks is also used for speeding up CARE. We evaluate the performance of CARE on multi label classification and link prediction tasks. Experimental results on different networks indicate that the proposed method outperforms others in both Micro-f1 and Macro-f1 measures for different size of training data.
- Predicting fine-grained interests of users with temporal behavior is important to personalization and information filtering applications. However, existing interest prediction methods are incapable of capturing the subtle degreed user interests towards particular items, and the internal time-varying drifting attention of individuals is not studied yet. Moreover, the prediction process can also be affected by inter-personal influence, known as behavioral mutual infectivity. Inspired by point process in modeling temporal point process, in this paper we present a deep prediction method based on two recurrent neural networks (RNNs) to jointly model each user's continuous browsing history and asynchronous event sequences in the context of inter-user behavioral mutual infectivity. Our model is able to predict the fine-grained interest from a user regarding a particular item and corresponding timestamps when an occurrence of event takes place. The proposed approach is more flexible to capture the dynamic characteristic of event sequences by using the temporal point process to model event data and timely update its intensity function by RNNs. Furthermore, to improve the interpretability of the model, the attention mechanism is introduced to emphasize both intra-personal and inter-personal behavior influence over time. Experiments on real datasets demonstrate that our model outperforms the state-of-the-art methods in fine-grained user interest prediction.
- Oct 17 2017 cs.SI arXiv:1710.05660v1In this paper, we describe \sc quantitative graph theory and argue it is a new graph-theoretical branch in network science, however, with significant different features compared to classical graph theory. The main goal of quantitative graph theory is the structural quantification of information contained in complex networks by employing a \it measurement approach based on numerical invariants and comparisons. Furthermore, the methods as well as the networks do not need to be deterministic but can be statistic. As such this complements the field of classical graph theory, which is descriptive and deterministic in nature. We provide examples of how quantitative graph theory can be used for novel applications in the context of the overarching concept network science.
- Oct 17 2017 cs.SI arXiv:1710.05386v1
- Oct 17 2017 physics.soc-ph cs.SI arXiv:1710.05272v1