Discrete Mathematics (cs.DM)

  • PDF
    Social networks and interactions in social media involve both positive and negative relationships. Signed graphs capture both types of relationships: positive edges correspond to pairs of "friends", and negative edges to pairs of "foes". The edge sign prediction problem, that aims to predict whether an interaction between a pair of nodes will be positive or negative, is an important graph mining task for which many heuristics have recently been proposed [Leskovec 2010]. We model the edge sign prediction problem as follows: we are allowed to query any pair of nodes whether they belong to the same cluster or not, but the answer to the query is corrupted with some probability $0<q<\frac{1}{2}$. Let $\delta=1-2q$ be the bias. We provide an algorithm that recovers all signs correctly with high probability in the presence of noise for any constant gap $\delta$ with $O(\frac{n\log n}{\delta^4})$ queries. Our algorithm uses breadth first search as its main algorithmic primitive. A byproduct of our proposed learning algorithm is the use of $s-t$ paths as an informative feature to predict the sign of the edge $(s,t)$. As a heuristic, we use edge disjoint $s-t$ paths of short length as a feature for predicting edge signs in real-world signed networks. Our findings suggest that the use of paths improves the classification accuracy, especially for pairs of nodes with no common neighbors.
  • PDF
    The Curveball algorithm is a variation on well-known switch-based Markov chain approaches for uniformly sampling binary matrices with fixed row and column sums. Instead of a switch, the Curveball algorithm performs a so-called binomial trade in every iteration of the algorithm. Intuitively, this could lead to a better convergence rate for reaching the stationary (uniform) distribution in certain cases. Some experimental evidence for this has been given in the literature. In this note we give a spectral gap comparison between two switch-based chains and the Curveball chain. In particular, this comparison allows us to conclude that the Curveball Markov chain is rapidly mixing whenever one of the two switch chains is rapidly mixing. Our analysis directly extends to the case of sampling binary matrices with forbidden entries (under the assumption of irreducibility). This in particular captures the case of sampling simple directed graphs with given degrees. As a by-product of our analysis, we show that the switch Markov chain of the Kannan-Tetali-Vempala conjecture only has non-negative eigenvalues if the sampled binary matrices have at least three columns. This shows that the Markov chain does not have to be made lazy, which is of independent interest. We also obtain an improved bound on the smallest eigenvalue for the switch Markov chain studied by Greenhill for uniformly sampling simple directed regular graphs.