Feb 23 2018

cs.DM arXiv:1802.08189v1

In the Directed Steiner Network problem we are given an arc-weighted digraph $G$, a set of terminals $T \subseteq V(G)$, and an (unweighted) directed request graph $R$ with $V(R)=T$. Our task is to output a subgraph $G' \subseteq G$ of the minimum cost such that there is a directed path from $s$ to $t$ in $G'$ for all $st \in A(R)$. It is known that the problem can be solved in time $|V(G)|^{O(|A(R)|)}$ [Feldman&Ruhl, SIAM J. Comput. 2006] and cannot be solved in time $|V(G)|^{o(|A(R)|)}$ even if $G$ is planar, unless Exponential-Time Hypothesis (ETH) fails [Chitnis et al., SODA 2014]. However, as this reduction (and other reductions showing hardness of the problem) only shows that the problem cannot be solved in time $|V(G)|^{o(|T|)}$ unless ETH fails, there is a significant gap in the complexity with respect to $|T|$ in the exponent. We show that Directed Steiner Network is solvable in time $f(R)\cdot |V(G)|^{O(c_g \cdot |T|)}$, where $c_g$ is a constant depending solely on the genus of $G$ and $f$ is a computable function. We complement this result by showing that there is no $f(R)\cdot |V(G)|^{o(|T|^2/ \log |T|)}$ algorithm for any function $f$ for the problem on general graphs, unless ETH fails.

Applied researchers often construct a network from a random sample of nodes in order to infer properties of the parent network. Two of the most widely used sampling schemes are subgraph sampling, where we sample each vertex independently with probability $p$ and observe the subgraph induced by the sampled vertices, and neighborhood sampling, where we additionally observe the edges between the sampled vertices and their neighbors. In this paper, we study the problem of estimating the number of motifs as induced subgraphs under both models from a statistical perspective. We show that: for any connected $h$ on $k$ vertices, to estimate $s=\mathsf{s}(h,G)$, the number of copies of $h$ in the parent graph $G$ of maximum degree $d$, with a multiplicative error of $\epsilon$, (a) For subgraph sampling, the optimal sampling ratio $p$ is $\Theta_{k}(\max\{ (s\epsilon^2)^{-\frac{1}{k}}, \; \frac{d^{k-1}}{s\epsilon^{2}} \})$, achieved by Horvitz-Thompson type of estimators. (b) For neighborhood sampling, we propose a family of estimators, encompassing and outperforming the Horvitz-Thompson estimator and achieving the sampling ratio $O_{k}(\min\{ (\frac{d}{s\epsilon^2})^{\frac{1}{k-1}}, \; \sqrt{\frac{d^{k-2}}{s\epsilon^2}}\})$. This is shown to be optimal for all motifs with at most $4$ vertices and cliques of all sizes. The matching minimax lower bounds are established using certain algebraic properties of subgraph counts. These results quantify how much more informative neighborhood sampling is than subgraph sampling, as empirically verified by experiments on both synthetic and real-world data. We also address the issue of adaptation to the unknown maximum degree, and study specific problems for parent graphs with additional structures, e.g., trees or planar graphs.