# Search SciRate

### results for au:Korman_A in:cs

• May 17 2017 cs.DC arXiv:1705.05704v1
We consider a parallel version of a classical Bayesian search problem. $k$ agents are looking for a treasure that is placed in one of the boxes indexed by $\mathbb{N}^+$ according to a known distribution $p$. The aim is to minimize the expected time until the first agent finds it. Searchers run in parallel where at each time step each searcher can "peek" into a box. A basic family of algorithms which are inherently robust is \emphnon-coordinating algorithms. Such algorithms act independently at each searcher, differing only by their probabilistic choices. We are interested in the price incurred by employing such algorithms when compared with the case of full coordination. We first show that there exists a non-coordination algorithm, that knowing only the relative likelihood of boxes according to $p$, has expected running time of at most $10+4(1+\frac{1}{k})^2 T$, where $T$ is the expected running time of the best fully coordinated algorithm. This result is obtained by applying a refined version of the main algorithm suggested by Fraigniaud, Korman and Rodeh in STOC'16, which was designed for the context of linear parallel search.We then describe an optimal non-coordinating algorithm for the case where the distribution $p$ is known. The running time of this algorithm is difficult to analyse in general, but we calculate it for several examples. In the case where $p$ is uniform over a finite set of boxes, then the algorithm just checks boxes uniformly at random among all non-checked boxes and is essentially $2$ times worse than the coordinating algorithm.We also show simple algorithms for Pareto distributions over $M$ boxes. That is, in the case where $p(x) \sim 1/x^b$ for $0< b < 1$, we suggest the following algorithm: at step $t$ choose uniformly from the boxes unchecked in ${1, . . . ,min(M, \lfloor t/\sigma\rfloor)}$, where $\sigma = b/(b + k - 1)$. It turns out this algorithm is asymptotically optimal, and runs about $2+b$ times worse than the case of full coordination.
• We introduce the dependent doors problem as an abstraction for situations in which one must perform a sequence of possibly dependent decisions, without receiving feedback information on the effectiveness of previously made actions. Informally, the problem considers a set of $d$ doors that are initially closed, and the aim is to open all of them as fast as possible. To open a door, the algorithm knocks on it and it might open or not according to some probability distribution. This distribution may depend on which other doors are currently open, as well as on which other doors were open during each of the previous knocks on that door. The algorithm aims to minimize the expected time until all doors open. Crucially, it must act at any time without knowing whether or which other doors have already opened. In this work, we focus on scenarios where dependencies between doors are both positively correlated and acyclic.The fundamental distribution of a door describes the probability it opens in the best of conditions (with respect to other doors being open or closed). We show that if in two configurations of $d$ doors corresponding doors share the same fundamental distribution, then these configurations have the same optimal running time up to a universal constant, no matter what are the dependencies between doors and what are the distributions. We also identify algorithms that are optimal up to a universal constant factor. For the case in which all doors share the same fundamental distribution we additionally provide a simpler algorithm, and a formula to calculate its running time. We furthermore analyse the price of lacking feedback for several configurations governed by standard fundamental distributions. In particular, we show that the price is logarithmic in $d$ for memoryless doors, but can potentially grow to be linear in $d$ for other distributions.We then turn our attention to investigate precise bounds. Even for the case of two doors, identifying the optimal sequence is an intriguing combinatorial question. Here, we study the case of two cascading memoryless doors. That is, the first door opens on each knock independently with probability $p\_1$. The second door can only open if the first door is open, in which case it will open on each knock independently with probability $p\_2$. We solve this problem almost completely by identifying algorithms that are optimal up to an additive term of 1.
• Jan 11 2017 cs.DC arXiv:1701.02555v1
We introduce the Ants Nearby Treasure Search (ANTS) problem, which models natural cooperative foraging behavior such as that performed by ants around their nest. In this problem, k probabilistic agents, initially placed at a central location, collectively search for a treasure on the two-dimensional grid. The treasure is placed at a target location by an adversary and the agents' goal is to find it as fast as possible as a function of both k and D, where D is the (unknown) distance between the central location and the target. We concentrate on the case in which agents cannot communicate while searching. It is straightforward to see that the time until at least one agent finds the target is at least $\Omega$(D + D 2 /k), even for very sophisticated agents, with unrestricted memory. Our algorithmic analysis aims at establishing connections between the time complexity and the initial knowledge held by agents (e.g., regarding their total number k), as they commence the search. We provide a range of both upper and lower bounds for the initial knowledge required for obtaining fast running time. For example, we prove that log log k + $\Theta$(1) bits of initial information are both necessary and sufficient to obtain asymptotically optimal running time, i.e., O(D +D 2 /k). We also we prove that for every 0 \textless \textless 1, running in time O(log 1-- k $\times$(D +D 2 /k)) requires that agents have the capacity for storing $\Omega$(log k) different states as they leave the nest to start the search. To the best of our knowledge, the lower bounds presented in this paper provide the first non-trivial lower bounds on the memory complexity of probabilistic agents in the context of search problems. We view this paper as a "proof of concept" for a new type of interdisciplinary methodology. To fully demonstrate this methodology, the theoretical tradeoff presented here (or a similar one) should be combined with measurements of the time performance of searching ants.
• In this paper we solve the ancestry-labeling scheme problem which aims at assigning the shortest possible labels (bit strings) to nodes of rooted trees, so that ancestry queries between any two nodes can be answered by inspecting their assigned labels only. This problem was introduced more than twenty years ago by Kannan et al. [STOC '88], and is among the most well-studied problems in the field of informative labeling schemes. We construct an ancestry-labeling scheme for n-node trees with label size log 2 n + O(log log n) bits, thus matching the log 2 n + $\Omega$(log log n) bits lower bound given by Alstrup et al. [SODA '03]. Our scheme is based on a simplified ancestry scheme that operates extremely well on a restricted set of trees. In particular, for the set of n-node trees with depth at most d, the simplified ancestry scheme enjoys label size of log 2 n + 2 log 2 d + O(1) bits. Since the depth of most XML trees is at most some small constant, such an ancestry scheme may be of practical use. In addition, we also obtain an adjacency-labeling scheme that labels n-node trees of depth d with labels of size log 2 n + 3 log 2 d + O(1) bits. All our schemes assign the labels in linear time, and guarantee that any query can be answered in constant time. Finally, our ancestry scheme finds applications to the construction of small universal partially ordered sets (posets). Specifically, for any fixed integer k, it enables the construction of a universal poset of size~Osize~ size~O(n k) for the family of n-element posets with tree-dimension at most k. Up to lower order terms, this bound is tight thanks to a lower bound of n k--o(1) due to Alon and Scheinerman [Order '88].
• Nov 07 2016 cs.DS arXiv:1611.01403v2
We consider a search problem on trees using unreliable guiding instructions. Specifically, an agent starts a search at the root of a tree aiming to find a treasure hidden at one of the nodes by an adversary. Each visited node holds information, called advice, regarding the most promising neighbor to continue the search. However, the memory holding this information may be unreliable. Modeling this scenario, we focus on a probabilistic setting. That is, the advice at a node is a pointer to one of its neighbors. With probability $q$ each node is faulty, independently of other nodes, in which case its advice points at an arbitrary neighbor, chosen u.a.r. Otherwise, the node is sound and necessarily points at the correct neighbor. Crucially, the advice is permanent, in the sense that querying a node several times would yield the same answer. We evaluate the agent's efficiency by two measures: The move complexity denotes the expected number of edge traversals, and the query complexity denotes the expected number of queries. Let $\Delta$ denote the maximal degree. Roughly speaking, the main message of this paper is that in order to obtain efficient search, $1/\sqrt{\Delta}$ is a threshold for the noise parameter $q$. Essentially, we prove that above the threshold, every search algorithm has query complexity (and move complexity) which is both exponential in the depth $d$ of the treasure and polynomial in the number of nodes $n$. Conversely, below the threshold, there exists an algorithm with move complexity $O(d\sqrt{\Delta})$, and an algorithm with query complexity $O(\sqrt{\Delta}\log \Delta \log^2 n)$. Moreover, for the case of regular trees, we obtain an algorithm with query complexity $O(\sqrt{\Delta}\log n\log\log n)$. The move complexity bound is tight below the threshold and the query complexity bounds are not far from the lower bound of $\Omega(\sqrt{\Delta}\log_\Delta n)$.
• In STOC'16, Fraigniaud et al. consider the problem of finding a treasure hidden in one of many boxes that are ordered by importance. That is, if a treasure is in a more important box, then one would like to find it faster. Assuming there are many searchers, the authors suggest that using an algorithm that requires no coordination between searchers can be highly beneficial. Indeed, besides saving the need for a communication and coordination mechanism, such algorithms enjoy inherent robustness. The authors proceed to solve this linear search problem in the case of countably many boxes and an adversary placed treasure, and prove that the best speed-up possible by $k$ non-coordinating searchers is precisely $\frac{k}{4}(1+1/k)^2$. In particular, this means that asymptotically, the speed-up is four times worse compared to the case of full coordination. We suggest an important variant of the problem, where the treasure is placed uniformly at random in one of a finite, large, number of boxes. We devise non-coordinating algorithms that achieve a speed-up of $6/5$ for two searchers, a speed-up of $3/2$ for three searchers, and in general, a speed-up of $k(k+1)/(3k-1)$ for any $k \geq 1$ searchers. Thus, as $k$ grows to infinity, the speed-up approaches three times worse compared to the case of full coordination. Moreover, these bounds are tight in a strong sense as no non-coordinating search algorithm for $k$ searchers can achieve better speed-ups. We also devise non-coordinating algorithms that use only logarithmic memory in the size of the search domain, and yet, asymptotically, achieve the optimal speed-up. Finally, we note that all our algorithms are extremely simple and hence applicable.
• This paper considers the basic $\mathcal{PULL}$ model of communication, in which in each round, each agent extracts information from few randomly chosen agents. We seek to identify the smallest amount of information revealed in each interaction (message size) that nevertheless allows for efficient and robust computations of fundamental information dissemination tasks. We focus on the Majority Bit Dissemination problem that considers a population of $n$ agents, with a designated subset of source agents. Each source agent holds an input bit and each agent holds an output bit. The goal is to let all agents converge their output bits on the most frequent input bit of the sources (the majority bit). Note that the particular case of a single source agent corresponds to the classical problem of Broadcast. We concentrate on the severe fault-tolerant context of self-stabilization, in which a correct configuration must be reached eventually, despite all agents starting the execution with arbitrary initial states. We first design a general compiler which can essentially transform any self-stabilizing algorithm with a certain property that uses $\ell$-bits messages to one that uses only $\log \ell$-bits messages, while paying only a small penalty in the running time. By applying this compiler recursively we then obtain a self-stabilizing Clock Synchronization protocol, in which agents synchronize their clocks modulo some given integer $T$, within $\tilde O(\log n\log T)$ rounds w.h.p., and using messages that contain $3$ bits only. We then employ the new Clock Synchronization tool to obtain a self-stabilizing Majority Bit Dissemination protocol which converges in $\tilde O(\log n)$ time, w.h.p., on every initial configuration, provided that the ratio of sources supporting the minority opinion is bounded away from half. Moreover, this protocol also uses only 3 bits per interaction.
• Jan 07 2016 cs.DC arXiv:1601.01104v1
" How well connected is the network? " This is one of the most fundamental questions one would ask when facing the challenge of designing a communication network. Three major notions of connectivity have been considered in the literature, but in the context of traditional (single-layer) networks, they turn out to be equivalent. This paper introduces a model for studying the three notions of connectivity in multi-layer networks. Using this model, it is easy to demonstrate that in multi-layer networks the three notions may differ dramatically. Unfortunately, in contrast to the single-layer case, where the values of the three connectivity notions can be computed efficiently, it has been recently shown in the context of WDM networks (results that can be easily translated to our model) that the values of two of these notions of connectivity are hard to compute or even approximate in multi-layer networks. The current paper shed some positive light into the multi-layer connectivity topic: we show that the value of the third connectivity notion can be computed in polynomial time and develop an approximation for the construction of well connected overlay networks.
• This paper demonstrates the usefulness of distributed local verification of proofs, as a tool for the design of self-stabilizing algorithms.In particular, it introduces a somewhat generalized notion of distributed local proofs, and utilizes it for improving the time complexity significantly, while maintaining space optimality. As a result, we show that optimizing the memory size carries at most a small cost in terms of time, in the context of Minimum Spanning Tree (MST). That is, we present algorithms that are both time and space efficient for both constructing an MST and for verifying it.This involves several parts that may be considered contributions in themselves.First, we generalize the notion of local proofs, trading off the time complexity for memory efficiency. This adds a dimension to the study of distributed local proofs, which has been gaining attention recently. Specifically, we design a (self-stabilizing) proof labeling scheme which is memory optimal (i.e., $O(\log n)$ bits per node), and whose time complexity is $O(\log ^2 n)$ in synchronous networks, or $O(\Delta \log ^3 n)$ time in asynchronous ones, where $\Delta$ is the maximum degree of nodes. This answers an open problem posed by Awerbuch and Varghese (FOCS 1991). We also show that $\Omega(\log n)$ time is necessary, even in synchronous networks. Another property is that if $f$ faults occurred, then, within the requireddetection time above, they are detected by some node in the $O(f\log n)$ locality of each of the faults.Second, we show how to enhance a known transformer that makes input/output algorithms self-stabilizing. It now takes as input an efficient construction algorithm and an efficient self-stabilizing proof labeling scheme, and produces an efficient self-stabilizing algorithm. When used for MST, the transformer produces a memory optimal self-stabilizing algorithm, whose time complexity, namely, $O(n)$, is significantly better even than that of previous algorithms. (The time complexity of previous MST algorithms that used $\Omega(\log^2 n)$ memory bits per node was $O(n^2)$, and the time for optimal space algorithms was $O(n|E|)$.) Inherited from our proof labelling scheme, our self-stabilising MST construction algorithm also has the following two properties: (1) if faults occur after the construction ended, then they are detected by some nodes within $O(\log ^2 n)$ time in synchronous networks, or within $O(\Delta \log ^3 n)$ time in asynchronous ones, and (2) if $f$ faults occurred, then, within the required detection time above, they are detected within the $O(f\log n)$ locality of each of the faults. We also show how to improve the above two properties, at the expense of some increase in the memory.
• The issue of identifiers is crucial in distributed computing. Informally, identities are used for tackling two of the fundamental difficulties that areinherent to deterministic distributed computing, namely: (1) symmetry breaking, and (2) topological information gathering. In the context of local computation, i.e., when nodes can gather information only from nodes at bounded distances, some insight regarding the role of identities has been established. For instance, it was shown that, for large classes of construction problems, the role of the identities can be rather small. However, for theidentities to play no role, some other kinds of mechanisms for breaking symmetry must be employed, such as edge-labeling or sense of direction. When it comes to local distributed decision problems, the specification of the decision task does not seem to involve symmetry breaking. Therefore, it is expected that, assuming nodes can gather sufficient information about their neighborhood, one could get rid of the identities, without employing extra mechanisms for breaking symmetry. We tackle this question in the framework of the $\local$ model. Let $\LD$ be the class of all problems that can be decided in a constant number of rounds in the $\local$ model. Similarly, let $\LD^*$ be the class of all problems that can be decided at constant cost in the anonymous variant of the $\local$ model, in which nodes have no identities, but each node can get access to the (anonymous) ball of radius $t$ around it, for any $t$, at a cost of $t$. It is clear that $\LD^*\subseteq \LD$. We conjecture that $\LD^*=\LD$. In this paper, we give several evidences supporting this conjecture. In particular, we show that it holds for hereditary problems, as well as when the nodes know an arbitrary upper bound on the total number of nodes. Moreover, we prove that the conjecture holds in the context of non-deterministic local decision, where nodes are given certificates (independent of the identities, if they exist), and the decision consists in verifying these certificates. In short, we prove that $\NLD^*=\NLD$.
• This paper introduces the notion of distributed verification without preprocessing. It focuses on the Minimum-weight Spanning Tree (MST) verification problem and establishes tight upper and lower bounds for the time and message complexities of this problem. Specifically, we provide an MST verification algorithm that achieves simultaneously O(m) messages and O($\sqrt$ n+D) time, where m is the number of edges in the given graph G, n is the number of nodes, and D is G's diameter. On the other hand, we show that any MST verification algorithm must send \Omega(m) messages and incur \Omega($\sqrt$ n + D) time in worst case. Our upper bound result appears to indicate that the verification of an MST may be easier than its construction, since for MST construction, both lower bounds of \Omega(m) messages and \Omega($\sqrt$ n+D time hold, but at the moment there is no known distributed algorithm that constructs an MST and achieves simultaneously O(m) messages and O($\sqrt$ n + D) time. Specifically, the best known time-optimal algorithm (using O($\sqrt$ n + D) time) requires O(m + n 3/2) messages, and the best known message-optimal algorithm (using O(m) messages) requires O(n) time. On the other hand, our lower bound results indicate that the verification of an MST is not significantly easier than its construction.
• Numerous sophisticated local algorithm were suggested in the literature for various fundamental problems. Notable examples are the MIS and $(\Delta+1)$-coloring algorithms by Barenboim and Elkin [6], by Kuhn [22], and by Panconesi and Srinivasan [34], as well as the $O(\Delta 2)$-coloring algorithm by Linial [28]. Unfortunately, most known local algorithms (including, in particular, the aforementioned algorithms) are non-uniform, that is, local algorithms generally use good estimations of one or more global parameters of the network, e.g., the maximum degree $\Delta$ or the number of nodes n. This paper provides a method for transforming a non-uniform local algorithm into a uniform one. Furthermore , the resulting algorithm enjoys the same asymp-totic running time as the original non-uniform algorithm. Our method applies to a wide family of both deterministic and randomized algorithms. Specifically, it applies to almost all state of the art non-uniform algorithms for MIS and Maximal Matching, as well as to many results concerning the coloring problem. (In particular, it applies to all aforementioned algorithms.) To obtain our transformations we introduce a new distributed tool called pruning algorithms, which we believe may be of independent interest.
• We analyze parallel algorithms in the context of exhaustive search over totally ordered sets. Imagine an infinite list of "boxes", with a "treasure" hidden in one of them, where the boxes' order reflects the importance of finding the treasure in a given box. At each time step, a search protocol executed by a searcher has the ability to peek into one box, and see whether the treasure is present or not. By equally dividing the workload between them, $k$ searchers can find the treasure $k$ times faster than one searcher. However, this straightforward strategy is very sensitive to failures (e.g., crashes of processors), and overcoming this issue seems to require a large amount of communication. We therefore address the question of designing parallel search algorithms maximizing their speed-up and maintaining high levels of robustness, while minimizing the amount of resources for coordination. Based on the observation that algorithms that avoid communication are inherently robust, we analyze the best running time performance of non-coordinating algorithms. Specifically, we devise non-coordinating algorithms that achieve a speed-up of $9/8$ for two searchers, a speed-up of $4/3$ for three searchers, and in general, a speed-up of $\frac{k}{4}(1+1/k)^2$ for any $k\geq 1$ searchers. Thus, asymptotically, the speed-up is only four times worse compared to the case of full-coordination, and our algorithms are surprisingly simple and hence applicable. Moreover, these bounds are tight in a strong sense as no non-coordinating search algorithm can achieve better speed-ups. Overall, we highlight that, in faulty contexts in which coordination between the searchers is technically difficult to implement, intrusive with respect to privacy, and/or costly in term of resources, it might well be worth giving up on coordination, and simply run our non-coordinating exhaustive search algorithms.
• We consider the External Clock Synchronization problem in dynamic sensor networks. Initially, sensors obtain inaccurate estimations of an external time reference and subsequently collaborate in order to synchronize their internal clocks with the external time. For simplicity, we adopt the drift-free assumption, where internal clocks are assumed to tick at the same pace. Hence, the problem is reduced to an estimation problem, in which the sensors need to estimate the initial external time. This work is further relevant to the problem of collective approximation of environmental values by biological groups. Unlike most works on clock synchronization that assume static networks, this paper focuses on an extreme case of highly dynamic networks. Specifically, we assume a non-adaptive scheduler adversary that dictates in advance an arbitrary, yet independent, meeting pattern. Such meeting patterns fit, for example, with short-time scenarios in highly dynamic settings, where each sensor interacts with only few other arbitrary sensors. We propose an extremely simple clock synchronization algorithm that is based on weighted averages, and prove that its performance on any given independent meeting pattern is highly competitive with that of the best possible algorithm, which operates without any resource or computational restrictions, and knows the meeting pattern in advance. In particular, when all distributions involved are Gaussian, the performances of our scheme coincide with the optimal performances. Our proofs rely on an extensive use of the concept of Fisher information. We use the Cramer-Rao bound and our definition of a Fisher Channel Capacity to quantify information flows and to obtain lower bounds on collective performance. This opens the door for further rigorous quantifications of information flows within collaborative sensors.
• Distributed computing models typically assume reliable communication between processors. While such assumptions often hold for engineered networks, e.g., due to underlying error correction protocols, their relevance to biological systems, wherein messages are often distorted before reaching their destination, is quite limited. In this study we take a first step towards reducing this gap by rigorously analyzing a model of communication in large anonymous populations composed of simple agents which interact through short and highly unreliable messages. We focus on the broadcast problem and the majority-consensus problem. Both are fundamental information dissemination problems in distributed computing, in which the goal of agents is to converge to some prescribed desired opinion. We initiate the study of these problems in the presence of communication noise. Our model for communication is extremely weak and follows the push gossip communication paradigm: In each round each agent that wishes to send information delivers a message to a random anonymous agent. This communication is further restricted to contain only one bit (essentially representing an opinion). Lastly, the system is assumed to be so noisy that the bit in each message sent is flipped independently with probability $1/2-\epsilon$, for some small $\epsilon >0$. Even in this severely restricted, stochastic and noisy setting we give natural protocols that solve the noisy broadcast and the noisy majority-consensus problems efficiently. Our protocols run in $O(\log n / \epsilon^2)$ rounds and use $O(n \log n / \epsilon^2)$ messages/bits in total, where $n$ is the number of agents. These bounds are asymptotically optimal and, in fact, are as fast and message efficient as if each agent would have been simultaneously informed directly by an agent that knows the prescribed desired opinion.
• Do unique node identifiers help in deciding whether a network $G$ has a prescribed property $P$? We study this question in the context of distributed local decision, where the objective is to decide whether $G \in P$ by having each node run a constant-time distributed decision algorithm. If $G \in P$, all the nodes should output yes; if $G \notin P$, at least one node should output no. A recent work (Fraigniaud et al., OPODIS 2012) studied the role of identifiers in local decision and gave several conditions under which identifiers are not needed. In this article, we answer their original question. More than that, we do so under all combinations of the following two critical variations on the underlying model of distributed computing: ($B$): the size of the identifiers is bounded by a function of the size of the input network; as opposed to ($\neg B$): the identifiers are unbounded. ($C$): the nodes run a computable algorithm; as opposed to ($\neg C$): the nodes can compute any, possibly uncomputable function. While it is easy to see that under ($\neg B, \neg C$) identifiers are not needed, we show that under all other combinations there are properties that can be decided locally if and only if identifiers are present. Our constructions use ideas from classical computability theory.
• The difference between the speed of the actions of different processes is typically considered as an obstacle that makes the achievement of cooperative goals more difficult. In this work, we aim to highlight potential benefits of such asynchrony phenomena to tasks involving symmetry breaking. Specifically, in this paper, identical (except for their speeds) mobile agents are placed at arbitrary locations on a cycle of length $n$ and use their speed difference in order to rendezvous fast. We normalize the speed of the slower agent to be 1, and fix the speed of the faster agent to be some $c>1$. (An agent does not know whether it is the slower agent or the faster one.) The straightforward distributed-race DR algorithm is the one in which both agents simply start walking until rendezvous is achieved. It is easy to show that, in the worst case, the rendezvous time of DR is $n/(c-1)$. Note that in the interesting case, where $c$ is very close to 1 this bound becomes huge. Our first result is a lower bound showing that, up to a multiplicative factor of 2, this bound is unavoidable, even in a model that allows agents to leave arbitrary marks, even assuming sense of direction, and even assuming $n$ and $c$ are known to agents. That is, we show that under such assumptions, the rendezvous time of any algorithm is at least $\frac{n}{2(c-1)}$ if $c\leq 3$ and slightly larger if $c>3$. We then construct an algorithm that precisely matches the lower bound for the case $c\leq 2$, and almost matches it when $c>2$. Moreover, our algorithm performs under weaker assumptions than those stated above, as it does not assume sense of direction, and it allows agents to leave only a single mark (a pebble) and only at the place where they start the execution. Finally, we investigate the setting in which no marks can be used at all, and show tight bounds for $c\leq 2$, and almost tight bounds for $c>2$.
• Jul 03 2012 cs.DC cs.CC arXiv:1207.0252v1
The paper tackles the power of randomization in the context of locality by analyzing the ability toboost' the success probability of deciding a distributed language. The main outcome of this analysis is that the distributed computing setting contrasts significantly with the sequential one as far as randomization is concerned. Indeed, we prove that in some cases, the ability to increase the success probability for deciding distributed languages is rather limited. Informally, a (p,q)-decider for a language L is a distributed randomized algorithm which accepts instances in L with probability at least p and rejects instances outside of L with probability at least q. It is known that every hereditary language that can be decided in t rounds by a (p,q)-decider, where p^2+q>1, can actually be decided deterministically in O(t) rounds. In one of our results we give evidence supporting the conjecture that the above statement holds for all distributed languages. This is achieved by considering the restricted case of path topologies. We then turn our attention to the range below the aforementioned threshold, namely, the case where p^2+q\leq1. We define B_k(t) to be the set of all languages decidable in at most t rounds by a (p,q)-decider, where p^1+1/k+q>1. It is easy to see that every language is decidable (in zero rounds) by a (p,q)-decider satisfying p+q=1. Hence, the hierarchy B_k provides a spectrum of complexity classes between determinism and complete randomization. We prove that all these classes are separated: for every integer k≥1, there exists a language L satisfying L∈B_k+1(0) but L∉B_k(t) for any t=o(n). In addition, we show that B_∞(t) does not contain all languages, for any t=o(n). Finally, we show that if the inputs can be restricted in certain ways, then the ability to boost the success probability becomes almost null.
• Initial knowledge regarding group size can be crucial for collective performance. We study this relation in the context of the \em Ants Nearby Treasure Search (ANTS) problem \citeFKLS, which models natural cooperative foraging behavior such as that performed by ants around their nest. In this problem, $k$ (probabilistic) agents, initially placed at some central location, collectively search for a treasure on the two-dimensional grid. The treasure is placed at a target location by an adversary and the goal is to find it as fast as possible as a function of both $k$ and $D$, where $D$ is the (unknown) distance between the central location and the target. It is easy to see that $T=\Omega(D+D^2/k)$ time units are necessary for finding the treasure. Recently, it has been established that $O(T)$ time is sufficient if the agents know their total number $k$ (or a constant approximation of it), and enough memory bits are available at their disposal \citeFKLS. In this paper, we establish lower bounds on the agent memory size required for achieving certain running time performances. To the best our knowledge, these bounds are the first non-trivial lower bounds for the memory size of probabilistic searchers. For example, for every given positive constant $\epsilon$, terminating the search by time $O(\log^{1-\epsilon} k \cdot T)$ requires agents to use $\Omega(\log\log k)$ memory bits. Such distributed computing bounds may provide a novel, strong tool for the investigation of complex biological systems.
• We generalize the classical cow-path problem [7, 14, 38, 39] into a question that is relevant for collective foraging in animal groups. Specifically, we consider a setting in which k identical (probabilistic) agents, initially placed at some central location, collectively search for a treasure in the two-dimensional plane. The treasure is placed at a target location by an adversary and the goal is to find it as fast as possible as a function of both k and D, where D is the distance between the central location and the target. This is biologically motivated by cooperative, central place foraging such as performed by ants around their nest. In this type of search there is a strong preference to locate nearby food sources before those that are further away. Our focus is on trying to find what can be achieved if communication is limited or altogether absent. Indeed, to avoid overlaps agents must be highly dispersed making communication difficult. Furthermore, if agents do not commence the search in synchrony then even initial communication is problematic. This holds, in particular, with respect to the question of whether the agents can communicate and conclude their total number, k. It turns out that the knowledge of k by the individual agents is crucial for performance. Indeed, it is a straightforward observation that the time required for finding the treasure is $\Omega$(D + D 2 /k), and we show in this paper that this bound can be matched if the agents have knowledge of k up to some constant approximation. We present an almost tight bound for the competitive penalty that must be paid, in the running time, if agents have no information about k. Specifically, on the negative side, we show that in such a case, there is no algorithm whose competitiveness is O(log k). On the other hand, we show that for every constant $\epsilon \textgreater{} 0$, there exists a rather simple uniform search algorithm which is $O( \log^{1+\epsilon} k)$-competitive. In addition, we give a lower bound for the setting in which agents are given some estimation of k. As a special case, this lower bound implies that for any constant $\epsilon \textgreater{} 0$, if each agent is given a (one-sided) $k^\epsilon$-approximation to k, then the competitiveness is $\Omega$(log k). Informally, our results imply that the agents can potentially perform well without any knowledge of their total number k, however, to further improve, they must be given a relatively good approximation of k. Finally, we propose a uniform algorithm that is both efficient and extremely simple suggesting its relevance for actual biological scenarios.
• We study the \em verification problem in distributed networks, stated as follows. Let $H$ be a subgraph of a network $G$ where each vertex of $G$ knows which edges incident on it are in $H$. We would like to verify whether $H$ has some properties, e.g., if it is a tree or if it is connected. We would like to perform this verification in a decentralized fashion via a distributed algorithm. The time complexity of verification is measured as the number of rounds of distributed communication. In this paper we initiate a systematic study of distributed verification, and give almost tight lower bounds on the running time of distributed verification algorithms for many fundamental problems such as connectivity, spanning connected subgraph, and $s-t$ cut verification. We then show applications of these results in deriving strong unconditional time lower bounds on the \em hardness of distributed approximation for many classical optimization problems including minimum spanning tree, shortest paths, and minimum cut. Many of these results are the first non-trivial lower bounds for both exact and approximate distributed computation and they resolve previous open questions. Moreover, our unconditional lower bound of approximating minimum spanning tree (MST) subsumes and improves upon the previous hardness of approximation bound of Elkin [STOC 2004] as well as the lower bound for (exact) MST computation of Peleg and Rubinovich [FOCS 1999]. Our result implies that there can be no distributed approximation algorithm for MST that is significantly faster than the current exact algorithm, for \em any approximation factor. Our lower bound proofs show an interesting connection between communication complexity and distributed computing which turns out to be useful in establishing the time complexity of exact and approximate distributed computation of many problems.
• Nov 10 2010 cs.DC cs.CC arXiv:1011.2152v2
A central theme in distributed network algorithms concerns understanding and coping with the issue of locality. Inspired by sequential complexity theory, we focus on a complexity theory for distributed decision problems. In the context of locality, solving a decision problem requires the processors to independently inspect their local neighborhoods and then collectively decide whether a given global input instance belongs to some specified language. This paper introduces several classes of distributed decision problems, proves separation among them and presents some complete problems. More specifically, we consider the standard LOCAL model of computation and define LD (for local decision) as the class of decision problems that can be solved in constant number of communication rounds. We first study the intriguing question of whether randomization helps in local distributed computing, and to what extent. Specifically, we define the corresponding randomized class BPLD, and ask whether LD=BPLD. We provide a partial answer to this question by showing that in many cases, randomization does not help for deciding hereditary languages. In addition, we define the notion of local many-one reductions, and introduce the (nondeterministic) class NLD of decision problems for which there exists a certificate that can be verified in constant number of communication rounds. We prove that there exists an NLD-complete problem. We also show that there exist problems not in NLD. On the other hand, we prove that the class NLD#n, which is NLD assuming that each processor can access an oracle that provides the number of nodes in the network, contains all (decidable) languages. For this class we provide a natural complete problem as well.
• An ancestry labeling scheme assigns labels (bit strings) to the nodes of rooted trees such that ancestry queries between any two nodes in a tree can be answered merely by looking at their corresponding labels. The quality of an ancestry labeling scheme is measured by its label size, that is the maximal number of bits in a label of a tree node. In addition to its theoretical appeal, the design of efficient ancestry labeling schemes is motivated by applications in web search engines. For this purpose, even small improvements in the label size are important. In fact, the literature about this topic is interested in the exact label size rather than just its order of magnitude. As a result, following the proposal of a simple interval-based ancestry scheme with label size $2\log_2 n$ bits (Kannan et al., STOC '88), a considerable amount of work was devoted to improve the bound on the size of a label. The current state of the art upper bound is $\log_2 n + O(\sqrt{\log n})$ bits (Abiteboul et al., SODA '02) which is still far from the known $\log_2 n + \Omega(\log\log n)$ bits lower bound (Alstrup et al., SODA '03). In this paper we close the gap between the known lower and upper bounds, by constructing an ancestry labeling scheme with label size $\log_2 n + O(\log\log n)$ bits. In addition to the optimal label size, our scheme assigns the labels in linear time and can support any ancestry query in constant time.
• Consider the setting of \emphrandomly weighted graphs, namely, graphs whose edge weights are chosen independently according to probability distributions with finite support over the non-negative reals. Under this setting, properties of weighted graphs typically become random variables and we are interested in computing their statistical features. Unfortunately, this turns out to be computationally hard for some properties albeit the problem of computing them in the traditional setting of algorithmic graph theory is tractable. For example, there are well known efficient algorithms that compute the \emphdiameter of a given weighted graph, yet, computing the \emphexpected diameter of a given randomly weighted graph is \SharpP-hard even if the edge weights are identically distributed. In this paper, we define a family of properties of weighted graphs and show that for each property in this family, the problem of computing the \emph$k^{\text{th}}$ moment (and in particular, the expected value) of the corresponding random variable in a given randomly weighted graph $G$ admits a \emphfully polynomial time randomized approximation scheme (FPRAS) for every fixed $k$. This family includes fundamental properties of weighted graphs such as the diameter of $G$, the \emphradius of $G$ (with respect to any designated vertex) and the weight of a \emphminimum spanning tree of $G$.
• An \em ancestry labeling scheme labels the nodes of any tree in such a way that ancestry queries between any two nodes in a tree can be answered just by looking at their corresponding labels. The common measure to evaluate the quality of an ancestry labeling scheme is by its \em label size, that is the maximal number of bits stored in a label, taken over all $n$-node trees. The design of ancestry labeling schemes finds applications in XML search engines. In the context of these applications, even small improvements in the label size are important. In fact, the literature about this topic is interested in the exact label size rather than just its order of magnitude. As a result, following the proposal of an original scheme of size $2\log n$ bits, a considerable amount of work was devoted to improve the bound on the label size. The current state of the art upper bound is $\log n + O(\sqrt{\log n})$ bits which is still far from the known $\log n + \Omega(\log\log n)$ lower bound. Moreover, the hidden constant factor in the additive $O(\sqrt{\log n})$ term is large, which makes this term dominate the label size for typical current XML trees. In attempt to provide good performances for real XML data, we rely on the observation that the depth of a typical XML tree is bounded from above by a small constant. Having this in mind, we present an ancestry labeling scheme of size $\log n+2\log d +O(1)$, for the family of trees with at most $n$ nodes and depth at most $d$. In addition to our main result, we prove a result that may be of independent interest concerning the existence of a linear \em universal graph for the family of forests with trees of bounded depth.
• We consider the Work Function Algorithm for the k-server problem. We show that if the Work Function Algorithm is c-competitive, then it is also strictly (2c)-competitive. As a consequence of [Koutsoupias and Papadimitriou, JACM 1995] this also shows that the Work Function Algorithm is strictly (4k-2)-competitive.
• Oct 02 2006 cs.DC arXiv:cs/0609163v1
We study the question of `how robust are the known lower bounds of labeling schemes when one increases the number of consulted labels''. Let $f$ be a function on pairs of vertices. An $f$-labeling scheme for a family of graphs $\cF$ labels the vertices of all graphs in $\cF$ such that for every graph $G\in\cF$ and every two vertices $u,v\in G$, the value $f(u,v)$ can be inferred by merely inspecting the labels of $u$ and $v$. This paper introduces a natural generalization: the notion of $f$-labeling schemes with queries, in which the value $f(u,v)$ can be inferred by inspecting not only the labels of $u$ and $v$ but possibly the labels of some additional vertices. We show that inspecting the label of a single additional vertex (one \em query) enables us to reduce the label size of many labeling schemes significantly.
• Let $F$ be a function on pairs of vertices. An \em $F$- labeling scheme is composed of a \em marker algorithm for labeling the vertices of a graph with short labels, coupled with a \em decoder algorithm allowing one to compute $F(u,v)$ of any two vertices $u$ and $v$ directly from their labels. As applications for labeling schemes concern mainly large and dynamically changing networks, it is of interest to study \em distributed dynamic labeling schemes. This paper investigates labeling schemes for dynamic trees. This paper presents a general method for constructing labeling schemes for dynamic trees. Our method is based on extending an existing \em static tree labeling scheme to the dynamic setting. This approach fits many natural functions on trees, such as ancestry relation, routing (in both the adversary and the designer port models), nearest common ancestor etc.. Our resulting dynamic schemes incur overheads (over the static scheme) on the label size and on the communication complexity. Informally, for any function $k(n)$ and any static $F$-labeling scheme on trees, we present an $F$-labeling scheme on dynamic trees incurring multiplicative overhead factors (over the static scheme) of $O(\log_{k(n)} n)$ on the label size and $O(k(n)\log_{k(n)} n)$ on the amortized message complexity. In particular, by setting $k(n)=n^{\epsilon}$ for any $0<\epsilon<1$, we obtain dynamic labeling schemes with asymptotically optimal label sizes and sublinear amortized message complexity for all the above mentioned functions.