Source localization and spectral estimation are among the most fundamental problems in statistical and array signal processing. Methods which rely on the orthogonality of the signal and noise subspaces, such as Pisarenko's method, MUSIC, and root-MUSIC are some of the most widely used algorithms to solve these problems. As a common feature, these methods require both apriori knowledge of the number of sources, and an estimate of the noise subspace. Both requirements are complicating factors to the practical implementation of the algorithms, and when not satisfied exactly, can potentially lead to severe errors. In this paper, we propose a new localization criterion based on the algebraic structure of the noise subspace that is described for the first time to the best of our knowledge. Using this criterion and the relationship between the source localization problem and the problem of computing the greatest common divisor (GCD), or more practically approximate GCD, for polynomials, we propose two algorithms which adaptively learn the number of sources and estimate their locations. Simulation results show a significant improvement over root-MUSIC in challenging scenarios such as closely located sources, both in terms of detection of the number of sources and their localization over a broad and practical range of SNRs. Further, no performance sacrifice in simple scenarios is observed.
We study the Wasserstein natural gradient in parametric statistical models with continuous sample space. Our approach is to pull back the $L^2$-Wasserstein metric tensor in probability density space to parameter space, under which the parameter space become a Riemannian manifold, named the Wasserstein statistical manifold. The gradient flow and natural gradient descent method in parameter space are then derived. When parameterized densities lie in $\bR$, we show the induced metric tensor establishes an explicit formula. Computationally, optimization problems can be accelerated by the proposed Wasserstein natural gradient descent, if the objective function is the Wasserstein distance. Examples are presented to demonstrate its effectiveness in several parametric statistical models.
Despite the remarkable successes of generative adversarial networks (GANs) in many applications, theoretical understandings of their performance is still limited. In this paper, we present a simple shallow GAN model fed by high-dimensional input data. The dynamics of the training process of the proposed model can be exactly analyzed in the high-dimensional limit. In particular, by using the tool of scaling limits of stochastic processes, we show that the macroscopic quantities measuring the quality of the training process converge to a deterministic process that is characterized as the unique solution of a finite-dimensional ordinary differential equation (ODE). The proposed model is simple, but its training process already exhibits several different phases that can mimic the behaviors of more realistic GAN models used in practice. Specifically, depending on the choice of the learning rates, the training process can reach either a successful, a failed, or an oscillating phase. By studying the steady-state solutions of the limiting ODEs, we obtain a phase diagram that precisely characterizes the conditions under which each phase takes place. Although this work focuses on a simple GAN model, the analysis methods developed here might prove useful in the theoretical understanding of other variants of GANs with more advanced training algorithms.
This paper considers the problem of implementing large-scale gradient descent algorithms in a distributed computing setting in the presence of \em straggling processors. To mitigate the effect of the stragglers, it has been previously proposed to encode the data with an erasure-correcting code and decode at the master server at the end of the computation. We, instead, propose to encode the second-moment of the data with a low density parity-check (LDPC) code. The iterative decoding algorithms for LDPC codes have very low computational overhead and the number of decoding iterations can be made to automatically adjust with the number of stragglers in the system. We show that for a random model for stragglers, the proposed moment encoding based gradient descent method can be viewed as the stochastic gradient descent method. This allows us to obtain convergence guarantees for the proposed solution. Furthermore, the proposed moment encoding based method is shown to outperform the existing schemes in a real distributed computing setup.
The celebrated Monte Carlo method estimates a quantity that is expensive to compute by random sampling. We propose adaptive Monte Carlo optimization: a general framework for discrete optimization of an expensive-to-compute function by adaptive random sampling. Applications of this framework have already appeared in machine learning but are tied to their specific contexts and developed in isolation. We take a unified view and show that the framework has broad applicability by applying it on several common machine learning problems: $k$-nearest neighbors, hierarchical clustering and maximum mutual information feature selection. On real data we show that this framework allows us to develop algorithms that confer a gain of a magnitude or two over exact computation. We also characterize the performance gain theoretically under regularity assumptions on the data that we verify in real world data. The code is available at https://github.com/govinda-kamath/combinatorial_MAB.
In this paper, we propose a cost-aware cascading bandits model, a new variant of multi-armed ban- dits with cascading feedback, by considering the random cost of pulling arms. In each step, the learning agent chooses an ordered list of items and examines them sequentially, until certain stopping condition is satisfied. Our objective is then to max- imize the expected net reward in each step, i.e., the reward obtained in each step minus the total cost in- curred in examining the items, by deciding the or- dered list of items, as well as when to stop examina- tion. We study both the offline and online settings, depending on whether the state and cost statistics of the items are known beforehand. For the of- fline setting, we show that the Unit Cost Ranking with Threshold 1 (UCR-T1) policy is optimal. For the online setting, we propose a Cost-aware Cas- cading Upper Confidence Bound (CC-UCB) algo- rithm, and show that the cumulative regret scales in O(log T ). We also provide a lower bound for all \alpha-consistent policies, which scales in \Omega(log T ) and matches our upper bound. The performance of the CC-UCB algorithm is evaluated with both synthetic and real-world data.
Analytics will be a part of the upcoming smart city and Internet of Things (IoT). The focus of this work is approximate distributed signal analytics. It is envisaged that distributed IoT devices will record signals, which may be of interest to the IoT cloud. Communication of these signals from IoT devices to the IoT cloud will require (lowpass) approximations. Linear signal approximations are well known in the literature. It will be outlined that in many IoT analytics problems, it is desirable that the approximated signals (or their analytics) should always over-predict the exact signals (or their analytics). This distributed nonlinear approximation problem has not been studied before. An algorithm to perform distributed over-predictive signal analytics in the IoT cloud, based on signal approximations by IoT devices, is proposed. The fundamental tradeoff between the signal approximation bandwidth used by IoT devices and the approximation error in signal analytics at the IoT cloud is quantified for the class of differentiable signals. Simulation results are also presented.
In this paper we propose two schemes for teleportation of a sub-class of tripartite states, the first one with the four-qubit cluster state and the second one with two Bell pairs as entanglement channels. A four-qubit joint measurement in the first case and two Bell measurements in the second are performed by the sender. Appropriate unitary operations on the qubits at the receiver's end along with an ancilla qubit result in the perfect teleportation of the tripartite state. Analysis of the quantum circuits employed in these schemes reveal that in our technique the desired quantum tasks are achieved with lesser quantum cost, gate count and classical communication bits compared with other similar schemes.
Massive MIMO opens up new avenues for enabling highly efficient random access (RA) by offering abundance of spatial degrees of freedom. In this paper, we investigate the grant-free RA with massive MIMO and derive the analytic expressions of success probability of the grant-free RA for conjugate beamforming and zero-forcing beamforming techniques.With the derived analytic expressions, we further shed light on the impact of system parameters on the success probability. Simulation results verify the accuracy of the analyses. It is confirmed that the grant-free RA with massive MIMO is an attractive RA technique with low signaling overhead that could simultaneously accommodate a number of RA users, which is multiple times the number of RA channels, with close-to-one success probability. In addition, when the number of antennas in massive MIMO is sufficiently large, we show that the number of orthogonal preambles would dominate the success probability.
A general approach to $L_2$-consistent estimation of various density functionals using $k$-nearest neighbor distances is proposed, along with the analysis of convergence rates in mean squared error. The construction of the estimator is based on inverse Laplace transforms related to the target density functional, which arises naturally from the convergence of a normalized volume of $k$-nearest neighbor ball to a Gamma distribution in the sample limit. Some instantiations of the proposed estimator rediscover existing $k$-nearest neighbor based estimators of Shannon and Renyi entropies and Kullback--Leibler and Renyi divergences, and discover new consistent estimators for many other functionals, such as Jensen--Shannon divergence and generalized entropies and divergences. A unified finite-sample analysis of the proposed estimator is presented that builds on a recent result by Gao, Oh, and Viswanath (2017) on the finite sample behavior of the Kozachenko--Leoneko estimator of entropy.
We consider a communication channel where there is no common clock between the transmitter and the receiver. This is motivated by the recent interest in building system-on-chip radios for Internet of Things applications, which cannot rely on crystal oscillators for accurate timing. We identify two types of clock uncertainty in such systems: timing jitter, which occurs at a time scale faster than the communication duration (or equivalently blocklength); and clock drift, which occurs at a slower time scale. We study the zero-error capacity under both types of timing imperfections, and obtain optimal zero-error codes for some cases. Our results show that, as opposed to common practice, in the presence of clock drift it is highly suboptimal to try to learn and track the clock frequency at the receiver; rather, one can design codes that come close to the performance of perfectly synchronous communication systems without any clock synchronization at the receiver.
When considering unidirectional communication for unmanned aerial vehicles (UAVs) as flying Base Stations (BSs), either uplink or downlink, the system is limited through the co-channel interference that takes place over line-of-sight (LoS) links. This paper considers two-way communication and takes advantage of the fact that the interference among the ground devices takes place through non-line-of-sight (NLoS) links. UAVs can be deployed at the high altitudes to have larger coverage, while the two-way communication allows to configure the transmission direction. Using these two levers, we show how the system throughput can be maximized for a given deployment of the ground devices.