- Apr 20 2018 quant-ph arXiv:1804.06995v1Steane's seven-qubit quantum code is a natural choice for fault-tolerance experiments because it is small and just two extra qubits are enough to correct errors. However, the two-qubit error-correction technique, known as "flagged" syndrome extraction, works slowly, measuring only one syndrome at a time. This is a disadvantage in experiments with high qubit rest error rates. We extend the technique to extract multiple syndromes at once, without needing more qubits. Qubits for different syndromes can flag errors in each other. This gives equally fast and more qubit-efficient alternatives to Steane's error-correction method, and also conforms to planar geometry constraints. We further show that Steane's code and some others can be error-corrected with no extra qubits, provided there are at least two code blocks. The rough idea is that two seven-qubit codewords can be temporarily joined into a twelve-qubit code, freeing two qubits for flagged syndrome measurement.
- Apr 18 2018 quant-ph arXiv:1804.06105v1With experimental quantum computing technologies now in their infancy, the search for efficient means of testing the correctness of these quantum computations is becoming more pressing. An approach to the verification of quantum computation within the framework of interactive proofs has been fruitful for addressing this problem. Specifically, an untrusted agent (prover) alleging to perform quantum computations can have his claims verified by another agent (verifier) who only has access to classical computation and a small quantum device for preparing or measuring single qubits. However, when this quantum device is prone to errors, verification becomes challenging and often existing protocols address this by adding extra assumptions, such as requiring the noise in the device to be uncorrelated with the noise on the prover's devices. In this paper, we present a simple protocol for verifying quantum computations, in the presence of noisy devices, with no extra assumptions. This protocol is based on post hoc techniques for verification, which allow for the prover to know the desired quantum computation and its input. We also perform a simulation of the protocol, for a one-qubit computation, and find the error thresholds when using the qubit repetition code as well as the Steane code.
- Apr 23 2018 quant-ph arXiv:1804.07364v1Contextuality - the inability to assign pre-existing outcomes to all potential measurements of a quantum system - has been proposed as a resource that powers quantum computing. The measurement-based model provides a concrete manifestation of contextuality as a computational resource, as follows. If local measurements on a multi-qubit state can be used to evaluate non-linear boolean functions with only linear control processing, then this computation constitutes a proof of contextuality - the possible local measurement outcomes cannot all be pre-assigned. However, this connection is restricted to the special case when the local measured systems are qubits, which have unusual properties from the perspective of contextuality. A single qubit cannot allow for a proof of contextuality, unlike higher-dimensional systems, and multiple qubits can allow for state-independent contextuality with only Pauli observables, again unlike higher-dimensional generalisations. Here we identify precisely which non-local features of contextuality are necessary in a qudit measurement-based computation that evaluates high-degree polynomial functions with only linear control. We introduce the concept of local universality, which places a bound on the space of output functions accessible under the constraint of single-qudit measurements. Thus, the partition of a physical system into subsystems plays a crucial role for the increase in computational power. A prominent feature of our setting is that the enabling resources for qubit and qudit measurement-based computations are of the same underlying nature, avoiding the pathologies associated with qubit contextuality.
- We unify and consolidate various results about non-signall-ing games, a subclass of non-local two-player one-round games, by introducing and studying several new families of games and establishing general theorems about them, which extend a number of known facts in a variety of special cases. Among these families are \it reflexive games, which are characterised as the hardest non-signalling games that can be won using a given set of strategies. We introduce \it imitation games, in which the players display linked behaviour, and which contains as subclasses the classes of variable assignment games, binary constraint system games, synchronous games, many games based on graphs, and \it unique games. We associate a C*-algebra $C^*(\mathcal{G})$ to any imitation game $\mathcal{G}$, and show that the existence of perfect quantum commuting (resp.\ quantum, local) strategies of $\mathcal{G}$ can be characterised in terms of properties of this C*-algebra, extending known results about synchronous games. We single out a subclass of imitation games, which we call \it mirror games, and provide a characterisation of their quantum commuting strategies that has an algebraic flavour, showing in addition that their approximately quantum perfect strategies arise from amenable traces on the encoding C*-algebra. We describe the main classes of non-signalling correlations in terms of states on operator system tensor products.
- Lovász Local Lemma (LLL) is a very powerful tool in combinatorics and probability theory to show the possibility of avoiding all "bad" events under some "weakly dependent" condition. Over the last decades, the algorithmic aspect of LLL has also attracted lots of attention in theoretical computer science. A tight criterion under which the abstract version LLL holds was given by Shearer [shearer1985problem]. It turns out that Shearer's bound is generally not tight for variable version LLL (VLLL) [he2017variable]. Recently, Ambainis et al. introduced a quantum version LLL (QLLL), which was then shown to be powerful for quantum satisfiability problem. In this paper, we prove that Shearer's bound is tight for QLLL, affirming a conjecture proposed by Sattath et. al. Our result shows the tightness of Gilyén and Sattath's algorithm, and implies that the lattice gas partition function fully characterizes quantum satisfiability for almost all Hamiltonians with large enough qudits. Commuting LLL (CLLL), LLL for commuting local Hamiltonians which are widely studied in literature, is also investigated here. We prove that the tight regions of CLLL and QLLL are generally different. Thus, the efficient region of algorithms for CLLL can go beyond shearer's bound. Our proof is by first bridging CLLL and VLLL on a family of interaction bipartite graphs and then applying the tools of VLLL, e.g., the gapless/gapful results, to CLLL. We also provide a sufficient and necessary condition for deciding whether the tight regions of QLLL and CLLL are the same for a given interaction bipartite graph.
- Cheeger inequalities bound the spectral gap $\gamma$ of a space by isoperimetric properties of that space and vice versa. In this paper, I derive Cheeger-type inequalities for nonpositive matrices (aka stoquastic Hamiltonians), real matrices, and Hermitian matrices. For matrices written $H = L+W$, where $L$ is either a combinatorial or normalized graph Laplacian, each bound holds independently of any information about $W$ other than its class and the weighted Cheeger constant induced by its ground-state. I show that independently of $\lVert W \rVert$: (1) when $W$ is diagonal and $L$ has maximum degree $d_{\max}$, $2h \geq \gamma \geq \sqrt{h^2 + d_{\max}^2}-d_\max$; (2) when $W$ is real, we can route negative weighted edges along positive weighted edges such that the Cheeger constant of the resulting graph obeys an inequality similar to that above; and (3) when $W$ is Hermitian, the weighted Cheeger constant obeys $2h \geq \gamma$. The weighted Cheeger constant reduces bounds on $\gamma$ to information contained in the underlying graph and the Hamiltonian's ground-state. If efficiently computable, the constant opens up a very clear path towards adaptive quantum adiabatic algorithms, those that adjust the adiabatic path based on spectral structure. I sketch a bashful adiabatic algorithm that aborts the adiabatic process early, uses the resulting state to approximate the weighted Cheeger constant, and restarts the process using the updated information. Should this approach work, it would provide more rigorous foundations for adiabatic quantum computing without a priori knowledge of the spectral gap.
- Apr 23 2018 quant-ph arXiv:1804.07653v1We introduce a new graphical framework for designing quantum error correction codes based on classical principles. A key feature of this graphical language, over previous approaches, is that it is closely related to that of factor graphs or graphical models in classical information theory and machine learning. It enables us to formulate the description of recently-introduced `coherent parity check' quantum error correction codes entirely within the language of classical information theory. This makes our construction accessible without requiring background in quantum error correction or even quantum mechanics in general. More importantly, this allows for a collaborative interplay where one can design new quantum error correction codes derived from classical codes.
- Apr 18 2018 quant-ph arXiv:1804.05951v1We provide an alternative proof of Wallman's \citeWallman2018 bounds on the effect of gate-dependent noise on randomized benchmarking (RB). Our primary insight is that a RB sequence is a convolution amenable to Fourier space analysis, and we adopt the mathematical framework of Fourier transforms of matrix-valued functions on groups established in recent work from Gowers and Hatami \citeGH15. We show explicitly that as long as our faulty gate-set is close to some representation of the Clifford group, an RB sequence is described by the exponential decay of a process that has exactly two eigenvalues close to one and the rest close to zero, even though the bounds with respect to any particular representation of the Clifford group may not tightly describe the rate of decay. This framework reveals some very provocative properties of maximally entangled states, and likely has applications in other areas of quantum information theory.
- Current work presents a new approach to quantum color codes on compact surfaces with genus $g \geq 2$ using the identification of these surfaces with hyperbolic polygons and hyperbolic tessellations. We show that this method may give rise to color codes with a very good parameters and we present tables with several examples of these codes whose parameters had not been shown before. We also present a family of codes with minimum distance $d=4$ and the encoding rate asymptotically going to 1 while $n \rightarrow \infty$.
- Apr 23 2018 quant-ph arXiv:1804.07718v1We reduce measurement errors in a quantum computer using machine learning techniques. We exploit a simple yet versatile neural network to classify multi-qubit quantum states, which is trained using experimental data. This flexible approach allows the incorporation of any number of features of the data with minimal modifications to the underlying network architecture. We experimentally illustrate this approach in the readout of trapped-ion qubits using additional spatial and temporal features in the data. Using this neural network classifier, we efficiently treat qubit readout crosstalk, resulting in a 30\% improvement in detection error over the conventional threshold method. Our approach does not depend on the specific details of the system and can be readily generalized to other quantum computing platforms.
- Apr 20 2018 quant-ph arXiv:1804.06969v1We discuss a surprisingly simple scheme for accounting (and removal) of error in observables determined from quantum algorithms. A correction to the value of the observable is calculated by first measuring the observable with all error sources active and subsequently measuring the observable with each error source removed separately. We apply this scheme to the variational quantum eigensolver, simulating the calculation of the ground state energy of equilibrium H$_2$ and LiH in the presence of several noise sources, including amplitude damping, dephasing, thermal noise, and correlated noise. We show that this scheme provides a decrease in the needed quality of the qubits by up to two orders of magnitude. In near-term quantum computers, where full fault-tolerant error correction is too expensive, this scheme provides a route to significantly more accurate calculation
- Apr 23 2018 quant-ph arXiv:1804.07435v1Integrated quantum photonics provides a scalable platform for the generation, manipulation, and detection of optical quantum states by confining light inside miniaturized waveguide circuits. Here we show the generation, manipulation, and interferometric stage of homodyne detection of non-classical light on a single device, a key step towards a fully integrated approach to quantum information with continuous variables. We use a dynamically reconfigurable lithium niobate waveguide network to generate and characterize squeezed vacuum and two-mode entangled states, key resources for several quantum communication and computing protocols. We measure a squeezing level of -1.38+-0.04 dB and demonstrate entanglement by verifying an inseparability criterion I=0.77+-0.02<1. Our platform can implement all the processes required for optical quantum technology and its high nonlinearity and fast reconfigurability makes it ideal for the realization of quantum computation with time encoded continuous variable cluster states.
- Apr 19 2018 quant-ph arXiv:1804.06668v1Digital quantum simulations offer exciting perspectives for the study of fermionic systems such as molecules or lattice models. However, with quantum error correction still being out of reach with present-day technology, a non-vanishing error rate is inevitable. We study the influence of gate errors on simulations of the Trotterized time evolution of the quantum system with focus on the fermionic Hubbard model. Specifically, we consider the effect of stochastic over-rotations in the applied gates. Depending on the particular algorithm implemented such gate errors may lead to a time evolution that corresponds to a disordered fermionic system, or they may correspond to unphysical errors, e.g., violate particle number conservation. We substantiate our analysis by numerical simulations of model systems. In addition we establish the relation between the gate fidelity and the strength of the over-rotations in a Trotterized quantum simulation. Based on this we provide estimates for the maximum number of Trotter steps which can be performed with sufficient accuracy for a given algorithm. This in turn implies, apart from obvious limitations on the maximum time of the simulation, also limits on the system size which can be handled.
- Apr 19 2018 quant-ph arXiv:1804.06554v1We find one-shot bounds for concentration of maximally coherent states in the so called assisted scenario. In this setting, Bob is restricted to performing incoherent operations on his quantum system, however he is assisted by Alice, who holds a purification of Bob's state and can send classical data to him. We further show that in the asymptotic limit our one-shot bounds recover the previously computed rate of asymptotic assisted concentration.
- Apr 23 2018 quant-ph arXiv:1804.07426v1Quantum states of mechanical motion can be important resources for quantum information, metrology, and studies of fundamental physics. Recent demonstrations of superconducting qubits coupled to acoustic resonators have opened up the possibility of performing quantum operations on macroscopic motional modes, which can act as long-lived quantum memories or transducers. In addition, they can potentially be used to test for novel decoherence mechanisms in macroscopic objects and other modifications to standard quantum theory. Many of these applications call for the ability to create and characterize complex quantum states, putting demanding requirements on the speed of quantum operations and the coherence of the mechanical mode. In this work, we demonstrate the controlled generation of multi-phonon Fock states in a macroscopic bulk-acoustic wave resonator. We also perform Wigner tomography and state reconstruction to highlight the quantum nature of the prepared states. These demonstrations are made possible by the long coherence times of our acoustic resonator and our ability to selectively couple to individual phonon modes. Our work shows that circuit quantum acousto-dynamics (circuit QAD) enables sophisticated quantum control of macroscopic mechanical objects and opens the door to using acoustic modes as novel quantum resources.
- Using information theoretic concepts to understand and explore the inner organization of deep neural networks (DNNs) remains a big challenge. Recently, the concept of an information plane began to shed light on the analysis of multilayer perceptrons (MLPs). We provided an in-depth insight into stacked autoencoders (SAEs) using a novel matrix-based Renyi's \alpha-entropy functional, enabling for the first time the analysis of the dynamics of learning using information flow in real-world scenario involving complex network architecture and large data. Despite the great potential of these past works, there are several open questions when it comes to applying information theoretic concepts to understand convolutional neural networks (CNNs). These include for instance the accurate estimation of information quantities among multiple variables, and the many different training methodologies. By extending the novel matrix-based Renyi's \alpha-entropy functional to a multivariate scenario, this paper presents a systematic method to analyze CNNs training using information theory. Our results validate two fundamental data processing inequalities in CNNs, and also have direct impacts on previous work concerning the training and design of CNNs.
- Apr 18 2018 quant-ph arXiv:1804.05856v1In this work we study the problem of single-shot discrimination of von Neumann measurements. We associate each measurement with a measure-and-prepare channel. There are two possible approaches to this problem. The first one, which is simple, does not utilize entanglement. We focus only on discrimination of classical probability distribution, which are outputs of the channels. We find necessary and sufficient criterion for perfect discrimination in this case. A more advanced approach requires the usage and entanglement. We quantify the distance of the two measurements in terms of the diamond norm (called sometimes the completely bounded trace norm). We provide an exact expression for the optimal probability of correct distinction and relate it to the discrimination of unitary channels. We also state a necessary and sufficient condition for perfect discrimination and a semidefinite program which checks this condition. Our main result, however, is a cone program which calculates the distance of these measurements and hence provides an upper bound on the probability of their correct distinction. As a by-product the program also finds a strategy (input state) which achieves this bound. Finally, we provide a full description for the cases of Fourier matrices and mirror isometries.
- Apr 24 2018 hep-th arXiv:1804.08358v1A concept of measuring the quantum distance between two different quantum states which is called quantum information metric is presented. The holographic principle (AdS/CFT) suggests that the quantum information metric $G_{\lambda\lambda}$ between perturbed state and unperturbed state in field theory has a dual description in the classical gravity. In this work we calculate the quantum information metric of a theory which is dual to a conical defect geometry and we show that it is $n$ times the one of its covering space. We also give a holographic check for our result in the gravity side. Meanwhile, it was argued that $G_{\lambda\lambda}$ is dual to a codimension-one surface in spacetime and satisfies $G_{\lambda\lambda}=n_{d}\cdot\mbox{Vol}(\Sigma_{max})/L^{d}$. We show that the coefficient $n_d$ for conical defect should be rescaled by $n^2$ from the one for AdS. A limit case of conical defect --- the massless BTZ black hole--- is also considered. We show that the quantum information metric of the massless BTZ black hole disagrees with the one obtained by taking the vanishing temperature limit in BTZ black hole. This provides a new arena in differiating the different phases between BTZ spacetime and its massless cousin.
- The orbifold construction $A\mapsto A^G$ for a finite group $G$ is fundamental in rational conformal field theory. The construction of $Rep(A^G)$ from $Rep(A)$ on the categorical level, often called gauging, is also prominent in the study of topological phases of matter. The key step in this construction is to find a braided $G$-crossed extension of $\mathcal{C}$ compatible with a given action of $G$ on a non-degenerate braided fusion category $\mathcal{C}$. The extension theory of Etingof-Nikshych-Ostrik gives two obstructions for this problem, $o_3\in H^3(G)$ and $o_4\in H^4(G)$ for certain coefficients, the latter of which depends on a categorical lifting of the action and is notoriously difficult to compute. We show that in the case where $G\le S_n$ acts by permutations on $\mathcal{C}^{\boxtimes n}$, both of these obstructions vanish. This verifies a conjecture of Müger, and constitutes a nontrivial test of the conjecture that all modular tensor categories come from vertex operator algebras or conformal nets.
- Defined by a single axiom, finite abstract simplicial complexes belong to the simplest constructs of mathematics. We look at a a few theorems.
- In this paper, we propose a simple neural net that requires only $O(nlog_2k)$ number of qubits and $O(nk)$ quantum gates: Here, $n$ is the number of input parameters, and $k$ is the number of weights applied to these parameters in the proposed neural net. We describe the network in terms of a quantum circuit, and then draw its equivalent classical neural net which involves $O(k^n)$ nodes in the hidden layer. Then, we show that the network uses a periodic activation function of cosine values of the linear combinations of the inputs and weights. The backpropagation is described through the gradient descent, and then iris and breast cancer datasets are used for the simulations. The numerical results indicate the network can be used in machine learning problems and it may provide exponential speedup over the same structured classical neural net.
- Apr 19 2018 quant-ph arXiv:1804.06794v1Preparation uncertainty relations establish a trade-off in the statistical spread of incompatible observables. However, the Heisenberg-Robertson (or Schroedinger's) uncertainty relations are expressed in terms of the product of variances, which is null whenever the system is in an eigenstate of one of the observables. So, in this case the relation becomes trivial and in the other cases it must be expressed in terms of a state-dependent bound. Uncertainty relations based on the sum of variances do not suffer from this drawback, as the sum cannot be null if the observables are incompatible, and hence they can capture fully the concept of quantum incompatibility. General procedures to construct generic sum-uncertainty relations are not known. Here we present one such procedure, based on Lie algebraic properties of observables that produces state-independent bounds. We illustrate our result for the cases of the Weyl-Heisenberg algebra, special unitary algebras up to rank 4, and any semisimple compact algebra.
- Multi-layer neural networks are among the most powerful models in machine learning, yet the fundamental reasons for this success defy mathematical understanding. Learning a neural network requires to optimize a non-convex high-dimensional objective (risk function), a problem which is usually attacked using stochastic gradient descent (SGD). Does SGD converge to a global optimum of the risk or only to a local optimum? In the first case, does this happen because local minima are absent, or because SGD somehow avoids them? In the second, why do local minima reached by SGD have good generalization properties? In this paper we consider a simple case, namely two-layers neural networks, and prove that -in a suitable scaling limit- SGD dynamics is captured by a certain non-linear partial differential equation (PDE) that we call distributional dynamics (DD). We then consider several specific examples, and show how DD can be used to prove convergence of SGD to networks with nearlyideal generalization error. This description allows to 'average-out' some of the complexities of the landscape of neural networks, and can be used to prove a general convergence result for noisy SGD.
- We give a simple distributed algorithm for computing adjacency matrix eigenvectors for the communication graph in an asynchronous gossip model. We show how to use this algorithm to give state-of-the-art asynchronous community detection algorithms when the communication graph is drawn from the well-studied \emphstochastic block model. Our methods also apply to a natural alternative model of randomized communication, where nodes within a community communicate more frequently than nodes in different communities. Our analysis simplifies and generalizes prior work by forging a connection between asynchronous eigenvector computation and Oja's algorithm for streaming principal component analysis. We hope that our work serves as a starting point for building further connections between the analysis of stochastic iterative methods, like Oja's algorithm, and work on asynchronous and gossip-type algorithms for distributed computation.
- We explain in detail the quantum-to-classical transition for the cosmological perturbations using only the standard rules of quantum mechanics: the Schrodinger equation and Born's rule applied to a subsystem. We show that the conditioned, i.e. intrinsic, pure state of the perturbations, is driven by the interactions with a generic environment, to become increasingly localized in field space as a mode exists the horizon during inflation. With a favourable coupling to the environment, the conditioned state of the perturbations becomes highly localized in field space due to the expansion of spacetime by a factor of roughly exp(-c N), where N~50 and c is a model dependent number of order 1. Effectively the state rapidly becomes specified completely by a point in phase space and an effective, classical, stochastic process emerges described by a classical Langevin equation. The statistics of the stochastic process is described by the solution of the master equation that describes the perturbations coupled to the environment.
- Preparing and certifying bound entangled states in the laboratory is an intrinsically hard task, due to both the fact that they typically form narrow regions in the state space, and that a certificate requires a tomographic reconstruction of the density matrix. Indeed, the previous experiments that have reported the preparation of a bound entangled state relied on such tomographic reconstruction techniques. However, the reliability of these results crucially depends on the extra assumption of an unbiased reconstruction. We propose an alternative method for certifying the bound entangled character of a quantum state that leads to a rigorous claim within a desired statistical significance, while bypassing a full reconstruction of the state. The method is comprised by a search for bound entangled states that are robust for experimental verification, and a hypothesis test tailored for the detection of bound entanglement that is naturally equipped with a measure of statistical significance. We apply our method to families of states of $3\times 3$ and $4\times 4$ systems, and find that the experimental certification of bound entangled states is well within reach.
- Apr 23 2018 quant-ph arXiv:1804.07457v2A two-pass fiber-optic quantum key distribution system with phase-encoded photon states in synchronization mode has been investigated. The possibility of applying the analytical expressions for the calculation of the correct detection probability of the signal time window at synchronization has been proved. A modernized algorithm of photon pulse detection, taking into account the dead time of the single-photon avalanche photodiode was proposed. The method of engineering an optical pulse detection process during the synchronization in a quantum key distribution system has been offered.
- Apr 23 2018 quant-ph arXiv:1804.07308v1The superposition of quantum states is one of the hallmarks of quantum physics, and clear demonstrations of superposition have been achieved in a number of quantum systems. However, mechanical systems have remained a challenge, with only indirect demonstrations of mechanical state superpositions, in spite of the intellectual appeal and technical utility such a capability would bring. This is due in part to the highly linear response of most mechanical systems, making quantum operation difficult, as well as their characteristically low frequencies, making it difficult to reach the quantum ground state. In this work, we demonstrate full quantum control of the mechanical state of a macroscopic mechanical resonator. We strongly couple a surface acoustic wave resonator to a superconducting qubit, using the qubit to control and measure quantum states in the mechanical resonator. Most notably, we generate a quantum superposition of the zero and one phonon states and map this and other states using Wigner tomography. This precise, programmable quantum control is essential to a range of applications of surface acoustic waves in the quantum limit, including using surface acoustic waves to couple disparate quantum systems.
- We present a major update to QuSpin, SciPostPhys.2.1.003 -- an open-source Python package for exact diagonalization and quantum dynamics of arbitrary boson, fermion and spin many-body systems, supporting the use of various (user-defined) symmetries in one and higher dimension and (imaginary) time evolution following a user-specified driving protocol. We explain how to use the new features of QuSpin using seven detailed examples of various complexity: (i) the transverse-field Ising model and the Jordan-Wigner transformation, (ii) free particle systems: the SSH model, (iii) the many-body localized Fermi-Hubbard model, (iv) the Bose-Hubbard model in a ladder geometry, (v) nonlinear (imaginary) time evolution and the Gross-Pitaevskii equation, (vi) integrability breaking and thermalizing dynamics in the translationally-invariant 2D transverse-field Ising model, and (vii) out-of-equilibrium Bose-Fermi mixtures. This easily accessible and user-friendly package can serve various purposes, including educational and cutting-edge experimental and theoretical research. The complete package documentation is available under http://weinbe58.github.io/QuSpin/index.html.
- A residual network (or ResNet) is a standard deep neural net architecture, with state-of-the-art performance across numerous applications. The main premise of ResNets is that they allow the training of each layer to focus on fitting just the residual of the previous layer's output and the target output. Thus, we should expect that the trained network is no worse than what we can obtain if we remove the residual layers and train a shallower network instead. However, due to the non-convexity of the optimization problem, it is not at all clear that ResNets indeed achieve this behavior, rather than getting stuck at some arbitrarily poor local minimum. In this paper, we rigorously prove that arbitrarily deep, nonlinear ResNets indeed exhibit this behavior, in the sense that the optimization landscape contains no local minima with value above what can be obtained with a linear predictor (namely a 1-layer network). Notably, we show this under minimal or no assumptions on the precise network architecture, data distribution, or loss function used. We also provide a quantitative analysis of second-order stationary points for this problem, and show that with a certain tweak to the architecture, training the network with standard stochastic gradient descent achieves an objective value no worse than any fixed linear predictor.
- In this work, we characterize the outputs of individual neurons in a trained feed-forward neural network by entropy, mutual information with the class variable, and a class selectivity measure based on Kullback-Leibler divergence. By cumulatively ablating neurons in the network, we connect these information-theoretic measures to the impact their removal has on classification performance on the test set. We observe that, looking at the neural network as a whole, none of these measures is a good indicator for classification performance, thus confirming recent results by Morcos et al. However, looking at specific layers separately, both mutual information and class selectivity are positively correlated with classification performance. We thus conclude that it is ill-advised to compare these measures across layers, and that different layers may be most appropriately characterized by different measures. We then discuss pruning neurons from neural networks to reduce computational complexity of inference. Drawing from our results, we perform pruning based on information-theoretic measures on a fully connected feed-forward neural network with two hidden layers trained on MNIST dataset and compare the results to a recently proposed pruning method. We furthermore show that the common practice of re-training after pruning can partly be obviated by a surgery step called bias balancing, without incurring significant performance degradation.
- Deep probabilistic programming languages try to combine the advantages of deep learning with those of probabilistic programming languages. If successful, this would be a big step forward in machine learning and programming languages. Unfortunately, as of now, this new crop of languages is hard to use and understand. This paper addresses this problem directly by explaining deep probabilistic programming languages and indirectly by characterizing their current strengths and weaknesses.
- Deep latent variable models are powerful tools for representation learning. In this paper, we adopt the deep information bottleneck model, identify its shortcomings and propose a model that circumvents them. To this end, we apply a copula transformation which, by restoring the invariance properties of the information bottleneck method, leads to disentanglement of the features in the latent space. Building on that, we show how this transformation translates to sparsity of the latent space in the new model. We evaluate our method on artificial and real data.
- Apr 24 2018 cs.DS arXiv:1804.08178v1We design new algorithms for maximizing a monotone non-negative submodular function under various constraints, which improve the state-of-the-art in time complexity and/or performance guarantee. We first investigate the cardinality constrained submodular maximization problem that has been widely studied for about four decades. We design an $(1-\frac{1}{e}-\varepsilon)$-approximation algorithm that makes $O(n\cdot \max \{\varepsilon^{-1},\log\log k \})$ queries. To the best of our knowledge, this is the fastest known algorithm. We further answer the open problem on finding a lower bound on the number of queries. We show that, no (randomized) algorithm can achieve a ratio better than $(\frac{1}{2}+\Theta(1))$ with $o(\frac{n}{\log n})$ queries. The acceleration above is achieved by our \emphAdaptive Decreasing Threshold (ADT) algorithm. Based on ADT, we study the $p$-system and $d$ knapsack constrained maximization problem. We show that an $(1/(p+\frac{7}{4}d+1)-\varepsilon)$-approximate solution can be computed via $O(\frac{n}{\varepsilon}\log \frac{n}{\varepsilon}\max\{\log \frac{1}{\varepsilon},\log\log n\})$ queries. Note that it improves the state of the art in both time complexity and approximation ratio. We also show how to improve the ratio for a single knapsack constraint via $O(n\cdot \max \{\varepsilon^{-1},\log\log k \})$ queries. For maximizing a submodular function with curvature $\kappa$ under matroid constraint, we show an $(1-\frac{\kappa}{e}-\varepsilon)$-approximate algorithm that uses $\tilde{O}(nk)$ value oracle queries. Our ADT could be utilized to obtain faster algorithms in other problems. To prove our results, we introduce a general characterization between randomized complexity and deterministic complexity of approximation algorithms that could be used in other problems and may be interesting in its own right.
- Apr 24 2018 cs.CC arXiv:1804.08176v1We propose an algebraic approach to proving circuit lower bounds for ACC0 by defining and studying the notion of torus polynomials. We show how currently known polynomial-based approximation results for AC0 and ACC0 can be reformulated in this framework, implying that ACC0 can be approximated by low-degree torus polynomials. Furthermore, as a step towards proving ACC0 lower bounds for the majority function via our approach, we show that MAJORITY cannot be approximated by low-degree symmetric torus polynomials. We also pose several open problems related to our framework.
- We initiate the study of Boolean function analysis on high-dimensional expanders. We describe an analog of the Fourier expansion and of the Fourier levels on simplicial complexes, and generalize the FKN theorem to high-dimensional expanders. Our results demonstrate that a high-dimensional expanding complex $X$ can sometimes serve as a sparse model for the Boolean slice or hypercube, and quite possibly additional results from Boolean function analysis can be carried over to this sparse model. Therefore, this model can be viewed as a derandomization of the Boolean slice, containing $|X(k)|=O(n)$ points in comparison to $\binom{n}{k+1}$ points in the $(k+1)$-slice (which consists of all $n$-bit strings with exactly $k+1$ ones).
- Gao's quantum union bound is a generalization of the union bound from probability theory and finds a range of applications in quantum communication theory, quantum algorithms, and quantum complexity theory [Phys. Rev. A, 92(5):052331, 2015]. It is relevant when performing a sequence of binary-outcome quantum measurements on a quantum state, giving the same bound that the classical union bound would, except with a scaling factor of four. In this paper, we improve upon Gao's quantum union bound, by proving a quantum union bound that involves a tunable parameter that can be optimized. This tunable parameter plays a similar role to a parameter involved in the Hayashi-Nagaoka inequality [IEEE Trans. Inf. Theory, 49(7):1753 (2003)], used often in quantum information theory when analyzing the error probability of a square-root measurement. An advantage of the proof delivered here is that it is elementary, relying only on basic properties of projectors, the Pythagorean theorem, and the Cauchy--Schwarz inequality. As a non-trivial application of our quantum union bound, we prove that a sequential decoding strategy for classical communication over a quantum channel achieves a lower bound on the channel's second-order coding rate. This demonstrates the advantage of our quantum union bound in the non-asymptotic regime, in which a communication channel is called a finite number of times.
- Grover's algorithm provides quantum computers a quadratic advantage over classical computers for searching in an arbitrary data-set. Bitcoin mining falls into this category of a search problem. It has been previously argued that the only side-effect of quantum mining would be an increased difficulty, due to this quadratic speed-up which can be applied to Bitcoin mining. In this work we argue that a crucial argument in the analysis of Bitcoin's security breaks down when quantum mining is performed. Classically, a Bitcoin fork occurs rarely, when two miners find a block almost at the same time: only if both miners are unaware of the other's block, due to propagation time effects. The situation differs dramatically when quantum miners use Grover's algorithm. Grover's algorithm repeatedly applies a procedure called a Grover iteration. More iterations provide a quadratically higher chance of finding a block. Crucially, a miner does not have to choose how many iterations to apply in advance. Suppose Alice receives Bob's new block. To maximize her revenue, she should stop applying Grover iterations and measure her state. Her hope is that her block (rather than Bob's) would become part of the longest chain. This strong correlation between the miners' actions, and the fact that they all measure their state at the same time, may lead to more forks. This is known as a security risk for Bitcoin. We propose a mechanism which, we conjecture, prohibits this form of quantum mining, and circumvents the high rate of forks.
- We consider the problem of sampling from the Potts model on random regular graphs. It is conjectured that sampling is possible when the temperature of the model is in the uniqueness regime of the regular tree, but positive algorithmic results have been for the most part elusive. In this paper, for all integers $q\geq 3$ and $\Delta\geq 3$, we develop algorithms that produce samples within error $o(1)$ from the $q$-state Potts model on random $\Delta$-regular graphs, whenever the temperature is in uniqueness, for both the ferromagnetic and antiferromagnetic cases. The algorithm for the antiferromagnetic Potts model is based on iteratively adding the edges of the graph and resampling a bichromatic class that contains the endpoints of the newly added edge. Key to the algorithm is how to perform the resampling step efficiently since bichromatic classes may induce linear-sized components. To this end, we exploit the tree uniqueness to show that the average growth of bichromatic components is typically small, which allows us to use correlation decay algorithms for the resampling step. While the precise uniqueness threshold on the tree is not known for general values of $q$ and $\Delta$ in the antiferromagnetic case, our algorithm works throughout uniqueness regardless of its value. In the case of the ferromagnetic Potts model, we simplify the algorithm significantly by utilising the random-cluster representation of the model. In particular, we show that a percolation-type algorithm succeeds in sampling from the random-cluster model with parameters $p,q$ on random $\Delta$-regular graphs for all values of $q\geq 1$ and $p<p_c(q,\Delta)$, where $p_c(q,\Delta)$ corresponds to a uniqueness threshold for the model on the $\Delta$-regular tree. When restricted to integer values of $q$, this yields a simplified algorithm for the ferromagnetic Potts model on random $\Delta$-regular graphs.
- Online matching problems have garnered significant attention in recent years due to numerous applications in e-commerce, online advertisements, ride-sharing, etc. Many of them capture the uncertainty in the real world by including stochasticity in both the arrival process and the matching process. The Online Stochastic Matching with Timeouts problem introduced by Bansal, et al., (Algorithmica, 2012) models matching markets (e.g., E-Bay, Amazon). Buyers arrive from an independent and identically distributed (i.i.d.) known distribution on buyer profiles and can be shown a list of items one at a time. Each buyer has some probability of purchasing each item and a limit (timeout) on the number of items they can be shown. Bansal et al., (Algorithmica, 2012) gave a 0.12-competitive algorithm which was improved by Adamczyk, et al., (ESA, 2015) to 0.24. We present an online attenuation framework that uses an algorithm for offline stochastic matching as a black box. On the upper bound side, we show that this framework, combined with a black-box adapted from Bansal et al., (Algorithmica, 2012), yields an online algorithm which nearly doubles the ratio to 0.46. On the lower bound side, we show that no algorithm can achieve a ratio better than 0.632 using the standard LP for this problem. This framework has a high potential for further improvements since new algorithms for offline stochastic matching can directly improve the ratio for the online problem. Our online framework also has the potential for a variety of extensions. For example, we introduce a natural generalization: Online Stochastic Matching with Two-sided Timeouts in which both online and offline vertices have timeouts. Our framework provides the first algorithm for this problem achieving a ratio of 0.30. We once again use the algorithm of Adamczyk et al., (ESA, 2015) as a black-box and plug-it into our framework.
- Apr 24 2018 cs.DS arXiv:1804.08001v1We design new differentially private algorithms for the Euclidean k-means problem, both in the centralized model and in the local model of differential privacy. In both models, our algorithms achieve significantly improved error rates over the previous state-of-the-art. In addition, in the local model, our algorithm significantly reduces the number of needed interactions. Although the problem has been widely studied in the context of differential privacy, all of the existing constructions achieve only super constant approximation factors. We present, for the first time, efficient private algorithms for the problem with constant multiplicative error.
- A key problem in research on adversarial examples is that vulnerability to adversarial examples is usually measured by running attack algorithms. Because the attack algorithms are not optimal, the attack algorithms are prone to overestimating the size of perturbation needed to fool the target model. In other words, the attack-based methodology provides an upper-bound on the size of a perturbation that will fool the model, but security guarantees require a lower bound. CLEVER is a proposed scoring method to estimate a lower bound. Unfortunately, an estimate of a bound is not a bound. In this report, we show that gradient masking, a common problem that causes attack methodologies to provide only a very loose upper bound, causes CLEVER to overestimate the size of perturbation needed to fool the model. In other words, CLEVER does not resolve the key problem with the attack-based methodology, because it fails to provide a lower bound.
- We revisit the question of reducing online learning to approximate optimization of the offline problem. In this setting, we give two algorithms with near-optimal performance in the full information setting: they guarantee optimal regret and require only poly-logarithmically many calls to the approximation oracle per iteration. Furthermore, these algorithms apply to the more general improper learning problems. In the bandit setting, our algorithm also significantly improves the best previously known oracle complexity while maintaining the same regret.
- This work considers the question: what convergence guarantees does the stochastic subgradient method have in the absence of smoothness and convexity? We prove that the stochastic subgradient method, on any semialgebraic locally Lipschitz function, produces limit points that are all first-order stationary. More generally, our result applies to any function with a Whitney stratifiable graph. In particular, this work endows the stochastic subgradient method with rigorous convergence guarantees for a wide class of problems arising in data science---including all popular deep learning architectures.
- The main goal of this paper is to classify $\ast$-module categories for the $SO(3)_{2m}$ modular tensor category. This is done by classifying $SO(3)_{2m}$ nimrep graphs and cell systems, and in the process we also classify the $SO(3)$ modular invariants. There are module categories of type $\mathcal{A}$, $\mathcal{E}$ and their conjugates. These come with a multiplicity from the centre of $SU(2)$, but there are no orbifold (or type $\mathcal{D}$) module categories. We present a construction of a subfactor with principal graph given by the fusion rules of the fundamental generator of the $SO(3)_{2m}$ modular category. We also introduce a Frobenius algebra $A$ which is an $SO(3)$ generalisation of (higher) preprojective algebras, and derive a finite resolution of $A$ as a left $A$-module along with its Hilbert series.
- Apr 23 2018 astro-ph.CO astro-ph.GA arXiv:1804.07704v1The debate in cosmology concerning LambdaCDM and MOND depends crucially on their respective ability of modelling across scales, and dealing with some of the specific problems that arise along the way. The main upshot of this article is to present three main problems facing multi-scale modelling in contemporary cosmology. The LambdaCDM model, which is the standard and by far most successful current cosmological model, faces what I call the downscaling problem when it comes to explain some recalcitrant evidence at the scale of individual galaxies, such as the mass-discrepancy acceleration relation (MDAR) and the baryonic Tully-Fisher relation (BTF). While the fastgrowing development of computer simulations has addressed these problems, nagging worries remain about some of the epistemic limits of these computer simulations in retrieving (as opposed to explaining) the data. The so-called upscaling problem affects MOND and its ability not just to explain but even simply retrieve large-scale structure and galaxy clusters. Recent attempts at extending MOND (EMOND) have had a limited empirical success, and are still far from providing a consistent explanation for possible formation mechanisms at the large-scale structure. Finally, the in between scales problem affects proposals designed to achieve the best of both worlds at the meso-scale. This is a fascinating area from a physical and a philosophical point of view, where the main challenge is the ability to have genuine predictive novelty.
- Apr 23 2018 math.CO arXiv:1804.07673v1Confirming a conjecture of Vera T. Sós in a very strong sense, we give a complete solution to Turán's hypergraph problem for the Fano plane. That is we prove for $n\ge 8$ that among all $3$-uniform hypergraphs on $n$ vertices not containing the Fano plane there is indeed exactly one whose number of edges is maximal, namely the balanced, complete, bipartite hypergraph. Moreover, for $n=7$ there is exactly one other extremal configuration with the same number of edges: the hypergraph arising from a clique of order $7$ by removing all five edges containing a fixed pair of vertices. For sufficiently large values $n$ this was proved earlier by Füredi and Simonovits, and by Keevash and Sudakov, who utilised the stability method.
- Apr 23 2018 quant-ph cond-mat.mes-hall arXiv:1804.07644v1We propose and analyze magnetic traps and lattices for electrons in semiconductors. We provide a general theoretical framework and show that thermally stable traps can be generated by magnetically driving the particle's internal spin transition, akin to optical dipole traps for ultra-cold atoms. Next we discuss in detail periodic arrays of magnetic traps, i.e. magnetic lattices, as a platform for quantum simulation of exotic Hubbard models, with lattice parameters that can be tuned in real time. Our scheme can be readily implemented in state-of-the-art experiments, as we particularize for two specific setups, one based on a superconducting circuit and another one based on surface acoustic waves.
- Apr 23 2018 quant-ph arXiv:1804.07639v1In this paper, we establish a general theoretical framework for the description of continuous quantum measurements and the statistics of the results of such measurements. The framework concerns the measurement of an arbitrary quantum system with arbitrary number of detectors under realistic assumption of instant detector reactions and white noise sources. We attend various approaches to the problem showing their equivalence. The approaches include the full counting statistics (FCS) evolution equation a for pseudo-density matrix, the drift-diffusion equation for a density matrix in the space of integrated outputs, and discrete stochastic updates. We provide the derivation of the underlying equations from a microscopic approach based on full counting statistics method, a phenomenological approach based on Lindblad construction, and interaction with auxiliary quantum systems representing the detectors. We establish the necessary conditions on the phenomenological susceptibilities and noises that guarantee the unambiguous interpretation of the measurement results and the positivity of the density matrix. Our results can be easily extended to describe various quantum feedback schemes where the manipulation decision is based on the values of detector outputs.
- Apr 23 2018 quant-ph arXiv:1804.07591v1Geometric phases are noise-resilient, and thus provide a robust way towards high fidelity quantum manipulation. Here we experimentally demonstrate arbitrary non-adiabatic holonomic single-qubit quantum gates for both a superconducting transmon qubit and a microwave cavity in a single-loop way. In both cases, an auxiliary state is utilized, and two resonant microwave drives are simultaneously applied with well-controlled but varying amplitudes and phases for the arbitrariness of the gate. The resulting gates on the transmon qubit achieve a fidelity of 0.996 characterized by randomized benchmarking and the ones on the cavity show an averaged fidelity of 0.978 based on a full quantum process tomography. In principle, a nontrivial two-qubit holonomic gate between the qubit and the cavity can also be realized based on our presented experimental scheme. Our experiment thus paves the way towards practical non-adiabatic holonomic quantum manipulation with both qubits and cavities in a superconducting circuit.