- Feb 22 2018 quant-ph arXiv:1802.07419v1The No Low-Energy Trivial States (NLTS) conjecture of Freedman and Hastings (Quantum Information and Computation 2014), which asserts the existence of local Hamiltonians whose low energy states cannot generated by constant depth quantum circuits, identifies a fundamental obstacle to resolving the quantum PCP conjecture. Progress towards the NLTS conjecture was made by Eldar and Harrow (Foundations of Computer Science 2017), who proved a closely related theorem called No Low-Error Trivial States (NLETS). In this paper, we give a much simpler proof of the NLETS theorem, and use the same technique to establish superpolynomial circuit size lower bounds for noisy ground states of local Hamiltonians (assuming $\mathsf{QCMA} \neq \mathsf{QMA}$), resolving an open question of Eldar and Harrow. We discuss the new light our results cast on the relationship between NLTS and NLETS. Finally, our techniques imply the existence of \emphapproximate quantum low-weight check (qLWC) codes with linear rate, linear distance, and constant weight checks. These codes are similar to quantum LDPC codes except (1) each particle may participate in a large number of checks, and (2) errors only need to be corrected up to fidelity $1 - 1/\mathsf{poly}(n)$. This stands in contrast to the best-known stabilizer LDPC codes due to Freedman, Meyer, and Luo which achieve a distance of $O(\sqrt{n \log n})$. The principal technique used in our results is to leverage the Feynman-Kitaev clock construction to approximately embed a subspace of states defined by a circuit as the ground space of a local Hamiltonian.
- We derive an attainable bound on the precision of quantum state estimation for finite dimensional systems, providing a construction for the asymptotically optimal measurement. Our results hold under an assumption called local asymptotic covariance, which is weaker than unbiasedness or local unbiasedness. The derivation is based on an analysis of the limiting distribution of the estimator's deviation from the true value of the parameter, and takes advantage of quantum local asymptotic normality, a duality between sequences of identically prepared states and Gaussian states of continuous variable systems. We first prove our results for the mean square error of a special class of models, called D-invariant, and then extend the results to arbitrary models, generic cost functions, and global state estimation, where the unknown parameter is not restricted to a local neighbourhood of the true value. The extension includes a treatment of nuisance parameters, namely parameters that are not of interest to the experimenter but nevertheless affect the estimation. As an illustration of the general approach, we provide the optimal estimation strategies for the joint measurement of two qubit observables, for the estimation of qubit states in the presence of amplitude damping noise, and for noisy multiphase estimation.
- Feb 22 2018 cs.CC arXiv:1802.07425v1We study the problem of computing the $p\rightarrow q$ norm of a matrix $A \in R^{m \times n}$, defined as \[ \|A\|_p→q ~:=~ \max_x \,∈\,R^n ∖{0} \frac\|Ax\|_q\|x\|_p \]This problem generalizes the spectral norm of a matrix ($p=q=2$) and the Grothendieck problem ($p=\infty$, $q=1$), and has been widely studied in various regimes. When $p \geq q$, the problem exhibits a dichotomy: constant factor approximation algorithms are known if $2 \in [q,p]$, and the problem is hard to approximate within almost polynomial factors when $2 \notin [q,p]$. The regime when $p < q$, known as \emphhypercontractive norms, is particularly significant for various applications but much less well understood. The case with $p = 2$ and $q > 2$ was studied by [Barak et al, STOC'12] who gave sub-exponential algorithms for a promise version of the problem (which captures small-set expansion) and also proved hardness of approximation results based on the Exponential Time Hypothesis. However, no NP-hardness of approximation is known for these problems for any $p < q$. We study the hardness of approximating matrix norms in both the above cases and prove the following results: - We show that for any $1< p < q < \infty$ with $2 \notin [p,q]$, $\|A\|_{p\rightarrow q}$ is hard to approximate within $2^{O(\log^{1-\epsilon}\!n)}$ assuming $NP \not\subseteq BPTIME(2^{\log^{O(1)}\!n})$. This suggests that, similar to the case of $p \geq q$, the hypercontractive setting may be qualitatively different when $2$ does not lie between $p$ and $q$. - For all $p \geq q$ with $2 \in [q,p]$, we show $\|A\|_{p\rightarrow q}$ is hard to approximate within any factor than $1/(\gamma_{p^*} \cdot \gamma_q)$, where for any $r$, $\gamma_r$ denotes the $r^{th}$ norm of a gaussian, and $p^*$ is the dual norm of $p$.
- This paper introduces a novel measure-theoretic learning theory to analyze generalization behaviors of practical interest. The proposed learning theory has the following abilities: 1) to utilize the qualities of each learned representation on the path from raw inputs to outputs in representation learning, 2) to guarantee good generalization errors possibly with arbitrarily rich hypothesis spaces (e.g., arbitrarily large capacity and Rademacher complexity) and non-stable/non-robust learning algorithms, and 3) to clearly distinguish each individual problem instance from each other. Our generalization bounds are relative to a representation of the data, and hold true even if the representation is learned. We discuss several consequences of our results on deep learning, one-shot learning and curriculum learning. Unlike statistical learning theory, the proposed learning theory analyzes each problem instance individually via measure theory, rather than a set of problem instances via statistics. Because of the differences in the assumptions and the objectives, the proposed learning theory is meant to be complementary to previous learning theory and is not designed to compete with it.
- Feb 22 2018 quant-ph cond-mat.stat-mech arXiv:1802.07681v1We study a quantum Stirling cycle which extracts work using quantized energy levels. The work and the efficiency of the engine depend on the length of the potential well, and the Carnot efficiency is achieved in a low temperature limiting case. We show that the lack of information about the position of the particle inside the potential well can be converted into useful work without resorting to any measurement. In the low temperature limit, we calculate the amount of work extractable from distinguishable particles, fermions and bosons.
- We construct efficient, unconditional non-malleable codes that are secure against tampering functions computed by small-depth circuits. For constant-depth circuits of polynomial size (i.e. $\mathsf{AC^0}$ tampering functions), our codes have codeword length $n = k^{1+o(1)}$ for a $k$-bit message. This is an exponential improvement of the previous best construction due to Chattopadhyay and Li (STOC 2017), which had codeword length $2^{O(\sqrt{k})}$. Our construction remains efficient for circuit depths as large as $\Theta(\log(n)/\log\log(n))$ (indeed, our codeword length remains $n\leq k^{1+\epsilon})$, and extending our result beyond this would require separating $\mathsf{P}$ from $\mathsf{NC^1}$. We obtain our codes via a new efficient non-malleable reduction from small-depth tampering to split-state tampering. A novel aspect of our work is the incorporation of techniques from unconditional derandomization into the framework of non-malleable reductions. In particular, a key ingredient in our analysis is a recent pseudorandom switching lemma of Trevisan and Xue (CCC 2013), a derandomization of the influential switching lemma from circuit complexity; the randomness-efficiency of this switching lemma translates into the rate-efficiency of our codes via our non-malleable reduction.
- Feb 22 2018 cond-mat.mes-hall arXiv:1802.07627v1Visibility of singlet-triplet qubit readout is reduced to almost zero in large magnetic field gradients due to relaxation processes. Here we present a new readout technique that is robust against relaxation and allows for measurement when previously studied methods fail. This technique maps the qubit onto spin states that are immune to relaxation using a spin dependent electron tunneling process between the qubit and the lead. We probe this readout's performance as a function of magnetic field gradient and applied magnetic field, and optimize the pulse applied to the qubit through experiment and simulation.
- Bhat characterizes the family of linear maps defined on $B(\mathcal{H})$ which preserve unitary conjugation. We generalize this idea and study the maps with a similar equivariance property on finite-dimensional matrix algebras. We show that the maps with equivariance property are significant to study $k$-positivity of linear maps defined on finite-dimensional matrix algebras. Choi showed that $n$-positivity is different from $(n-1)$-positivity for the linear maps defined on $n$ by $n$ matrix algebras. In this paper, we present a parametric family of linear maps $\Phi_{\alpha, \beta,n} : M_{n}(\mathbb{C}) \rightarrow M_{n^{2}}(\mathbb{C})$ and study the properties of positivity, completely positivity, decomposability etc. We determine values of parameters $\alpha$ and $\beta$ for which the family of maps $\Phi_{\alpha, \beta,n}$ is positive for any natural number $n \geq 3$. We focus on the case of $n=3,$ that is, $\Phi_{\alpha, \beta,3}$ and study the properties of $2$-positivity, completely positivity and decomposability. In particular, we give values of parameters $\alpha$ and $\beta$ for which the family of maps $\Phi_{\alpha, \beta,3}$ is $2$-positive and not completely positive.
- Feb 22 2018 quant-ph cond-mat.stat-mech arXiv:1802.07715v1We develop a theory to describe dynamics of a nonstationary open quantum system interacting with a hybrid environment, which includes high-frequency and low-frequency noise components. One part of the system-bath interaction is treated in a perturbative manner, whereas the other part is considered exactly. This approach allows us to derive a set of master equations where the relaxation rates are expressed as convolutions of the Bloch-Redfield and Marcus formulas. Our theory enables analysis of systems that have extremely small energy gaps in the presence of a realistic environment. We apply the theory to an example of the 16-qubit quantum annealing problem with dangling qubits and show qualitative agreement with experimental results.
- We show a global existence result of weak solutions for a class of generalized Surface Quasi-Geostrophic equation in the inviscid case. We also prove the global regularity of such solutions for the equation with slightly supercritical dissipation, which turns out to correspond to a logarithmically supercritical diffusion due to the singular nature of the velocity. Our last result is the eventual regularity in the supercritical cases for such weak solutions. The main idea in the proof of the existence part is based on suitable commutator estimates along with a careful cutting into low/high frequencies and inner/outer spatial scales to pass to the limit; while the proof of both the global regularity result and the eventual regularity for the supercritical diffusion are essentially based on the use of the so-called modulus of continuity method.
- Feb 22 2018 math.AP arXiv:1802.07661v1In this paper, we provide a sufficient condition of the energy equality for the incompressible Navier-Stokes equations in bounded domains.
- We give a maximal independent set (MIS) algorithm that runs in $O(\log \log \Delta)$ rounds in the congested clique model, where $\Delta$ is the maximum degree of the input graph. This improves upon the $O(\frac{\log(\Delta) \cdot \log \log \Delta}{\sqrt{\log n}} + \log \log \Delta )$ rounds algorithm of [Ghaffari, PODC '17], where $n$ is the number of vertices of the input graph. In the first stage of our algorithm, we simulate the first $O(\frac{n}{\text{poly} \log n})$ iterations of the sequential random order Greedy algorithm for MIS in the congested clique model in $O(\log \log \Delta)$ rounds. This thins out the input graph relatively quickly: After this stage, the maximum degree of the residual graph is poly-logarithmic. In the second stage, we run the MIS algorithm of [Ghaffari, PODC '17] on the residual graph, which completes in $O(\log \log \Delta)$ rounds on graphs of poly-logarithmic degree.
- The floating structure problem describes the interaction between surface water waves and a floating body, generally a boat or a wave energy converter. As shown by Lannes in [18] the equations for the fluid motion can be reduced to a set of two evolution equations on the surface elevation and the horizontal discharge. The presence of the object is accounted for by a constraint on the discharge under the object; the pressure exerted by the fluid on this object is then the Lagrange multiplier associated to this constraint. Our goal in this paper is to prove the well-posedness of this fluid-structure interaction problem in the shallow water approximation under the assumption that the flow is axisymmetric without swirl. We write the fluid equations as a quasilinear hyperbolic mixed initial boundary value problem and the solid equation as a second order ODE coupled to the fluid equations. Finally we prove the local in time well-posedness for this coupled problem, provided some compatibility conditions on the initial data are satisfied.
- We study the homotopy category $\mathrm{hmf}(R,W)$ of matrix factorizations of non-zero elements $W\in R^\times$, where $R$ is an elementary divisor domain. When $R$ has prime elements and $W$ factors into a square-free element $W_0$ and a finite product of primes of multiplicity greater than one and which do not divide $W_0$, we show that $\mathrm{hmf}(R,W)$ is triangle-equivalent with an orthogonal sum of the triangulated categories of singularities $\mathrm{D}_{\mathrm sing}(A_n(p))$ of the local Artinian rings $A_n(p)=R/\langle p^n\rangle$, where $p$ runs over the prime divisors of $W$ of order $n\geq 2$. This result holds even when $R$ is not Noetherian. The triangulated categories $\mathrm{D}_{\mathrm sing}(A_n(p))$ are Krull-Schmidt and we describe them explicitly. We also study the cocycle category $\mathrm{zmf}(R,W)$, showing that it is additively generated by elementary matrix factorizations. Finally, we discuss a few classes of examples.
- Feb 22 2018 math.AP arXiv:1802.07607v1We study the Plateau problem with a lower dimensional obstacle in $\mathbb{R}^n$. Intuitively, in $\mathbb{R}^3$ this corresponds to a soap film (spanning a given contour) that is pushed from below by a "vertical" 2D half-space (or some smooth deformation of it). We establish almost optimal $C^{1,1/2-}$ estimates for the solutions near points on the free boundary of the contact set in dimension $n = 3$. The $C^{1,1/2-}$ estimates follow from an $\varepsilon$-regularity result for minimal surfaces with thin obstacles (valid in any dimension $n$) in the spirit of the De Giorgi's improvement of flatness. To prove it, we follow Savin's small perturbations method. A nontrivial difficulty in using Savin's approach with minimal surfaces with thin obstacles is that near a typical contact point the solution consists of two smooth surfaces that intersect transversally, and hence it is not very flat at small scales. Via a new "dichotomy approach" based on barrier arguments we are able to overcome this difficulty and prove the desired result.
- We introduce the notion of a \emphcomplex cell, a complexification of the cells/cylinders used in real tame geometry. Complex cells are equipped with a natural notion of holomorphic extension, and the hyperbolic geometry of a cell within its extension provides the class of complex cells with a rich geometric function theory absent in the real case. We use this to prove a complex analog of the cellular decomposition theorem of real tame geometry. In the algebraic case we show that the complexity of such decompositions depends polynomially on the degrees of the equations involved. Using this theory, we sharpen the Yomdin-Gromov algebraic lemma on $C^r$-smooth parametrizations of semialgebraic sets: we show that the number of $C^r$ charts can be taken to be polynomial in the smoothness order $r$ and in the complexity of the set. The algebraic lemma was initially invented in the work of Yomdin and Gromov to produce estimates for the topological entropy of $C^\infty$ maps. Combined with work of Burguet, Liao and Yang, our refined version establishes an optimal sharpening of these estimates for \emphanalytic maps, in the form of tight bounds on the tail entropy and volume growth. This settles a conjecture of Yomdin who proved the same result in dimension two in 1991. A self-contained proof of these estimates using the refined algebraic lemma is given in an appendix by Yomdin. The algebraic lemma has more recently been used in the study of rational points on algebraic and transcendental varieties. We use the theory of complex cells in these two directions. In the algebraic context we prove a sharpening of a result of Heath-Brown on interpolating rational points in algebraic varieties. In the transcendental context we prove an interpolation result for (unrestricted) logarithmic images of subanalytic sets.
- Feb 22 2018 physics.optics quant-ph arXiv:1802.07573v1Secure communication is of paramount importance in modern society. Asymmetric cryptography methods such as the widely used RSA method allow secure exchange of information between parties who have not shared secret keys. However, the existing asymmetric cryptographic schemes rely on unproven mathematical assumptions for security. Further, the digital keys used in their implementation are susceptible to copying that might remain unnoticed. Here we introduce a secure communication method that overcomes these two limitations by employing Physical Unclonable Keys (PUKs). Using optical PUKs realized in opaque scattering materials and employing off-the-shelf equipment, we transmit messages in an error-corrected way. Information is transmitted as patterned wavefronts of few-photon wavepackets which can be successfully decrypted only with the receiver's PUK. The security of PUK-Enabled Asymmetric Communication (PEAC) is not based on any stored secret but on the hardness of distinguishing between different few-photon wavefronts.
- It is proved that the classical Laplace transform is a continuous valuation which is positively GL$(n)$ covariant and logarithmic translation covariant. Conversely, these properties turn out to be sufficient to characterize this transform.
- Feb 22 2018 math.MG arXiv:1802.07561v1For $1 \leq p < \infty$, Ludwig, Haberl and Parapatits classified $L_p$ Minkowski valuations intertwining the special linear group with additional conditions such as homogeneity and continuity. In this paper,a complete classification of $L_p$ Minkowski valuations intertwining the special linear group on polytopes without any additional conditions is established for $p \geq 1$ including $p = \infty$. For $n=3$ and $p=1$, there exist valuations not mentioned before.
- Feb 22 2018 math.MG arXiv:1802.07559v1In this article, a classification of continuous, linearly intertwining, symmetric $L_p$-Blaschke ($p>1$) valuations is established as an extension of Haberl's work on Blaschke valuations. More precisely, we show that for dimensions $n\geq 3$, the only continuous, linearly intertwining, normalized symmetric $L_p$-Blaschke valuation is the normalized $L_p$-curvature image operator, while for dimension $n = 2 $, a rotated normalized $L_p$-curvature image operator is an only additional one. One of the advantages of our approach is that we deal with normalized symmetric $L_p$-Blaschke valuations, which makes it possible to handle the case $p=n$. The cases where $p \not =n$ are also discussed by studying the relations between symmetric $L_p$-Blaschke valuations and normalized ones.
- We consider some of Jonathan and Peter Borweins' contributions to the high-precision computation of $\pi$ and the elementary functions, with particular reference to their book "Pi and the AGM" (Wiley, 1987). Here "AGM" is the arithmetic-geometric mean of Gauss and Legendre. Because the AGM converges quadratically, it can be combined with fast multiplication algorithms to give fast algorithms for the $n$-bit computation of $\pi$, and more generally the elementary functions. These algorithms run in almost linear time $O(M(n)\log n)$, where $M(n)$ is the time for $n$-bit multiplication. We outline some of the results and algorithms given in Pi and the AGM, and present some related (but new) results. In particular, we improve the published error bounds for some quadratically and quartically convergent algorithms for $\pi$, such as the Gauss-Legendre algorithm. We show that an iteration of the Borwein-Borwein quartic algorithm for $\pi$ is equivalent to two iterations of the Gauss-Legendre quadratic algorithm for $\pi$, in the sense that they produce exactly the same sequence of approximations to $\pi$ if performed using exact arithmetic.
- Feb 22 2018 math.AP arXiv:1802.07542v1In this paper we prove that any $C^{1,\alpha}$ curve in $\mathbb{R}^n$, with $\alpha \in (\frac{1}{2},1]$, is the solution of the gradient flow equation for some $C^1$ convex function $f$, if and only if it is strongly self-contracted.
- Feb 22 2018 math.SP arXiv:1802.07524v1We consider eigenvalues of the Dirichlet-to-Neumann operator for Laplacian in the domain (or manifold) with edges and establish the asymptotics of the eigenvalue counting function \beginequation* \mathsfN(\lambda)= \kappa_0\lambda^d +O(\lambda^d-1)\qquad \textas\ \ \lambda\to+∞, \endequation* where $d$ is dimension of the boundary. Further, in certain cases we establish two-term asymptotics \beginequation* \mathsfN(\lambda)= \kappa_0\lambda^d+\kappa_1\lambda^d-1+o(\lambda^d-1)\qquad \textas\ \ \lambda\to+∞. \endequation* We also establish improved asymptotics for Riesz means.
- Feb 22 2018 quant-ph arXiv:1802.07521v1We propose a Global-Local optimization algorithm for quantum control that combines standard local search methodologies with evolutionary algorithms. This allows us to find faster solutions to a set of problems relating to ultracold control of Bose-Einstein condensates.
- Feb 22 2018 quant-ph arXiv:1802.07509v1We propose a quantum optimal control algorithm that performs a gradient descent in a reduced basis named GRadient Optimization Using Parametrization (GROUP). We compare this optimization algorithm to the other state-of-the-art algorithms in quantum control namely, Gradient-Ascent Pulse Engieering (GRAPE), Krotov's method and Nelder-Mead using Chopped Random Basis (CRAB). We find that GROUP converges much faster than Nelder-Mead with CRAB and achieves better results than GRAPE and Krotov's method on the control problem presented here.
- We consider the set $\mathrm{MC}_d$ of monic centered polynomials of one complex variable with degree $d \geq 2$, and study the map $\widehat{\Phi}_d:\mathrm{MC}_d\to \widetilde{\Lambda}_d \subset \mathbb{C}^d / \mathfrak{S}_d$ which maps each $f \in \mathrm{MC}_d$ to its unordered collection of fixed-point multipliers. We give an explicit formula for counting the number of elements of each fiber $\widehat{\Phi}_d^{-1}\left(\bar{\lambda}\right)$ for every $\bar{\lambda} \in \widetilde{\Lambda}_d$. This formula contains no induction process, and is a drastic improvement of our previous result which gave a rather long algorithm with some induction processes for counting the number of elements of each fiber.
- Eigenvalues of higher multiplicity of the Nonlinear Fourier Transform (NFT) are considered for information transmission over fiber optic channels. The effects of phase, time or frequency shifts on this generalized NFT are derived, as well as an expression for the signal energy. These relations are used to design transmit signals and numerical algorithms to compute the direct and inverse NFTs, and to numerically demonstrate communication using a soliton with one double eigenvalue.
- We prove an abstract compactness result for gradient flow lines of a non-local unregularized gradient flow equation on a scale Hilbert space. This is the first step towards Floer theory on scale Hilbert spaces.
- Feb 22 2018 cs.LG arXiv:1802.07427v1In the large-scale multiclass setting, assigning labels often consists of answering multiple questions to drill down through a hierarchy of classes. Here, the labor required per annotation scales with the number of questions asked. We propose active learning with partial feedback. In this setup, the learner asks the annotator if a chosen example belongs to a (possibly composite) chosen class. The answer eliminates some classes, leaving the agent with a partial label. Success requires (i) a sampling strategy to choose (example, class) pairs, and (ii) learning from partial labels. Experiments on the TinyImageNet dataset demonstrate that our most effective method achieves a 21% relative improvement in accuracy for a 200k binary question budget. Experiments on the TinyImageNet dataset demonstrate that our most effective method achieves a 26% relative improvement (8.1% absolute) in top1 classification accuracy for a 250k (or 30%) binary question budget, compared to a naive baseline. Our work may also impact traditional data annotation. For example, our best method fully annotates TinyImageNet with only 482k (with EDC though, ERC is 491) binary questions (vs 827k for naive method).
- Let V be a linear subspace of M_n(C) which contains the identity matrix and is stable under Hermitian transpose. A "quantum k-clique" for V is a rank k orthogonal projection P in M_n(C) for which dim(PVP) = k^2, and a "quantum k-anticlique" is a rank k orthogonal projection for which dim(PVP) = 1. We give upper and lower bounds both for the largest dimension of V which would ensure the existence of a quantum k-anticlique, and for the smallest dimension of V which would ensure the existence of a quantum k-clique.
- Feb 22 2018 cond-mat.mes-hall arXiv:1802.07323v1We theoretically investigate a near-quantum-limited parametric amplifier based on the non-linear dynamics of quasiparticles flowing through a superconducting-insulator-superconducting junction. Photon-assisted tunneling, resulting from the combination of dc- and ac-voltage bias, gives rise to a strong parametric interaction for the electromagnetic modes reflected by the junction coupled to a transmission line. We show phase-sensitive and phase-preserving amplification, together with single- and two-mode squeezing. For an Aluminum junction pumped at 8 GHz, we predict narrow-band phase-sensitive amplification of microwaves signals to more than 30 dB, and broadband phase-preserving amplification larger than 15 dB over a 4 GHz band. We also predict single-mode squeezing reaching -20 dB and two-mode squeezing of -14 dB over a 4 GHz band. A key feature is that the device parameters can be tuned in-situ by the applied dc- and ac-voltage biases.
- Feb 22 2018 math.CO arXiv:1802.07310v1In this note, we provide a simple derivation of expressions for the restricted partition function and its polynomial part. Our proof relies on elementary algebra on rational functions and a lemma that expresses the polynomial part as an average of the partition function.
- We establish connections between the problem of learning a two-layers neural network with good generalization error and tensor decomposition. We consider a model with input $\boldsymbol x \in \mathbb R^d$, $r$ hidden units with weights $\{\boldsymbol w_i\}_{1\le i \le r}$ and output $y\in \mathbb R$, i.e., $y=\sum_{i=1}^r \sigma(\left \langle \boldsymbol x, \boldsymbol w_i\right \rangle)$, where $\sigma$ denotes the activation function. First, we show that, if we cannot learn the weights $\{\boldsymbol w_i\}_{1\le i \le r}$ accurately, then the neural network does not generalize well. More specifically, the generalization error is close to that of a trivial predictor with access only to the norm of the input. This result holds for any activation function, and it requires that the weights are roughly isotropic and the input distribution is Gaussian, which is a typical assumption in the theoretical literature. Then, we show that the problem of learning the weights $\{\boldsymbol w_i\}_{1\le i \le r}$ is at least as hard as the problem of tensor decomposition. This result holds for any input distribution and assumes that the activation function is a polynomial whose degree is related to the order of the tensor to be decomposed. By putting everything together, we prove that learning a two-layers neural network that generalizes well is at least as hard as tensor decomposition. It has been observed that neural network models with more parameters than training samples often generalize well, even if the problem is highly underdetermined. This means that the learning algorithm does not estimate the weights accurately and yet is able to yield a good generalization error. This paper shows that such a phenomenon cannot occur when the input distribution is Gaussian and the weights are roughly isotropic. We also provide numerical evidence supporting our theoretical findings.
- Feb 22 2018 physics.comp-ph quant-ph arXiv:1802.07259v1We combine the multigrid (MG) method with state-of-the-art concepts from the variational formulation of the numerical renormalization group. The resulting MG renormalization (MGR) method is a natural generalization of the MG method for solving partial differential equations. When the solution on a grid of $N$ points is sought, our MGR method has a computational cost scaling as $\mathcal{O}(\log(N))$, as opposed to $\mathcal{O}(N)$ for the best standard MG method. Therefore MGR can exponentially speed up standard MG computations. To illustrate our method, we develop a novel algorithm for the ground state computation of the nonlinear Schrödinger equation. Our algorithm acts variationally on tensor products and updates the tensors one after another by solving a local nonlinear optimization problem. We compare several different methods for the nonlinear tensor update and find that the Newton method is the most efficient as well as precise. The combination of MGR with our nonlinear ground state algorithm produces accurate results for the nonlinear Schrödinger equation on $N = 10^{18}$ grid points in three spatial dimensions.
- Feb 22 2018 hep-ph arXiv:1802.07720v1
- Feb 22 2018 physics.app-ph arXiv:1802.07719v1
- Feb 22 2018 hep-lat arXiv:1802.07718v1
- Feb 22 2018 cond-mat.quant-gas arXiv:1802.07717v1
- Feb 22 2018 math.CO arXiv:1802.07713v1
- Feb 22 2018 q-bio.TO arXiv:1802.07338v1
- Feb 22 2018 cs.GR arXiv:1802.07710v1
- Feb 22 2018 astro-ph.CO arXiv:1802.07708v1
- Feb 22 2018 math.DS arXiv:1802.07706v1
- Feb 22 2018 math.GT arXiv:1802.07704v1
- Feb 22 2018 quant-ph cond-mat.stat-mech arXiv:1802.07703v1