Top arXiv papers

sign in to customize
  • PDF
    We study thermal states of strongly interacting quantum spin chains and prove that those can efficiently be represented in terms of convex combinations of matrix product states. Apart from revealing new features of the entanglement structure of Gibbs states, such as an area law for the entanglement of formation, our results provide a theoretical justifications for the use of White's algorithm of minimally entangled typical thermal states. Furthermore, we shed new light on time dependent matrix product state algorithms which yield hydrodynamical descriptions of the underlying dynamics.
  • PDF
    We extend quantum Stein's lemma in asymmetric quantum hypothesis testing to composite null and alternative hypotheses. As our main result, we show that the asymptotic error exponent for testing convex combinations of quantum states $\rho^{\otimes n}$ against convex combinations of quantum states $\sigma^{\otimes n}$ is given by a regularized quantum relative entropy distance formula. We prove that in general such a regularization is needed but also discuss various settings where our formula as well as extensions thereof become single-letter. This includes a novel operational interpretation of the relative entropy of coherence in terms of hypothesis testing. For our proof, we start from the composite Stein's lemma for classical probability distributions and lift the result to the non-commutative setting by only using elementary properties of quantum entropy. Finally, our findings also imply an improved Markov type lower bound on the quantum conditional mutual information in terms of the regularized quantum relative entropy - featuring an explicit and universal recovery map.
  • PDF
    We describe an efficient quantum algorithm for the quantum Schur transform. The Schur transform is an operation on a quantum computer that maps the standard computational basis to a basis composed of irreducible representations of the unitary and symmetric groups. We simplify and extend the algorithm of Bacon, Chuang, and Harrow, and provide a new practical construction as well as sharp theoretical and practical analyses. Our algorithm decomposes the Schur transform on $n$ qubits into $O(n^4 \log(n/{\epsilon}))$ operators in the Clifford+T fault-tolerant gate set. We extend our qubit algorithm to decompose the Schur transform on $n$ qudits of dimension $d$ into $O(d^{1+p} n^{2d+1} \log^p (dn/{\epsilon})$) primitive operators from any universal gate set, for $p {\approx} 3.97$.
  • PDF
    Purification is a powerful technique in quantum physics whereby a mixed quantum state is extended to a pure state on a larger system. This process is not unique, and in systems composed of many degrees of freedom, one natural purification is the one with minimal entanglement. Here we study the entropy of the minimally entangled purification, called the entanglement of purification, in three model systems: an Ising spin chain, conformal field theories holographically dual to Einstein gravity, and random stabilizer tensor networks. We conjecture values for the entanglement of purification in all these models, and we support our conjectures with a variety of numerical and analytical results. We find that such minimally entangled purifications have a number of applications, from enhancing entanglement-based tensor network methods for describing mixed states to elucidating novel aspects of the emergence of geometry from entanglement in the AdS/CFT correspondence.
  • PDF
    Quantum Markov semigroups characterize the time evolution of an important class of open quantum systems. Studying convergence properties of such a semigroup, and determining concentration properties of its invariant state, have been the focus of much research. Quantum versions of functional inequalities (like the modified logarithmic Sobolev and Poincaré inequalities) and the so-called transportation cost inequalities, have proved to be essential for this purpose. Classical functional and transportation cost inequalities are seen to arise from a single geometric inequality, called the Ricci lower bound, via an inequality which interpolates between them. The latter is called the HWI-inequality, where the letters I, W and H are, respectively, acronyms for the Fisher information (arising in the modified logarithmic Sobolev inequality), the so-called Wasserstein distance (arising in the transportation cost inequality) and the relative entropy (or Boltzmann H function) arising in both. Hence, classically, all the above inequalities and the implications between them form a remarkable picture which relates elements from diverse mathematical fields, such as Riemannian geometry, information theory, optimal transport theory, Markov processes, concentration of measure, and convexity theory. Here we consider a quantum version of the Ricci lower bound introduced by Carlen and Maas, and prove that it implies a quantum HWI inequality from which the quantum functional and transportation cost inequalities follow. Our results hence establish that the unifying picture of the classical setting carries over to the quantum one.
  • PDF
    Entanglement distribution is a prerequisite for several important quantum information processing and computing tasks, such as quantum teleportation, quantum key distribution, and distributed quantum computing. In this work, we focus on two-dimensional quantum networks based on optical quantum technologies using dual-rail photonic qubits. We lay out a quantum network architecture for entanglement distribution between distant parties, with the technological constraint that quantum repeaters equipped with quantum memories are not currently widely available. We also discuss several quantum network topologies for the building of a fail-safe quantum internet. We use percolation theory to provide figures of merit on the loss parameter of the optical medium for networks with bow-tie lattice and Archimedean lattice topologies. These figures of merit allow for comparisons of the robustness of different networks against intermittent failures of its nodes and against intermittent photon loss, which is an important consideration in the realization of the quantum internet.
  • PDF
    Simulating quantum contextuality with classical systems requires memory. A fundamental yet open question is which is the minimum memory needed and, therefore, the precise sense in which quantum systems outperform classical ones. Here we make rigorous the notion of classically simulating quantum state-independent contextuality (QSIC) in the case of a single quantum system submitted to an infinite sequence of measurements randomly chosen from a finite QSIC set. We obtain the minimum memory classical systems need to simulate arbitrary QSIC sets under the assumption that the simulation should not contain any oracular information. In particular, we show that, while classically simulating two qubits tested with the Peres-Mermin set requires $\log_2 24 \approx 4.585$ bits, simulating a single qutrit tested with the Yu-Oh set requires, at least, $5.740$ bits.
  • PDF
    To analyze non-Markovianity of tripartite quantum states from a resource theoretical viewpoint, we introduce a class of quantum operations performed by three distant parties, and investigate an operational resource theory induced it. A tripartite state is a free state if and only if it is a quantum Markov chain. We prove monotonicity of functions such as the conditional mutual information, intrinsic information, squashed entanglement, a generalization of entanglement of purification, and the relative entropy of recovery. We identify bound sets that have a clear correspondence to the monotone functions, and analyze their inclusion relations. We introduce a task of "non-Markovianity dilution", and prove that the optimal rate for the task, namely the "non-Markovianity cost", is bounded from above by the regularized entanglement of purification. We also propose a classical resource theory of non-Markovianity.
  • PDF
    The quantum autoencoder is a recent paradigm in the field of quantum machine learning, which may enable an enhanced use of resources in quantum technologies. To this end, quantum neural networks with less nodes in the inner than in the outer layers were considered. Here, we propose a useful connection between approximate quantum adders and quantum autoencoders. Specifically, this link allows us to employ optimized approximate quantum adders, obtained with genetic algorithms, for the implementation of quantum autoencoders for a variety of initial states. Furthermore, we can also directly optimize the quantum autoencoders via genetic algorithms. Our approach opens a different path for the design of quantum autoencoders in controllable quantum platforms.
  • PDF
    High-dimensional encoding of quantum information provides a promising method of transcending current limitations in quantum communication. One of the central challenges in the pursuit of such an approach is the certification of high-dimensional entanglement. In particular, it is desirable to do so without resorting to inefficient full state tomography. Here, we show how carefully constructed measurements in two or more bases can be used to efficiently certify high-dimensional states and their entanglement under realistic conditions. We considerably improve upon existing criteria and introduce new entanglement dimensionality witnesses which we put to the test for photons entangled in their orbital angular momentum. In our experimental setup, we are able to verify 8-dimensional entanglement for 11-dimensional subspaces, at present the highest amount certified without assumptions on the state itself.
  • PDF
    The notion of potential output purity of a completely positive map is introduced as a generalization of the regularized output purity. An upper bound is derived for this quantity, and for several classes of maps (including CQ, QC and Hadamard channels) it is shown that potential purity does not exceed the standard output purity. As an application the potential purity is used to bound the logarithmic Sobolev constant of a product of depolarizing channel semigroups.
  • PDF
    We apply advanced methods of control theory to open quantum systems and we determine finite-time processes which are optimal with respect to thermodynamic performances. General properties and necessary conditions characterizing optimal drivings are derived, obtaining bang-bang type solutions corresponding to control strategies switching between adiabatic and isothermal transformations. A direct application of these results is the maximization of the work produced by a generic quantum heat engine, where we show that the maximum power is directly linked to a particular conserved quantity naturally emerging from the control problem. Finally we apply our general approach to the specific case of a two level system, which can be put in contact with two different baths at fixed temperatures, identifying the processes which minimize heat dissipation. Moreover, we explicitly solve the optimization problem for a cyclic two-level heat engine driven beyond the linear-response regime, determining the corresponding optimal cycle, the maximum power, and the efficiency at maximum power.
  • PDF
    Covert communication conceals the transmission of the message from an attentive adversary. Recent work on the limits of covert communication in additive white Gaussian noise (AWGN) channels has demonstrated that a covert transmitter (Alice) can reliably transmit a maximum of $\mathcal{O}\left(\sqrt{n}\right)$ bits to a covert receiver (Bob) without being detected by an adversary (Warden Willie) in $n$ channel uses. This paper focuses on the scenario where other friendly nodes distributed according to a two-dimensional Poisson point process with density $m$ are present in the environment. We propose a strategy where the friendly node closest to the adversary, without close coordination with Alice, produces artificial noise. We show that this method allows Alice to reliably and covertly send $\mathcal{O}(\min\{{n,m^{\gamma/2}\sqrt{n}}\})$ bits to Bob in $n$ channel uses, where $\gamma$ is the path-loss exponent. Moreover, we also consider a setting where there are $N_{\mathrm{w}}$ collaborating adversaries uniformly and randomly located in the environment and show that in $n$ channel uses, Alice can reliably and covertly send $\mathcal{O}\left(\min\left\{n,\frac{m^{\gamma/2} \sqrt{n}}{N_{\mathrm{w}}^{\gamma}}\right\}\right)$ bits to Bob when $\gamma >2$, and $\mathcal{O}\left(\min\left\{n,\frac{m \sqrt{n}}{N_{\mathrm{w}}^{2}\log^2 {N_{\mathrm{w}}}}\right\}\right)$ when $\gamma = 2$. Conversely, under mild restrictions on the communication strategy, we demonstrate that no higher covert throughput is possible for $\gamma>2$.
  • PDF
    We study the least squares regression problem \beginalign* \min_\Theta ∈\mathcalS_⊙D,R \|A\Theta-b\|_2, \endalign* where $\mathcal{S}_{\odot D,R}$ is the set of $\Theta$ for which $\Theta = \sum_{r=1}^{R} \theta_1^{(r)} \circ \cdots \circ \theta_D^{(r)}$ for vectors $\theta_d^{(r)} \in \mathbb{R}^{p_d}$ for all $r \in [R]$ and $d \in [D]$, and $\circ$ denotes the outer product of vectors. That is, $\Theta$ is a low-dimensional, low-rank tensor. This is motivated by the fact that the number of parameters in $\Theta$ is only $R \cdot \sum_{d=1}^D p_d$, which is significantly smaller than the $\prod_{d=1}^{D} p_d$ number of parameters in ordinary least squares regression. We consider the above CP decomposition model of tensors $\Theta$, as well as the Tucker decomposition. For both models we show how to apply data dimensionality reduction techniques based on \it sparse random projections $\Phi \in \mathbb{R}^{m \times n}$, with $m \ll n$, to reduce the problem to a much smaller problem $\min_{\Theta} \|\Phi A \Theta - \Phi b\|_2$, for which if $\Theta'$ is a near-optimum to the smaller problem, then it is also a near optimum to the original problem. We obtain significantly smaller dimension and sparsity in $\Phi$ than is possible for ordinary least squares regression, and we also provide a number of numerical simulations supporting our theory.
  • PDF
    Black box variational inference (BBVI) with reparameterization gradients triggered the exploration of divergence measures other than the Kullback-Leibler (KL) divergence, such as alpha divergences. These divergences can be tuned to be more mass-covering (preventing overfitting in complex models), but are also often harder to optimize using Monte-Carlo gradients. In this paper, we view BBVI with generalized divergences as a form of biased importance sampling. The choice of divergence determines a bias-variance tradeoff between the tightness of the bound (low bias) and the variance of its gradient estimators. Drawing on variational perturbation theory of statistical physics, we use these insights to construct a new variational bound which is tighter than the KL bound and more mass covering. Compared to alpha-divergences, its reparameterization gradients have a lower variance. We show in several experiments on Gaussian Processes and Variational Autoencoders that the resulting posterior covariances are closer to the true posterior and lead to higher likelihoods on held-out data.
  • PDF
    Can a large system be fully characterized using its subsystems via inductive reasoning? Is it possible to completely reduce the behavior of a complex system to the behavior of its simplest "atoms"? In the following paper we answer these questions on the negative for a specific class of systems and measurements. We begin with simple two-particle example, where strong correlations arise between two apparently empty boxes. This leads to new surprising effects within atomic and electromagnetic systems. A general construction based on pre- and post-selected ensembles is then suggested, where the N-body correlation can be genuinely perceived as a global property, as long as one is limited to preforming a small set of measurements which we term "strictly local". We conclude that within time-symmetric quantum mechanics and under certain boundary conditions, high-order correlations can determine low-order ones, but not vice versa. Moreover, the latter seem to provide no information at all regarding the former. This supports a top-down structure in many-body quantum mechanics.
  • PDF
    Dependent rounding is a popular technique in designing approximation algorithms. It allows us to randomly round a fractional solution $x$ to an integral vector $X$ such that $E[X] = x$, all $X_i$'s are ("cylindrically") negatively correlated, and the cardinality constraint $\sum_i X_i = \sum_i x_i$ holds with probability one. One can also essentially replace the cardinality constraint by a general linear constraint $\sum_i a_i X_i = \sum_i a_i x_i$ if the $a_i$'s are non-negative. However, in certain problems, the coefficients $a_i$ may have mixed signs; furthermore, some applications require correlation inequalities that are much more general than negative correlation. In this paper, we propose a generalized dependent rounding technique, called symmetric randomized dependent rounding, which can round $x$ to an almost-integer vector so that any given linear constraint is satisfied with probability one and such that all the variables are nearly independent. We give two illustrative examples of this technique: a randomized algorithm for the knapsack center problem with new average approximation guarantees and an improved bi-criteria approximation ratio for the knapsack median problem.
  • PDF
    We consider various curious features of general relativity, and relativistic field theory, in two spacetime dimensions. In particular, we discuss: the vanishing of the Einstein tensor; the failure of an initial-value formulation for vacuum spacetimes; the status of singularity theorems; the non-existence of a Newtonian limit; the status of the cosmological constant; and the character of matter fields, including perfect fluids and electromagnetic fields. We conclude with a discussion of what constrains our understanding of physics in different dimensions.
  • PDF
    We present methodology for using dynamic evaluation to improve neural sequence models. Models are adapted to recent history via a gradient descent based mechanism, allowing them to assign higher probabilities to re-occurring sequential patterns. Dynamic evaluation is demonstrated to compare favourably with existing adaptation approaches for language modelling. We apply dynamic evaluation to improve the state of the art word-level perplexities on the Penn Treebank and WikiText-2 datasets to 51.1 and 44.3 respectively, and the state of the art character-level cross-entropy on the Hutter prize dataset to 1.17 bits/character.
  • PDF
    Ethnicity is a key demographic attribute of human beings and it plays a vital role in automatic facial recognition and have extensive real world applications such as Human Computer Interaction (HCI); demographic based classification; biometric based recognition; security and defense to name a few. In this paper we present a novel approach for extracting ethnicity from the facial images. The proposed method makes use of a pre trained Convolutional Neural Network (CNN) to extract the features and then Support Vector Machine (SVM) with linear kernel is used as a classifier. This technique uses translational invariant hierarchical features learned by the network, in contrast to previous works, which use hand crafted features such as Local Binary Pattern (LBP); Gabor etc. Thorough experiments are presented on ten different facial databases which strongly suggest that our approach is robust to different expressions and illuminations conditions. Here we consider ethnicity classification as a three class problem including Asian, African-American and Caucasian. Average classification accuracy over all databases is 98.28%, 99.66% and 99.05% for Asian, African-American and Caucasian respectively.
  • PDF
    Using a representation of the discrete Hilbert transform in terms of martingales arising from Doob $h$-processes, we prove that its $l^p$-norm, $1<p<\infty$, is bounded above by the $L^p$-norm of the continuous Hilbert transform. Together with the already known lower bound, this resolves the long-standing conjecture that the norms of these operators are equal.
  • PDF
    We prove the uniqueness and finite-time existence of bounded-vorticity solutions to the 2D Euler equations having velocity growing slower than the square root of the distance from the origin, obtaining global existence for more slowly growing velocity fields. We also establish continuous dependence on initial data.
  • PDF
    We present an approach to automate the process of discovering optimization methods, with a focus on deep learning architectures. We train a Recurrent Neural Network controller to generate a string in a domain specific language that describes a mathematical update equation based on a list of primitive functions, such as the gradient, running average of the gradient, etc. The controller is trained with Reinforcement Learning to maximize the performance of a model after a few epochs. On CIFAR-10, our method discovers several update rules that are better than many commonly used optimizers, such as Adam, RMSProp, or SGD with and without Momentum on a ConvNet model. We introduce two new optimizers, named PowerSign and AddSign, which we show transfer well and improve training on a variety of different tasks and architectures, including ImageNet classification and Google's neural machine translation system.
  • PDF
    Let ${\mathfrak M}$ be a closed, orientable, hyperbolic 3-orbifold whose singular set is a link, and such that $\pi_1({\mathfrak M})$ contains no hyperbolic triangle group. We show that if the underlying manifold $|{\mathfrak M}|$ is irreducible, and $|{\mathfrak M}|$ is irreducible for every two-sheeted (orbifold) covering $\widetilde{\mathfrak M}$ of $$, and if ${\rm vol} {\mathfrak M}\le1.72$, then $\dim H_1({\mathfrak M};{\mathbb Z}_2)\le 15$. Furthermore, if ${\rm vol} {\mathfrak M}\le1.22$ then $\dim H_1({\mathfrak M};{\mathbb Z}_2)\le 11$, and if ${\rm vol} {\mathfrak M}\le0.61$ then $\dim H_1({\mathfrak M};{\mathbb Z}_2)\le 7$. The proof is an application of results that will be used in the sequel to this paper to obtain qualitatively similar results without the assumption of irreducibility of $|{\mathfrak M}|$ and $|\widetilde{\mathfrak M}|$.
  • PDF
    Mining suggestion expressing sentences from a given text is a less investigated sentence classification task, and therefore lacks hand labeled benchmark datasets. In this work, we propose and evaluate two approaches for distant supervision in suggestion mining. The distant supervision is obtained through a large silver standard dataset, constructed using the text from wikiHow and Wikipedia. Both the approaches use a LSTM based neural network architecture to learn a classification model for suggestion mining, but vary in their method to use the silver standard dataset. The first approach directly trains the classifier using this dataset, while the second approach only learns word embeddings from this dataset. In the second approach, we also learn POS embeddings, which interestingly gives the best classification accuracy.
  • PDF
    In this letter we report a thorough analysis of the exciton dispersion in bulk hexagonal boron nitride. We solve the ab initio GW Bethe-Salpeter equation at finite $\mathbf{q}\parallel \Gamma K$, and we compare our results with recent high-accuracy electron energy loss data. Simulations reproduce the measured dispersion and the variation of the peak intensity. We focus on the evolution of the intensity, and we demonstrate that the excitonic peak is formed by the superposition of two groups of transitions that we call $KM$ and $MK'$ from the k-points involved in the transitions. These two groups contribute to the peak intensity with opposite signs, each damping the contributions of the other. The variations in number and amplitude of these transitions determine the changes in intensity of the peak. Our results contribute to the understanding of electronic excitations in this systems along the $\Gamma K$ direction, which is the relevant direction for spectroscopic measurements. They also unveil the non-trivial relation between valley physics and excitonic dispersion in h--BN, opening the possibility to tune excitonic effects by playing with the interference between transitions. Furthermore, this study introduces analysis tools and a methodology that are completely general. They suggest a way to regroup independent-particle transitions which could permit a deeper understanding of excitonic properties in any system.
  • PDF
    Automatic urban land cover classification is a classical problem in remote sensing and good urban land cover maps build the foundation for many tasks, such as e.g. environmental monitoring. It is a particularly challenging problem, as classes generally have high inter-class and low intra-class variance. A common technique to improve urban land cover classification performance in remote sensing is the fusing of data from different sensors with different data modalities. However, all modalities are rarely available for all test data, and this missing data problem poses severe challenges for multi-modal learning. Inspired by recent successes in deep learning, we propose as a remedy a convolutional neural network (CNN) architecture for urban remote sensing image segmentation trained on data modalities which are not all available at test time. We train our architecture with a cost function particularly suited for imbalanced classes, as this is a frequent problem in remote sensing, especially in urban areas. We demonstrate the method using two benchmark datasets, both consisting of optical and digital surface model (DSM) images. We simulate missing data, by assuming that the DSM images are missing during testing and show that our method outperforms both CNNs trained on optical images as well as an ensemble of two CNNs trained only on optical images. We further evaluate the potential of our method to handle situations where only some DSM images are missing during training and show that we can clearly exploit training time information of the missing modality during testing.
  • PDF
    Object classification is one of the many holy grails in computer vision and as such has resulted in a very large number of algorithms being proposed already. Specifically in recent years there has been considerable progress in this area primarily due to the increased efficiency and accessibility of deep learning techniques. In fact, for single-label object classification [i.e. only one object present in the image] the state-of-the-art techniques employ deep neural networks and are reporting very close to human-like performance. There are specialized applications in which single-label object-level classification will not suffice; for example in cases where the image contains multiple intertwined objects of different labels. In this paper, we address the complex problem of multi-label pixelwise classification. We present our distinct solution based on a convolutional neural network (CNN) for performing multi-label pixelwise classification and its application to large-scale urban reconstruction. A supervised learning approach is followed for training a 13-layer CNN using both LiDAR and satellite images. An empirical study has been conducted to determine the hyperparameters which result in the optimal performance of the CNN. Scale invariance is introduced by training the network on five different scales of the input and labeled data. This results in six pixelwise classifications for each different scale. An SVM is then trained to map the six pixelwise classifications into a single-label. Lastly, we refine boundary pixel labels using graph-cuts for maximum a-posteriori (MAP) estimation with Markov Random Field (MRF) priors. The resulting pixelwise classification is then used to accurately extract and reconstruct the buildings in large-scale urban areas. The proposed approach has been extensively tested and the results are reported.
  • PDF
    We define a new large $N$ limit for general $\text{O}(N)^{R}$ or $\text{U}(N)^{R}$ invariant tensor models, based on an enhanced large $N$ scaling of the coupling constants. The resulting large $N$ expansion is organized in terms of a half-integer associated with Feynman graphs that we call the index. This index has a natural interpretation in terms of the many matrix models embedded in the tensor model. Our new scaling can be shown to be optimal for a wide class of non-melonic interactions, which includes all the maximally single-trace terms. Our construction allows to define a new large $D$ expansion of the sum over diagrams of fixed genus in matrix models with an additional $\text{O}(D)^{r}$ global symmetry. When the interaction is the complete vertex of order $R+1$, we identify in detail the leading order graphs for $R$ a prime number. This slightly surprising condition is equivalent to the complete interaction being maximally single-trace.
  • PDF
    Generative Adversarial Networks (GANs) produce systematically better quality samples when class label information is provided., i.e. in the conditional GAN setup. This is still observed for the recently proposed Wasserstein GAN formulation which stabilized adversarial training and allows considering high capacity network architectures such as ResNet. In this work we show how to boost conditional GAN by augmenting available class labels. The new classes come from clustering in the representation space learned by the same GAN model. The proposed strategy is also feasible when no class information is available, i.e. in the unsupervised setup. Our generated samples reach state-of-the-art Inception scores for CIFAR-10 and STL-10 datasets in both supervised and unsupervised setup.
  • PDF
    Estimation of semantic similarity and relatedness between biomedical concepts has utility for many informatics applications. Automated methods fall into two categories: methods based on distributional statistics drawn from text corpora, and methods using the structure of existing knowledge resources. Methods in the former category disregard taxonomic structure, while those in the latter fail to consider semantically relevant empirical information. In this paper, we present a method that retrofits distributional context vector representations of biomedical concepts using structural information from the UMLS Metathesaurus, such that the similarity between vector representations of linked concepts is augmented. We evaluated it on the UMNSRS benchmark. Our results demonstrate that retrofitting of concept vector representations leads to better correlation with human raters for both similarity and relatedness, surpassing the best results reported to date. They also demonstrate a clear improvement in performance on this reference standard for retrofitted vector representations, as compared to those without retrofitting.
  • PDF
    Liver and liver tumor segmentation plays an important role in hepatocellular carcinoma diagnosis and treatment planning. Recently, fully convolutional neural networks (FCNs) serve as the back-bone in many volumetric medical image segmentation tasks, including 2D and 3D FCNs. However, 2D convolutions can not fully leverage the spatial information along the $z$-axis direction while 3D convolutions suffer from high computational cost and GPU memory consumption. To address these issues, we propose a novel hybrid densely connected UNet (H-DenseUNet), which consists of a 2D DenseUNet for efficiently extracting intra-slice features and a 3D counterpart for hierarchically aggregating volumetric contexts under the spirit of the auto-context algorithm. In this way, the H-DenseUNet can harness the hybrid deep features effectively for volumetric segmentation. We extensively evaluated our method on the MICCAI 2017 Liver Tumor Segmentation (LiTS) Challenge. Our method outperformed other state-of-the-art methods on the overall score, with dice per case on liver and tumor as 0.961 and 0.686, as well as global dice score on liver and tumor as 0.965 and 0.829, respectively.
  • PDF
    We propose AffordanceNet, a new deep learning approach to simultaneously detect multiple objects and their affordances from RGB images. Our AffordanceNet has two branches: an object detection branch to localize and classify the object, and an affordance detection branch to assign each pixel in the object to its most probable affordance label. The proposed framework employs three key components for effectively handling the multiclass problem in the affordance mask: a sequence of deconvolutional layers, a robust resizing strategy, and a multi-task loss function. The experimental results on the public datasets show that our AffordanceNet outperforms recent state-of-the-art methods by a fair margin, while its end-to-end architecture allows the inference at the speed of 150ms per image. This makes our AffordanceNet is well suitable for real-time robotic applications. Furthermore, we demonstrate the effectiveness of AffordanceNet in different testing environments and in real robotic applications. The source code and trained models will be made available.
  • PDF
    We present a benchmark suite for visual perception. The benchmark is based on more than 250K high-resolution video frames, all annotated with ground-truth data for both low-level and high-level vision tasks, including optical flow, semantic instance segmentation, object detection and tracking, object-level 3D scene layout, and visual odometry. Ground-truth data for all tasks is available for every frame. The data was collected while driving, riding, and walking a total of 184 kilometers in diverse ambient conditions in a realistic virtual world. To create the benchmark, we have developed a new approach to collecting ground-truth data from simulated worlds without access to their source code or content. We conduct statistical analyses that show that the composition of the scenes in the benchmark closely matches the composition of corresponding physical environments. The realism of the collected data is further validated via perceptual experiments. We analyze the performance of state-of-the-art methods for multiple tasks, providing reference baselines and highlighting challenges for future research. The supplementary video can be viewed at https://youtu.be/T9OybWv923Y
  • PDF
    This paper describes the Arabic MGB-3 Challenge - Arabic Speech Recognition in the Wild. Unlike last year's Arabic MGB-2 Challenge, for which the recognition task was based on more than 1,200 hours broadcast TV news recordings from Aljazeera Arabic TV programs, MGB-3 emphasises dialectal Arabic using a multi-genre collection of Egyptian YouTube videos. Seven genres were used for the data collection: comedy, cooking, family/kids, fashion, drama, sports, and science (TEDx). A total of 16 hours of videos, split evenly across the different genres, were divided into adaptation, development and evaluation data sets. The Arabic MGB-Challenge comprised two tasks: A) Speech transcription, evaluated on the MGB-3 test set, along with the 10 hour MGB-2 test set to report progress on the MGB-2 evaluation; B) Arabic dialect identification, introduced this year in order to distinguish between four major Arabic dialects - Egyptian, Levantine, North African, Gulf, as well as Modern Standard Arabic. Two hours of audio per dialect were released for development and a further two hours were used for evaluation. For dialect identification, both lexical features and i-vector bottleneck features were shared with participants in addition to the raw audio recordings. Overall, thirteen teams submitted ten systems to the challenge. We outline the approaches adopted in each system, and summarise the evaluation results.
  • PDF
    In recent years, the number of papers on Alzheimer's disease classification has increased dramatically, generating interesting methodological ideas on the use machine learning and feature extraction methods. However, practical impact is much more limited and, eventually, one could not tell which of these approaches are the most efficient. While over 90\% of these works make use of ADNI an objective comparison between approaches is impossible due to variations in the subjects included, image pre-processing, performance metrics and cross-validation procedures. In this paper, we propose a framework for reproducible classification experiments using multimodal MRI and PET data from ADNI. The core components are: 1) code to automatically convert the full ADNI database into BIDS format; 2) a modular architecture based on Nipype in order to easily plug-in different classification and feature extraction tools; 3) feature extraction pipelines for MRI and PET data; 4) baseline classification approaches for unimodal and multimodal features. This provides a flexible framework for benchmarking different feature extraction and classification tools in a reproducible manner. We demonstrate its use on all (1519) baseline T1 MR images and all (1102) baseline FDG PET images from ADNI 1, GO and 2 with SPM-based feature extraction pipelines and three different classification techniques (linear SVM, anatomically regularized SVM and multiple kernel learning SVM). The highest accuracies achieved were: 91% for AD vs CN, 83% for MCIc vs CN, 75% for MCIc vs MCInc, 94% for AD-A$\beta$+ vs CN-A$\beta$- and 72% for MCIc-A$\beta$+ vs MCInc-A$\beta$+. The code is publicly available at https://gitlab.icm-institute.org/aramislab/AD-ML (depends on the Clinica software platform, publicly available at http://www.clinica.run).
  • PDF
    We consider the scenario of $n$ sensor nodes observing streams of data. The nodes are connected to a central server whose task it is to compute some function over all data items observed by the nodes. In our case, there exists a total order on the data items observed by the nodes. Our goal is to compute the $k$ currently lowest observed values or a value with rank in $[(1-\varepsilon)k,(1+\varepsilon)k]$ with probability $(1-\delta)$. We propose solutions for these problems in an extension of the distributed monitoring model where the server can send broadcast messages to all nodes for unit cost. We want to minimize communication over multiple time steps where there are $m$ updates to a node's value in between queries. The result is composed of two main parts, which each may be of independent interest: (1) Protocols which answer Top-k and k-Select queries. These protocols are memoryless in the sense that they gather all information at the time of the request. (2) A dynamic data structure which tracks for every $k$ an element close to $k$. We describe how to combine the two parts to receive a protocol answering the stated queries over multiple time steps. Overall, for Top-$k$ queries we use $O(k + \log m + \log \log n)$ and for $k$-Select queries $O(\frac{1}{\varepsilon^2} \log \frac{1}{\delta} + \log m + \log^2 \log n)$ messages in expectation. These results are shown to be asymptotically tight if $m$ is not too small.
  • PDF
    This work presents the evolution of a solution for predictive maintenance to a Big Data environment. The proposed adaptation aims for predicting failures on wind turbines using a data-driven solution deployed in the cloud and which is composed by three main modules. (i) A predictive model generator which generates predictive models for each monitored wind turbine by means of Random Forest algorithm. (ii) A monitoring agent that makes predictions every 10 minutes about failures in wind turbines during the next hour. Finally, (iii) a dashboard where given predictions can be visualized. To implement the solution Apache Spark, Apache Kafka, Apache Mesos and HDFS have been used. Therefore, we have improved the previous work in terms of data process speed, scalability and automation. In addition, we have provided fault-tolerant functionality with a centralized access point from where the status of all the wind turbines of a company localized all over the world can be monitored, reducing O&M costs.
  • PDF
    We present a sorting algorithm for the case of recurrent random comparison errors. The algorithm essentially achieves simultaneously good properties of previous algorithms for sorting $n$ distinct elements in this model. In particular, it runs in $O(n^2)$ time, the maximum dislocation of the elements in the output is $O(\log n)$, while the total dislocation is $O(n)$. These guarantees are the best possible since we prove that even randomized algorithms cannot achieve $o(\log n)$ maximum dislocation with high probability, or $o(n)$ total dislocation in expectation, regardless of their running time.
  • PDF
    Light scattered from multiple surfaces can be used to retrieve information of hidden environments. However, full three-dimensional retrieval of an object hidden from view by a wall has only been achieved with scanning systems and requires intensive computational processing of the retrieved data. Here we use a non-scanning, single-photon single-pixel detector in combination with an artificial neural network: this allows us to locate the position and to also simultaneously provide the actual identity of a hidden person, chosen from a database of people (N=3). Artificial neural networks applied to specific computational imaging problems can therefore enable novel imaging capabilities with hugely simplified hardware and processing times
  • PDF
    The purpose of this paper is to establish an atomic decomposition for functions in the weighted mixed norm space $A^{p,q}_\omega$ induced by a radial weight $\omega$ in the unit disc admitting a two-sided doubling condition. The obtained decomposition is further applied to characterize Carleson measures for $A^{p,q}_\omega$, and bounded differentiation operators $D^{(n)}(f)=f^{(n)}$ acting from $A^{p,q}_\omega$ to $L^p_\mu$, induced by a positive Borel measure $\mu$, on the full range of parameters $0<p,q,s<\infty$.
  • PDF
    Swarm systems constitute a challenging problem for reinforcement learning (RL) as the algorithm needs to learn decentralized control policies that can cope with limited local sensing and communication abilities of the agents. Although there have been recent advances of deep RL algorithms applied to multi-agent systems, learning communication protocols while simultaneously learning the behavior of the agents is still beyond the reach of deep RL algorithms. However, while it is often difficult to directly define the behavior of the agents, simple communication protocols can be defined more easily using prior knowledge about the given task. In this paper, we propose a number of simple communication protocols that can be exploited by deep reinforcement learning to find decentralized control policies in a multi-robot swarm environment. The protocols are based on histograms that encode the local neighborhood relations of the agents and can also transmit task-specific information, such as the shortest distance and direction to a desired target. In our framework, we use an adaptation of Trust Region Policy Optimization to learn complex collaborative tasks, such as formation building, building a communication link, and pushing an intruder. We evaluate our findings in a simulated 2D-physics environment, and compare the implications of different communication protocols.
  • PDF
    Deep learning algorithms offer a powerful means to automatically analyze the content of medical images. However, many biological samples of interest are primarily transparent to visible light and contain features that are difficult to resolve with a standard optical microscope. Here, we use a convolutional neural network (CNN) not only to classify images, but also to optimize the physical layout of the imaging device itself. We increase the classification accuracy of a microscope's recorded images by merging an optical model of image formation into the pipeline of a CNN. The resulting network simultaneously determines an ideal illumination arrangement to highlight important sample features during image acquisition, along with a set of convolutional weights to classify the detected images post-capture. We demonstrate our joint optimization technique with an experimental microscope configuration that automatically identifies malaria-infected cells with 5-10% higher accuracy than standard and alternative microscope lighting designs.
  • PDF
    In this paper, we address the problem of estimating the positions of human joints, i.e., articulated pose estimation. Recent state-of-the-art solutions model two key issues, joint detection and spatial configuration refinement, together using convolutional neural networks. Our work mainly focuses on spatial configuration refinement by reducing variations of human poses statistically, which is motivated by the observation that the scattered distribution of the relative locations of joints e.g., the left wrist is distributed nearly uniformly in a circular area around the left shoulder) makes the learning of convolutional spatial models hard. We present a two-stage normalization scheme, human body normalization and limb normalization, to make the distribution of the relative joint locations compact, resulting in easier learning of convolutional spatial models and more accurate pose estimation. In addition, our empirical results show that incorporating multi-scale supervision and multi-scale fusion into the joint detection network is beneficial. Experiment results demonstrate that our method consistently outperforms state-of-the-art methods on the benchmarks.
  • PDF
    In this paper, we present a general approach to automatically visual-servo control the position and shape of a deformable object whose deformation parameters are unknown. The servo-control is achieved by online learning a model mapping between the robotic end-effector's movement and the object's deformation measurement. The model is learned using the Gaussian Process Regression (GPR) to deal with its highly nonlinear property, and once learned, the model is used for predicting the required control at each time step. To overcome GPR's high computational cost while dealing with long manipulation sequences, we implement a fast online GPR by selectively removing uninformative observation data from the regression process. We validate the performance of our controller on a set of deformable object manipulation tasks and demonstrate that our method can achieve effective and accurate servo-control for general deformable objects with a wide variety of goal settings. Videos are available at https://sites.google.com/view/mso-fogpr.
  • PDF
    Unsupervised image segmentation and denoising are two fundamental tasks in image processing. Usually, graph based models such as multicut are used for segmentation and variational models are employed for denoising. Our approach addresses both problems at the same time. We propose a novel MILP formulation of a first derivative Potts model, where binary variables are introduced to directly deal with the $\ell_0$ norm. As a by-product the image is denoised. To the best of our knowledge, it is the first global mathematical programming model for simultaneous segmentation and denoising. Numerical experiments on real-world images are compared with multicut approaches.
  • PDF
    This paper addresses the question of emotion classification. The task consists in predicting emotion labels (taken among a set of possible labels) best describing the emotions contained in short video clips. Building on a standard framework -- lying in describing videos by audio and visual features used by a supervised classifier to infer the labels -- this paper investigates several novel directions. First of all, improved face descriptors based on 2D and 3D Convo-lutional Neural Networks are proposed. Second, the paper explores several fusion methods, temporal and multimodal, including a novel hierarchical method combining features and scores. In addition, we carefully reviewed the different stages of the pipeline and designed a CNN architecture adapted to the task; this is important as the size of the training set is small compared to the difficulty of the problem, making generalization difficult. The so-obtained model ranked 4th at the 2017 Emotion in the Wild challenge with the accuracy of 58.8 %.
  • PDF
    Recently visual question answering (VQA) and visual question generation (VQG) are two trending topics in the computer vision, which have been explored separately. In this work, we propose an end-to-end unified framework, the Invertible Question Answering Network (iQAN), to leverage the complementary relations between questions and answers in images by jointly training the model on VQA and VQG tasks. Corresponding parameter sharing scheme and regular terms are proposed as constraints to explicitly leverage Q,A's dependencies to guide the training process. After training, iQAN can take either question or answer as input, then output the counterpart. Evaluated on the large-scale visual question answering datasets CLEVR and VQA2, our iQAN improves the VQA accuracy over the baselines. We also show the dual learning framework of iQAN can be generalized to other VQA architectures and consistently improve the results over both the VQA and VQG tasks.
  • PDF
    Locally $L^0$-convex modules were introduced in [D. Filipovic, M. Kupper, N. Vogelpoth. Separation and duality in locally $L^0$-convex modules. J. Funct. Anal. 256(12), 3996-4029 (2009)] as the analytic basis for the study of multi-period mathematical finance. Later, the algebra of conditional sets was introduced in [S. Drapeau, A. Jamneshan, M. Karliczek, M. Kupper. The algebra of conditional sets and the concepts of conditional topology and compactness. J. Math. Anal. Appl. 437(1), 561-589 (2016)]. By means of Boolean-valued models and its transfer principle we show that any known result on locally convex spaces has a transcription in the frame of locally $L^0$-convex modules which is also true, and that the formulation in conditional set theory of any theorem of classical set theory is also a theorem. We propose Boolean-valued analysis as an analytic framework for the study of multi-period problems in mathematical finance.
  • PDF
    We present an end-to-end imitation learning system for agile, off-road autonomous driving using only low-cost on-board sensors.By imitating an optimal controller, we train a deep neural network control policy to map raw, high-dimensional observations to continuous steering and throttle commands, the latter of which is essential to successfully drive on varied terrain at high speed. Compared with recent approaches to similar tasks, our method requires neither state estimation nor online planning to navigate the vehicle. Real-world experimental results demonstrate successful autonomous off-road driving, matching the state-of-the-art performance.

Recent comments

James Wootton Sep 21 2017 05:41 UTC

What does this imply for https://scirate.com/arxiv/1608.00263? I'm guessing they still regard it as valid (it is ref [14]), but just too hard to implement for now.

Ben Criger Sep 08 2017 08:09 UTC

Oh look, there's another technique for decoding surface codes subject to X/Z correlated errors: https://scirate.com/arxiv/1709.02154

Aram Harrow Sep 06 2017 07:54 UTC

The paper only applies to conformal field theories, and such a result cannot hold for more general 1-D systems by 0705.4077 and other papers (assuming standard complexity theory conjectures).

Felix Leditzky Sep 05 2017 21:27 UTC

Thanks for the clarification, Philippe!

Philippe Faist Sep 05 2017 21:09 UTC

Hi Felix, thanks for the good question.

We've found it more convenient to consider trace-nonincreasing and $\Gamma$-sub-preserving maps (and this is justified by the fact that they can be dilated to fully trace-preserving and $\Gamma$-preserving maps on a larger system). The issue arises because

...(continued)
Felix Leditzky Sep 05 2017 19:02 UTC

What is the reason/motivation to consider trace-non-increasing maps instead of trace-preserving maps in your framework and the definition of the coherent relative entropy?

Steve Flammia Aug 30 2017 22:30 UTC

Thanks for the reference Ashley. If I understand your paper, you are still measuring stabilizers of X- and Z-type at the top layer of the code. So it might be that we can improve on the factor of 2 that you found if we tailor the stabilizers to the noise bias at the base level.

Ashley Aug 30 2017 22:09 UTC

We followed Aliferis and Preskill's approach in [https://arxiv.org/abs/1308.4776][1] and found that the fault-tolerant threshold for the surface code was increased by approximately a factor of two, from around 0.75 per cent to 1.5 per cent for a bias of 10 to 100.

[1]: https://arxiv.org/abs/1308.

...(continued)
Stephen Bartlett Aug 30 2017 21:55 UTC

Following on from Steve's comments, it's possible to use the bias-preserving gate set in Aliferis and Preskill directly to do the syndrome extraction, as you build up a CNOT gadget, but such a direct application of your methods would be very complicated and involve a lot of gate teleportation. If y

...(continued)
Steve Flammia Aug 30 2017 21:38 UTC

We agree that finding good syndrome extraction circuits if an important question. At the moment we do not have such circuits, though we have started to think about them. We are optimistic that this can be done in principle, but it remains to be seen if the circuits can be made sufficiently simple to

...(continued)