One of the most widely known building blocks of modern physics is Heisenberg's indeterminacy principle. Among the different statements of this fundamental property of the full quantum mechanical nature of physical reality, the uncertainty relation for energy and time has a special place. Its interpretation and its consequences have inspired continued research efforts for almost a century. In its modern formulation, the uncertainty relation is understood as setting a fundamental bound on how fast any quantum system can evolve. In this Topical Review we describe important milestones, such as the Mandelstam-Tamm and the Margolus-Levitin bounds on the quantum speed limit, and summarise recent applications in a variety of current research fields -- including quantum information theory, quantum computing, and quantum thermodynamics amongst several others. To bring order and to provide an access point into the many different notions and concepts, we have grouped the various approaches into the minimal time approach and the geometric approach, where the former relies on quantum control theory, and the latter arises from measuring the distinguishability of quantum states. Due to the volume of the literature, this Topical Review can only present a snapshot of the current state-of-the-art and can never be fully comprehensive. Therefore, we highlight but a few works hoping that our selection can serve as a representative starting point for the interested reader.
The discovery of topological states of matter has profoundly augmented our understanding of phase transitions in physical systems. Instead of local order parameters, topological phases are described by global topological invariants and are therefore robust against perturbations. A prominent example thereof is the two-dimensional integer quantum Hall effect. It is characterized by the first Chern number which manifests in the quantized Hall response induced by an external electric field. Generalizing the quantum Hall effect to four-dimensional systems leads to the appearance of a novel non-linear Hall response that is quantized as well, but described by a 4D topological invariant - the second Chern number. Here, we report on the first observation of a bulk response with intrinsic 4D topology and the measurement of the associated second Chern number. By implementing a 2D topological charge pump with ultracold bosonic atoms in an angled optical superlattice, we realize a dynamical version of the 4D integer quantum Hall effect. Using a small atom cloud as a local probe, we fully characterize the non-linear response of the system by in-situ imaging and site-resolved band mapping. Our findings pave the way to experimentally probe higher-dimensional quantum Hall systems, where new topological phases with exotic excitations are predicted.
One of the most striking features of quantum theory is the existence of entangled states, responsible for Einstein's so called "spooky action at a distance". These states emerge from the mathematical formalism of quantum theory, but to date we do not have a clear idea of the physical principles that give rise to entanglement. Why does nature have entangled states? Would any theory superseding classical theory have entangled states, or is quantum theory special? One important feature of quantum theory is that it has a classical limit, recovering classical theory through the process of decoherence. We show that any theory with a classical limit must contain entangled states, thus establishing entanglement as an inevitable feature of any theory superseding classical theory.
The aim of this contribution is to discuss relations between non-classical features, such as entanglement, incompatibility of measurements, steering and non-locality, in general probabilistic theories. We show that all these features are particular forms of entanglement, which leads to close relations between their quantifications. For this, we study the structure of the tensor products of a compact convex set with a semiclassical state space.
We consider the contextual fraction as a quantitative measure of contextuality of empirical models, i.e. tables of probabilities of measurement outcomes in an experimental scenario. It provides a general way to compare the degree of contextuality across measurement scenarios; it bears a precise relationship to violations of Bell inequalities; its value, and a witnessing inequality, can be computed using linear programming; it is monotone with respect to the "free" operations of a resource theory for contextuality; and it measures quantifiable advantages in informatic tasks, such as games and a form of measurement based quantum computing.
May 24 2017 cs.AI
Solving sequential decision making problems, such as text parsing, robotic control, and game playing, requires a combination of planning policies and generalisation of those plans. In this paper, we present Expert Iteration, a novel algorithm which decomposes the problem into separate planning and generalisation tasks. Planning new policies is performed by tree search, while a deep neural network generalises those plans. In contrast, standard Deep Reinforcement Learning algorithms rely on a neural network not only to generalise plans, but to discover them too. We show that our method substantially outperforms Policy Gradients in the board game Hex, winning 84.4% of games against it when trained for equal time.
When a two-dimensional electron gas is exposed to a perpendicular magnetic field and an in-plane electric field, its conductance becomes quantized in the transverse in-plane direction: this is known as the quantum Hall (QH) effect. This effect is a result of the nontrivial topology of the system's electronic band structure, where an integer topological invariant known as the first Chern number leads to the quantization of the Hall conductance. Interestingly, it was shown that the QH effect can be generalized mathematically to four spatial dimensions (4D), but this effect has never been realized for the obvious reason that experimental systems are bound to three spatial dimensions. In this work, we harness the high tunability and control offered by photonic waveguide arrays to experimentally realize a dynamically-generated 4D QH system using a 2D array of coupled optical waveguides. The inter-waveguide separation is constructed such that the propagation of light along the device samples over higher-dimensional momenta in the directions orthogonal to the two physical dimensions, thus realizing a 2D topological pump. As a result, the device's band structure is associated with 4D topological invariants known as second Chern numbers which support a quantized bulk Hall response with a 4D symmetry. In a finite-sized system, the 4D topological bulk response is carried by localized edges modes that cross the sample as a function of of the modulated auxiliary momenta. We directly observe this crossing through photon pumping from edge-to-edge and corner-to-corner of our system. These are equivalent to the pumping of charge across a 4D system from one 3D hypersurface to the opposite one and from one 2D hyperedge to another, and serve as first experimental realization of higher-dimensional topological physics.
Contextuality is a fundamental feature of quantum theory and is necessary for quantum computation and communication. Serious steps have therefore been taken towards a formal framework for contextuality as an operational resource. However, the most important component for a resource theory - a concrete, explicit form for the free operations of contextuality - was still missing. Here we provide such a component by introducing noncontextual wirings: a physically-motivated class of contextuality-free operations with a friendly parametrization. We characterize them completely for the general case of black-box measurement devices with arbitrarily many inputs and outputs. As applications, we show that the relative entropy of contextuality is a contextuality monotone and that maximally contextual boxes that serve as contextuality bits exist for a broad class of scenarios. Our results complete a unified resource-theoretic framework for contextuality and Bell nonlocality.
We describe a simple and effective technique, the Eigenvector Method for Umbrella Sampling (EMUS), for accurately estimating small probabilities and expectations with respect to a given target probability density. In EMUS, we apply the principle of stratified survey sampling to Markov chain Monte Carlo (MCMC) simulation: We divide the support of the target distribution into regions called strata, we use MCMC to sample (in parallel) from probability distributions supported in each of the strata, and we weight the data from each stratum to assemble estimates of general averages with respect to the target distribution. We demonstrate by theoretical results and computational examples that EMUS can be dramatically more efficient than direct Markov chain Monte Carlo when the target distribution is multimodal or when the goal is to compute tail probabilities.
Consider a set of agents in a peer-to-peer communication network, where each agent has a personal dataset and a personal learning objective. The main question addressed in this paper is: how can agents collaborate to improve upon their locally learned model without leaking sensitive information about their data? Our first contribution is to reformulate this problem so that it can be solved by a block coordinate descent algorithm. We obtain an efficient and fully decentralized protocol working in an asynchronous fashion. Our second contribution is to make our algorithm differentially private to protect against the disclosure of any information about personal datasets. We prove convergence rates and exhibit the trade-off between utility and privacy. Our experiments show that our approach dramatically outperforms previous work in the non-private case, and that under privacy constraints we significantly improve over purely local models.
May 24 2017 cs.CL
We introduce an architecture in which internal representations, learned by end-to-end optimization in a deep neural network performing a textual question-answering task, can be interpreted using basic concepts from linguistic theory. This interpretability comes at a cost of only a few percentage-point reduction in accuracy relative to the original model on which the new one is based (BiDAF ). The internal representation that is interpreted is a Tensor Product Representation: for each input word, the model selects a symbol to encode the word, and a role in which to place the symbol, and binds the two together. The selection is via soft attention. The overall interpretation is built from interpretations of the symbols, as recruited by the trained model, and interpretations of the roles as used by the model. We find support for our initial hypothesis that symbols can be interpreted as lexical-semantic word meanings, while roles can be interpreted as approximations of grammatical roles (or categories) such as subject, wh-word, determiner, etc. Through extremely detailed, fine-grained analysis, we find specific correspondences between the learned roles and parts of speech as assigned by a standard parser , and find several discrepancies in the model's favor. In this sense, the model learns significant aspects of grammar, after having been exposed solely to linguistically unannotated text, questions, and answers: no prior linguistic knowledge is given to the model. What is given is the means to represent using symbols and roles and an inductive bias favoring use of these in an approximately discrete manner.
Spatial organisation is a hallmark of all living cells, and recreating it in model systems is a necessary step in the creation of synthetic cells. It is therefore of both fundamental and practical interest to better understand the basic mechanisms underlying spatial organisation in cells. In this work, we use a continuum model of membrane and protein dynamics to study the behaviour of curvature-inducing proteins on membranes of spherical shape, such as living cells or lipid vesicles. We show that the interplay between curvature energy, entropic forces, and the geometric constraints on the membrane can result in the formation of patterns of highly-curved/protein-rich and weakly-curved/protein-poor domains on the membrane. The spontaneous formation of such patterns can be triggered either by an increase in the average density of curvature-inducing proteins, or by a relaxation of the geometric constraints on the membrane imposed by a turgor pressure or the tethering of the membrane to a cell wall or cortex. These two parameters can also be tuned to select the size and number of the protein-rich domains that arise upon pattern formation. The very general mechanism presented here could be related to protein self-organisation in many biological systems.
May 24 2017 cs.LG
Sepsis is a leading cause of mortality in intensive care units (ICUs) and costs hospitals billions annually. Treating a septic patient is highly challenging, because individual patients respond very differently to medical interventions and there is no universally agreed-upon treatment for sepsis. Understanding more about a patient's physiological state at a given time could hold the key to effective treatment policies. In this work, we propose a new approach to deduce optimal treatment policies for septic patients by using continuous state-space models and deep reinforcement learning. Learning treatment policies over continuous spaces is important, because we retain more of the patient's physiological information. Our model is able to learn clinically interpretable treatment policies, similar in important aspects to the treatment policies of physicians. Evaluating our algorithm on past ICU patient data, we find that our model could reduce patient mortality in the hospital by up to 3.6% over observed clinical policies, from a baseline mortality of 13.7%. The learned treatment policies could be used to aid intensive care clinicians in medical decision making and improve the likelihood of patient survival.
May 24 2017 cs.CV
This paper introduces a video dataset of spatio-temporally localized Atomic Visual Actions (AVA). The AVA dataset densely annotates 80 atomic visual actions in 64k movie clips with actions localized in space and time, resulting in 197k action labels with multiple labels per human occurring frequently. The main differences with existing video datasets are: (1) the definition of atomic visual actions, which avoids collecting data for each and every complex action; (2) precise spatio-temporal annotations with possibly multiple annotations for each human; (3) the use of diverse, realistic video material (movies). This departs from existing datasets for spatio-temporal action recognition, such as JHMDB and UCF datasets, which provide annotations for at most 24 composite actions, such as basketball dunk, captured in specific environments, i.e., basketball court. We implement a state-of-the-art approach for action localization. Despite this, the performance on our dataset remains low and underscores the need for developing new approaches for video understanding. The AVA dataset is the first step in this direction, and enables the measurement of performance and progress in realistic scenarios.
May 24 2017 cs.SE
This paper presents the result of our experience with the ap- plication of runtime verification, testing and static analysis techniques to several industrial projects. We discuss the eight most relevant challenges that we experienced, and the strategies that we elaborated to face them.
May 24 2017 cs.AI
No real-world reward function is perfect. Sensory errors and software bugs may result in RL agents observing higher (or lower) rewards than they should. For example, a reinforcement learning agent may prefer states where a sensory error gives it the maximum reward, but where the true reward is actually small. We formalise this problem as a generalised Markov Decision Problem called Corrupt Reward MDP. Traditional RL methods fare poorly in CRMDPs, even under strong simplifying assumptions and when trying to compensate for the possibly corrupt rewards. Two ways around the problem are investigated. First, by giving the agent richer data, such as in inverse reinforcement learning and semi-supervised reinforcement learning, reward corruption stemming from systematic sensory errors may sometimes be completely managed. Second, by using randomisation to blunt the agent's optimisation, reward corruption can be partially managed under some assumptions.
Ridesourcing platforms like Uber and Didi are getting more and more popular around the world. However, unauthorized ridesourcing activities taking advantages of the sharing economy can greatly impair the healthy development of this emerging industry. As the first step to regulate on-demand ride services and eliminate black market, we design a method to detect ridesourcing cars from a pool of cars based on their trajectories. Since licensed ridesourcing car traces are not openly available and may be completely missing in some cities due to legal issues, we turn to transferring knowledge from public transport open data, i.e, taxis and buses, to ridesourcing detection among ordinary vehicles. We propose a two-stage transfer learning framework. In Stage 1, we take taxi and bus data as input to learn a random forest (RF) classifier using trajectory features shared by taxis/buses and ridesourcing/other cars. Then, we use the RF to label all the candidate cars. In Stage 2, leveraging the subset of high confident labels from the previous stage as input, we further learn a convolutional neural network (CNN) classifier for ridesourcing detection, and iteratively refine RF and CNN, as well as the feature set, via a co-training process. Finally, we use the resulting ensemble of RF and CNN to identify the ridesourcing cars in the candidate pool. Experiments on real car, taxi and bus traces show that our transfer learning framework, with no need of a pre-labeled ridesourcing dataset, can achieve similar accuracy as the supervised learning methods.
Developments in deep generative models have allowed for tractable learning of high-dimensional data distributions. While the employed learning procedures typically assume that training data is drawn i.i.d. from the distribution of interest, it may be desirable to model distinct distributions which are observed sequentially, such as when different classes are encountered over time. Although conditional variations of deep generative models permit multiple distributions to be modeled by a single network in a disentangled fashion, they are susceptible to catastrophic forgetting when the distributions are encountered sequentially. In this paper, we adapt recent work in reducing catastrophic forgetting to the task of training generative adversarial networks on a sequence of distinct distributions, enabling continual generative modeling.
Today, the internet makes tremendous amounts of data widely available. Often, the same information is behind multiple different available data sets. This lends growing importance to latent variable models that try to learn the hidden information from the available imperfect versions. For example, social media platforms can contain an abundance of pictures of the same person or object, yet all of which are taken from different perspectives. In a simplified scenario, one may consider pictures taken from the same perspective, which are distorted by noise. This latter application allows for a rigorous mathematical treatment, which is the content of this contribution. We apply a recently developed method of dependent component analysis to image denoising when multiple distorted copies of one and the same image are available, each being corrupted by a different and unknown noise process. In a simplified scenario, we assume that the distorted image is corrupted by noise that acts independently on each pixel. We answer completely the question of how to perform optimal denoising, when at least three distorted copies are available: First we define optimality of an algorithm in the presented scenario, and then we describe an aymptotically optimal universal discrete denoising algorithm (UDDA). In the case of binary data and binary symmetric noise, we develop a simplified variant of the algorithm, dubbed BUDDA, which we prove to attain universal denoising uniformly.
May 24 2017 cs.LO
Computational cognitive modeling investigates human cognition by building detailed computational models for cognitive processes. Adaptive Control of Thought - Rational (ACT-R) is a rule-based cognitive architecture that offers a widely employed framework to build such models. There is a sound and complete embedding of ACT-R in Constraint Handling Rules (CHR). Therefore analysis techniques from CHR can be used to reason about computational properties of ACT-R models. For example, confluence is the property that a program yields the same result for the same input regardless of the rules that are applied. In ACT-R models, there are often cognitive processes that should always yield the same result while others e.g. implement strategies to solve a problem that could yield different results. In this paper, a decidable confluence criterion for ACT-R is presented. It allows to identify ACT-R rules that are not confluent. Thereby, the modeler can check if his model has the desired behavior. The sound and complete translation of ACT-R to CHR from prior work is used to come up with a suitable invariant-based confluence criterion from the CHR literature. Proper invariants for translated ACT-R models are identified and proven to be decidable. The presented method coincides with confluence of the original ACT-R models.
In this paper we consider the cluster estimation problem under the Stochastic Block Model. We show that the semidefinite programming (SDP) formulation for this problem achieves an error rate that decays exponentially in the signal-to-noise ratio. The error bound implies weak recovery in the sparse graph regime with bounded expected degrees, as well as exact recovery in the dense regime. An immediate corollary of our results yields error bounds under the Censored Block Model. Moreover, these error bounds are robust, continuing to hold under heterogeneous edge probabilities and a form of the so-called monotone attack. Significantly, this error rate is achieved by the SDP solution itself without any further pre- or post-processing, and improves upon existing polynomially-decaying error bounds proved using the Grothendieck\textquoteright s inequality. Our analysis has two key ingredients: (i) showing that the graph has a well-behaved spectrum, even in the sparse regime, after discounting an exponentially small number of edges, and (ii) an order-statistics argument that governs the final error rate. Both arguments highlight the implicit regularization effect of the SDP formulation.
We give a streamlined derivation of "teleportation identities" for discrete and continuous systems. The identities do not depend on the choice of Bell basis and so are "coordinate free". The unitaries Bob needs to apply to recover Alice's unknown state is the product of the unitaries Alice and Bob use to generate a common Bell basis. The case of qubit, qudits and continuous systems are all treated on the same footing.
Generic text embeddings are successfully used in a variety of tasks. However, they are often learnt by capturing the co-occurrence structure from pure text corpora, resulting in limitations of their ability to generalize. In this paper, we explore models that incorporate visual information into the text representation. Based on comprehensive ablation studies, we propose a conceptually simple, yet well performing architecture. It outperforms previous multimodal approaches on a set of well established benchmarks. We also improve the state-of-the-art results for image-related text datasets, using orders of magnitude less data.
Deep neural networks (DNNs) play a key role in many applications. Unsurprisingly, they also became a potential attack target of adversaries. Some studies have demonstrated DNN classifiers can be fooled by the adversarial example, which is crafted via introducing some perturbations into an original sample. Accordingly, some powerful defense techniques were proposed against adversarial examples. However, existing defense techniques require modifying the target model or depend on the prior knowledge of attack techniques to different degrees. In this paper, we propose a straightforward method for detecting adversarial image examples. It doesn't require any prior knowledge of attack techniques and can be directly deployed into unmodified off-the-shelf DNN models. Specifically, we consider the perturbation to images as a kind of noise and introduce two classical image processing techniques, scalar quantization and smoothing spatial filter, to reduce its effect. The image two-dimensional entropy is employed as a metric to implement an adaptive noise reduction for different kinds of images. As a result, the adversarial example can be effectively detected by comparing the classification results of a given sample and its denoised version. Thousands of adversarial examples against some state-of-the-art DNN models are used to evaluate the proposed method, which are crafted with different attack techniques. The experiment shows that our detection method can achieve an overall recall of 93.73% and an overall precision of 95.45% without referring to any prior knowledge of attack techniques.
May 24 2017 cs.CV
We present a powerful method to extract per-point semantic class labels from aerialphotogrammetry data. Labeling this kind of data is important for tasks such as environmental modelling, object classification and scene understanding. Unlike previous point cloud classification methods that rely exclusively on geometric features, we show that incorporating color information yields a significant increase in accuracy in detecting semantic classes. We test our classification method on three real-world photogrammetry datasets that were generated with Pix4Dmapper Pro, and with varying point densities. We show that off-the-shelf machine learning techniques coupled with our new features allow us to train highly accurate classifiers that generalize well to unseen data, processing point clouds containing 10 million points in less than 3 minutes on a desktop computer.
Recently, Roger Colbeck and Renato Renner (C&R) have claimed that '[n]o extension of quantum theory can have improved predictive power'. If correct, this is a spectacular impossibility theorem for hidden variable theories, which is more general than the theorems of Bell and Leggett. C&R's claim essentially means that in any hidden variable theory that is compatible with quantum-mechanical predictions, probabilities of measurement outcomes are independent of these hidden variables. On closer inspection, however, the generality and validity of the claim can be contested. First, it is based on an assumption called 'Freedom of Choice'. As the name suggests, this assumption involves the independence of an experimenter's choice of measurement settings. But in the way C&R define this assumption, a no-signalling condition is surreptitiously presupposed, making the assumption less innocent than it sounds. When using this definition, any hidden variable theory violating Parameter Independence, such as Bohmian Mechanics, is immediately shown to be incompatible with quantum-mechanical predictions. Also, the argument of C&R is hard to follow and their mathematical derivation contains several gaps, some of which cannot be closed in the way they suggest. We shall show that these gaps can be filled. The issue with the 'Freedom of Choice' assumption can be circumvented by explicitly assuming Parameter Independence. This makes the result less general, but better founded. We then obtain an impossibility theorem for hidden variable theories satisfying Parameter Independence only. So, while quantum mechanics itself satisfies Parameter Independence, if a variable is added that changes the outcome probabilities, however slightly, Parameter Independence must be violated.
May 24 2017 cs.AI
Explaining and reasoning about processes which underlie observed black-box phenomena enables the discovery of causal mechanisms, derivation of suitable abstract representations and the formulation of more robust predictions. We propose to learn high level functional programs in order to represent abstract models which capture the invariant structure in the observed data. We introduce the $\pi$-machine (program-induction machine) -- an architecture able to induce interpretable LISP-like programs from observed data traces. We propose an optimisation procedure for program learning based on backpropagation, gradient descent and A* search. We apply the proposed method to three problems: system identification of dynamical systems, explaining the behaviour of a DQN agent and learning by demonstration in a human-robot interaction scenario. Our experimental results show that the $\pi$-machine can efficiently induce interpretable programs from individual data traces.
Adaptive optimization methods, which perform local optimization with a metric constructed from the history of iterates, are becoming increasingly popular for training deep neural networks. Examples include AdaGrad, RMSProp, and Adam. We show that for simple overparameterized problems, adaptive methods often find drastically different solutions than gradient descent (GD) or stochastic gradient descent (SGD). We construct an illustrative binary classification problem where the data is linearly separable, GD and SGD achieve zero test error, and AdaGrad, Adam, and RMSProp attain test errors arbitrarily close to half. We additionally study the empirical generalization capability of adaptive methods on several state-of-the-art deep learning models. We observe that the solutions found by adaptive methods generalize worse (often significantly worse) than SGD, even when these solutions have better training performance. These results suggest that practitioners should reconsider the use of adaptive methods to train neural networks.
May 24 2017 cs.IR
Automated music playlist generation is a specific form of music recommendation. Generally stated, the user receives a set of song suggestions defining a coherent listening session. We hypothesize that the best way to convey such playlist coherence to new recommendations is by learning it from actual curated examples, in contrast to imposing ad hoc constraints. Examples of thoroughly curated playlists are rather scarce, especially compared to the amount of listening logs available (e.g., in music streaming services). Collaborative filtering methods can be used to capture underlying patterns in hand-curated playlists, but their performance is severely affected by the size of the data and the low number of song observations. We propose an alternative method based on a song-to-playlist classifier, which learns the underlying structure from actual playlist examples, while leveraging song features based on audio, social tags and independent listening logs. Experiments on two datasets of hand-curated playlists show competitive performance compared to collaborative filtering when enough training data is available and also more robust performance in cold-start situations. For example, both methods achieve a recall@100 of roughly 35% for songs observed 5 or more times in the training set, whereas the proposed model achieves a recall@100 of roughly 15% for songs observed 4 or less times in the training set, compared to the 3% achieved by collaborative filtering.
In a vertex-colored graph, an edge is happy if its endpoints have the same color. Similarly, a vertex is happy if all its incident edges are happy. Motivated by the computation of homophily in social networks, we consider the algorithmic aspects of the following Maximum Happy Edges (k-MHE) problem: given a partially k-colored graph G, find an extended full k-coloring of G maximizing the number of happy edges. When we want to maximize the number of happy vertices, the problem is known as Maximum Happy Vertices (k-MHV). We further study the complexity of the problems and their weighted variants. For instance, we prove that for every k >= 3, both problems are NP-complete for bipartite graphs and k-MHV remains hard for split graphs. In terms of exact algorithms, we show both problems can be solved in time O*(2^n), and give an even faster O*(1.89^n)-time algorithm when k = 3. From a parameterized perspective, we give a linear vertex kernel for Weighted k-MHE, where edges are weighted and the goal is to obtain happy edges of at least a specified total weight. Finally, we prove both problems are solvable in polynomial-time when the graph has bounded treewidth or bounded neighborhood diversity.
May 24 2017 cs.CV
We address the problem of estimating image difficulty defined as the human response time for solving a visual search task. We collect human annotations of image difficulty for the PASCAL VOC 2012 data set through a crowd-sourcing platform. We then analyze what human interpretable image properties can have an impact on visual search difficulty, and how accurate are those properties for predicting difficulty. Next, we build a regression model based on deep features learned with state of the art convolutional neural networks and show better results for predicting the ground-truth visual search difficulty scores produced by human annotators. Our model is able to correctly rank about 75% image pairs according to their difficulty score. We also show that our difficulty predictor generalizes well to new classes not seen during training. Finally, we demonstrate that our predicted difficulty scores are useful for weakly supervised object localization (8% improvement) and semi-supervised object classification (1% improvement).
Finding groups of connected individuals in large graphs with tens of thousands or more nodes has received considerable attention in academic research. In this paper, we analyze three main issues with respect to the recent influx of papers on community detection in (large) graphs, highlight the specific problems with the current research avenues, and propose a first step towards a better approach. First, in spite of the strong interest in community detection, a strong conceptual and theoretical foundation of connectedness in large graphs is missing. Yet, it is crucial to be able to determine the specific feats that we aim to analyze in large networks, to avoid a purely black-or-white view. Second, in literature commonly employed (meta)heuristic frameworks are applied for the large graph problems. Currently, it is, however, unclear whether these techniques are even viable options, and what the added value of the constituting parts is. Additionally, the manner in which different algorithms are compared is also ambiguous. Finally, no analyses of the impact of data parameters on the reported clusters is done. Nonetheless, it would be interesting to evaluate which characteristics lead to which type of communities and what their effect is on computational difficulty.
May 24 2017 cs.AI
Applying deep reinforcement learning (RL) on real systems suffers from slow data sampling. We propose an enhanced generative adversarial network (EGAN) to initialize an RL agent in order to achieve faster learning. The EGAN utilizes the relation between states and actions to enhance the quality of data samples generated by a GAN. Pre-training the agent with the EGAN shows a steeper learning curve with a 20% improvement of training time in the beginning of learning, compared to no pre-training, and an improvement compared to training with GAN by about 5% with smaller variations. For real time systems with sparse and slow data sampling the EGAN could be used to speed up the early phases of the training process.
May 24 2017 cs.LO
We introduce loose graph simulations (LGS), a new notion about labelled graphs which subsumes in an intuitive and natural way subgraph isomorphism (SGI), regular language pattern matching (RLPM) and graph simulation (GS). Being an unification of all these notions, LGS allows us to express directly also problems which are "mixed" instances of previous ones, and hence which would not fit easily in any of them. After the definition and some examples, we show that the problem of finding loose graph simulations is NP-complete, we provide formal translation of SGI, RLPM, and GS into LGSs, and we give the representation of a problem which extends both SGI and RLPM.
Even though the evolution of an isolated quantum system is unitary, the complexity of interacting many-body systems prevents the observation of recurrences of quantum states for all but the smallest systems. For large systems one can not access the full complexity of the quantum states and the requirements to observe a recurrence in experiments reduces to being close to the initial state with respect to the employed observable. Selecting an observable connected to the collective excitations in one-dimensional superfluids, we demonstrate recurrences of coherence and long range order in an interacting quantum many-body system containing thousands of particles. This opens up a new window into the dynamics of large quantum systems even after they reached a transient thermal-like state.
The dark sector of the Universe is beginning to be clarified step by step. If the dark energy is vacuum energy, then 123 orders are exactly reduced by ordinary physical processes. For many years these unexplained orders were called a crisis in physics. There was indeed a "crisis" before the introduction of the holographic principle and entropic force in physics. The vacuum energy was spent for the organization of new microstates during the entire life of the Universe, but in the initial period of its evolution the vacuum energy (78 orders) were reduced more effectively by the vacuum condensates produced by phase transitions, because the Universe lost the high symmetry during its expansion. Important problems of physics and cosmology can be solved if the quarks, leptons, and gauge bosons are composite particles. The dark matter, partially or all consisting of familon-type pseudo-Goldstone bosons with a mass of 10^-5 - 10^-3 eV, can be explained in the composite model. Three generations of elementary particles are absolutely necessary in this model. In addition, this model realizes three relativistic phase transitions in a medium of familons at different red shifts, forming a large-scale structure of dark matter that was "repeated" by baryons. We predict the detection of dark matter dynamics, the detection of familons as dark matter particles, and the development of spectroscopy for dark medium due to the probable presence of dark atoms in it. Other viewpoints on the dark components of the Universe are also discussed briefly.
May 24 2017 math.OC
For an arbitrary finite family of semi-algebraic/definable functions, we consider the corresponding constraint set and we study qualification conditions for perturbations of this set. In particular we prove that all positive diagonal perturbations, save perhaps a finite number of them, ensure that any point within the feasible set satisfies Mangasarian-Fromovitz constraint qualification. Using the Milnor-Thom theorem, we provide a bound for the number of singular perturbations when the constraints are polynomial functions. Examples show that the order of magnitude of our exponential bound is relevant. Our perturbation approach provides a simple protocol to build sequences of "regular" problems approximating an arbitrary semi-algebraic/definable problem. Applications to sequential quadratic programming methods and sum of squares relaxation are provided.
Shot boundary detection (SBD) is an important component of many video analysis tasks, such as action recognition, video indexing, summarization and editing. Previous work typically used a combination of low-level features like color histograms, in conjunction with simple models such as SVMs. Instead, we propose to learn shot detection end-to-end, from pixels to final shot boundaries. For training such a model, we rely on our insight that all shot boundaries are generated. Thus, we create a dataset with one million frames and automatically generated transitions such as cuts, dissolves and fades. In order to efficiently analyze hours of videos, we propose a Convolutional Neural Network (CNN) which is fully convolutional in time, thus allowing to use a large temporal context without the need to repeatedly processing frames. With this architecture our method obtains state-of-the-art results while running at an unprecedented speed of more than 120x real-time.
May 24 2017 cs.CV
Salient object detection has increasingly become a popular topic in cognitive and computational sciences, including computer vision and artificial intelligence research. In this paper, we propose integrating \textitsemantic priors into the salient object detection process. Our algorithm consists of three basic steps. Firstly, the explicit saliency map is obtained based on the semantic segmentation refined by the explicit saliency priors learned from the data. Next, the implicit saliency map is computed based on a trained model which maps the implicit saliency priors embedded into regional features with the saliency values. Finally, the explicit semantic map and the implicit map are adaptively fused to form a pixel-accurate saliency map which uniformly covers the objects of interest. We further evaluate the proposed framework on two challenging datasets, namely, ECSSD and HKUIS. The extensive experimental results demonstrate that our method outperforms other state-of-the-art methods.
May 24 2017 cs.AI
The human reasoning process is seldom a one-way process from an input leading to an output. Instead, it often involves a systematic deduction by ruling out other possible outcomes as a self-checking mechanism. In this paper, we describe the design of a hybrid neural network for logical learning that is similar to the human reasoning through the introduction of an auxiliary input, namely the indicators, that act as the hints to suggest logical outcomes. We generate these indicators by digging into the hidden information buried underneath the original training data for direct or indirect suggestions. We used the MNIST data to demonstrate the design and use of these indicators in a convolutional neural network. We trained a series of such hybrid neural networks with variations of the indicators. Our results show that these hybrid neural networks are very robust in generating logical outcomes with inherently higher prediction accuracy than the direct use of the original input and output in apparent models. Such improved predictability with reassured logical confidence is obtained through the exhaustion of all possible indicators to rule out all illogical outcomes, which is not available in the apparent models. Our logical learning process can effectively cope with the unknown unknowns using a full exploitation of all existing knowledge available for learning. The design and implementation of the hints, namely the indicators, become an essential part of artificial intelligence for logical learning. We also introduce an ongoing application setup for this hybrid neural network in an autonomous grasping robot, namely as_DeepClaw, aiming at learning an optimized grasping pose through logical learning.
Security, privacy, and fairness have become critical in the era of data science and machine learning. More and more we see that achieving universally secure, private, and fair systems is practically impossible. We have seen for example how generative adversarial networks can be used to learn about the expected private training data; how the exploitation of additional data can reveal private information in the original one; and how what looks like unrelated features can teach us about each other. Confronted with this challenge, in this paper we open a new line of research, where the security, privacy, and fairness is learned and used in a closed environment. The goal is to ensure that a given entity (e.g., the company or the government), trusted to infer certain information with our data, is blocked from inferring protected information from it. For example, a hospital might be allowed to produce diagnosis on the patient (the positive task), without being able to infer the gender of the subject (negative task). Similarly, a company can guarantee that internally it is not using the provided data for any undesired task, an important goal that is not contradicting the virtually impossible challenge of blocking everybody from the undesired task. We design a system that learns to succeed on the positive task while simultaneously fail at the negative one, and illustrate this with challenging cases where the positive task is actually harder than the negative one being blocked. Fairness, to the information in the negative task, is often automatically obtained as a result of this proposed approach. The particular framework and examples open the door to security, privacy, and fairness in very important closed scenarios, ranging from private data accumulation companies like social networks to law-enforcement and hospitals.
May 24 2017 cs.CV
We propose a novel framework for abnormal event detection in video that requires no training sequences. Our framework is based on unmasking, a technique previously used for authorship verification in text documents, which we adapt to our task. We iteratively train a binary classifier to distinguish between two consecutive video sequences while removing at each step the most discriminant features. Higher training accuracy rates of the intermediately obtained classifiers represent abnormal events. To the best of our knowledge, this is the first work to apply unmasking for a computer vision task. We compare our method with several state-of-the-art supervised and unsupervised methods on four benchmark data sets. The empirical results indicate that our abnormal event detection framework can achieve state-of-the-art results, while running in real-time at 20 frames per second.
We consider the question: what can be learnt by looking at and listening to a large amount of unlabelled videos? There is a valuable, but so far untapped, source of information contained in the video itself -- the correspondence between the visual and the audio streams, and we introduce a novel "Audio-Visual Correspondence" learning task that makes use of this. Training visual and audio networks from scratch, without any additional supervision other than the raw unconstrained videos themselves, is shown to successfully solve this task, and, more interestingly, result in good vision and audio representations. These features set the new state-of-the-art on two sound classification benchmarks, and perform on par with the state-of-the-art self-supervised approaches on ImageNet classification. We also demonstrate that the network is able to localize objects in both modalities, as well as perform fine-grained recognition tasks.
May 24 2017 cs.NI
In this paper, we investigate cooperative spectrum sensing (CSS) in a cognitive radio network (CRN) where multiple secondary users (SUs) cooperate in order to detect a primary user (PU) which possibly occupies multiple bands simultaneously. Deep sensing, which constitutes the first CSS framework based on a convolutional neural network (CNN), is proposed. In deep sensing, instead of the explicit mathematical modeling of CSS, the optimal strategy for combining the individual sensing results of the SUs is obtained with a CNN based on training sensing samples. Accordingly, an environment-specific CSS is found in an adaptive manner regardless of whether the individual sensing results are quantized or not. Through simulation, we show that the performance of CSS can be significantly improved by the proposed deep sensing scheme, especially in the low signal-to-noise ratio (SNR) regime, even when the number of training samples is moderate.
Long Short-Term Memory (LSTM) recurrent neural networks are renowned for being uninterpretable "black boxes". In the medical domain where LSTMs have shown promise, this is specifically concerning because it is imperative to understand the decisions made by machine learning models in such acute situations. This study employs techniques used in the Convolutional Neural Network domain to elucidate the operations that LSTMs perform on time series. The visualization techniques include input saliency by means of occlusion and derivatives, class mode visualization, and temporal outputs. Moreover, we demonstrate that LSTMs appear to extract features similar to those extracted by wavelets. It was found that deriving the inputs for saliency is a poor approximation and occlusion is a better approach. Moreover, analyzing LSTMs on different sets of data provide novel interpretations.
Multi-task learning is partly motivated by the observation that humans bring to bear what they know about related problems when solving new ones. Similarly, deep neural networks can profit from related tasks by sharing parameters with other networks. However, humans do not consciously decide to transfer knowledge between tasks (and are typically not aware of the transfer). In machine learning, it is hard to estimate if sharing will lead to improvements; especially if tasks are only loosely related. To overcome this, we introduce Sluice Networks, a general framework for multi-task learning where trainable parameters control the amount of sharing -- including which parts of the models to share. Our framework goes beyond and generalizes over previous proposals in enabling hard or soft sharing of all combinations of subspaces, layers, and skip connections. We perform experiments on three task pairs from natural language processing, and across seven different domains, using data from OntoNotes 5.0, and achieve up to 15% average error reductions over common approaches to multi-task learning. We analyze when the architecture is particularly helpful, as well as its ability to fit noise. We show that a) label entropy is predictive of gains in sluice networks, confirming findings for hard parameter sharing, and b) while sluice networks easily fit noise, they are robust across domains in practice.
May 24 2017 cs.RO
Tensegrity mechanisms have several interesting properties that make them suitable for a number of applications. Their analysis is generally challenging because the static equilibrium conditions often result in complex equations. A class of planar one-degree-of-freedom (dof) tensegrity mechanisms with three linear springs is analyzed in detail in this paper. The kinetostatic equations are derived and solved under several loading and geometric conditions. It is shown that these mechanisms exhibit up to six equilibrium configurations, of which one or two are stable. Discriminant varieties and cylindrical algebraic decomposition combined with Groebner base elimination are used to classify solutions as function of the input parameters.
Recent researches have shown that machine learning based malware detection algorithms are very vulnerable under the attacks of adversarial examples. These works mainly focused on the detection algorithms which use features with fixed dimension, while some researchers have begun to use recurrent neural networks (RNN) to detect malware based on sequential API features. This paper proposes a novel algorithm to generate sequential adversarial examples, which are used to attack a RNN based malware detection system. It is usually hard for malicious attackers to know the exact structures and weights of the victim RNN. A substitute RNN is trained to approximate the victim RNN. Then we propose a generative RNN to output sequential adversarial examples from the original sequential malware inputs. Experimental results showed that RNN based malware detection algorithms fail to detect most of the generated malicious adversarial examples, which means the proposed model is able to effectively bypass the detection algorithms.
Key to multitask learning is exploiting relationships between different tasks to improve prediction performance. If the relations are linear, regularization approaches can be used successfully. However, in practice assuming the tasks to be linearly related might be restrictive, and allowing for nonlinear structures is a challenge. In this paper, we tackle this issue by casting the problem within the framework of structured prediction. Our main contribution is a novel algorithm for learning multiple tasks which are related by a system of nonlinear equations that their joint outputs need to satisfy. We show that the algorithm is consistent and can be efficiently implemented. Experimental results show the potential of the proposed method.
What are the conditions for adiabatic quantum computation (AQC) to outperform classical computation? We consider the strong quantum speedup: scaling advantage in computational time over the best classical algorithms. Although there exist several quantum adiabatic algorithms achieving the strong quantum speedup, the essential keys to their speedups are still unclear. Here, we propose a necessary condition for the quantum speedup in AQC. This is a conjecture that a superposition of macroscopically distinct states appears during AQC if it achieves the quantum speedup. This is a natural extension of the conjecture in circuit-based quantum computation [A. Shimizu et al., J. Phys. Soc. Jpn. 82, 054801 (2013)]. To describe the statement of the conjecture, we introduce an index $p$ that quantifies a superposition of macroscopically distinct states---macroscopic entanglement---from the asymptotic behaviors of fluctuations of additive observables. We theoretically test the conjecture by investigating five quantum adiabatic algorithms. All the results show that the conjecture is correct for these algorithms. We therefore expect that a superposition of macroscopically distinct states is an appropriate indicator of entanglement crucial to the strong quantum speedup in AQC.