# Top arXiv papers

• With progress in quantum technology more sophisticated quantum annealing devices are becoming available. While they offer new possibilities for solving optimization problems, their true potential is still an open question. As the optimal design of adiabatic algorithms plays an important role in their assessment, we illustrate the aspects and challenges to consider when implementing optimization problems on quantum annealing hardware based on the example of the traveling salesman problem (TSP). We demonstrate that tunneling between local minima can be exponentially suppressed if the quantum dynamics are not carefully tailored to the problem. Furthermore we show that inequality constraints, in particular, present a major hurdle for the implementation on analog quantum annealers. We finally argue that programmable digital quantum annealers can overcome many of these obstacles and can - once large enough quantum computers exist - provide an interesting route to using quantum annealing on a large class of problems.
• Essential to the description of a quantum system are its local degrees of freedom, which enable the interpretation of subsystems and dynamics in the Hilbert space. While a choice of local tensor factorization of the Hilbert space is often implicit in the writing of a Hamiltonian or Lagrangian, the identification of local tensor factors is not intrinsic to the Hilbert space itself. Instead, the only basis-invariant data of a Hamiltonian is its spectrum, which does not manifestly determine the local structure. This ambiguity is highlighted by the existence of dualities, in which the same energy spectrum may describe two systems with very different local degrees of freedom. We argue that in fact, the energy spectrum alone almost always encodes a unique description of local degrees of freedom when such a description exists, allowing one to explicitly identify local subsystems and how they interact. In special cases, multiple dual local descriptions can be extracted from a given spectrum, but generically the local description is unique.
• We propose a general protocol for low-control refrigeration and thermometry of thermal qubits, which can be implemented using electronic spins in diamond. The refrigeration is implemented by a probe, consisting of a network of spins with two-body XXZ interactions. The protocol involves two operations: (i) free evolution of the probe; and (ii) a swap gate between one spin in the probe and the thermal qubit we wish to cool. We show that if the initial state of the probe falls within a suitable range, then the cooling protocol will always succeed, with an efficiency that depends on the rate of spin dephasing and the swap gate fidelity. Furthermore, measuring the probe after it has cooled many qubits provides an estimate of their temperature. We provide a specific example where the probe is a Heisenberg spin chain, and suggest a physical implementation using electronic spins in diamond. Here the probe is constituted of nitrogen vacancy (NV) centers, while the thermal qubits are dark spins. By using a novel pulse sequence, a chain of NV centers can be made to evolve according to a Heisenberg Hamiltonian. This proposal allows for a range of applications, such as NV-based nuclear magnetic resonance of photosensitive molecules kept in a dark spot on a sample, and it opens up new possibilities for the study of quantum thermodynamics, environment-assisted sensing, and many-body physics.
• Performing perfect/conclusive quantum state exclusion means to be able to discard with certainty at least one out of n possible quantum state preparations by performing a measurement of the resulting state. When all the preparations correspond to pure states, it is an open problem (see arXiv:1306.4683v3 and arXiv:quant-ph/0206110) whether POVMs give any additional power for this task with respect for projective measurements. This is the case even for the simple case of three states in three dimensions. In this paper, we give an analytical proof that in this simple case, considering POVMs does indeed not give any additional power with respect to projective measurements. To do so, we first make without loss of generality some assumptions about the structure of an optimal POVM. The justification of these assumptions involves arguments based on convexity, rank and symmetry properties. We show then that any pure states perfectly excluded by such a POVM meet the conditions identified in arXiv:quant-ph/0206110 for perfect exclusion by a projective measurement of three pure states in three dimensions.
• Machine Learning (ML) models are applied in a variety of tasks such as network intrusion detection or malware classification. Yet, these models are vulnerable to a class of malicious inputs known as adversarial examples. These are slightly perturbed inputs that are classified incorrectly by the ML model. The mitigation of these adversarial inputs remains an open problem. As a step towards a model-agnostic defense against adversarial examples, we show that they are not drawn from the same distribution than the original data, and can thus be detected using statistical tests. As the number of malicious points included in samples presented to the test diminishes, its detection confidence decreases. Hence, we introduce a complimentary approach to identify specific inputs that are adversarial among sets of inputs flagged by the statistical test. Specifically, we augment our ML model with an additional output, in which the model is trained to classify all adversarial inputs. We evaluate our approach on multiple adversarial example crafting methods (including the fast gradient sign and Jacobian-based saliency map methods) with several datasets. The statistical test flags sample sets containing adversarial inputs with confidence above 80%. Furthermore, our augmented model either detects adversarial examples with high accuracy (>80%) or increases the adversary's cost---the perturbation added---by more than 150%. In this way, we show that statistical properties of adversarial examples are essential to their detection.
• We investigate the generation of one-sentence Wikipedia biographies from facts derived from Wikidata slot-value pairs. We train a recurrent neural network sequence-to-sequence model with attention to select facts and generate textual summaries. Our model incorporates a novel secondary objective that helps ensure it generates sentences that contain the input facts. The model achieves a BLEU score of 41, improving significantly upon the vanilla sequence-to-sequence model and scoring roughly twice that of a simple template baseline. Human preference evaluation suggests the model is nearly as good as the Wikipedia reference. Manual analysis explores content selection, suggesting the model can trade the ability to infer knowledge against the risk of hallucinating incorrect information.
• There has been a recent explosion in the capabilities of game-playing artificial intelligence. Many classes of RL tasks, from Atari games to motor control to board games, are now solvable by fairly generic algorithms, based on deep learning, that learn to play from experience with minimal knowledge of the specific domain of interest. In this work, we will investigate the performance of these methods on Super Smash Bros. Melee (SSBM), a popular console fighting game. The SSBM environment has complex dynamics and partial observability, making it challenging for human and machine alike. The multi-player aspect poses an additional challenge, as the vast majority of recent advances in RL have focused on single-agent environments. Nonetheless, we will show that it is possible to train agents that are competitive against and even surpass human professionals, a new result for the multi-player video game setting.
• Researchers often summarize their work in the form of scientific posters. Posters provide a coherent and efficient way to convey core ideas expressed in scientific papers. Generating a good scientific poster, however, is a complex and time consuming cognitive task, since such posters need to be readable, informative, and visually aesthetic. In this paper, for the first time, we study the challenging problem of learning to generate posters from scientific papers. To this end, a data-driven framework, that utilizes graphical models, is proposed. Specifically, given content to display, the key elements of a good poster, including attributes of each panel and arrangements of graphical elements are learned and inferred from data. During the inference stage, an MAP inference framework is employed to incorporate some design principles. In order to bridge the gap between panel attributes and the composition within each panel, we also propose a recursive page splitting algorithm to generate the panel layout for a poster. To learn and validate our model, we collect and release a new benchmark dataset, called NJU-Fudan Paper-Poster dataset, which consists of scientific papers and corresponding posters with exhaustively labelled panels and attributes. Qualitative and quantitative results indicate the effectiveness of our approach.
• In many domains, a latent competition among different conventions determines which one will come to dominate. One sees such effects in the success of community jargon, of competing frames in political rhetoric, or of terminology in technical contexts. These effects have become widespread in the on-line domain, where the data offers the potential to study competition among conventions at a fine-grained level. In analyzing the dynamics of conventions over time, however, even with detailed on-line data, one encounters two significant challenges. First, as conventions evolve, the underlying substance of their meaning tends to change as well; and such substantive changes confound investigations of social effects. Second, the selection of a convention takes place through the complex interactions of individuals within a community, and contention between the users of competing conventions plays a key role in the convention's evolution. Any analysis must take place in the presence of these two issues. In this work we study a setting in which we can cleanly track the competition among conventions. Our analysis is based on the spread of low-level authoring conventions in the e-print arXiv over 24 years: by tracking the spread of macros and other author-defined conventions, we are able to study conventions that vary even as the underlying meaning remains constant. We find that the interaction among co-authors over time plays a crucial role in the selection of them; the distinction between more and less experienced members of the community, and the distinction between conventions with visible versus invisible effects, are both central to the underlying processes. Through our analysis we make predictions at the population level about the ultimate success of different synonymous conventions over time - and at the individual level about the outcome of "fights" between people over convention choices.
• We investigate the impact of general conditions of theoretical stability and cosmological viability on dynamical dark energy models. As a powerful example, we study whether minimally coupled, single field Quintessence models that are safe from ghost instabilities, can source the CPL expansion history recently shown to be mildly favored by a combination of CMB (Planck) and Weak Lensing (KiDS) data. Interestingly we find that in their most conservative form, the theoretical conditions impact the analysis in such a way that smooth single field Quintessence becomes significantly disfavored with respect to the standard $\Lambda$CDM cosmological model. This is due to the fact that these conditions cut a significant portion of the $(w_0,w_a)$ parameter space for CPL, in particular eliminating the region that would be favored by weak lensing data. Within the scenario of a smooth dynamical dark energy parametrized with CPL, weak lensing data favors a region that would require multiple fields to ensure gravitational stability.
• Feb 22 2017 cs.CV arXiv:1702.06521v1
Machine learning techniques, namely convolutional neural networks (CNN) and regression forests, have recently shown great promise in performing 6-DoF localization of monocular images. However, in most cases image-sequences, rather only single images, are readily available. To this extent, none of the proposed learning-based approaches exploit the valuable constraint of temporal smoothness, often leading to situations where the per-frame error is larger than the camera motion. In this paper we propose a recurrent model for performing 6-DoF localization of video-clips. We find that, even by considering only short sequences (20 frames), the pose estimates are smoothed and the localization error can be drastically reduced. Finally, we consider means of obtaining probabilistic pose estimates from our model. We evaluate our method on openly-available real-world autonomous driving and indoor localization datasets.
• This paper presents an estimator for semiparametric models that uses a feed-forward neural network to fit the nonparametric component. Unlike many methodologies from the machine learning literature, this approach is suitable for longitudinal/panel data. It provides unbiased estimation of the parametric component of the model, with associated confidence intervals that have near-nominal coverage rates. It is further shown that this model and estimator nests a nonparametric heterogeneous treatment effects model and estimator, which can consistently estimate individualized treatment effects conditional on covariates. Simulations demonstrate (1) efficiency, (2) that parametric estimates are unbiased, and (3) coverage properties of estimated intervals. An application section demonstrates the method by predicting county-level corn yield using daily weather data from the period 1981-2015, along with parametric time trends representing technological change. The method is shown to out-perform linear methods such as OLS and ridge/lasso, as well as random forest. The procedures described in this paper are implemented in the R package panelNNET.
• We explore design principles for general pixel-level prediction problems, from low-level edge detection to mid-level surface normal estimation to high-level semantic segmentation. Convolutional predictors, such as the fully-convolutional network (FCN), have achieved remarkable success by exploiting the spatial redundancy of neighboring pixels through convolutional processing. Though computationally efficient, we point out that such approaches are not statistically efficient during learning precisely because spatial redundancy limits the information learned from neighboring pixels. We demonstrate that stratified sampling of pixels allows one to (1) add diversity during batch updates, speeding up learning; (2) explore complex nonlinear predictors, improving accuracy; and (3) efficiently train state-of-the-art models tabula rasa (i.e., "from scratch") for diverse pixel-labeling tasks. Our single architecture produces state-of-the-art results for semantic segmentation on PASCAL-Context dataset, surface normal estimation on NYUDv2 depth dataset, and edge detection on BSDS.
• Bias is a common problem in today's media, appearing frequently in text and in visual imagery. Users on social media websites such as Twitter need better methods for identifying bias. Additionally, activists --those who are motivated to effect change related to some topic, need better methods to identify and counteract bias that is contrary to their mission. With both of these use cases in mind, in this paper we propose a novel tool called UnbiasedCrowd that supports identification of, and action on bias in visual news media. In particular, it addresses the following key challenges (1) identification of bias; (2) aggregation and presentation of evidence to users; (3) enabling activists to inform the public of bias and take action by engaging people in conversation with bots. We describe a preliminary study on the Twitter platform that explores the impressions that activists had of our tool, and how people reacted and engaged with online bots that exposed visual bias. We conclude by discussing design and implication of our findings for creating future systems to identify and counteract the effects of news bias.
• Brains need to predict how our muscles and body react to motor commands. How networks of spiking neurons can learn to reproduce these non-linear dynamics, using local, online and stable learning rules, is an important, open question. Here, we present a supervised learning scheme for the feedforward and recurrent connections in a network of heterogeneous spiking neurons. The error in the output is fed back through fixed random connections with a negative gain, causing the network to follow the desired dynamics, while an online and local rule changes the weights; hence we call the scheme FOLLOW (Feedback-based Online Local Learning Of Weights) The rule is local in the sense that weight changes depend on the presynaptic activity and the error signal projected onto the post-synaptic neuron. We provide examples of learning linear, non-linear and chaotic dynamics, as well as the dynamics of a two-link arm. Using the Lyapunov method, and under reasonable assumptions and approximations, we show that FOLLOW learning is uniformly stable, with the error going to zero asymptotically.
• In this paper, we focus on fully automatic traffic surveillance camera calibration which we use for speed measurement of passing vehicles. We improve over a recent state-of-the-art camera calibration method for traffic surveillance based on two detected vanishing points. More importantly, we propose a novel automatic scene scale inference based on matching bounding boxes of rendered 3D models of vehicles with detected bounding boxes in the image. The proposed method can be used from an arbitrary viewpoint and it has no constraints on camera placement. We evaluate our method on recent comprehensive dataset for speed measurement BrnoCompSpeed. Experiments show that our automatic camera calibration by detected two vanishing points method reduces the error by 50% compared to the previous state-of-the-art method. We also show that our scene scale inference method is much more precise (mean speed measurement error 1.10km/h) outperforming both state of the art automatic calibration method (error reduction by 86% -- mean error 7.98km/h) and manual calibration (error reduction by 19% -- mean error 1.35km/h). We also present qualitative results of automatic camera calibration method on video sequences obtained from real surveillance cameras on various places and under different lighting conditions (night, dawn, day).
• In this paper, we focus on traffic camera calibration and visual speed measurement from a single monocular camera, which is an important task of visual traffic surveillance. Existing methods addressing this problem are hard to compare due to lack of a common dataset with reliable ground truth. Therefore, it is not clear how the methods compare in various aspects and what are the factors affecting their performance. We captured a new dataset of 18 full-HD videos, each around one hour long, captured at 6 different locations. Vehicles in videos (20,865 instances in total) are annotated with precise speed measurements from optical gates using LIDAR and verified with several reference GPS tracks. We provide the videos and metadata (calibration, lengths of features in image, annotations, etc.) for future comparison and evaluation. Camera calibration is the most crucial part of the speed measurement; therefore, we provide a review of the methods and analyze a recently published method for fully automatic camera calibration and vehicle speed measurement and report the results on this dataset in detail.
• In this article, we use the Combinatorial Nullstellensatz to give new proofs of the Cauchy-Davenport, the Dias da Silva-Hamidoune and to generalize a previous addition theorem of the author. Precisely, this last result proves that for a set A $\subset$ Fp such that A $\cap$ (--A) = $\emptyset$ the cardinality of the set of subsums of at least $\alpha$ pairwise distinct elements of A is: |$\Sigma$$\alpha$(A)| $\ge$ min (p, |A|(|A| + 1)/2 -- $\alpha$($\alpha$ + 1)/2 + 1) , the only cases previously known were $\alpha$ $\in$ 0, 1. The Combinatorial Nullstellensatz is used, for the first time, in a direct and in a reverse way. The direct (and usual) way states that if some coefficient of a polynomial is non zero then there is a solution or a contradiction. The reverse way relies on the coefficient formula (equivalent to the Combinatorial Nullstellensatz). This formula gives an expression for the coefficient as a sum over any cartesian product. For these three addition theorems, some arithmetical progressions (that reach the bounds) will allow to consider cartesian products such that the coefficient formula is a sum all of whose terms are zero but exactly one. Thus we can conclude the proofs without computing the appropriate coefficients.
• The event-based model (EBM) for data-driven disease progression modeling estimates the sequence in which biomarkers for a disease become abnormal. This helps in understanding the dynamics of disease progression and facilitates early diagnosis by staging patients on a disease progression timeline. Existing EBM methods are all generative in nature. In this work we propose a novel discriminative approach to EBM, which is shown to be more accurate as well as computationally more efficient than existing state-of-the art EBM methods. The method first estimates for each subject an approximate ordering of events, by ranking the posterior probabilities of individual biomarkers being abnormal. Subsequently, the central ordering over all subjects is estimated by fitting a generalized Mallows model to these approximate subject-specific orderings based on a novel probabilistic Kendall's Tau distance. To evaluate the accuracy, we performed extensive experiments on synthetic data simulating the progression of Alzheimer's disease. Subsequently, the method was applied to the Alzheimer's Disease Neuroimaging Initiative (ADNI) data to estimate the central event ordering in the dataset. The experiments benchmark the accuracy of the new model under various conditions and compare it with existing state-of-the-art EBM methods. The results indicate that discriminative EBM could be a simple and elegant approach to disease progression modeling.
• Feb 22 2017 math.CT math.CO math.MG arXiv:1702.06388v1
We use the language of precategories to formulate a general mathematical framework for phylogenetics.
• Feb 22 2017 cs.IR cs.CV arXiv:1702.06383v1
Comparing images in order to recommend items from an image-inventory is a subject of continued interest. Added with the scalability of deep-learning architectures the once `manual' job of hand-crafting features have been largely alleviated, and images can be compared according to features generated from a deep convolutional neural network. In this paper, we compare distance metrics (and divergences) to rank features generated from a neural network, for content-based image retrieval. Specifically, after modelling individual images using approximations of mixture models or sparse covariance estimators we resort to their information-theoretic and Riemann geometric comparisons. We show that using approximations of mixture models enable us to to compute a distance measure based on the Wasserstein metric that requires less effort than computationally intensive optimal transport plans; finally, an affine invariant metric is used to compare the optimal transport metric to its Riemann geometric counterpart -- we conclude that although expensive, retrieval metric based on Wasserstein geometry are more suitable than information theoretic comparison of images. In short, we combine GPU scalability in learning deep feature vectors with computationally efficient metrics that we foresee being utilized in a commercial setting.
• Complex Event Recognition applications exhibit various types of uncertainty, ranging from incomplete and erroneous data streams to imperfect complex event patterns. We review Complex Event Recognition techniques that handle, to some extent, uncertainty. We examine techniques based on automata, probabilistic graphical models and first-order logic, which are the most common ones, and approaches based on Petri Nets and Grammars, which are less frequently used. A number of limitations are identified with respect to the employed languages, their probabilistic models and their performance, as compared to the purely deterministic cases. Based on those limitations, we highlight promising directions for future work.
• This paper proposes a branched residual network for image classification. It is known that high-level features of deep neural network are more representative than lower-level features. By sharing the low-level features, the network can allocate more memory to high-level features. The upper layers of our proposed network are branched, so that it mimics the ensemble learning. By mimicking ensemble learning with single network, we have achieved better performance on ImageNet classification task.
• Quantum discord refers to an important aspect of quantum correlations for bipartite quantum systems. In our earlier works we have shown that corresponding to every graph (combinatorial) there are quantum states whose properties are reflected in the structure of the corresponding graph. Here, we attempt to develop a graph theoretic study of quantum discord that corresponds to a necessary and sufficient condition of zero quantum discord states which says that the blocks of density matrix corresponding to a zero quantum discord state are normal and commute with each other. These blocks have a one to one correspondence with some specific subgraphs of the graph which represents the quantum state. We obtain a number of graph theoretic properties representing normality and commutativity of a set of matrices which are indeed arising from the given graph. Utilizing these properties we define graph theoretic measures for normality and commutativity that results a formulation of graph theoretic quantum discord. We identify classes of quantum states with zero discord using the said formulation.
• Limits on power dissipation have pushed CPUs to grow in parallel processing capabilities rather than clock rate, leading to the rise of "manycore" or GPU-like processors. In order to achieve the best performance, applications must be able to take full advantage of vector units across multiple cores, or some analogous arrangement on an accelerator card. Such parallel performance is becoming a critical requirement for methods to reconstruct the tracks of charged particles at the Large Hadron Collider and, in the future, at the High Luminosity LHC. This is because the steady increase in luminosity is causing an exponential growth in the overall event reconstruction time, and tracking is by far the most demanding task for both online and offline processing. Many past and present collider experiments adopted Kalman filter-based algorithms for tracking because of their robustness and their excellent physics performance, especially for solid state detectors where material interactions play a significant role. We report on the progress of our studies towards a Kalman filter track reconstruction algorithm with optimal performance on manycore architectures. The combinatorial structure of these algorithms is not immediately compatible with an efficient SIMD (or SIMT) implementation; the challenge for us is to recast the existing software so it can readily generate hundreds of shared-memory threads that exploit the underlying instruction set of modern processors. We show how the data and associated tasks can be organized in a way that is conducive to both multithreading and vectorization. We demonstrate very good performance on Intel Xeon and Xeon Phi architectures, as well as promising first results on Nvidia GPUs.
• We consider the reionization process in a cosmological model in which dark matter interacts with dark energy. Using a semi-analytical reionization model, we compute the evolution of the ionized fraction in terms of its spatial average and linear perturbations. We show that certain types of interactions between dark matter and dark energy can significantly affect the reionization history. We calculate the 21 cm signals in the interaction models, and compare the results with the predictions of the ${\rm \Lambda CDM}$ model.
• Object detection in videos has drawn increasing attention recently with the introduction of the large-scale ImageNet VID dataset. Different from object detection in static images, temporal information in videos provides vital information for object detection. To fully utilize temporal information, state-of-the-art methods are therefore based on spatiotemporal tubelets, which are essentially sequences of associated bounding boxes across time. However, the existing methods have major limitations in generating tubelets in terms of quality and efficiency. Motion-based methods are able to obtain dense tubelets, but the lengths are generally only several frames, which is not optimal to incorporate long-term temporal information. Appearance-based methods, usually involving generic object tracking, could generate long tubelets, but are usually computational expensive. In this work, we propose a framework for object detection in videos, which consists of a novel tubelet proposal network to efficiently generate spatiotemporal proposals, and a Long Short-term Memory (LSTM) network that incorporates temporal information from tubelet proposals for achieving high object detection accuracy in videos. The experiments on the large-scale ImageNet VID dataset demonstrate the effectiveness of the proposed framework for object detection in videos.
• Recommendation for e-commerce with a mix of durable and nondurable goods has characteristics that distinguish it from the well-studied media recommendation problem. The demand for items is a combined effect of form utility and time utility, i.e., a product must both be intrinsically appealing to a consumer and the time must be right for purchase. In particular for durable goods, time utility is a function of inter-purchase duration within product category because consumers are unlikely to purchase two items in the same category in close temporal succession. Moreover, purchase data, in contrast to ratings data, is implicit with non-purchases not necessarily indicating dislike. Together, these issues give rise to the positive-unlabeled demand-aware recommendation problem that we pose via joint low-rank tensor completion and product category inter-purchase duration vector estimation. We further relax this problem and propose a highly scalable alternating minimization approach with which we can solve problems with millions of users and items. We also show superior prediction accuracies on multiple real-world data sets.
• In this paper, we import tensor index notation including Einstein summation notation into programming by introducing two kinds of functions, tensor functions and scalar functions. Tensor functions are functions that contract the tensors given as an argument, and scalar functions are the others. As with ordinary functions, when a tensor function obtains a tensor as an argument, the tensor function treats the tensor as it is as a tensor. On the other hand, when a scalar function obtains a tensor as an argument, the scalar function is applied to each component of the tensor. This paper shows that, by introducing these two kinds of functions, index notation can be imported into whole programming, that means we can use index notation for arbitrary functions, without requiring annoying description to enable each function to handle tensors. This method can be applied to arbitrary programming languages.
• The Hamiltonian formulation of conformally invariant Weyl-squared higher derivative theory teaches us that conformal symmetry is expressed through particular first class constraints related to the absence of the three-metric determinant and the trace of the extrinsic curvature from the theory. Any term depending on them which is added to this theory breaks conformal invariance and turns these constraints into second class ones. Such second class constraints are missing in the standard canonical formulation of the conformally non-invariant Einstein-Hilbert theory. It is demonstrated that such constraints do appear if the theory is treated as a higher derivative one --- if the extrinsic curvature is promoted to an independent variable, the apparently missing information about conformal behavior is revealed.
• Mobile robots are increasingly being employed for performing complex tasks in dynamic environments. Reinforcement learning (RL) methods are recognized to be promising for specifying such tasks in a relatively simple manner. However, the strong dependency between the learning method and the task to learn is a well-known problem that restricts practical implementations of RL in robotics, often requiring major modifications of parameters and adding other techniques for each particular task. In this paper we present a practical core implementation of RL which enables the learning process for multiple robotic tasks with minimal per-task tuning or none. Based on value iteration methods, this implementation includes a novel approach for action selection, called Q-biased softmax regression (QBIASSR), which avoids poor performance of the learning process when the robot reaches new unexplored states. Our approach takes advantage of the structure of the state space by attending the physical variables involved (e.g., distances to obstacles, X,Y,\theta pose, etc.), thus experienced sets of states may favor the decision-making process of unexplored or rarely-explored states. This improvement has a relevant role in reducing the tuning of the algorithm for particular tasks. Experiments with real and simulated robots, performed with the software framework also introduced here, show that our implementation is effectively able to learn different robotic tasks without tuning the learning method. Results also suggest that the combination of true online SARSA(\lambda) with QBIASSR can outperform the existing RL core algorithms in low-dimensional robotic tasks.
• Feb 22 2017 cs.LG stat.ML arXiv:1702.06295v1
Initialization of parameters in deep neural networks has been shown to have a big impact on the performance of the networks (Mishkin & Matas, 2015). The initialization scheme devised by He et al, allowed convolution activations to carry a constrained mean which allowed deep networks to be trained effectively (He et al., 2015a). Orthogonal initializations and more generally orthogonal matrices in standard recurrent networks have been proved to eradicate the vanishing and exploding gradient problem (Pascanu et al., 2012). Majority of current initialization schemes do not take fully into account the intrinsic structure of the convolution operator. This paper introduces a new type of initialization built around the duality of the Fourier transform and the convolution operator. With Convolution Aware Initialization we noticed not only higher accuracy and lower loss, but faster convergence in general. We achieve new state of the art on the CIFAR10 dataset, and achieve close to state of the art on various other tasks.
• This paper presents a novel approach for video-based person re-identification using multiple Convolutional Neural Networks (CNNs). Unlike previous work, we intend to extract a compact yet discriminative appearance representation from several frames rather than the whole sequence. Specifically, given a video, the representative frames are selected based on the walking profile of consecutive frames. A multiple CNN architecture incorporated with feature pooling is proposed to learn and compile the features of the selected representative frames into a compact description about the pedestrian for identification. Experiments are conducted on benchmark datasets to demonstrate the superiority of the proposed method over existing person re-identification approaches.
• One of the major challenges of model-free visual tracking problem has been the difficulty originating from the unpredictable and drastic changes in the appearance of objects we target to track. Existing methods tackle this problem by updating the appearance model on-line in order to adapt to the changes in the appearance. Despite the success of these methods however, inaccurate and erroneous updates of the appearance model result in a tracker drift. In this paper, we introduce a novel visual tracking algorithm based on a template selection strategy constructed by deep reinforcement learning methods. The tracking algorithm utilizes this strategy to choose the best template for tracking a given frame. The template selection strategy is self-learned by utilizing a simple policy gradient method on numerous training episodes randomly generated from a tracking benchmark dataset. Our proposed reinforcement learning framework is generally applicable to other confidence map based tracking algorithms. The experiment shows that our tracking algorithm effectively decides the best template for visual tracking.
• Sound events often occur in unstructured environments where they exhibit wide variations in their frequency content and temporal structure. Convolutional neural networks (CNN) are able to extract higher level features that are invariant to local spectral and temporal variations. Recurrent neural networks (RNNs) are powerful in learning the longer term temporal context in the audio signals. CNNs and RNNs as classifiers have recently shown improved performances over established methods in various sound recognition tasks. We combine these two approaches in a Convolutional Recurrent Neural Network (CRNN) and apply it on a polyphonic sound event detection task. We compare the performance of the proposed CRNN method with CNN, RNN, and other established methods, and observe a considerable improvement for four different datasets consisting of everyday sound events.
• The most general structure (in matrix form) of a single-qubit gate is presented. Subsequently, used that to obtain a set of conditions for testing (a) whether a given 2-qubit gate is genuinely a 2-qubit gate, i.e., not decomposable into two single qubit gates and (b) whether a given single qubit gate is self-inverse? Relevance of the results reported here is discussed in the context of optimization of reversible and quantum circuits, especially for the optimization of quantum cost. A systematic procedure is developed for the identification of the non-decomposable 2-qubit gates. Such a non-decomposable 2-qubit gate along with all possible single qubit gates form a universal quantum gate library. Further, some possible applications of the present work are also discussed.
• Deep convolutional networks are well-known for their high computational and memory demands. Given limited resources, how does one design a network that balances its size, training time, and prediction accuracy? A surprisingly effective approach to trade accuracy for size and speed is to simply reduce the number of channels in each convolutional layer by a fixed fraction and retrain the network. In many cases this leads to significantly smaller networks with only minimal changes to accuracy. In this paper, we take a step further by empirically examining a strategy for deactivating connections between filters in convolutional layers in a way that allows us to harvest savings both in run-time and memory for many network architectures. More specifically, we generalize 2D convolution to use a channel-wise sparse connection structure and show that this leads to significantly better results than the baseline approach for large networks including VGG and Inception V3.
• Successful analysis of player skills in video games has important impacts on the process of enhancing player experience without undermining their continuous skill development. Moreover, player skill analysis becomes more intriguing in team-based video games because such form of study can help discover useful factors in effective team formation. In this paper, we consider the problem of skill decomposition in MOBA (MultiPlayer Online Battle Arena) games, with the goal to understand what player skill factors are essential for the outcome of a game match. To understand the construct of MOBA player skills, we utilize various skill-based predictive models to decompose player skills into interpretative parts, the impact of which are assessed in statistical terms. We apply this analysis approach on two widely known MOBAs, namely League of Legends (LoL) and Defense of the Ancients 2 (DOTA2). The finding is that base skills of in-game avatars, base skills of players, and players' champion-specific skills are three prominent skill components influencing LoL's match outcomes, while those of DOTA2 are mainly impacted by in-game avatars' base skills but not much by the other two.
• Recommendation system is a common demand in daily life and matrix completion is a widely adopted technique for this task. However, most matrix completion methods lack semantic interpretation and usually result in weak-semantic recommendations. To this end, this paper proposes a \bf Semantic \bf Analysis approach for \bf Recommendation systems \textbf(SAR), which applies a two-level hierarchical generative process that assigns semantic properties and categories for user and item. SAR learns semantic representations of users/items merely from user ratings on items, which offers a new path to recommendation by semantic matching with the learned representations. Extensive experiments demonstrate SAR outperforms other state-of-the-art baselines substantially.
• Argument component detection (ACD) is an important sub-task in argumentation mining. ACD aims at detecting and classifying different argument components in natural language texts. Historical annotations (HAs) are important features the human annotators consider when they manually perform the ACD task. However, HAs are largely ignored by existing automatic ACD techniques. Reinforcement learning (RL) has proven to be an effective method for using HAs in some natural language processing tasks. In this work, we propose a RL-based ACD technique, and evaluate its performance on two well-annotated corpora. Results suggest that, in terms of classification accuracy, HAs-augmented RL outperforms plain RL by at most 17.85%, and outperforms the state-of-the-art supervised learning algorithm by at most 11.94%.
• We obtain the first polynomial-time algorithm for exact tensor completion that improves over the bound implied by reduction to matrix completion. The algorithm recovers an unknown 3-tensor with $r$ incoherent, orthogonal components in $\mathbb R^n$ from $r\cdot \tilde O(n^{1.5})$ randomly observed entries of the tensor. This bound improves over the previous best one of $r\cdot \tilde O(n^{2})$ by reduction to exact matrix completion. Our bound also matches the best known results for the easier problem of approximate tensor completion (Barak & Moitra, 2015). Our algorithm and analysis extends seminal results for exact matrix completion (Candes & Recht, 2009) to the tensor setting via the sum-of-squares method. The main technical challenge is to show that a small number of randomly chosen monomials are enough to construct a degree-3 polynomial with a precisely planted orthogonal global optima over the sphere and that this fact can be certified within the sum-of-squares proof system.
• The linear coupling of a rotating heat bath to a quantum field is studied in the framework of the Markovian master equation for the field's non-unitary time evolution. The bath's rotation induces population inversion for the field's low-energy modes. For bosons, this leads to superradiance, an irreversible process in which some of the bath's kinetic energy is extracted by spontaneous and stimulated emission. We find the energy and entropy balance for such systems and apply our results to the theory of black hole radiation. We also comment on how this relates to classical self-oscillations such as the hydrodynamic Kelvin-Helmholtz instability.
• This paper addresses tracking of a moving target in a multi-agent network. The target follows a linear dynamics corrupted by an adversarial noise, i.e., the noise is not generated from a statistical distribution. The location of the target at each time induces a global time-varying loss function, and the global loss is a sum of local losses, each of which is associated to one agent. Agents noisy observations could be nonlinear. We formulate this problem as a distributed online optimization where agents communicate with each other to track the minimizer of the global loss. We then propose a decentralized version of the Mirror Descent algorithm and provide the non-asymptotic analysis of the problem. Using the notion of dynamic regret, we measure the performance of our algorithm versus its offline counterpart in the centralized setting. We prove that the bound on dynamic regret scales inversely in the network spectral gap, and it represents the adversarial noise causing deviation with respect to the linear dynamics. Our result subsumes a number of results in the distributed optimization literature. Finally, in a numerical experiment, we verify that our algorithm can be simply implemented for multi-agent tracking with nonlinear observations.
• This paper considers the problem of implementing a previously proposed distributed direct coupling quantum observer for a closed linear quantum system. By modifying the form of the previously proposed observer, the paper proposes a possible experimental implementation of the observer plant system using a non-degenerate parametric amplifier and a chain of optical cavities which are coupled together via optical interconnections. It is shown that the distributed observer converges to a consensus in a time averaged sense in which an output of each element of the observer estimates the specified output of the quantum plant.
• Recognizing human activities in a sequence is a challenging area of research in ubiquitous computing. Most approaches use a fixed size sliding window over consecutive samples to extract features---either handcrafted or learned features---and predict a single label for all samples in the window. Two key problems emanate from this approach: i) the samples in one window may not always share the same label. Consequently, using one label for all samples within a window inevitably lead to loss of information; ii) the testing phase is constrained by the window size selected during training while the best window size is difficult to tune in practice. We propose an efficient algorithm that can predict the label of each sample, which we call dense labeling, in a sequence of human activities of arbitrary length using a fully convolutional network. In particular, our approach overcomes the problems posed by the sliding window step. Additionally, our algorithm learns both the features and classifier automatically. We release a new daily activity dataset based on a wearable sensor with hospitalized patients. We conduct extensive experiments and demonstrate that our proposed approach is able to outperform the state-of-the-arts in terms of classification and label misalignment measures on three challenging datasets: Opportunity, Hand Gesture, and our new dataset.
• Hypothesis generation is becoming a crucial time-saving technique which allows biomedical researchers to quickly discover implicit connections between important concepts. Typically, these systems operate on domain-specific fractions of public medical data. MOLIERE, in contrast, utilizes information from over 24.5 million documents. At the heart of our approach lies a multi-modal and multi-relational network of biomedical objects extracted from several heterogeneous datasets from the National Center for Biotechnology Information (NCBI). These objects include but are not limited to scientific papers, keywords, genes, proteins, diseases, and diagnoses. We model hypotheses using Latent Dirichlet Allocation applied on abstracts found near shortest paths discovered within this network, and demonstrate the effectiveness of MOLIERE by performing hypothesis generation on historical data. Our network, implementation, and resulting data are all publicly available for the broad scientific community.
• Boolean matrix factorisation (BooMF) infers interpretable decompositions of a binary data matrix into a pair of low-rank, binary matrices: One containing meaningful patterns, the other quantifying how the observations can be expressed as a combination of these patterns. We introduce the OrMachine, a probabilistic generative model for BooMF and derive a Metropolised Gibbs sampler that facilitates very efficient parallel posterior inference. Our method outperforms all currently existing approaches for Boolean Matrix factorization and completion, as we show on simulated and real world data. This is the first method to provide full posterior inference for BooMF which is relevant in applications, e.g. for controlling false positive rates in collaborative filtering, and crucially it improves the interpretability of the inferred patterns. The proposed algorithm scales to large datasets as we demonstrate by analysing single cell gene expression data in 1.3 million mouse brain cells across 11,000 genes on commodity hardware.
• Edge bundling is an important concept, heavily used for graph visualization purposes. To enable the comparison with other established nearly-planarity models in graph drawing, we formulate a new edge-bundling model which is inspired by the recently introduced fan-planar graphs. In particular, we restrict the bundling to the endsegments of the edges. As in 1-planarity, we call our model 1-fan-bundle-planarity, as we allow at most one crossing per bundle. For the two variants where we allow either one or, more naturally, both endsegments of each edge to be part of bundles, we present edge density results and consider various recognition questions, not only for general graphs, but also for the outer and 2-layer variants. We conclude with a series of challenging questions.
• In this paper we explore city-level traffic and parking data to determine how much cruising for curbside parking contributes to overall traffic congestion. To this end, we describe a new kind of queueing network and present a data-informed model based on this new queuing network. We leverage the data-informed model in developing and validating a simulation tool. In addition, we utilize curbside parking and arterial traffic volume data to produce an estimate of the proportion of traffic searching for parking along high occupancy arterials. Somewhat surprisingly, we find that while percentage increase in travel time to through traffic vehicles depends on time of day, it does not appear to depend on high volumes of through traffic. Moreover, we show that the probability of a block-face being full is a much more viable metric for directly controlling congestion than average occupancy rate, typically used by municipalities.
• Feature extraction is a critical component of many applied data science workflows. In recent years, rapid advances in artificial intelligence and machine learning have led to an explosion of feature extraction tools and services that allow data scientists to cheaply and effectively annotate their data along a vast array of dimensions---ranging from detecting faces in images to analyzing the sentiment expressed in coherent text. Unfortunately, the proliferation of powerful feature extraction services has been mirrored by a corresponding expansion in the number of distinct interfaces to feature extraction services. In a world where nearly every new service has its own API, documentation, and/or client library, data scientists who need to combine diverse features obtained from multiple sources are often forced to write and maintain ever more elaborate feature extraction pipelines. To address this challenge, we introduce a new open-source framework for comprehensive multimodal feature extraction. Pliers is an open-source Python package that supports standardized annotation of diverse data types (video, images, audio, and text), and is expressly with both ease-of-use and extensibility in mind. Users can apply a wide range of pre-existing feature extraction tools to their data in just a few lines of Python code, and can also easily add their own custom extractors by writing modular classes. A graph-based API enables rapid development of complex feature extraction pipelines that output results in a single, standardized format. We describe the package's architecture, detail its major advantages over previous feature extraction toolboxes, and use a sample application to a large functional MRI dataset to illustrate how pliers can significantly reduce the time and effort required to construct sophisticated feature extraction workflows while increasing code clarity and maintainability.

Māris Ozols Feb 21 2017 15:35 UTC

I'm wondering if this result could have any interesting consequences for Hamiltonian complexity. The LCL problem sounds very much like a local Hamiltonian problem, with the run-time of an LCL algorithm corresponding to the range of local interactions in the Hamiltonian.

Maybe one caveat is that thi

...(continued)
Andrey Karchevsky Feb 17 2017 09:51 UTC

Dear Authors,

This is in reference of your preprint arxiv 1702.0638.

Above all I must say that I am puzzled with the level of publicity your work has got at http://www.nature.com/news/long-awaited-mathematics-proof-could-help-scan-earth-s-innards-1.21439. Is this a new way for mathematicians t

...(continued)
Karl Joulain Feb 09 2017 15:50 UTC

A **GREAT** paper. Where you learn how to extract work from the measurement of a qubit coupled to a drive. The authors build an engine with very unusual and interesting features such as efficiency of 1 (no entropy creation) arising for conditions where the power extrated is maximum! This maximum dep

...(continued)
Jānis Iraids Jan 25 2017 11:35 UTC

You are correct, that is a mistake -- it should be $\\{0,1\\}^n\rightarrow\\{0,1\\}$. Thank you for spotting it!

Christopher Thomas Chubb Jan 25 2017 02:27 UTC

In the abstract, should the domain of $f$ be $\lbrace0,1\rbrace^n$ instead of just $\lbrace0,1\rbrace$?

Robert Raussendorf Jan 24 2017 22:29 UTC

Regarding Mark's above comment on the role of the stabilizer states: Yes, all previous works on the subject have used the stabilizer states and Clifford gates as the classical backbone. This is due to the Gottesman-Knill theorem and related results. But is it a given that the free sector in quantum

...(continued)
Planat Jan 24 2017 13:09 UTC

Are you sure? Since we do not propose a conjecture, there is nothing wrong. A class of strange states underlie the pentagons in question. The motivation is to put the magic of computation in the permutation frame, one needs more work to check its relevance.

Mark Howard Jan 24 2017 09:59 UTC

It seems interesting at first sight, but after reading it the motivation is very muddled. It boils down to finding pentagons (which enable KCBS-style proofs of contextuality) within sets of projectors, some of which are stabilizer states and some of which are non-stabilizer states (called magic stat

...(continued)
Zoltán Zimborás Jan 12 2017 20:38 UTC

Here is a nice description, with additional links, about the importance of this work if it turns out to be flawless (thanks a lot to Martin Schwarz for this link): [dichotomy conjecture][1].

[1]: http://processalgebra.blogspot.com/2017/01/has-feder-vardi-dichotomy-conjecture.html

Noon van der Silk Jan 05 2017 04:51 UTC

This is a cool paper!