- Nov 22 2017 quant-ph cond-mat.str-el arXiv:1711.07500v1A method to study strongly interacting quantum many-body systems at and away from criticality is proposed. The method is based on a MERA-like tensor network that can be efficiently and reliably contracted on a noisy quantum computer using a number of qubits that is much smaller than the system size. We prove that the outcome of the contraction is stable to noise and that the estimated energy upper bounds the ground state energy. The stability, which we numerically substantiate, follows from the positivity of operator scaling dimensions under renormalization group flow. The variational upper bound follows from a particular assignment of physical qubits to different locations of the tensor network plus the assumption that the noise model is local. We postulate a scaling law for how well the tensor network can approximate ground states of lattice regulated conformal field theories in d spatial dimensions and provide evidence for the postulate. Under this postulate, a $O(\log^{d}(1/\delta))$-qubit quantum computer can prepare a valid quantum-mechanical state with energy density $\delta$ above the ground state. In the presence of noise, $\delta = O(\epsilon \log^{d+1}(1/\epsilon))$ can be achieved, where $\epsilon$ is the noise strength.
- Large-scale quantum computation is likely to require massive quantum error correction (QEC). QEC codes and circuits are described via the stabilizer formalism, which represents stabilizer states by keeping track of the operators that preserve them. Such states are obtained by stabilizer circuits (consisting of CNOT, Hadamard and Phase gates) and can be represented compactly on conventional computers using $O(n^2)$ bits, where $n$ is the number of qubits. As an additional application, the work by Aaronson and Gottesman suggests the use of superpositions of stabilizer states to represent arbitrary quantum states. To aid in such applications and improve our understanding of stabilizer states, we characterize and count nearest-neighbor stabilizer states, quantify the distribution of angles between pairs of stabilizer states, study succinct stabilizer superpositions and stabilizer bivectors, explore the approximation of non-stabilizer states by single stabilizer states and short linear combinations of stabilizer states, develop an improved inner-product computation for stabilizer states via synthesis of compact canonical stabilizer circuits, propose an orthogonalization procedure for stabilizer states, and evaluate several of these algorithms empirically.
- Quantum information theory has considerably helped in the understanding of quantum many-body systems. Since the early 2000s various measures of quantum entanglement have been employed to characterise the features of the ground and excited states of quantum matter. Furthermore, the scaling of entanglement entropy with the size of a system has inspired modifications to numerical techniques for the simulation of many-body systems leading to the, now established, area of tensor networks. However, the knowledge and the methods brought by quantum information do not end with bipartite entanglement. There are other forms of quantum correlations that emerge "for free" in the ground and thermal states of condensed matter models and that can be exploited as resources for quantum technologies. The goal of this work is to review the most recent development on quantum correlations for quantum many-body systems focussing on multipartite entanglement, quantum non locality, quantum discord, mutual information but also other non classical resources like quantum coherence. Moreover, we also discuss applications of quantum metrology in quantum many-body systems.
- We explore various combinatorial problems mostly borrowed from physics, that share the property of being continuously or discretely integrable, a feature that guarantees the existence of conservation laws that often make the problems exactly solvable. We illustrate this with: random surfaces, lattice models, and structure constants in representation theory.
- One of the significant breakthroughs in quantum computation is Grover's algorithm for unsorted database search. Recently, the applications of Grover's algorithm to solve global optimization problems have been demonstrated, where unknown optimum solutions are found by iteratively improving the threshold value for the selective phase shift operator in Grover rotation. In this paper, a hybrid approach that combines continuous-time quantum walks with Grover search is proposed so that the search is accelerated with improved threshold values. By taking advantage of the quantum tunneling effect, better threshold values can be found at the early stage of the search process so that the sharpness of probability improves. The results between the new algorithm, existing Grover search, and classical heuristic algorithms are compared.
- Nov 22 2017 cs.CV arXiv:1711.07794v1Multi-person pose estimation (MPPE) in natural images is key to the meaningful use of visual data in many fields including movement science, security, and rehabilitation. In this paper we tackle MPPE with a bottom-up approach, starting with candidate detections of body parts from a convolutional neural network (CNN) and grouping them into people. We formulate the grouping of body part detections into people as a minimum-weight set packing (MWSP) problem where the set of potential people is the power set of body part detections. We model the quality of a hypothesis of a person which is a set in the MWSP by an augmented tree-structured Markov random field where variables correspond to body-parts and their state-spaces correspond to the power set of the detections for that part. We describe a novel algorithm that combines efficiency with provable bounds on this MWSP problem. We employ an implicit column generation strategy where the pricing problem is formulated as a dynamic program. To efficiently solve this dynamic program we exploit the problem structure utilizing a nested Bender's decomposition (NBD) exact inference strategy which we speed up by recycling Bender's rows between calls to the pricing problem. We test our approach on the MPII-Multiperson dataset, showing that our approach obtains comparable results with the state-of-the-art algorithm for joint node labeling and grouping problems, and that NBD achieves considerable speed-ups relative to a naive dynamic programming approach. Typical algorithms that solve joint node labeling and grouping problems use heuristics and thus can not obtain proofs of optimality. Our approach, in contrast, proves that for over 99 percent of problem instances we find the globally optimal solution and otherwise provide upper/lower bounds.
- Nov 22 2017 cs.CV arXiv:1711.07974v1We present a method for reconstructing images viewed by observers based only on their eye movements. By exploring the relationships between gaze patterns and image stimuli, the "What Are You Looking At?" (WAYLA) system learns to synthesize photo-realistic images that are similar to the original pictures being viewed. The WAYLA approach is based on the Conditional Generative Adversarial Network (Conditional GAN) image-to-image translation technique of Isola et al. We consider two specific applications - the first, of reconstructing newspaper images from gaze heat maps, and the second, of detailed reconstruction of images containing only text. The newspaper image reconstruction process is divided into two image-to-image translation operations, the first mapping gaze heat maps into image segmentations, and the second mapping the generated segmentation into a newspaper image. We validate the performance of our approach using various evaluation metrics, along with human visual inspection. All results confirm the ability of our network to perform image generation tasks using eye tracking data.
- Nov 22 2017 cs.CV arXiv:1711.07971v1Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time. In this paper, we present non-local operations as a generic family of building blocks for capturing long-range dependencies. Inspired by the classical non-local means method in computer vision, our non-local operation computes the response at a position as a weighted sum of the features at all positions. This building block can be plugged into many computer vision architectures. On the task of video classification, even without any bells and whistles, our non-local models can compete or outperform current competition winners on both Kinetics and Charades datasets. In static image recognition, our non-local models improve object detection/segmentation and pose estimation on the COCO suite of tasks. Code will be made available.
- We consider the use of Deep Learning methods for modeling complex phenomena like those occurring in natural physical processes. With the large amount of data gathered on these phenomena the data intensive paradigm could begin to challenge more traditional approaches elaborated over the years in fields like maths or physics. However, despite considerable successes in a variety of application domains, the machine learning field is not yet ready to handle the level of complexity required by such problems. Using an example application, namely Sea Surface Temperature Prediction, we show how general background knowledge gained from physics could be used as a guideline for designing efficient Deep Learning models. In order to motivate the approach and to assess its generality we demonstrate a formal link between the solution of a class of differential equations underlying a large family of physical phenomena and the proposed model. Experiments and comparison with series of baselines including a state of the art numerical approach is then provided.
- Nov 22 2017 cs.GT arXiv:1711.07968v1Compositional Game Theory is a new, recently introduced model of economic games based upon the computer science idea of compositionality. In it, complex and irregular games can be built up from smaller and simpler games, and the equilibria of these complex games can be defined recursively from the equilibria of their simpler subgames. This paper extends the model by providing a final coalgebra semantics for infinite games. In the course of this, we introduce a new operator on games to model the economic concept of subgame perfection.
- The recent Nobel-prize-winning detections of gravitational waves from merging black holes and the subsequent detection of the collision of two neutron stars in coincidence with electromagnetic observations have inaugurated a new era of multimessenger astrophysics. To enhance the scope of this emergent science, the use of deep convolutional neural networks were proposed for the detection and characterization of gravitational wave signals in real-time. This approach, Deep Filtering, was initially demonstrated using simulated LIGO noise. In this article, we present the extension of Deep Filtering using real noise from the first observing run of LIGO, for both detection and parameter estimation of gravitational waves from binary black hole mergers with continuous data streams from multiple LIGO detectors. We show for the first time that machine learning can detect and estimate the true parameters of a real GW event observed by LIGO. Our comparisons show that Deep Filtering is far more computationally efficient than matched-filtering, while retaining similar performance, allowing real-time processing of weak time-series signals in non-stationary non-Gaussian noise, with minimal resources, and also enables the detection of new classes of gravitational wave sources that may go unnoticed with existing detection algorithms. This framework is uniquely suited to enable coincident detection campaigns of gravitational waves and their multimessenger counterparts in real-time.
- Nov 22 2017 cs.MA arXiv:1711.07951v1We present ongoing work on a tool that consists of two parts: (i) A raw micro-level abstract world simulator with an interface to (ii) a 3D game engine, translator of raw abstract simulator data to photorealistic graphics. Part (i) implements a dedicated cellular automata (CA) on reconfigurable hardware (FPGA) and part (ii) interfaces with a deep learning framework for training neural networks. The bottleneck of such an architecture usually lies in the fact that transferring the state of the whole CA significantly slows down the simulation. We bypass this by sending only a small subset of the general state, which we call a 'locus of visibility', akin to a torchlight in a darkened 3D space, into the simulation. The torchlight concept exists in many games but these games generally only simulate what is in or near the locus. Our chosen architecture will enable us to simulate on a micro level outside the locus. This will give us the advantage of being able to create a larger and more fine-grained simulation which can be used to train neural networks for use in games.
- Nov 22 2017 cs.CL arXiv:1711.07950v1Contrary to most natural language processing research, which makes use of static datasets, humans learn language interactively, grounded in an environment. In this work we propose an interactive learning procedure called Mechanical Turker Descent (MTD) and use it to train agents to execute natural language commands grounded in a fantasy text adventure game. In MTD, Turkers compete to train better agents in the short term, and collaborate by sharing their agents' skills in the long term. This results in a gamified, engaging experience for the Turkers and a better quality teaching signal for the agents compared to static datasets, as the Turkers naturally adapt the training data to the agent's abilities.
- We introduce a notion of quantum function, and develop a compositional framework for finite quantum set theory based on a 2-category of quantum sets and quantum functions. We use this framework to formulate a 2-categorical theory of quantum graphs, which captures the quantum graphs and quantum graph homomorphisms recently discovered in the study of nonlocal games and zero-error communication, and relates them to quantum automorphism groups of graphs considered in the setting of compact quantum groups. We show that the 2-categories of quantum sets and quantum graphs are semisimple and characterise existing notions of quantum permutations and quantum graph isomorphisms as dagger-dualisable 1-morphisms in these 2-categories.
- Nov 22 2017 quant-ph arXiv:1711.07943v1A quantum state's entanglement across a bipartite cut can be quantified with entanglement entropy or, more generally, Schmidt norms. Using only Schmidt decompositions, we present a simple iterative algorithm to numerically find pure states that maximize Schmidt norms, potentially minimizing or maximizing entanglement across several bipartite cuts at the same time, possibly only among states in a specified subspace. Recognizing that convergence but not success is certain, we ask how this algorithm can help to explore topics ranging from fermionic reduced density matrices and varieties of pure quantum states to absolutely maximally entangled states and minimal output entropy of channels.
- Nov 22 2017 cs.CV arXiv:1711.07933v1We present a novel method to train machine learning algorithms to estimate scene depths from a single image, by using the information provided by a camera's aperture as supervision. Prior works use a depth sensor's outputs or images of the same scene from alternate viewpoints as supervision, while our method instead uses images from the same viewpoint taken with a varying camera aperture. To enable learning algorithms to use aperture effects as supervision, we introduce two differentiable aperture rendering functions that use the input image and predicted depths to simulate the depth-of-field effects caused by real camera apertures. We train a monocular depth estimation network end-to-end to predict the scene depths that best explain these finite aperture images as defocus-blurred renderings of the input all-in-focus image.
- Nov 22 2017 hep-ex arXiv:1711.07927v1Using the entire Belle data sample of 980 ${\rm fb}^{-1}$ of $e^+e^-$ collisions, we present the results of a study of excited $\Omega_c$ charmed baryons in the decay mode $\Xi_c^+K^-$. We show confirmation of four of the five narrow states reported by the LHCb Collaboration: the $\Omega_c(3000)$ , $\Omega_c(3050)$, $\Omega_c(3066)$, and $\Omega_c(3090)$.
- We study the problem of nonnegative rank-one approximation of a nonnegative tensor, and show that the globally optimal solution that minimizes the generalized Kullback-Leibler divergence can be efficiently obtained, i.e., it is not NP-hard. This result works for arbitrary nonnegative tensors with an arbitrary number of modes (including two, i.e., matrices). We derive a closed-form expression for the KL principal component, which is easy to compute and has an intuitive probabilistic interpretation. For generalized KL approximation with higher ranks, the problem is for the first time shown to be equivalent to multinomial latent variable modeling, and an iterative algorithm is derived that resembles the expectation-maximization algorithm. On the Iris dataset, we showcase how the derived results help us learn the model in an \emphunsupervised manner, and obtain strikingly close performance to that from supervised methods.
- Nov 22 2017 cs.CL arXiv:1711.07915v1Sentiment analysis has become a very important tool for analysis of social media data. There are several methods developed for this research field, many of them working very differently from each other, covering distinct aspects of the problem and disparate strategies. Despite the large number of existent techniques, there is no single one which fits well in all cases or for all data sources. Supervised approaches may be able to adapt to specific situations but they require manually labeled training, which is very cumbersome and expensive to acquire, mainly for a new application. In this context, in here, we propose to combine several very popular and effective state-of-the-practice sentiment analysis methods, by means of an unsupervised bootstrapped strategy for polarity classification. One of our main goals is to reduce the large variability (lack of stability) of the unsupervised methods across different domains (datasets). Our solution was thoroughly tested considering thirteen different datasets in several domains such as opinions, comments, and social media. The experimental results demonstrate that our combined method (aka, 10SENT) improves the effectiveness of the classification task, but more importantly, it solves a key problem in the field. It is consistently among the best methods in many data types, meaning that it can produce the best (or close to best) results in almost all considered contexts, without any additional costs (e.g., manual labeling). Our self-learning approach is also very independent of the base methods, which means that it is highly extensible to incorporate any new additional method that can be envisioned in the future. Finally, we also investigate a transfer learning approach for sentiment analysis as a means to gather additional (unsupervised) information for the proposed approach and we show the potential of this technique to improve our results.
- Nov 22 2017 stat.ML arXiv:1711.07910v1Domain generalization is the problem of assigning class labels to an unlabeled test data set, given several labeled training data sets drawn from similar distributions. This problem arises in several applications where data distributions fluctuate because of biological, technical, or other sources of variation. We develop a distribution-free, kernel-based approach that predicts a classifier from the marginal distribution of features, by leveraging the trends present in related classification tasks. This approach involves identifying an appropriate reproducing kernel Hilbert space and optimizing a regularized empirical risk over the space. We present generalization error analysis, describe universal kernels, and establish universal consistency of the proposed methodology. Experimental results on synthetic data and three real data applications demonstrate the superiority of the method with respect to a pooling strategy.
- Nov 22 2017 cs.CL arXiv:1711.07908v1Biomedical named entity recognition (NER) is a fundamental task in text mining of medical documents and has a lot of applications. Existing approaches for NER require manual feature engineering in order to represent words and its corresponding contextual information. Deep learning based approaches have been gaining increasing attention in recent years as their weight parameters can be learned end-to-end without the need for hand-engineered features. These approaches rely on high-quality labeled data which is expensive to obtain. To address this issue, we investigate how to use widely available unlabeled text data to improve the performance of NER models. Specifically, we train a bidirectional language model (Bi-LM) on unlabeled data and transfer its weights to a NER model with the same architecture as the Bi-LM, which results in a better parameter initialization of the NER model. We evaluate our approach on three datasets for disease NER and show that it leads to a remarkable improvement in F1 score as compared to the model with random parameter initialization. We also show that Bi-LM weight transfer leads to faster model training. In addition, our model requires fewer training examples to achieve a particular F1 score.
- Nov 22 2017 cs.CL arXiv:1711.07893v1In this paper, we proposed two strategies which can be applied to a multilingual neural machine translation system in order to better tackle zero-shot scenarios despite not having any parallel corpus. The experiments show that they are effective in terms of both performance and computing resources, especially in multilingual translation of unbalanced data in real zero-resourced condition when they alleviate the language bias problem.
- Nov 22 2017 cs.CV arXiv:1711.07888v1The objective of this paper is 3D shape understanding from single and multiple images. To this end, we introduce a new deep-learning architecture and loss function, SilNet, that can handle multiple views in an order-agnostic manner. The architecture is fully convolutional, and for training we use a proxy task of silhouette prediction, rather than directly learning a mapping from 2D images to 3D shape as has been the target in most recent work. We demonstrate that with the SilNet architecture there is generalisation over the number of views -- for example, SilNet trained on 2 views can be used with 3 or 4 views at test-time; and performance improves with more views. We introduce two new synthetics datasets: a blobby object dataset useful for pre-training, and a challenging and realistic sculpture dataset; and demonstrate on these datasets that SilNet has indeed learnt 3D shape. Finally, we show that SilNet exceeds the state of the art on the ShapeNet benchmark dataset, and use SilNet to generate novel views of the sculpture dataset.
- Nov 22 2017 cs.SE arXiv:1711.07876v1Software is a field of rapid changes: the best technology today becomes obsolete in the near future. If we review the graduate attributes of any of the software engineering programs across the world, life-long learning is one of them. The social and psychological aspects of professional development is linked with rewards. In organizations, where people are provided with learning opportunities and there is a culture that rewards learning, people embrace changes easily. However, the software industry tends to be short-sighted and its primary focus is more on current project success; it usually ignores the capacity building of the individual or team. It is hoped that our software engineering colleagues will be motivated to conduct more research into the area of software psychology so as to understand more completely the possibilities for increased effectiveness and personal fulfillment among software engineers working alone and in teams.
- The autoencoder is an artificial neural network model that learns hidden representations of unlabeled data. With a linear transfer function it is similar to the principal component analysis (PCA). While both methods use weight vectors for linear transformations, the autoencoder does not come with any indication similar to the eigenvalues in PCA that are paired with the eigenvectors. We propose a novel supervised node saliency (SNS) method that ranks the hidden nodes by comparing class distributions of latent representations against a fixed reference distribution. The latent representations of a hidden node can be described using a one-dimensional histogram. We apply normalized entropy difference (NED) to measure the "interestingness" of the histograms, and conclude a property for NED values to identify a good classifying node. By applying our methods to real data sets, we demonstrate the ability of SNS to explain what the trained autoencoders have learned.
- Nov 22 2017 cond-mat.str-el arXiv:1711.07864v1We construct model wavefunctions for the half-filled Landau level parameterized by "composite fermion occupation-number configurations" in a two-dimensional momentum space which correspond to a Fermi sea with particle-hole excitations. When these correspond to a weakly-excited Fermi sea, they have large overlap with wavefunctions obtained by exact diagonalization of lowest-Landau-level electrons interacting with a Coulomb interaction, allowing exact states to be identified with quasiparticle configurations. We then formulate a many-body version of the single-particle Berry phase for adiabatic transport of a single quasiparticle around a path in momentum space, and evaluate it using a sequence of exact eigenstates in which a single quasiparticle moves incrementally. In this formulation the standard free-particle construction in terms of the overlap between "periodic parts of successive Bloch wavefunctions" is reinterpreted as the matrix element of a "momentum boost" operator between the full Bloch states, which becomes the matrix elements of a Girvin-MacDonald-Platzman density operator in the many-body context. This allows computation of the Berry phase for transport of a single composite fermion around the Fermi surface. In addition to a phase contributed by the density operator, we find a phase of exactly $\pi$ for this process.
- Nov 22 2017 cond-mat.str-el arXiv:1711.07860v1We review a simple mechanism for the formation of plateaux in the fractional quantum Hall effect. It arises from a map of the microscopic Hamiltonian in the thin torus limit to a lattice gas model, solved by Hubbard. The map suggests a Devil's staircase pattern, and explains the observed asymmetries in the widths. Each plateau is a new ground state of the system: a periodic Slater state in the thin torus limit. We provide the unitary operator that maps such limit states to the full, effective ground states with same filling fraction. These Jack polynomials generalise Laughlin's ansatz, and are exact eigenstates of the Laplace-Beltrami operator. Why are Jacks sitting on the Devil's staircase? This is yet an intriguing problem.
- Nov 22 2017 cs.SE arXiv:1711.07857v1Software organizations have relied on process and technology initiatives to compete in a highly globalized world. Unfortunately, that has led to little or no success. We propose that the organizations start working on people initiatives, such as inspiring egoless behavior among software developers. This paper proposes a multi-stage approach to develop egoless behavior and discusses the universality of the egoless behavior by studying cohorts from three different countries, i.e., Japan, India, and Canada. The three stages in the approach are self-assessment, peer validation, and action plan development. The paper covers the first stage of self-assssment using an instrument based on Lamont Adams Ten commandments (factors) of egoless programming, seven of the factors are general, whereas three are related to coding behavior. We found traces of universality in the egoless behavior among the three cohorts such as there was no difference in egoless behaviours between Indian and Canadian cohorts and both Indian and Japanese cohorts had difficulties in behaving in egoless manner in coding activities than in general activities.
- Nov 22 2017 cs.CV arXiv:1711.07846v1We present a new dataset, Functional Map of the World (fMoW), which aims to inspire the development of machine learning models capable of predicting the functional purpose of buildings and land use from temporal sequences of satellite images and a rich set of metadata features. The metadata provided with each image enables reasoning about location, time, sun angles, physical sizes, and other features when making predictions about objects in the image. Our dataset consists of over 1 million images from over 200 countries. For each image, we provide at least one bounding box annotation containing one of 63 categories, including a "false detection" category. We present an analysis of the dataset along with baseline approaches that reason about metadata and temporal views. Our data, code, and pretrained models have been made publicly available.
- A major challenge in computational chemistry is the generation of novel molecular structures with desirable pharmacological and physiochemical properties. In this work, we investigate the potential use of autoencoder, a deep learning methodology, for de novo molecular design. Various generative autoencoders were used to map molecule structures into a continuous latent space and vice versa and their performance as structure generator was assessed. Our results show that the latent space preserves chemical similarity principle and thus can be used for the generation of analogue structures. Furthermore, the latent space created by autoencoders were searched systematically to generate novel compounds with predicted activity against dopamine receptor type 2 and compounds similar to known active compounds not included in the training set were identified.
- Nov 22 2017 cs.LG arXiv:1711.07838v1Learning low-dimensional representations of networks has proved effective in a variety of tasks such as node classification, link prediction and network visualization. Existing methods can effectively encode different structural properties into the representations, such as neighborhood connectivity patterns, global structural role similarities and other high-order proximities. However, except for objectives to capture network structural properties, most of them suffer from lack of additional constraints for enhancing the robustness of representations. In this paper, we aim to exploit the strengths of generative adversarial networks in capturing latent features, and investigate its contribution in learning stable and robust graph representations. Specifically, we propose an Adversarial Network Embedding (ANE) framework, which leverages the adversarial learning principle to regularize the representation learning. It consists of two components, i.e., a structure preserving component and an adversarial learning component. The former component aims to capture network structural properties, while the latter contributes to learning robust representations by matching the posterior distribution of the latent representations to given priors. As shown by the empirical results, our method is competitive with or superior to state-of-the-art approaches on benchmark network embedding tasks.
- Nov 22 2017 cs.CV arXiv:1711.07837v1In the era of end-to-end deep learning, many advances in computer vision are driven by large amounts of labeled data. In the optical flow setting, however, obtaining dense per-pixel ground truth for real scenes is difficult and thus such data is rare. Therefore, recent end-to-end convolutional networks for optical flow rely on synthetic datasets for supervision, but the domain mismatch between training and test scenarios continues to be a challenge. Inspired by classical energy-based optical flow methods, we design an unsupervised loss based on occlusion-aware bidirectional flow estimation and the robust census transform to circumvent the need for ground truth flow. On the KITTI benchmarks, our unsupervised approach outperforms previous unsupervised deep networks by a large margin, and is even more accurate than similar supervised methods trained on synthetic datasets alone. By optionally fine-tuning on the KITTI training data, our method achieves competitive optical flow accuracy on the KITTI 2012 and 2015 benchmarks, thus in addition enabling generic pre-training of supervised networks for datasets with limited amounts of ground truth.
- Nov 22 2017 cs.CV arXiv:1711.07835v1Discriminative correlation filter (DCF) based trackers have recently achieved excellent performance with great computational efficiency. However, DCF based trackers suffer boundary effects, which result in the unstable performance in challenging situations exhibiting fast motion. In this paper, we propose a novel method to mitigate this side-effect in DCF based trackers. We change the search area according to the prediction of target motion. When the object moves fast, broad search area could alleviate boundary effects and reserve the probability of locating the object. When the object moves slowly, narrow search area could prevent the effect of useless background information and improve computational efficiency to attain real-time performance. This strategy can impressively soothe boundary effects in situations exhibiting fast motion and motion blur, and it can be used in almost all DCF based trackers. The experiments on OTB benchmark show that the proposed framework improves the performance compared with the baseline trackers.
- Nov 22 2017 cs.CV arXiv:1711.07827v1In this paper, an efficient implementation for a recognition system based on the original HMAX model of the visual cortex is proposed. Various optimizations targeted to increase accuracy at the so-called layers S1, C1, and S2 of the HMAX model are proposed. At layer S1, all unimportant information such as illumination and expression variations are eliminated from the images. Each image is then convolved with 64 separable Gabor filters in the spatial domain. At layer C1, the minimum scales values are exploited to be embedded into the maximum ones using the additive embedding space. At layer S2, the prototypes are generated in a more efficient way using Partitioning Around Medoid (PAM) clustering algorithm. The impact of these optimizations in terms of accuracy and computational complexity was evaluated on the Caltech101 database, and compared with the baseline performance using support vector machine (SVM) and nearest neighbor (NN) classifiers. The results show that our model provides significant improvement in accuracy at the S1 layer by more than 10% where the computational complexity is also reduced. The accuracy is slightly increased for both approximations at the C1 and S2 layers.
- Nov 22 2017 cond-mat.str-el arXiv:1711.07813v1Strongly correlated iridate pyrochlores with geometrically frustrated spins have been recognized as a potentially interesting group of oxide materials where novel topological phases may appear. A particularly attractive system is the metallic Pr$_2$Ir$_2$O$_7$, as it is known as a Fermi node semimetal characterized by quadratic band touching at the Brillouin zone center, suggesting that the topology of its electronic states can be tuned by moderate lattice strain. In this work we report the growth of epitaxial Pr$_2$Ir$_2$O$_7$ thin films grown by solid-state epitaxy. We show that the strained parts of the films give rise to a spontaneous Hall effect that persists up to 50 K without having spontaneous magnetization within our experimental accuracy. This indicates that a macroscopic time reversal symmetry (TRS) breaking appears at a temperature scale that is too high for the magnetism to be due to Pr 4$f$ moments, and must thus be related to magnetic order of the iridium 5$d$ electrons. The magnetotransport and Hall analysis results are consistent with the formation of a Weyl semimetal state that is induced by a combination of TRS breaking and cubic symmetry breaking due to lattice strain.
- Nov 22 2017 cs.CV arXiv:1711.07807v1We design a novel network architecture for learning discriminative image models that are employed to efficiently tackle the problem of grayscale and color image denoising. Based on the proposed architecture, we introduce two different variants. The first network involves convolutional layers as a core component, while the second one relies instead on non-local filtering layers and thus it is able to exploit the inherent non-local self-similarity property of natural images. As opposed to most of the existing neural networks, which require the training of a specific model for each considered noise level, the proposed networks are able to handle a wide range of different noise levels, while they are very robust when the noise degrading the latent image does not match the statistics of the one used during training. The latter argument is supported by results that we report on publicly available images corrupted by unknown noise and which we compare against solutions obtained by alternative state-of-the-art methods. At the same time the introduced networks achieve excellent results under additive white Gaussian noise (AWGN), which are comparable to the current state-of-the-art network, while they depend on a more shallow architecture with the number of trained parameters being one order of magnitude smaller. These properties make the proposed networks ideal candidates to serve as sub-solvers on restoration methods that deal with general inverse imaging problems such as deblurring, demosaicking, superresolution, etc.
- Product codes (PCs) protect a two-dimensional array of bits using short component codes. Assuming transmission over the binary symmetric channel, the decoding is commonly performed by iteratively applying bounded-distance decoding to the component codes. For this coding scheme, undetected errors in the component decoding-also known as miscorrections-significantly degrade the performance. In this paper, we propose a novel iterative decoding algorithm for PCs which can detect and avoid most miscorrections. The algorithm can also be used to decode many recently proposed classes of generalized PCs such as staircase, braided, and half-product codes. Depending on the component code parameters, our algorithm significantly outperforms the conventional iterative decoding method. As an example, for double-error-correcting Bose-Chaudhuri-Hocquenghem component codes, the net coding gain can be increased by up to 0.4 dB. Moreover, the error floor can be lowered by orders of magnitude, up to the point where the decoder performs virtually identical to a genie-aided decoder that avoids all miscorrections. We also discuss post-processing techniques that can be used to reduce the error floor even further.
- Sentiment analysis is attracting more and more attentions and has become a very hot research topic due to its potential applications in personalized recommendation, opinion mining, etc. Most of the existing methods are based on either textual or visual data and can not achieve satisfactory results, as it is very hard to extract sufficient information from only one single modality data. Inspired by the observation that there exists strong semantic correlation between visual and textual data in social medias, we propose an end-to-end deep fusion convolutional neural network to jointly learn textual and visual sentiment representations from training examples. The two modality information are fused together in a pooling layer and fed into fully-connected layers to predict the sentiment polarity. We evaluate the proposed approach on two widely used data sets. Results show that our method achieves promising result compared with the state-of-the-art methods which clearly demonstrate its competency.
- In this paper we give the definition of an adapted generating set and an adapted basis for the first homology group of a compact Riemann surface of genus at least two when the surface has a conformal automorphism group of order $n$. We prove the existence of such a generating set and basis for any conformal group acting on a surface. This extends earlier results on adapted bases for automorphism groups of prime orders and other specific groups.
- Recently, there is increasing interest and research on the interpretability of machine learning models, for example how they transform and internally represent EEG signals in Brain-Computer Interface (BCI) applications. This can help to understand the limits of the model and how it may be improved, in addition to possibly provide insight about the data itself. Schirrmeister et al. (2017) have recently reported promising results for EEG decoding with deep convolutional neural networks (ConvNets) trained in an end-to-end manner and, with a causal visualization approach, showed that they learn to use spectral amplitude changes in the input. In this study, we investigate how ConvNets represent spectral features through the sequence of intermediate stages of the network. We show higher sensitivity to EEG phase features at earlier stages and higher sensitivity to EEG amplitude features at later stages. Intriguingly, we observed a specialization of individual stages of the network to the classical EEG frequency bands alpha, beta, and high gamma. Furthermore, we find first evidence that particularly in the later layer, the network learns to detect more complex oscillatory patterns beyond spectral phase and amplitude, reminiscent of the representation of complex visual features in later layer of ConvNets in computer vision tasks. Our findings thus provide insights into how ConvNets hierarchically represent spectral EEG features in their intermediate layers and suggest that ConvNets can exploit and might help to better understand the compositional structure of EEG time series.
- We present a novel, reflection-aware method for 3D sound localization in indoor environments. Unlike prior approaches, which are mainly based on continuous sound signals from a stationary source, our formulation is designed to localize the position instantaneously from signals within a single frame. We consider direct sound and indirect sound signals that reach the microphones after reflecting off surfaces such as ceilings or walls. We then generate and trace direct and reflected acoustic paths using inverse acoustic ray tracing and utilize these paths with Monte Carlo localization to estimate a 3D sound source position. We have implemented our method on a robot with a cube-shaped microphone array and tested it against different settings with continuous and intermittent sound signals with a stationary or a mobile source. Across different settings, our approach can localize the sound with an average distance error of 0.8m tested in a room of 7m by 7m area with 3m height, including a mobile and non-line-of-sight sound source. We also reveal that the modeling of indirect rays increases the localization accuracy by 40% compared to only using direct acoustic rays.
- Nov 22 2017 cs.LO arXiv:1711.07786v1Many reasoning problems are based on the problem of satisfiability (SAT). While SAT itself becomes easy when restricting the structure of the formulas in a certain way, the situation is more opaque for more involved decision problems. For instance, the CardMinSat problem which asks, given a propositional formula $\phi$ and an atom $x$, whether $x$ is true in some cardinality-minimal model of $\phi$, is easy for the Horn fragment, but, as we will show in this paper, remains $\Theta_2\mathrm{P}$-complete (and thus $\mathrm{NP}$-hard) for the Krom fragment (which is given by formulas in CNF where clauses have at most two literals). We will make use of this fact to study the complexity of reasoning tasks in belief revision and logic-based abduction and show that, while in some cases the restriction to Krom formulas leads to a decrease of complexity, in others it does not. We thus also consider the CardMinSat problem with respect to additional restrictions to Krom formulas towards a better understanding of the tractability frontier of such problems.
- The paper introduces the Hidden Tree Markov Network (HTN), a neuro-probabilistic hybrid fusing the representation power of generative models for trees with the incremental and discriminative learning capabilities of neural networks. We put forward a modular architecture in which multiple generative models of limited complexity are trained to learn structural feature detectors whose outputs are then combined and integrated by neural layers at a later stage. In this respect, the model is both deep, thanks to the unfolding of the generative models on the input structures, as well as wide, given the potentially large number of generative modules that can be trained in parallel. Experimental results show that the proposed approach can outperform state-of-the-art syntactic kernels as well as generative kernels built on the same probabilistic model as the HTN.
- Analyses of gamma-ray spectra, acquired through non-invasive techniques, have found applications in fields such as medicine, industry and homeland security. Constituent gamma-ray spectra of a chemical compound have been determined from its sole spectrum through a forward Monte Carlo simulation coupled with a least squares method (MCLLS). The method's limitations include its linearity assumption and its oversensitivity to correlated or noisy data, which render the method unfit to deal with such numerical ill conditioning. Recently this issue was tackled by iteratively reducing the condition number of the linear system of equations. Despite its superior results, it is not suitable for cases where there are missing libraries in the analysis. Our work introduces a novel framework that allows treating spectral analyses problems through geometrical insights. Based on this it was possible to propose solutions to three problems regarding the missing library: to find its photopeak, its most probable fraction, and an envelope around its spectrum. We successfully validated these on some Monte Carlo-generated radionuclide gamma-ray spectra.
- Nov 22 2017 math.PR arXiv:1711.07768v1This paper concerns the long term behaviour of a growth model describing a random sequential allocation of particles on a finite cycle graph. The model can be regarded as a reinforced urn model with graph-based interactions. It is motivated by cooperative sequential adsorption, where adsorption rates at a site depend on the configuration of existing particles in the neighbourhood of that site. Our main result is that, with probability one, the growth process will eventually localise either at a single site, or at a pair of neighbouring sites.
- Nov 22 2017 cs.CV arXiv:1711.07767v1Current top-performing object detectors depend on deep CNN backbones, such as ResNet-101 and Inception, benefiting from their powerful feature representation but suffering from high computational cost. Conversely, some lightweight model based detectors fulfil real time processing, while their accuracies are often criticized. In this paper, we explore an alternative to build a fast and accurate detector by strengthening lightweight features using a crafting mechanism. Inspired by the structure of Receptive Fields (RFs) in human visual systems, we propose a novel RF Block (RFB) module, which takes the relationship between the size and eccentricity of RFs into account, to enhance the discriminability and robustness of features. We further assemble the RFB module to the top of SSD with a lightweight CNN model, constructing the RFB Net detector. To evaluate its effectiveness, experiments are conducted on two major benchmarks and the results show that RFB Net is able to reach the accuracy of advanced very deep backbone network based detectors while keeping the real-time speed. Code will be make publicly available soon.
- Nov 22 2017 stat.AP arXiv:1711.07763v1We consider the problem of conditioning a geological process-based computer simulation, which produces basin models by simulating transport and deposition of sediments, to data. Emphasising uncertainty quantification, we frame this as a Bayesian inverse problem, and propose to characterize the posterior probability distribution of the geological quantities of interest by using a variant of the ensemble Kalman filter, an estimation method which linearly and sequentially conditions realisations of the system state to data. A test case involving synthetic data is used to assess the performance of the proposed estimation method, and to compare it with similar approaches. We further apply the method to a more realistic test case, involving real well data from the Colville foreland basin, North Slope, Alaska.
- Nov 22 2017 cs.IR arXiv:1711.07762v1In this work, we address the problem of recommending jobs to university students. For this, we explore the utilization of neural item embeddings for the task of content-based recommendation, and we propose to integrate the factors of frequency and recency of interactions with job postings to combine these item embeddings. We evaluate our job recommendation system on a dataset of the Austrian student job portal Studo using prediction accuracy, diversity and an adapted novelty metric. This paper demonstrates that utilizing frequency and recency of interactions with job postings for combining item embeddings results in a robust model with respect to accuracy and diversity, which also provides the best adapted novelty results.
- Nov 22 2017 cs.LG arXiv:1711.07758v1Deep learning achieves remarkable generalization capability with overwhelming number of model parameters. Theoretical understanding of deep learning generalization receives recent attention yet remains not fully explored. This paper attempts to provide an alternative understanding from the perspective of maximum entropy. We first derive two feature conditions that softmax regression strictly apply maximum entropy principle. DNN is then regarded as approximating the feature conditions with multilayer feature learning, and proved to be a recursive solution towards maximum entropy principle. The connection between DNN and maximum entropy well explains why typical designs such as shortcut and regularization improves model generalization, and provides instructions for future model development.
- Nov 22 2017 cs.CV arXiv:1711.07752v1Detecting individual pedestrians in a crowd remains a challenging problem since the pedestrians often gather together and occlude each other in real-world scenarios. In this paper, we first explore how a state-of-the-art pedestrian detector is harmed by crowd occlusion via experimentation, providing insights into the crowd occlusion problem. Then, we propose a novel bounding box regression loss specifically designed for crowd scenes, termed repulsion loss. This loss is driven by two motivations: the attraction by target, and the repulsion by other surrounding objects. The repulsion term prevents the proposal from shifting to surrounding objects thus leading to more crowd-robust localization. Our detector trained by repulsion loss outperforms all the state-of-the-art methods with a significant improvement in occlusion cases.