- Mar 29 2017 quant-ph arXiv:1703.09656v1We quantify the usefulness of a bipartite quantum state in the ancilla-assisted channel discrimination of arbitrary quantum channels, formally defining a worst-case-scenario channel discrimination power for bipartite quantum states. We show that such a quantifier is deeply connected with the operator Schmidt decomposition of the state. We compute the channel discrimination power exactly for pure states, and provide upper and lower bounds for general mixed states. We show that highly entangled states can outperform any state that passes the realignment criterion for separability. Furthermore, while also unentangled states can be used in ancilla-assisted channel discrimination, we show that the channel discrimination power of a state is bounded by its quantum discord.
- Mar 29 2017 quant-ph arXiv:1703.09277v1Recent theoretical and experimental studies have suggested that quantum Monte Carlo (QMC) simulation can behave similarly to quantum annealing (QA). The theoretical analysis was based on calculating transition rates between local minima, in the large spin limit using WentzelKramers-Brillouin (WKB) approximation, for highly symmetric systems of ferromagnetically coupled qubits. The rate of transition was observed to scale the same in QMC and incoherent quantum tunneling, implying that there might be no quantum advantage of QA compared to QMC other than a prefactor. Quantum annealing is believed to provide quantum advantage through large scale superposition and entanglement and not just incoherent tunneling. Even for incoherent tunneling, the scaling similarity with QMC observed above does not hold in general. Here, we compare incoherent tunneling and QMC escape using perturbation theory, which has much wider validity than WKB approximation. We show that the two do not scale the same way when there are multiple homotopy-inequivalent paths for tunneling. We demonstrate through examples that frustration can generate an exponential number of tunneling paths, which under certain conditions can lead to an exponential advantage for incoherent tunneling over classical QMC escape. We provide analytical and numerical evidence for such an advantage and show that it holds beyond perturbation theory.
- Mar 29 2017 quant-ph arXiv:1703.09236v1We address the dynamics of a bosonic system coupled to either a bosonic or a magnetic environment, and derive a set of sufficient conditions that allow one to describe the dynamics in terms of the effective interaction with a classical fluctuating field. We find that for short interaction times the dynamics of the open system is described by a Gaussian noise map for several different interaction models and independently on the temperature of the environment. More generally, our results indicate that quantum environments may be described by classical fields whenever global symmetries lead to the definition of environmental operators that remain well defined when increasing the size of the environment.
- Mar 29 2017 quant-ph arXiv:1703.09576v1In the task of assisted coherence distillation via the set of operations X, where X is either local incoherent operations and classical communication (LICC), local quantum-incoherent operations and classical communication (LQICC), separable incoherent operations (SI), or separable quantum incoherent operations (SQI), two parties, namely Alice and Bob, share many copies of a bipartite joint state. The aim of the process is to generate the maximal possible coherence on the subsystem of Bob. In this paper, we investigate the assisted coherence distillation of some special mixed states, the states with vanished basis-dependent discord and Werner states. We show that all the four sets of operations are equivalent for assisted coherence distillation, whenever Alice and Bob share one of those mixed quantum states. Moreover, we prove that the assisted coherence distillation of the former can reach the upper bound, namely QI relative entropy, while that of the latter can not. Meanwhile, we also present a sufficient condition such that the assistance of Alice via the set of operations X can not help Bob improve his distillable coherence, and this condition is that the state shared by Alice and Bob has vanished basis-dependent discord.
- Mar 29 2017 quant-ph arXiv:1703.09568v1Quantum samplers are believed capable of sampling efficiently from distributions that are classically hard to sample from. We consider a sampler inspired by the Ising model. It is nonadaptive and therefore experimentally amenable. Under a plausible average-case hardness conjecture, classical sampling upto additive errors from this model is known to be hard. We present a trap-based verification scheme for quantum supremacy that only requires the verifier to prepare single-qubit states. The verification is done on the same model as the original sampler, a square lattice, with only a constant factor overhead. We next revamp our verification scheme to operate in the presence of noise by emulating a fault-tolerant procedure without correcting on-line for the errors, thus keeping the model non-adaptive, but verifying supremacy fault-tolerantly. We show that classically sampling upto additive errors is likely hard in our revamped scheme. Our results are applicable to more general sampling problems such as the Instantaneous Quantum Polynomial-time (IQP) computation model. It should also assist near-term attempts at experimentally demonstrating quantum supremacy and guide long-term ones.
- I will briefly discuss three cosmological models built upon three distinct quantum gravity proposals. I will first highlight the cosmological role of a vector field in the framework of a string/brane cosmological model. I will then present the resolution of the big bang singularity and the occurrence of an early era of accelerated expansion of a geometric origin, in the framework of group field theory condensate cosmology. I will then summarise results from an extended gravitational model based on non-commutative spectral geometry, a model that offers a purely geometric explanation for the standard model of particle physics.
- Mar 29 2017 cs.CV arXiv:1703.09393v1This paper proposes a crowd counting method. Crowd counting is difficult because of large appearance changes of a target which caused by density and scale changes. Conventional crowd counting methods generally utilize one predictor (e,g., regression and multi-class classifier). However, such only one predictor can not count targets with large appearance changes well. In this paper, we propose to predict the number of targets using multiple CNNs specialized to a specific appearance, and those CNNs are adaptively selected according to the appearance of a test image. By integrating the selected CNNs, the proposed method has the robustness to large appearance changes. In experiments, we confirm that the proposed method can count crowd with lower counting error than a CNN and integration of CNNs with fixed weights. Moreover, we confirm that each predictor automatically specialized to a specific appearance.
- Recently, deep learning (DL) methods have been introduced very successfully into human activity recognition (HAR) scenarios in ubiquitous and wearable computing. Especially the prospect of overcoming the need for manual feature design combined with superior classification capabilities render deep neural networks very attractive for real-life HAR application. Even though DL-based approaches now outperform the state-of-the-art in a number of recognitions tasks of the field, yet substantial challenges remain. Most prominently, issues with real-life datasets, typically including imbalanced datasets and problematic data quality, still limit the effectiveness of activity recognition using wearables. In this paper we tackle such challenges through Ensembles of deep Long Short Term Memory (LSTM) networks. We have developed modified training procedures for LSTM networks and combine sets of diverse LSTM learners into classifier collectives. We demonstrate, both formally and empirically, that Ensembles of deep LSTM learners outperform the individual LSTM networks. Through an extensive experimental evaluation on three standard benchmarks (Opportunity, PAMAP2, Skoda) we demonstrate the excellent recognition capabilities of our approach and its potential for real-life applications of human activity recognition.
- Mar 29 2017 quant-ph arXiv:1703.09243v1Hyperpolarisation at room temperature is one of the most important research fields in order to improve liquid, gas or nanoparticle tracer for Magnetic Resonance Imaging (MRI) in medical applications. In this paper we utilize nuclear magnetic resonance (NMR) to investigate the hyperpolarisation effect of negatively charged nitrogen vacancy (NV) centres on carbon-13 nuclei and their spin diffusion in a diamond single crystal close to the excited state level anti crossing (ESLAC) around 50 mT. Whereas the electron spins of the NV centre can be easily polarized in its m = 0 ground state at room temperature just by irradiation with green light , the swop of the NV electron spin polarization to a carbon-13 nuclei is a complex task. We found that the coupling between the polarized NV electron spin, the electron spin of a substitutional nitrogen impurity (P1) as well as its nitrogen-14 nuclei and the carbon-13 nuclear spin has to be considered. Here we show that through an optimization of this procedure, in about two minutes a signal to noise ratio which corresponds to a 23 hour standard measurement without hyperpolarisation and an accumulation of 460 single scans can be obtained. Furthermore we were able to identify several polarisation peaks of different sign at different magnetic fields in a region of some tens of gauss. Most of the peaks can be attributed to a coupling of the NV centres to nearby P1 centres. We present a new theoretical model in a framework of cross polarisation of a four spin dynamic model in good agreement with our experimental data. The results demonstrate the opportunities and power as well as limitations of hyperpolarisation in diamond via NV centres. We expect that the current work may have a significant impact on future applications.
- Mar 29 2017 cs.CV arXiv:1703.09695v1Semantic segmentation has been a long standing challenging task in computer vision. It aims at assigning a label to each image pixel and needs significant number of pixellevel annotated data, which is often unavailable. To address this lack, in this paper, we leverage, on one hand, massive amount of available unlabeled or weakly labeled data, and on the other hand, non-real images created through Generative Adversarial Networks. In particular, we propose a semi-supervised framework ,based on Generative Adversarial Networks (GANs), which consists of a generator network to provide extra training examples to a multi-class classifier, acting as discriminator in the GAN framework, that assigns sample a label y from the K possible classes or marks it as a fake sample (extra class). The underlying idea is that adding large fake visual data forces real samples to be close in the feature space, enabling a bottom-up clustering process, which, in turn, improves multiclass pixel classification. To ensure higher quality of generated images for GANs with consequent improved pixel classification, we extend the above framework by adding weakly annotated data, i.e., we provide class level information to the generator. We tested our approaches on several challenging benchmarking visual datasets, i.e. PASCAL, SiftFLow, Stanford and CamVid, achieving competitive performance also compared to state-of-the-art semantic segmentation method
- Mar 29 2017 cond-mat.mes-hall arXiv:1703.09694v1A strained graphene monolayer is shown to operate as a highly efficient quantum heat engine delivering maximum power. The efficiency and power of the proposed device exceeds that of recent proposals. The reason for these excellent characteristics is that strain enables complete valley separation in transmittance through the device, implying that increasing strain leads to very high Seeback coefficient as well as lower conductance. In addition, since time-reversal symmetry is unbroken in our system, the proposed strained graphene quantum heat engine can also act as a high performance refrigerator.
- The Horndeski Lagrangian brings together all possible interactions between gravity and a scalar field that yield second-order field equations in four-dimensional spacetime. As originally proposed, it only addresses phenomenology without torsion, which is a non-Riemannian feature of geometry. Since torsion can potentially affect interesting phenomena such as gravitational waves and early Universe inflation, in this paper we allow torsion to exist and propagate within the Horndeski framework. To achieve this goal, we cast the Horndeski Lagrangian in Cartan's first-order formalism, and introduce wave operators designed to act covariantly on p-form fields that carry Lorentz indices. We find that nonminimal couplings and second-order derivatives of the scalar field in the Lagrangian are indeed generic sources of torsion. Metric perturbations couple to the background torsion and new torsional modes appear. These may be detected via gravitational waves but not through Yang-Mills gauge bosons.
- In visual question answering (VQA), an algorithm must answer text-based questions about images. While multiple datasets for VQA have been created since late 2014, they all have flaws in both their content and the way algorithms are evaluated on them. As a result, evaluation scores are inflated and predominantly determined by answering easier questions, making it difficult to compare different methods. In this paper, we analyze existing VQA algorithms using a new dataset. It contains over 1.6 million questions organized into 12 different categories. We also introduce questions that are meaningless for a given image to force a VQA system to reason about image content. We propose new evaluation schemes that compensate for over-represented question-types and make it easier to study the strengths and weaknesses of algorithms. We analyze the performance of both baseline and state-of-the-art VQA models, including multi-modal compact bilinear pooling (MCB), neural module networks, and recurrent answering units. Our experiments establish how attention helps certain categories more than others, determine which models work better than others, and explain how simple models (e.g. MLP) can surpass more complex models (MCB) by simply learning to answer large, easy question categories.
- Mar 29 2017 physics.bio-ph q-bio.CB arXiv:1703.09666v1Multicellular chemotaxis can occur via individually chemotaxing cells that are physically coupled. Alternatively, it can emerge collectively, from cells chemotaxing differently in a group than they would individually. We find that while the speeds of these two mechanisms are roughly the same, the precision of emergent chemotaxis is higher than that of individual-based chemotaxis for one-dimensional cell chains and two-dimensional cell sheets, but not three-dimensional cell clusters. We describe the physical origins of these results, discuss their biological implications, and show how they can be tested using common experimental measures such as the chemotactic index.
- Different users can use a given Internet application in many different ways. The ability to record detailed event logs of user in-application activity allows us to discover ways in which the application is being used. This enables personalization and also leads to important insights with actionable business and product outcomes. Here we study the problem of user session categorization, where the goal is to automatically discover categories/classes of user in-session behavior using event logs, and then consistently categorize each user session into the discovered classes. We develop a three stage approach which uses clustering to discover categories of sessions, then builds classifiers to classify new sessions into the discovered categories, and finally performs daily classification in a distributed pipeline. An important innovation of our approach is selecting a set of events as long-tail features, and replacing them with a new feature that is less sensitive to product experimentation and logging changes. This allows for robust and stable identification of session types even though the underlying application is constantly changing. We deploy the approach to Pinterest and demonstrate its effectiveness. We discover insights that have consequences for product monetization, growth, and design. Our solution classifies millions of user sessions daily and leads to actionable insights.
- Mar 29 2017 stat.CO arXiv:1703.09658v1In this article, we present an orthogonal basis expansion method for solving stochastic differential equations with a path-independent solution of the form $X_{t}=\phi(t,W_{t})$. For this purpose, we define a Hilbert space and construct an orthogonal basis for this inner product space with the aid of 2D-Hermite polynomials. With considering $X_{t}$ as orthogonal basis expansion, this method is implemented and the expansion coefficients are obtained by solving a system of nonlinear integro-differential equations. The strength of such a method is that expectation and variance of the solution is computed by these coefficients directly. Eventually, numerical results demonstrate its validity and efficiency in comparison with other numerical methods.
- Mar 29 2017 math.DG arXiv:1703.09629v1This note gives sufficient conditions (isothermic or totally nonisothermic) for an immersion of a compact surface to have no Bonnet mate.
- Mar 29 2017 cs.CV arXiv:1703.09625v1Existing RNN-based approaches for action recognition from depth sequences require either skeleton joints or hand-crafted depth features as inputs. An end-to-end manner, mapping from raw depth maps to action classes, is non-trivial to design due to the fact that: 1) single channel map lacks texture thus weakens the discriminative power; 2) relatively small set of depth training data. To address these challenges, we propose to learn an RNN driven by privileged information (PI) in three-steps: An encoder is pre-trained to learn a joint embedding of depth appearance and PI (i.e. skeleton joints). The learned embedding layers are then tuned in the learning step, aiming to optimize the network by exploiting PI in a form of multi-task loss. However, exploiting PI as a secondary task provides little help to improve the performance of a primary task (i.e. classification) due to the gap between them. Finally, a bridging matrix is defined to connect two tasks by discovering latent PI in the refining step. Our PI-based classification loss maintains a consistency between latent PI and predicted distribution. The latent PI and network are iteratively estimated and updated in an expectation-maximization procedure. The proposed learning process provides greater discriminative power to model subtle depth difference, while helping avoid overfitting the scarcer training data. Our experiments show significant performance gains over state-of-the-art methods on three public benchmark datasets and our newly collected Blanket dataset.
- Mar 29 2017 cs.AI arXiv:1703.09620v1Classical higher-order logic, when utilized as a meta-logic in which various other (classical and non-classical) logics can be shallowly embedded, is well suited for realising a universal logic reasoning approach. Universal logic reasoning in turn, as envisioned already by Leibniz, may support the rigorous formalisation and deep logical analysis of rational arguments within machines. A respective universal logic reasoning framework is described and a range of exemplary applications are discussed. In the future, universal logic reasoning in combination with appropriate, controlled forms of rational argumentation may serve as a communication layer between humans and intelligent machines.
- Mar 29 2017 cs.SE arXiv:1703.09613v1When learning to use an Application Programming Interface (API), programmers need to understand the inputs and outputs (I/O) of the API functions. Current documentation tools automatically document the static information of I/O, such as parameter types and names. What is missing from these tools is dynamic information, such as I/O examples---actual valid values of inputs that produce certain outputs. In this paper, we demonstrate Docio, a prototype toolset we built to generate I/O examples. Docio logs I/O values when API functions are executed, for example in running test suites. Then, Docio puts I/O values into API documents as I/O examples. Docio has three programs: 1) funcWatch, which collects I/O values when API developers run test suites, 2) ioSelect, which selects one I/O example from a set of I/O values, and 3) ioPresent, which embeds the I/O examples into documents. In a preliminary evaluation, we used Docio to generate four hundred I/O examples for three C libraries: ffmpeg, libssh, and protobuf-c. Docio is open-source and available at: http://www3.nd.edu/~sjiang1/docio/
- Mar 29 2017 cs.SE arXiv:1703.09603v1Committing to a version control system means submitting a software change to the system. Each commit can have a message to describe the submission. Several approaches have been proposed to automatically generate the content of such messages. However, the quality of the automatically generated messages falls far short of what humans write. In studying the differences between auto-generated and human-written messages, we found that 82% of the human-written messages have only one sentence, while the automatically generated messages often have multiple lines. Furthermore, we found that the commit messages often begin with a verb followed by an direct object. This finding inspired us to use a "verb+object" format in this paper to generate short commit summaries. We split the approach into two parts: verb generation and object generation. As our first try, we trained a classifier to classify a diff to a verb. We are seeking feedback from the community before we continue to work on generating direct objects for the commits.
- Mar 29 2017 quant-ph arXiv:1703.09588v1The phase dependence of the cavity quantum dynamics in a driven equidistant three-level ladder-type system found in a quantum well structure with perpendicular transition dipoles is investigated in the good cavity limit. The pumping laser phases are directly transferred to the superposed amplitudes of the cavity-quantum-well interaction. Their phase difference may be tuned in order to obtain destructive quantum interferences. Therefore, the cavity field vanishes although the emitter continues to be pumped.
- In a model of the late-time cosmic acceleration within the framework of generalized Proca theories, there exists a de Sitter attractor preceded by the dark energy equation of state $w_{\rm DE}=-1-s$, where $s$ is a positive constant. We run the Markov-Chain-Monte-Carlo code to confront the model with the observational data of Cosmic Microwave Background (CMB), baryon acoustic oscillations, supernovae type Ia, and local measurements of the Hubble expansion rate for the background cosmological solutions and obtain the bound $s=0.254^{{}+ 0.118}_{{}-0.097}$ at 95% confidence level (CL). Existence of the additional parameter $s$ to those in the $\Lambda$-Cold-Dark-Matter ($\Lambda$CDM) model allows to reduce tensions of the Hubble constant $H_0$ between the CMB and the low-redshift measurements. Including the cosmic growth data of redshift-space distortions in the galaxy power spectrum and taking into account no-ghost and stability conditions of cosmological perturbations, we find that the bound on $s$ is shifted to $s=0.16^{+0.08}_{-0.08}$ (95 % CL) and hence the model with $s>0$ is still favored over the $\Lambda$CDM model. Apart from the quantities $s, H_0$ and the today's matter density parameter $\Omega_{m0}$, the constraints on other model parameters associated with perturbations are less stringent, reflecting the fact that there are different sets of parameters that give rise to similar cosmic expansion and growth history.
- Mar 29 2017 cs.CV arXiv:1703.09554v1Convolutional networks reach top quality in pixel-level object tracking but require a large amount of training data (1k ~ 10k) to deliver such results. We propose a new training strategy which achieves state-of-the-art results across three evaluation datasets while using 20x ~ 100x less annotated data than competing methods. Instead of using large training sets hoping to generalize across domains, we generate in-domain training data using the provided annotation on the first frame of each video to synthesize ("lucid dream") plausible future video frames. In-domain per-video training data allows us to train high quality appearance- and motion-based models, as well as tune the post-processing stage. This approach allows to reach competitive results even when training from only a single annotated frame, without ImageNet pre-training. Our results indicate that using a larger training set is not automatically better, and that for the tracking task a smaller training set that is closer to the target domain is more effective. This changes the mindset regarding how many training samples and general "objectness" knowledge are required for the object tracking task.
- While humor has been historically studied from a psychological, cognitive and linguistic standpoint, its study from a computational perspective is an area yet to be explored in Computational Linguistics. There exist some previous works, but a characterization of humor that allows its automatic recognition and generation is far from being specified. In this work we build a crowdsourced corpus of labeled tweets, annotated according to its humor value, letting the annotators subjectively decide which are humorous. A humor classifier for Spanish tweets is assembled based on supervised learning, reaching a precision of 84% and a recall of 69%.
- Mar 29 2017 cs.CV arXiv:1703.09507v1In recent years, the performance of face verification systems has significantly improved using deep convolutional neural networks (DCNNs). A typical pipeline for face verification includes training a deep network for subject classification with softmax loss, using the penultimate layer output as the feature descriptor, and generating a cosine similarity score given a pair of face images. The softmax loss function does not optimize the features to have higher similarity score for positive pairs and lower similarity score for negative pairs, which leads to a performance gap. In this paper, we add an L2-constraint to the feature descriptors which restricts them to lie on a hypersphere of a fixed radius. This module can be easily implemented using existing deep learning frameworks. We show that integrating this simple step in the training pipeline significantly boosts the performance of face verification. Specifically, we achieve state-of-the-art results on the challenging IJB-A dataset, achieving True Accept Rates of 0.863 and 0.910 at False Accept Rates 0.0001 and 0.001 respectively on the face verification protocol.
- Mar 29 2017 cs.CV arXiv:1703.09474v1Person re-identification (re-id) aims to match people across non-overlapping camera views. So far the RGB-based appearance is widely used in most existing works. However, when people appeared in extreme illumination or changed clothes, the RGB appearance-based re-id methods tended to fail. To overcome this problem, we propose to exploit depth information to provide more invariant body shape and skeleton information regardless of illumination and color change. More specifically, we exploit depth voxel covariance descriptor and further propose a locally rotation invariant depth shape descriptor called Eigen-depth feature to describe pedestrian body shape. We prove that the distance between any two covariance matrices on the Riemannian manifold is equivalent to the Euclidean distance between the corresponding Eigen-depth features. Furthermore, we propose a kernelized implicit feature transfer scheme to estimate Eigen-depth feature implicitly from RGB image when depth information is not available. We find that combining the estimated depth features with RGB-based appearance features can sometimes help to better reduce visual ambiguities of appearance features caused by illumination and similar clothes. The effectiveness of our models was validated on publicly available depth pedestrian datasets as compared to related methods for person re-identification.
- Users like sharing personal photos with others through social media. At the same time, they might want to make automatic identification in such photos difficult or even impossible. Classic obfuscation methods such as blurring are not only unpleasant but also not as effective as one would expect. Recent studies on adversarial image perturbations (AIP) suggest that it is possible to confuse recognition systems effectively without unpleasant artifacts. However, in the presence of counter measures against AIPs, it is unclear how effective AIP would be in particular when the choice of counter measure is unknown. Game theory provides tools for studying the interaction between agents with uncertainties in the strategies. We introduce a general game theoretical framework for the user-recogniser dynamics, and present a case study that involves current state of the art AIP and person recognition techniques. We derive the optimal strategy for the user that assures an upper bound on the recognition rate independent of the recogniser's counter measure.
- We describe a novel method for blind, single-image spectral super-resolution. While conventional super-resolution aims to increase the spatial resolution of an input image, our goal is to spectrally enhance the input, i.e., generate an image with the same spatial resolution, but a greatly increased number of narrow (hyper-spectral) wave-length bands. Just like the spatial statistics of natural images has rich structure, which one can exploit as prior to predict high-frequency content from a low resolution image, the same is also true in the spectral domain: the materials and lighting conditions of the observed world induce structure in the spectrum of wavelengths observed at a given pixel. Surprisingly, very little work exists that attempts to use this diagnosis and achieve blind spectral super-resolution from single images. We start from the conjecture that, just like in the spatial domain, we can learn the statistics of natural image spectra, and with its help generate finely resolved hyper-spectral images from RGB input. Technically, we follow the current best practice and implement a convolutional neural network (CNN), which is trained to carry out the end-to-end mapping from an entire RGB image to the corresponding hyperspectral image of equal size. We demonstrate spectral super-resolution both for conventional RGB images and for multi-spectral satellite data, outperforming the state-of-the-art.
- Current speech enhancement techniques operate on the spectral domain and/or exploit some higher-level feature. The majority of them tackle a limited number of noise conditions and rely on first-order statistics. To circumvent these issues, deep networks are being increasingly used, thanks to their ability to learn complex functions from large example sets. In this work, we propose the use of generative adversarial networks for speech enhancement. In contrast to current techniques, we operate at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them. We evaluate the proposed model using an independent, unseen test set with two speakers and 20 alternative noise conditions. The enhanced samples confirm the viability of the proposed model, and both objective and subjective evaluations confirm the effectiveness of it. With that, we open the exploration of generative architectures for speech enhancement, which may progressively incorporate further speech-centric design choices to improve their performance.
- Mar 29 2017 cs.CV arXiv:1703.09438v1We present a deep convolutional decoder architecture that can generate volumetric 3D outputs in a compute- and memory-efficient manner by using an octree representation. The network learns to predict both the structure of the octree, and the occupancy values of individual cells. This makes it a particularly valuable technique for generating 3D shapes. In contrast to standard decoders acting on regular voxel grids, the architecture does not have cubic complexity. This allows representing much higher resolution outputs with a limited memory budget. We demonstrate this in several application domains, including 3D convolutional autoencoders, generation of objects and whole scenes from high-level representations, and shape from a single image.
- Mar 29 2017 cs.CV arXiv:1703.09436v1The task of counting eucalyptus trees from aerial images collected by unmanned aerial vehicles (UAVs) has been frequently explored by techniques of estimation of the basal area, i.e, by determining the expected number of trees based on sampling techniques. An alternative is the use of machine learning to identify patterns that represent a tree unit, and then search for the occurrence of these patterns throughout the image. This strategy depends on a supervised image segmentation step to define predefined interest regions. Thus, it is possible to automate the counting of eucalyptus trees in these images, thereby increasing the efficiency of the eucalyptus forest inventory management. In this paper, we evaluated 20 different classifiers for the image segmentation task. A real sample was used to analyze the counting trees task considering a practical environment. The results show that it possible to automate this task with 0.7% counting error, in particular, by using strategies based on a combination of classifiers. Moreover, we present some performance considerations about each classifier that can be useful as a basis for decision-making in future tasks.
- Multiple different approaches of generating adversarial examples have been proposed to attack deep neural networks. These approaches involve either directly computing gradients with respect to the image pixels, or directly solving an optimization on the image pixels. In this work, we present a fundamentally new method for generating adversarial examples that is fast to execute and provides exceptional diversity of output. We efficiently train feed-forward neural networks in a self-supervised manner to generate adversarial examples against a target network or set of networks. We call such a network an Adversarial Transformation Network (ATN). ATNs are trained to generate adversarial examples that minimally modify the classifier's outputs given the original input, while constraining the new classification to match an adversarial target class. We present methods to train ATNs and analyze their effectiveness targeting a variety of MNIST classifiers as well as the latest state-of-the-art ImageNet classifier Inception ResNet v2.
- Mar 29 2017 quant-ph arXiv:1703.09381v1We provide a general construction of convex roof measures of coherence. This construction is based on arbitrary coherence measures of pure states in the framework of resource theory of coherence. Convex roof measures of coherence bound from above all possible coherence measures, given specific valid quantifications of pure states.
- Mar 29 2017 cs.CV arXiv:1703.09379v1The process of using one image to guide the filtering process of another one is called Guided Image Filtering (GIF). The main challenge of GIF is the structure inconsistency between the guidance image and the target image. Besides, noise in the target image is also a challenging issue especially when it is heavy. In this paper, we propose a general framework for Robust Guided Image Filtering (RGIF), which contains a data term and a smoothness term, to solve the two issues mentioned above. The data term makes our model simultaneously denoise the target image and perform GIF which is robust against the heavy noise. The smoothness term is able to make use of the property of both the guidance image and the target image which is robust against the structure inconsistency. While the resulting model is highly non-convex, it can be solved through the proposed Iteratively Re-weighted Least Squares (IRLS) in an efficient manner. For challenging applications such as guided depth map upsampling, we further develop a data-driven parameter optimization scheme to properly determine the parameter in our model. This optimization scheme can help to preserve small structures and sharp depth edges even for a large upsampling factor (8x for example). Moreover, the specially designed structure of the data term and the smoothness term makes our model perform well in edge-preserving smoothing for single-image tasks (i.e., the guidance image is the target image itself). This paper is an extension of our previous work [1], [2].
- Mar 29 2017 cs.CV arXiv:1703.09342v1Sparse coding (SC) is an unsupervised learning scheme that has received an increasing amount of interests in recent years. However, conventional SC vectorizes the input images, which destructs the intrinsic spatial structures of the images. In this paper, we propose a novel graph regularized tensor sparse coding (GTSC) for image representation. GTSC preserves the local proximity of elementary structures in the image by adopting the newly proposed tubal-tensor representation. Simultaneously, it considers the intrinsic geometric properties by imposing graph regularization that has been successfully applied to uncover the geometric distribution for the image data. Moreover, the returned sparse representations by GTSC have better physical explanations as the key operation (i.e., circular convolution) in the tubal-tensor model preserves the shifting invariance property. Experimental results on image clustering demonstrate the effectiveness of the proposed scheme.
- Mar 29 2017 cs.LG arXiv:1703.09327v1In Imitation Learning, a supervisor's policy is observed and the intended behavior is learned. A known problem with this approach is covariate shift, which occurs because the agent visits different states than the supervisor. Rolling out the current agent's policy, an on-policy method, allows for collecting data along a distribution similar to the updated agent's policy. However this approach can become less effective as the demonstrations are collected in very large batch sizes, which reduces the relevance of data collected in previous iterations. In this paper, we propose to alleviate the covariate shift via the injection of artificial noise into the supervisor's policy. We prove an improved bound on the loss due to the covariate shift, and introduce an algorithm that leverages our analysis to estimate the level of $\epsilon$-greedy noise to inject. In a driving simulator domain where an agent learns an image-to-action deep network policy, our algorithm Dart achieves a better performance than DAgger with 75% fewer demonstrations.
- Mar 29 2017 quant-ph cond-mat.mes-hall arXiv:1703.09317v1Sensors based on single spins can enable magnetic field detection with very high sensitivity and spatial resolution. Previous work has concentrated on sensing of a constant magnetic field or a periodic signal. Here, we instead investigate the problem of estimating a field with non-periodic variation described by a Wiener process. We propose and study, by numerical simulations, an adaptive tracking protocol based on Bayesian estimation. The tracking protocol updates the probability distribution for the magnetic field, based on measurement outcomes, and adapts the choice of sensing time and phase in real time. By taking the statistical properties of the signal into account, our protocol strongly reduces the required measurement time, reducing the error in the estimation of a time-varying signal by up to a factor 4.
- Mar 29 2017 cs.RO arXiv:1703.09312v1To reduce data collection time for deep learning of robust robotic grasp plans, we explore training from a synthetic dataset of 6.7 million point clouds, grasps, and robust analytic grasp metrics generated from thousands of 3D models from Dex-Net 1.0 in randomized poses on a table. We use the resulting dataset, Dex-Net 2.0, to train a Grasp Quality Convolutional Neural Network (GQ-CNN) model that rapidly classifies grasps as robust from depth images and the position, angle, and height of the gripper above a table. Experiments with over 1,000 trials on an ABB YuMi comparing grasp planning methods on singulated objects suggest that a GQ-CNN trained with only synthetic data from Dex-Net 2.0 can be used to plan grasps in 0.8s with a success rate of 93% on eight known objects with adversarial geometry and is 3x faster than registering point clouds to a precomputed dataset of objects and indexing grasps. The GQ-CNN is also the highest performing method on a dataset of ten novel household objects, with zero false positives out of 29 grasps classified as robust and a 1.5x higher success rate than a point cloud registration method.
- This work studies how an AI-controlled dog-fighting agent with tunable decision-making parameters can learn to optimize performance against an intelligent adversary, as measured by a stochastic objective function evaluated on simulated combat engagements. Gaussian process Bayesian optimization (GPBO) techniques are developed to automatically learn global Gaussian Process (GP) surrogate models, which provide statistical performance predictions in both explored and unexplored areas of the parameter space. This allows a learning engine to sample full-combat simulations at parameter values that are most likely to optimize performance and also provide highly informative data points for improving future predictions. However, standard GPBO methods do not provide a reliable surrogate model for the highly volatile objective functions found in aerial combat, and thus do not reliably identify global maxima. These issues are addressed by novel Repeat Sampling (RS) and Hybrid Repeat/Multi-point Sampling (HRMS) techniques. Simulation studies show that HRMS improves the accuracy of GP surrogate models, allowing AI decision-makers to more accurately predict performance and efficiently tune parameters.
- Mar 29 2017 cs.SD arXiv:1703.09302v1In this study we present a Deep Mixture of Experts (DMoE) neural-network architecture for single microphone speech enhancement. By contrast to most speech enhancement algorithms that overlook the speech variability mainly caused by phoneme structure, our framework comprises a set of deep neural networks (DNNs), each one of which is an 'expert' in enhancing a given speech type corresponding to a phoneme. A gating DNN determines which expert is assigned to a given speech segment. A speech presence probability (SPP) is then obtained as a weighted average of the expert SPP decisions, with the weights determined by the gating DNN. A soft spectral attenuation, based on the SPP, is then applied to enhance the noisy speech signal. The experts and the gating components of the DMoE network are trained jointly. As part of the training, speech clustering into different subsets is performed in an unsupervised manner. Therefore, unlike previous methods, a phoneme-labeled database is not required for the training procedure. A series of experiments with different noise types verified the applicability of the new algorithm to the task of speech enhancement. The proposed scheme outperforms other schemes that either do not consider phoneme structure or use a simpler training methodology.
- In applications of Einstein gravity one replaces the quantum-mechanical energy-momentum tensor of sources such as the degenerate electrons in a white dwarf or the black-body photons in the microwave background by c-number matrix elements. And not only that, one ignores the zero-point fluctuations in these sources by only retaining the normal-ordered parts of those matrix elements. There is no apparent justification for this procedure, and we show that it is precisely this procedure that leads to the cosmological constant problem. We suggest that solving the problem requires that gravity be treated just as quantum-mechanically as the sources to which it couples, and show that one can then solve the cosmological constant problem if one replaces Einstein gravity by the fully quantum-mechanically consistent conformal gravity theory.
- Mar 29 2017 math.OC arXiv:1703.09280v1We present a subgradient method for minimizing non-smooth, non-Lipschitz convex optimization problems. The only structure assumed is that a strictly feasible point is known. We extend the work of Renegar [1] by taking a different perspective, leading to an algorithm which is conceptually more natural, has notably improved convergence rates, and for which the analysis is surprisingly simple. At each iteration, the algorithm takes a subgradient step and then performs a line search to move radially towards (or away from) the known feasible point. Our convergence results have striking similarities to those of traditional methods that require Lipschitz continuity. Costly orthogonal projections typical of subgradient methods are entirely avoided.
- Mar 29 2017 quant-ph arXiv:1703.09278v1Quantum key distribution using weak coherent states and homodyne detection is a promising candidate for practical quantum-cryptographic implementations due to its compatibility with existing telecom equipment and high detection efficiencies. However, despite the actual simplicity of the protocol, the security analysis of this method is rather involved compared to discrete-variable QKD. In this article we review the theoretical foundations of continuous-variable quantum key distribution (CV-QKD) with Gaussian modulation and rederive the essential relations from scratch in a pedagogical way. The aim of this paper is to be as comprehensive and self-contained as possible in order to be well intelligible even for readers with little pre-knowledge on the subject. Although the present article is a theoretical discussion of CV-QKD, its focus lies on practical implementations, taking into account various kinds of hardware imperfections and suggesting practical methods to perform the security analysis subsequent to the key exchange. Apart from a review of well known results, this manuscript presents a set of new original noise models which are helpful to get an estimate of how well a given set of hardware will perform in practice.
- Real-world robots are becoming increasingly complex and commonly act in poorly understood environments where it is extremely challenging to model or learn their true dynamics. Therefore, it might be desirable to take a task-specific approach, wherein the focus is on explicitly learning the dynamics model which achieves the best control performance for the task at hand, rather than learning the true dynamics. In this work, we use Bayesian optimization in an active learning framework where a locally linear dynamics model is learned with the intent of maximizing the control performance, and used in conjunction with optimal control schemes to efficiently design a controller for a given task. This model is updated directly based on the performance observed in experiments on the physical system in an iterative manner until a desired performance is achieved. We demonstrate the efficacy of the proposed approach through simulations and real experiments on a quadrotor testbed.
- Mar 29 2017 cs.CV arXiv:1703.09245v1Recently, several discriminative learning approaches have been proposed for effective image restoration, achieving convincing trade-off between image quality and computational efficiency. However, these methods require separate training for each restoration task (e.g., denoising, deblurring, demosaicing) and problem condition (e.g., noise level of input images). This makes it time-consuming and difficult to encompass all tasks and conditions during training. In this paper, we propose a discriminative transfer learning method that incorporates formal proximal optimization and discriminative learning for general image restoration. The method requires a single-pass training and allows for reuse across various problems and conditions while achieving an efficiency comparable to previous discriminative approaches. Furthermore, after being trained, our model can be easily transferred to new likelihood terms to solve untrained tasks, or be combined with existing priors to further improve image restoration quality.
- We study a variant of the source identification game with training data in which part of the training data is corrupted by an attacker. In the addressed scenario, the defender aims at deciding whether a test sequence has been drawn according to a discrete memoryless source $X \sim P_X$, whose statistics are known to him through the observation of a training sequence generated by $X$. In order to undermine the correct decision under the alternative hypothesis that the test sequence has not been drawn from $X$, the attacker can modify a sequence produced by a source $Y \sim P_Y$ up to a certain distortion, and corrupt the training sequence either by adding some fake samples or by replacing some samples with fake ones. We derive the unique rationalizable equilibrium of the two versions of the game in the asymptotic regime and by assuming that the defender bases its decision by relying only on the first order statistics of the test and the training sequences. By mimicking Stein's lemma, we derive the best achievable performance for the defender when the first type error probability is required to tend to zero exponentially fast with an arbitrarily small, yet positive, error exponent. We then use such a result to analyze the ultimate distinguishability of any two sources as a function of the allowed distortion and the fraction of corrupted samples injected into the training sequence.
- Mar 29 2017 astro-ph.CO gr-qc arXiv:1703.09228v1What are the fundamental limitations of reconstructing the properties of dark energy, given cosmological observations in the weakly nonlinear regime in a range of redshifts, to be as precise as required? The aim of this paper is to address this question by constructing model-independent observables, whilst completely ignoring practical problems of real-world observations. Non-Gaussianities already present in the initial conditions are not directly accessible from observations, because of a perfect degeneracy with the non-Gaussianities arising from the nonlinear matter evolution in generalized dark energy models. By imposing a specific set of evolution equations that should cover a range of dark energy cosmologies, we however find a constraint equation for the linear structure growth rate $f_1$ expressed in terms of model-independent observables. Entire classes of dark energy models which do not satisfy this constraint equation could be ruled out, and for models satisfying it we could reconstruct e.g. the nonlocal bias parameters $b_1$ and $b_2$.
- Mar 29 2017 cs.DB arXiv:1703.09218v1In visual exploration and analysis of data, determining how to select and transform the data for visualization is a challenge for data-unfamiliar or inexperienced users. Our main hypothesis is that for many data sets and common analysis tasks, there are relatively few "data slices" that result in effective visualizations. By focusing human users on appropriate and suitably transformed parts of the underlying data sets, these data slices can help the users carry their task to correct completion. To verify this hypothesis, we develop a framework that permits us to capture exemplary data slices for a user task, and to explore and parse visual-exploration sequences into a format that makes them distinct and easy to compare. We develop a recommendation system, DataSlicer, that matches a "currently viewed" data slice with the most promising "next effective" data slices for the given exploration task. We report the results of controlled experiments with an implementation of the DataSlicer system, using four common analytical task types. The experiments demonstrate statistically significant improvements in accuracy and exploration speed versus users without access to our system.
- This paper presents real-time vibration based identification technique using measured frequency response functions(FRFs) under random vibration loading. Artificial Neural Networks (ANNs) are trained to map damage fingerprints to damage characteristic parameters. Principal component statistical analysis(PCA) technique was used to tackle the problem of high dimensionality and high noise of data, which is common for industrial structures. The present study considers Crack, Rivet hole expansion and redundant uniform mass as damages on the structure. Frequency response function data after being reduced in size using PCA is fed to individual neural networks to localize and predict the severity of damage on the structure. The system of ANNs trained with both numerical and experimental model data to make the system reliable and robust. The methodology is applied to a numerical model of stiffened panel structure, where damages are confined close to the stiffener. The results showed that, in all the cases considered, it is possible to localize and predict severity of the damage occurrence with very good accuracy and reliability.