- We present a quantum algorithm for simulating the dynamics of Hamiltonians that are not necessarily sparse. Our algorithm is based on the assumption that the entries of the Hamiltonian are stored in a data structure that allows for the efficient preparation of states that encode the rows of the Hamiltonian. We use a linear combination of quantum walks to achieve a poly-logarithmic dependence on the precision. The time complexity measured in terms of circuit depth of our algorithm is $O(t\sqrt{N}\lVert H \rVert \text{polylog}(N, t\lVert H \rVert, 1/\epsilon))$, where $t$ is the evolution time, $N$ is the dimension of the system, and $\epsilon$ is the error in the final state, which we call precision. Our algorithm can directly be applied as a subroutine for unitary Hamiltonians and solving linear systems, achieving a $\widetilde{O}(\sqrt{N})$ dependence for both applications.
- Mar 23 2018 quant-ph cond-mat.stat-mech arXiv:1803.08220v1The pursuit of simplicity underlies most of quantitative science. In stochastic modeling, there has been significant effort towards finding models that predict a process' future using minimal information from its past. Meanwhile, in condensed matter physics, finding efficient representations for large quantum many-body systems is a topic of critical concern -- exemplified by the development of the matrix product state (MPS) formalism. In this letter, we connect these two distinct fields. Specifically, we associate each stochastic process with a suitable quantum state of a spin-chain. We show that the optimal predictive model for the process leads directly to the MPS representation of the associated quantum state. Conversely, MPS methods offer a systematic construction of q-simulators -- the currently best known predictive quantum models for stochastic processes. We show that the memory requirements of the these models directly coincide with the bipartite entanglement of an associated spin-chain, providing an analytical connection between quantum correlations in many body physics and the complexity of stochastic modeling.
- Mar 23 2018 quant-ph arXiv:1803.08245v1Estimation of quantum states and measurements is crucial for the implementation of quantum information protocols. The standard method for each is quantum tomography. However, quantum tomography suffers from systematic errors caused by imperfect knowledge of the system. We present a procedure to simultaneously characterize quantum states and measurements that mitigates systematic errors by use of a single high-fidelity state preparation and a limited set of high-fidelity unitary operations. Such states and operations are typical of many state-of-the-art systems. For this situation we design a set of experiments and an optimization algorithm that alternates between maximizing the likelihood with respect to the states and measurements to produce estimates of each. In some cases, the procedure does not enable unique estimation of the states. For these cases, we show how one may identify a set of density matrices compatible with the measurements and use a semi-definite program to place bounds on the state's expectation values. We demonstrate the procedure on data from a simulated experiment with two trapped ions.
- We study the finite-temperature scrambling behavior of a quantum system described by a Hamiltonian chosen from a random matrix ensemble. This effectively (0+1)-dimensional model admits an exact calculation of various ensemble-averaged out-of-time-ordered correlation functions in the large-$N$ limit, where $N$ is the Hilbert space dimension. For a Hamiltonian drawn from the Gaussian unitary ensemble, we calculate the ensemble averaged OTOC at all temperatures. In addition to an early time quadratic growth of the averaged out-of-time-ordered commutator (OTOC), we determine that the OTOC saturates to its asymptotic value as a power-law in time, with an exponent that depends both on temperature, and on one of four classes of operators appearing in the correlation function, that naturally emerge from this calculation. Out-of-time-ordered correlation functions of operators that are distributed around the thermal circle take a time $t_{s}\sim \beta$ to decay at low temperatures. We generalize these exact results, by demonstrating that out-of-time-ordered correlation functions averaged over any ensemble of Hamiltonians that are invariant under unitary conjugation $H \rightarrow {U} H {U}^{\dagger}$, exhibit power-law decay to an asymptotic value. We argue that this late-time behavior is a generic feature of unitary dynamics with energy conservation. Finally, by comparing the OTOC with a commutator-anticommutator correlation function, we examine whether there is a time window over which a typical Hamiltonian behaves as a "coherent scrambler" in the language of Ref. \citeKitaev_IAS, Kitaev_Suh.
- Mar 23 2018 quant-ph cond-mat.dis-nn arXiv:1803.08321v1The near-critical unitary dynamics of quantum Ising spin chains in transversal and longitudinal magnetic fields is studied using an artificial neural network representation of the wave function. A focus is set on strong spatial correlations which build up in the system following a quench into the vicinity of the quantum critical point. We compare correlations observed following reinforcement learning of the network states with analytical solutions in integrable cases and tDMRG simulations, as well as with predictions from a semi-classical discrete Truncated Wigner analysis. While the semi-classical approach excells mainly at short times and for small transverse fields, the neural-network representation provides accurate results for a much wider range of parameters. Where long-range spin-spin correlations build up in the long-time dynamics we find qualitative agreement with exact results while quantitative deviations are of similar size as for the semi-classically predicted correlations, and slow convergence is observed when increasing the number of hidden neurons.
- Mar 23 2018 quant-ph arXiv:1803.08259v1We propose a reference-frame-independent measurement-device-independent quantum key distribution with uncharacterized quantum bits. We show security of the protocol. The protocol can also be useful for implementation with channel that has very low bit error rate but suffers large uncontrolled unitary rotation.
- The fine grained energy spectrum of quantum chaotic systems is widely believed to be described by random matrix statistics. A basic scale in such a system is the energy range over which this behavior persists. We define the corresponding time scale by the time at which the linearly growing ramp region in the spectral form factor begins. We call this time $t_{\rm ramp}$. The purpose of this paper is to study this scale in many-body quantum systems that display strong chaos, sometimes called scrambling systems. We focus on randomly coupled qubit systems, both local and $k$-local (all-to-all interactions) and the Sachdev--Ye--Kitaev (SYK) model. Using numerical results, analytic estimates for random quantum circuits, and a heuristic analysis of Hamiltonian systems we find the following results. For geometrically local systems with a conservation law we find $t_{\rm ramp}$ is determined by the diffusion time across the system, order $N^2$ for a 1D chain of $N$ qubits. This is analogous to the behavior found for local one-body chaotic systems. For a $k$-local system like SYK the time is order $\log N$ but with a different prefactor and a different mechanism than the scrambling time. In the absence of any conservation laws, as in a generic random quantum circuit, we find $t_{\rm ramp} \sim \log N$, independent of connectivity.
- Batch Normalization (BN) is a milestone technique in the development of deep learning, enabling various networks to train. However, normalizing along the batch dimension introduces problems --- BN's error increases rapidly when the batch size becomes smaller, caused by inaccurate batch statistics estimation. This limits BN's usage for training larger models and transferring features to computer vision tasks including detection, segmentation, and video, which require small batches constrained by memory consumption. In this paper, we present Group Normalization (GN) as a simple alternative to BN. GN divides the channels into groups and computes within each group the mean and variance for normalization. GN's computation is independent of batch sizes, and its accuracy is stable in a wide range of batch sizes. On ResNet-50 trained in ImageNet, GN has 10.6% lower error than its BN counterpart when using a batch size of 2; when using typical batch sizes, GN is comparably good with BN and outperforms other normalization variants. Moreover, GN can be naturally transferred from pre-training to fine-tuning. GN can outperform or compete with its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks. GN can be easily implemented by a few lines of code in modern libraries.
- Unseen Action Recognition (UAR) aims to recognise novel action categories without training examples. While previous methods focus on inner-dataset seen/unseen splits, this paper proposes a pipeline using a large-scale training source to achieve a Universal Representation (UR) that can generalise to a more realistic Cross-Dataset UAR (CD-UAR) scenario. We first address UAR as a Generalised Multiple-Instance Learning (GMIL) problem and discover 'building-blocks' from the large-scale ActivityNet dataset using distribution kernels. Essential visual and semantic components are preserved in a shared space to achieve the UR that can efficiently generalise to new datasets. Predicted UR exemplars can be improved by a simple semantic adaptation, and then an unseen action can be directly recognised using UR during the test. Without further training, extensive experiments manifest significant improvements over the UCF101 and HMDB51 benchmarks.
- Deep reinforcement learning has been successfully applied to several visual-input tasks using model-free methods. In this paper, we propose a model-based approach that combines learning a DNN-based transition model with Monte Carlo tree search to solve a block-placing task in Minecraft. Our learned transition model predicts the next frame and the rewards one step ahead given the last four frames of the agent's first-person-view image and the current action. Then a Monte Carlo tree search algorithm uses this model to plan the best sequence of actions for the agent to perform. On the proposed task in Minecraft, our model-based approach reaches the performance comparable to the Deep Q-Network's, but learns faster and, thus, is more training sample efficient.
- Mar 23 2018 quant-ph arXiv:1803.08443v1The dynamics of a system that is initially correlated with an environment is almost always non-Markovian. Hence it is important to characterise such initial correlations experimentally and witness them in physically realistic settings. One such setting is weak-field phase control, where chemical reactions are sought to be controlled by the phase of shaped weak laser pulses. In this manuscript, we show how weak quantum controllability can be combined with quantum preparations to witness initial correlations between the system and the environment. Furthermore we show how weak field phase control can be applied to witness when the quantum regression formula does not apply.
- Mar 23 2018 math.MG arXiv:1803.08224v1We study a new construction of bodies from a given convex body in $\mathbb{R}^{n}$ which are isomorphic to (weighted) floating bodies. We establish several properties of this new construction, including its relation to $p$-affine surface areas.
- Mar 23 2018 cs.CV arXiv:1803.08496v1A new passive approach called Generalized Scene Reconstruction (GSR) enables "generalized scenes" to be effectively reconstructed. Generalized scenes are defined to be "boundless" spaces that include non-Lambertian, partially transmissive, textureless and finely-structured matter. A new data structure called a plenoptic octree is introduced to enable efficient (database-like) light and matter field reconstruction in devices such as mobile phones, augmented reality (AR) glasses and drones. To satisfy threshold requirements for GSR accuracy, scenes are represented as systems of partially polarized light, radiometrically interacting with matter. To demonstrate GSR, a prototype imaging polarimeter is used to reconstruct (in generalized light fields) highly reflective, hail-damaged automobile body panels. Follow-on GSR experiments are described.
- We present a method for generating colored 3D shapes from natural language. To this end, we first learn joint embeddings of freeform text descriptions and colored 3D shapes. Our model combines and extends learning by association and metric learning approaches to learn implicit cross-modal connections, and produces a joint representation that captures the many-to-many relations between language and physical properties of 3D shapes such as color and shape. To evaluate our approach, we collect a large dataset of natural language descriptions for physical 3D objects in the ShapeNet dataset. With this learned joint embedding we demonstrate text-to-shape retrieval that outperforms baseline approaches. Using our embeddings with a novel conditional Wasserstein GAN framework, we generate colored 3D shapes from text. Our method is the first to connect natural language text with realistic 3D objects exhibiting rich variations in color, texture, and shape detail. See video at https://youtu.be/zraPvRdl13Q
- Mar 23 2018 cs.CL arXiv:1803.08493v1This paper introduces a simple and explicit measure of word importance in a global context, including very small contexts (10+ sentences). After generating a word-vector space containing both 2-gram clauses and single tokens, it became clear that more contextually significant words disproportionately define clause meanings. Using this simple relationship in a weighted bag-of-words sentence embedding model results in sentence vectors that outperform the state-of-the-art for subjectivity/objectivity analysis, as well as paraphrase detection, and fall within those produced by state-of-the-art models for six other transfer learning tests. The metric was then extended to a sentence/document summarizer, an improved (and context-aware) cosine distance and a simple document stop word identifier. The sigmoid-global context weighted bag of words is presented as a new baseline for sentence embeddings.
- Mar 23 2018 cs.CV arXiv:1803.08489v1The main challenge in applying state-of-the-art deep learning methods to predict image quality in-the-wild is the relatively small size of existing quality scored datasets. The reason for the lack of larger datasets is the massive resources required in generating diverse and publishable content. We present a new systematic and scalable approach to create large-scale, authentic and diverse image datasets for Image Quality Assessment (IQA). We show how we built an IQA database, KonIQ-10k, consisting of 10,073 images, on which we performed very large scale crowdsourcing experiments in order to obtain reliable quality ratings from 1,467 crowd workers (1.2 million ratings). We argue for its ecological validity by analyzing the diversity of the dataset, by comparing it to state-of-the-art IQA databases, and by checking the reliability of our user studies.
- Mar 23 2018 cs.RO arXiv:1803.08478v1This paper presents a versatile robotic system for sewing 3D structured object. Leveraging on using a customized robotic sewing device and closed-loop visual servoing control, an all-in-one solution for sewing personalized stent graft is demonstrated. Stitch size planning and automatic knot tying are proposed as the two key functions of the system. By using effective stitch size planning, sub-millimetre sewing accuracy is achieved for stitch sizes ranging from 2mm to 5mm. In addition, a thread manipulator for thread management and tension control is also proposed to perform successive knot tying to secure each stitch. Detailed laboratory experiments have been performed to access the proposed instruments and allied algorithms. The proposed framework can be generalised to a wide range of applications including 3D industrial sewing, as well as transferred to other clinical areas such as surgical suturing.
- Word Sense Induction (WSI) is the ability to automatically induce word senses from corpora. The WSI task was first proposed to overcome the limitations of manually annotated corpus that are required in word sense disambiguation systems. Even though several works have been proposed to induce word senses, existing systems are still very limited in the sense that they make use of structured, domain-specific knowledge sources. In this paper, we devise a method that leverages recent findings in word embeddings research to generate context embeddings, which are embeddings containing information about the semantical context of a word. In order to induce senses, we modeled the set of ambiguous words as a complex network. In the generated network, two instances (nodes) are connected if the respective context embeddings are similar. Upon using well-established community detection methods to cluster the obtained context embeddings, we found that the proposed method yields excellent performance for the WSI task. Our method outperformed competing algorithms and baselines, in a completely unsupervised manner and without the need of any additional structured knowledge source.
- As more aspects of social interaction are digitally recorded, there is a growing need to develop privacy-preserving data analysis methods. Social scientists will be more likely to adopt these methods if doing so entails minimal change to their current methodology. Toward that end, we present a general and modular method for privatizing Bayesian inference for Poisson factorization, a broad class of models that contains some of the most widely used models in the social sciences. Our method satisfies local differential privacy, which ensures that no single centralized server need ever store the non-privatized data. To formulate our local-privacy guarantees, we introduce and focus on limited-precision local privacy---the local privacy analog of limited-precision differential privacy (Flood et al., 2013). We present two case studies, one involving social networks and one involving text corpora, that test our method's ability to form the posterior distribution over latent variables under different levels of noise, and demonstrate our method's utility over a naïve approach, wherein inference proceeds as usual, treating the privatized data as if it were not privatized.
- Mar 23 2018 math.AP arXiv:1803.08470v1We study the motion of smooth, closed, strictly convex hypersurfaces in $\mathbb{R}^{n+1}$ expanding in the direction of their normal vector field with speed depending on the $k$th elementary symmetric polynomial of the principal radii of curvature $\sigma_k$ and support function $h$. A homothetic self-similar solution to the flow that we will consider in this paper, if exists, is a solution of the well-known $L_p$-Christoffel-Minkowski problem $\varphi h^{1-p}\sigma_k=c$. Here $\varphi$ is a preassigned positive smooth function defined on the unit sphere, and $c$ is a positive constant. For $1\leq k\leq n-1, p\geq k+1$, assuming the spherical hessian of $\varphi^{\frac{1}{p+k-1}}$ is positive definite, we prove the $C^{\infty}$ convergence of the normalized flow to a homothetic self-similar solution. One of the highlights of our arguments is that we do not need the constant rank theorem/deformation lemma of Guan-Ma and thus we give a partial answer to a question raised in Guan-Xia. Moreover, for $k=n, p\geq n+1$, we prove the $C^{\infty}$ convergence of the normalized flow to a homothetic self-similar solution without imposing any further condition on $\varphi.$ In the final section of the paper, for $1\leq k<n$, we will give an example that spherical hessian of $\varphi^{\frac{1}{p+k-1}}$ is negative definite at some point and the solution to the flow loses its smoothness.
- Mar 23 2018 cs.CV arXiv:1803.08467v1We introduce BranchGAN, a novel training method that enables unconditioned generative adversarial networks (GANs) to learn image manifolds at multiple scales. What is unique about BranchGAN is that it is trained in multiple branches, progressively covering both the breadth and depth of the network, as resolutions of the training images increase to reveal finer-scale features. Specifically, each noise vector, as input to the generator network, is explicitly split into several sub-vectors, each corresponding to and trained to learn image representations at a particular scale. During training, we progressively "de-freeze" the sub-vectors, one at a time, as a new set of higher-resolution images is employed for training and more network layers are added. A consequence of such an explicit sub-vector designation is that we can directly manipulate and even combine latent (sub-vector) codes that are associated with specific feature scales. Experiments demonstrate the effectiveness of our training method in multi-scale, disentangled learning of image manifolds and synthesis, without any extra labels and without compromising quality of the synthesized high-resolution images. We further demonstrate two new applications enabled by BranchGAN.
- Mar 23 2018 cs.CL arXiv:1803.08463v1In this report, we describe our participant named-entity recognition system at VLSP 2018 evaluation campaign. We formalized the task as a sequence labeling problem using BIO encoding scheme. We applied a feature-based model which combines word, word-shape features, Brown-cluster-based features, and word-embedding-based features. We compare several methods to deal with nested entities in the dataset. We showed that combining tags of entities at all levels for training a sequence labeling model (joint-tag model) improved the accuracy of nested named-entity recognition.
- Mar 23 2018 cs.CV arXiv:1803.08450v1Deep learning revolutionized data science, and recently, its popularity has grown exponentially, as did the amount of papers employing deep networks. Vision tasks such as human pose estimation did not escape this methodological change. The large number of deep architectures lead to a plethora of methods that are evaluated under different experimental protocols. Moreover, small changes in the architecture of the network, or in the data pre-processing procedure, together with the stochastic nature of the optimization methods, lead to notably different results, making extremely difficult to sift methods that significantly outperform others. Therefore, when proposing regression algorithms, practitioners proceed by trial-and-error. This situation motivated the current study, in which we perform a systematic evaluation and a statistical analysis of the performance of vanilla deep regression -- short for convolutional neural networks with a linear regression top layer --. Up to our knowledge this is the first comprehensive analysis of deep regression techniques. We perform experiments on three vision problems and report confidence intervals for the median performance as well as the statistical significance of the results, if any. Surprisingly, the variability due to different data pre-processing procedures generally eclipses the variability due to modifications in the network architecture.
- Mar 23 2018 cs.NI arXiv:1803.08448v1Edge computing caters to a wide range of use cases from latency sensitive to bandwidth constrained applications. However, the exact specifications of the edge that give the most benefit for each type of application are still unclear. We investigate the concrete conditions when the edge is feasible, i.e., when users observe performance gains from the edge while costs remain low for the providers, for an application that requires both low latency and high bandwidth: video analytics.
- Mar 23 2018 cond-mat.str-el cond-mat.stat-mech arXiv:1803.08445v1We show how to accurately study 2D quantum critical phenomena using infinite projected entangled-pair states (iPEPS). We identify the presence of a finite correlation length in the optimal iPEPS approximation to Lorentz-invariant critical states which we use to perform a finite correlation-length scaling (FCLS) analysis to determine critical exponents. This is analogous to the one-dimensional (1D) finite entanglement scaling with infinite matrix product states. We provide arguments why this approach is also valid in 2D by identifying a class of states that despite obeying the area law of entanglement seems hard to describe with iPEPS. We apply these ideas to interacting spinless fermions on a honeycomb lattice and obtain critical exponents which are in agreement with Quantum Monte Carlo results. Furthermore, we introduce a new scheme to locate the critical point without the need of computing higher order moments of the order parameter. Finally, we also show how to obtain an improved estimate of the order parameter in gapless systems, with the 2D Heisenberg model as an example.
- Mar 23 2018 cs.CV arXiv:1803.08435v1Deep generative models have shown success in automatically synthesizing missing image regions using surrounding context. However, users cannot directly decide what content to synthesize with such approaches. We propose an end-to-end network for image inpainting that uses a different image to guide the synthesis of new content to fill the hole. A key challenge addressed by our approach is synthesizing new content in regions where the guidance image and the context of the original image are inconsistent. We conduct four studies that demonstrate our results yield more realistic image inpainting results over seven baselines.
- Mar 23 2018 cs.DC arXiv:1803.08426v1Volunteer computing is currently successfully used to make hundreds of thousands of machines available free-of-charge to projects of general interest. However the effort and cost involved in participating in and launching such projects may explain why only a few high-profile projects use it and why only 0.1% of Internet users participate in them. In this paper we present Pando, a new web-based volunteer computing system designed to be easy to deploy and which does not require dedicated servers. The tool uses new demand-driven stream abstractions and a WebRTC overlay based on a fat tree for connecting volunteers. Together the stream abstractions and the fat-tree overlay enable a thousand browser tabs running on multiple machines to be used for computation, enough to tap into all machines bought as part of previous hardware investments made by a small- or medium-company or a university department. Moreover the approach is based on a simple programming model that should be both easy to use by itself by JavaScript programmers and as a compilation target by compiler writers. We provide a command-line version of the tool and all scripts and procedures necessary to replicate the experiments we made on the Grid5000 testbed.
- Mar 23 2018 math.CV arXiv:1803.08422v1The paper investigates the complex gradient descent method (CGD) for the best rational approximation of a given order to a function in the Hardy space on the unit disk. It is equivalent to finding the best Blaschke form with free poles. The adaptive Fourier decomposition (AFD) and the cyclic AFD methods in literature are based on the grid search technique. The precision of these methods is limited by the grid spacing. The proposed method employs a fast search algorithm to find the initial for CGD, then finds the target poles by gradient descent optimization. Hence, it can reach higher precision with less computation cost. Its validity and effectiveness are confirmed by several examples.
- The sizes of compressed images depend on their spatial resolution (number of pixels) and on their color resolution (number of color quantization levels). We introduce DaltonQuant, a new color quantization technique for image compression that cloud services can apply to images destined for a specific user with known color vision deficiencies. DaltonQuant improves compression in a user-specific but reversible manner thereby improving a user's network bandwidth and data storage efficiency. DaltonQuant quantizes image data to account for user-specific color perception anomalies, using a new method for incremental color quantization based on a large corpus of color vision acuity data obtained from a popular mobile game. Servers that host images can revert DaltonQuant's image requantization and compression when those images must be transmitted to a different user, making the technique practical to deploy on a large scale. We evaluate DaltonQuant's compression performance on the Kodak PC reference image set and show that it improves compression by an additional 22%-29% over the state-of-the-art compressors TinyPNG and pngquant.
- Conversational agents have become ubiquitous, ranging from goal-oriented systems for helping with reservations to chit-chat models found in modern virtual assistants. In this survey paper, we explore this fascinating field. We look at some of the pioneering work that defined the field and gradually move to the current state-of-the-art models. We look at statistical, neural, generative adversarial network based and reinforcement learning based approaches and how they evolved. Along the way we discuss various challenges that the field faces, lack of context in utterances, not having a good quantitative metric to compare models, lack of trust in agents because they do not have a consistent persona etc. We structure this paper in a way that answers these pertinent questions and discusses competing approaches to solve them.
- Parametric approaches to Learning, such as deep learning (DL), are highly popular in nonlinear regression, in spite of their extremely difficult training with their increasing complexity (e.g. number of layers in DL). In this paper, we present an alternative semi-parametric framework which foregoes the ordinarily required feedback, by introducing the novel idea of geometric regularization. We show that certain deep learning techniques such as residual network (ResNet) architecture are closely related to our approach. Hence, our technique can be used to analyze these types of deep learning. Moreover, we present preliminary results which confirm that our approach can be easily trained to obtain complex structures.
- Mar 23 2018 cs.GT arXiv:1803.08415v1Vehicle-to-Infrastructure (V2I) communications are increasingly supporting highway operations such as electronic toll collection, carpooling, and vehicle platooning. In this paper we study the incentives of strategic misbehavior by individual vehicles who can exploit the security vulnerabilities in V2I communications and impact the highway operations. We consider a V2I-enabled highway segment facing two classes of vehicles (agent populations), each with an authorized access to one server (subset of lanes). Vehicles are strategic in that they can misreport their class (type) to the system operator and get unauthorized access to the server dedicated to the other class. This misbehavior causes a congestion externality on the compliant vehicles, and thus, needs to be deterred. We focus on an environment where the operator is able to inspect the vehicles for misbehavior based on their reported types. The inspection is costly and successful detection incurs a fine on the misbehaving vehicle. We formulate a signaling game to study the strategic interaction between the vehicle classes and the operator. Our equilibrium analysis provides conditions on the cost parameters that govern the vehicles' incentive to misbehave, and determine the operator's optimal inspection strategy.
- Mar 23 2018 cs.CV arXiv:1803.08414v1In this paper, we adapt the Faster-RCNN framework for the detection of underground buried objects (i.e. hyperbola reflections) in B-scan ground penetrating radar (GPR) images. Due to the lack of real data for training, we propose to incorporate more simulated radargrams generated from different configurations using the gprMax toolbox. Our designed CNN is first pre-trained on the grayscale Cifar-10 database. Then, the Faster-RCNN framework based on the pre-trained CNN is trained and fine-tuned on both real and simulated GPR data. Preliminary detection results show that the proposed technique can provide significant improvements compared to classical computer vision methods and hence becomes quite promising to deal with this kind of specific GPR data even with few training samples.
- Mar 23 2018 cs.CV arXiv:1803.08412v1Inspired by group-based sparse coding, recently proposed group sparsity residual (GSR) scheme has demonstrated superior performance in image processing. However, one challenge in GSR is to estimate the residual by using a proper reference of the group-based sparse coding (GSC), which is desired to be as close to the truth as possible. Previous researches utilized the estimations from other algorithms (i.e., GMM or BM3D), which are either not accurate or too slow. In this paper, we propose to use the Non-Local Samples (NLS) as reference in the GSR regime for image denoising, thus termed GSR-NLS. More specifically, we first obtain a good estimation of the group sparse coefficients by the image nonlocal self-similarity, and then solve the GSR model by an effective iterative shrinkage algorithm. Experimental results demonstrate that the proposed GSR-NLS not only outperforms many state-of-the-art methods, but also delivers the competitive advantage of speed.
- Mar 23 2018 cs.CV arXiv:1803.08410v1In laparoscopic surgery, image quality can be severely degraded by surgical smoke, which not only introduces error for the image processing (used in image guided surgery), but also reduces the visibility of the surgeons. In this paper, we propose to enhance the laparoscopic images by decomposing them into unwanted smoke part and enhanced part using a variational approach. The proposed method relies on the observation that smoke has low contrast and low inter-channel differences. A cost function is defined based on this prior knowledge and is solved using an augmented Lagrangian method. The obtained unwanted smoke component is then subtracted from the original degraded image, resulting in the enhanced image. The obtained quantitative scores in terms of FADE, JNBM and RE metrics show that our proposed method performs rather well. Furthermore, the qualitative visual inspection of the results show that it removes smoke effectively from the laparoscopic images.
- Mar 23 2018 cs.CL arXiv:1803.08409v1Machine Translation (MT) is being deployed for a range of use-cases by millions of people on a daily basis. There should, therefore, be no doubt as to the utility of MT. However, not everyone is convinced that MT can be useful, especially as a productivity enhancer for human translators. In this chapter, I address this issue, describing how MT is currently deployed, how its output is evaluated and how this could be enhanced, especially as MT quality itself improves. Central to these issues is the acceptance that there is no longer a single 'gold standard' measure of quality, such that the situation in which MT is deployed needs to be borne in mind, especially with respect to the expected 'shelf-life' of the translation itself.
- Mar 23 2018 cs.CV arXiv:1803.08407v1We introduce a novel RGB-D patch descriptor designed for detecting coplanar surfaces in SLAM reconstruction. The core of our method is a deep convolutional neural net that takes in RGB, depth, and normal information of a planar patch in an image and outputs a descriptor that can be used to find coplanar patches from other images.We train the network on 10 million triplets of coplanar and non-coplanar patches, and evaluate on a new coplanarity benchmark created from commodity RGB-D scans. Experiments show that our learned descriptor outperforms alternatives extended for this new task by a significant margin. In addition, we demonstrate the benefits of coplanarity matching in a robust RGBD reconstruction formulation.We find that coplanarity constraints detected with our method are sufficient to get reconstruction results comparable to state-of-the-art frameworks on most scenes, but outperform other methods on standard benchmarks when combined with a simple keypoint method.
- We propose a new end-to-end single image dehazing method, called Densely Connected Pyramid Dehazing Network (DCPDN), which can jointly learn the transmission map, atmospheric light and dehazing all together. The end-to-end learning is achieved by directly embedding the atmospheric scattering model into the network, thereby ensuring that the proposed method strictly follows the physics-driven scattering model for dehazing. Inspired by the dense network that can maximize the information flow along features from different levels, we propose a new edge-preserving densely connected encoder-decoder structure with multi-level pyramid pooling module for estimating the transmission map. This network is optimized using a newly introduced edge-preserving loss function. To further incorporate the mutual structural information between the estimated transmission map and the dehazed result, we propose a joint-discriminator based on generative adversarial network framework to decide whether the corresponding dehazed image and the estimated transmission map are real or fake. An ablation study is conducted to demonstrate the effectiveness of each module evaluated at both estimated transmission map and dehazed result. Extensive experiments demonstrate that the proposed method achieves significant improvements over the state-of-the-art methods. Code will be made available at: https://github.com/hezhangsprinter
- Mar 23 2018 cs.HC arXiv:1803.08395v1Research on science fiction (sci-fi) in scientific publications has indicated the usage of sci-fi stories, movies or shows to inspire novel Human-Computer Interaction (HCI) research. Yet no studies have analysed sci-fi in a top-ranked computer science conference at present. For that reason, we examine the CHI main track for the presence and nature of sci-fi referrals in relationship to HCI research. We search for six sci-fi terms in a dataset of 5812 CHI main proceedings and code the context of 175 sci-fi referrals in 83 papers indexed in the CHI main track. In our results, we categorize these papers into five contemporary HCI research themes wherein sci-fi and HCI interconnect: 1) Theoretical Design Research; 2) New Interactions; 3) Human-Body Modification or Extension; 4) Human-Robot Interaction and Artificial Intelligence; and 5) Visions of Computing and HCI. In conclusion, we discuss results and implications located in the promising arena of sci-fi and HCI research.
- Mar 23 2018 cs.CV arXiv:1803.08394v1Iris recognition is used in many applications around the world, with enrollment sizes as large as over one billion persons in India's Aadhaar program. Large enrollment sizes can require special optimizations in order to achieve fast database searches. One such optimization that has been used in some operational scenarios is 1:First search. In this approach, instead of scanning the entire database, the search is terminated when the first sufficiently good match is found. This saves time, but ignores potentially better matches that may exist in the unexamined portion of the enrollments. At least one prominent and successful border-crossing program used this approach for nearly a decade, in order to allow users a fast "token-free" search. Our work investigates the search accuracy of 1:First and compares it to the traditional 1:N search. Several different scenarios are considered trying to emulate real environments as best as possible: a range of enrollment sizes, closed- and open-set configurations, two iris matchers, and different permutations of the galleries. Results confirm the expected accuracy degradation using 1:First search, and also allow us to identify acceptable working parameters where significant search time reduction is achieved, while maintaining accuracy similar to 1:N search.
- Mar 23 2018 stat.ME arXiv:1803.08393v1As the frontiers of applied statistics progress through increasingly complex experiments we must exploit increasingly sophisticated inferential models to analyze the observations we make. In order to avoid misleading or outright erroneous inferences we then have to be increasingly diligent in scrutinizing the consequences of those modeling assumptions. Fortunately model-based methods of statistical inference naturally define procedures for quantifying the scope of inferential outcomes and calibrating corresponding decision making processes. In this paper I review the construction and implementation of the particular procedures that arise within frequentist and Bayesian methodologies.
- Mar 23 2018 cs.HC arXiv:1803.08383v1Head Up Displays (HUDs) were designed originally to present at the usual viewpoints of the pilot the main sensor data during aircraft missions, because of placing instrument information in the forward field of view enhances pilots ability to utilize both instrument and environmental information simultaneously. The first civilian motor vehicle had a monochrome HUD that was released in 1988 by General Motors as a technological improvement of HeadDown Display (HDD) interface, which is commonly used in automobile industry. The HUD reduces the number and duration of the drivers sight deviations from the road, by projecting the required information directly into the drivers line of vision. There are many studies about ways of presenting the information: standard oneearpiece presentation, threedimensional audio presentation, visual only or audiovisual presentation. Results have shown that using a 3D auditory display the time of acquiring targets is approximately 2.2 seconds faster than using a oneearpiece way. Nevertheless, a disadvantage is when the drivers attention unconsciously shifts away from the road and goes focused on processing the information presented by the HUD. By this reason, the time, the way and the channel are important to represent the information on a HUD. A solution is a context aware multimodal proactive recommended system that features personalized content combined with the use of car sensors to determine when the information has to be presented.
- Mar 23 2018 astro-ph.GA arXiv:1803.08382v1In a non-expanding universe surface brightness is independent of distance or redshift, while in an expanding universe it decreases rapidly with both. Similarly, for objects of the same luminosity, the angular radius of an object in a non-expanding universe declines with redshift, while in an expanding universe this radius increases for redshifts z>1.25. The author and colleagues have previously shown that data for the surface brightness of disk galaxies are compatible with a static universe with redshift linearly proportional to distance at all z (SEU hypothesis). In this paper we examine the more conventional hypothesis that the universe is expanding, but that the actual radii of galaxies of a given luminosity increase with time (decrease with z), as others have proposed. We show that the radii data for both disk and elliptical galaxies are incompatible with any of the published size-evolution predictions based on an expanding universe. We find that all the physical mechanisms proposed for size evolution, such as galaxy mergers, lead to predictions that are in quantitative contradiction with either the radius data or other data sets, such as the observed rate of galaxy mergers. In addition, we find that when the effect of telescope resolution is taken into account, the r-z relationships for disk and elliptical galaxies are identical. Both are excellently fit by SEU predictions. An overall comparison of cosmological models requires examining all available data-sets, but for this data-set there is a clear contradiction of predictions based on an expanding universe hypothesis.
- Mar 23 2018 astro-ph.HE arXiv:1803.08376v1Although several theories for the origin of cosmic rays in the region between the spectral `knee' and `ankle' exist, this problem is still unsolved. A variety of observations suggest that the transition from Galactic to extragalactic sources occurs in this energy range. In this work we examine whether a Galactic wind which eventually forms a termination shock far outside the Galactic plane can contribute as a possible source to the observed flux in the region of interest. Previous work by Bustard et al. (2017) estimated that particles can be accelerated up to energies above the `knee' up to $R_\mathrm{max} = 10^{16}$ eV for parameters drawn from a model of a Milky Way wind (Everett et al. 2017). A remaining question is whether the accelerated cosmic rays can propagate back into the Galaxy. To answer this crucial question, we simulate the propagation of the cosmic rays using the low energy extension of the CRPropa framework, based on the solution of the transport equation via stochastic differential equations. The setup includes all relevant processes, including three-dimensional anisotropic spatial diffusion, advection, and corresponding adiabatic cooling. We find that, assuming realistic parameters for the shock evolution, a possible Galactic termination shock can contribute significantly to the energy budget in the `knee' region and above. We estimate the resulting produced neutrino fluxes and find them to be below measurements from IceCube and limits by KM3NeT.
- We introduce the use of rectified linear units (ReLU) as the classification function in a deep neural network (DNN). Conventionally, ReLU is used as an activation function in DNNs, with Softmax function as their classification function. However, there have been several studies on using a classification function other than Softmax, and this study is an addition to those. We accomplish this by taking the activation of the penultimate layer $h_{n - 1}$ in a neural network, then multiply it by weight parameters $\theta$ to get the raw scores $o_{i}$. Afterwards, we threshold the raw scores $o_{i}$ by $0$, i.e. $f(o) = \max(0, o_{i})$, where $f(o)$ is the ReLU function. We provide class predictions $\hat{y}$ through argmax function, i.e. argmax $f(x)$.
- Supervised learning frequently boils down to determining hidden and bright parameters in a parameterized hypothesis space based on finite input-output samples. The hidden parameters determine the attributions of hidden predictors or the nonlinear mechanism of an estimator, while the bright parameters characterize how hidden predictors are linearly combined or the linear mechanism. In traditional learning paradigm, hidden and bright parameters are not distinguished and trained simultaneously in one learning process. Such an one-stage learning (OSL) brings a benefit of theoretical analysis but suffers from the high computational burden. To overcome this difficulty, a two-stage learning (TSL) scheme, featured by learning through deterministic assignment of hidden parameters (LtDaHP) was proposed, which suggests to deterministically generate the hidden parameters by using minimal Riesz energy points on a sphere and equally spaced points in an interval. We theoretically show that with such deterministic assignment of hidden parameters, LtDaHP with a neural network realization almost shares the same generalization performance with that of OSL. We also present a series of simulations and application examples to support the outperformance of LtDaHP
- Mar 23 2018 math.DS arXiv:1803.08368v1This is an expository plus research paper which mainly exposes preliminary connection and contrast between classical complex dynamics and semigroup dynamics of holomorphic functions. Classically, we expose some existing results of rational and transcendental dynamics and we see how far these results generalized to holomorphic semigroup dynamics as well as we also see what new phenomena occur.
- Deep neural networks are often trained in the over-parametrized regime (i.e. with far more parameters than training examples), and understanding why the training converges to solutions that generalize remains an open problem. Several studies have highlighted the fact that the training procedure, i.e. mini-batch Stochastic Gradient Descent (SGD) leads to solutions that have specific properties in the loss landscape. However, even with plain Gradient Descent (GD) the solutions found in the over-parametrized regime are pretty good and this phenomenon is poorly understood. We propose an analysis of this behavior for feedforward networks with a ReLU activation function under the assumption of small initialization and learning rate and uncover a quantization effect: The weight vectors tend to concentrate at a small number of directions determined by the input data. As a consequence, we show that for given input data there are only finitely many, "simple" functions that can be obtained, independent of the network size. This puts these functions in analogy to linear interpolations (for given input data there are finitely many triangulations, which each determine a function by linear interpolation). We ask whether this analogy extends to the generalization properties - while the usual distribution-independent generalization property does not hold, it could be that for e.g. smooth functions with bounded second derivative an approximation property holds which could "explain" generalization of networks (of unbounded size) to unseen inputs.
- We consider a Hamiltonian describing three quantum particles in dimension one interacting through two-body short-range potentials. We prove that, as a suitable scale parameter in the potential terms goes to zero, such Hamiltonian converges to one with zero-range (also called delta or point) interactions. The convergence is understood in norm resolvent sense. The two-body rescaled potentials are of the form $v^{\varepsilon}_{\sigma}(x_{\sigma})= \varepsilon^{-1} v_{\sigma}(\varepsilon^{-1}x_\sigma )$, where $\sigma = 23, 12, 31$ is an index that runs over all the possible pairings of the three particles, $x_{\sigma}$ is the relative coordinate between two particles, and $\varepsilon$ is the scale parameter. The limiting Hamiltonian is the one formally obtained by replacing the potentials $v_\sigma$ with $\alpha_\sigma \delta_\sigma$, where $\delta_\sigma$ is the Dirac delta-distribution centered on the coincidence hyperplane $x_\sigma=0$ and $\alpha_\sigma = \int_{\mathbb{R}} v_\sigma dx_\sigma$. To prove the convergence of the resolvents we make use of Faddeev's equations.
- Motivated by Supervised Opinion Analysis, we propose a novel framework devoted to Structured Output Learning with Abstention (SOLA). The structure prediction model is able to abstain from predicting some labels in the structured output at a cost chosen by the user in a flexible way. For that purpose, we decompose the problem into the learning of a pair of predictors, one devoted to structured abstention and the other, to structured output prediction. To compare fully labeled training data with predictions potentially containing abstentions, we define a wide class of asymmetric abstention-aware losses. Learning is achieved by surrogate regression in an appropriate feature space while prediction with abstention is performed by solving a new pre-image problem. Thus, SOLA extends recent ideas about Structured Output Prediction via surrogate problems and calibration theory and enjoys statistical guarantees on the resulting excess risk. Instantiated on a hierarchical abstention-aware loss, SOLA is shown to be relevant for fine-grained opinion mining and gives state-of-the-art results on this task. Moreover, the abstention-aware representations can be used to competitively predict user-review ratings based on a sentence-level opinion predictor.