- Apr 24 2017 quant-ph cond-mat.stat-mech arXiv:1704.06291v1One of the main questions of research on quantum many-body systems following unitary out of equilibrium dynamics is to find out how local expectation values equilibrate in time. For non-interacting models, this question is rather well understood. However, the best known bounds for general quantum systems are vastly crude, scaling unfavorable with the system size. Nevertheless, empirical and numerical evidence suggests that for generic interacting many-body systems, generic local observables, and sufficiently well-behaved states, the equilibration time does not depend strongly on the system size, but only the precision with which this occurs does. In this discussion paper, we aim at giving very simple and plausible arguments for why this happens. While our discussion does not yield rigorous results about equilibration time scales, we believe that it helps to clarify the essential underlying mechanisms, the intuition and important figure of merits behind equilibration. We then connect our arguments to common assumptions and numerical results in the field of equilibration and thermalization of closed quantum systems, such as the eigenstate thermalization hypothesis as well as rigorous results on interacting quantum many-body systems. Finally, we complement our discussions with numerical results - both in the case of examples and counter-examples of equilibrating systems.
- Apr 24 2017 quant-ph cond-mat.stat-mech arXiv:1704.06646v1We show that the physical mechanism for the equilibration of closed quantum systems is dephasing, and identify the energy scales that determine the equilibration timescale of a given observable. For realistic physical systems (e.g those with local Hamiltonians), our arguments imply timescales that do not increase with the system size, in contrast to previously known upper bounds. In particular, we show that, for such Hamiltonians, the matrix representation of local observables in the energy basis is banded, and that this property is crucial in order to derive equilibration times that are non-negligible in macroscopic systems. Finally, we give an intuitive interpretation to recent theorems on equilibration time-scales.
- Apr 24 2017 quant-ph arXiv:1704.06475v1The aim of this paper is to provide a quantum counterpart of the well known minimum-distance classifier named Nearest Mean Classifier (NMC). In particular, we refer to the following previous works: i) in Sergioli et al. 2016, we have introduced a detailed quantum version of the NMC, named Quantum Nearest Mean Classifier (QNMC), for two-dimensional problems and we have proposed a generalization to abitrary dimensions; ii) in Sergioli et al. 2017, the n-dimensional problem was analyzed in detail and a particular encoding for arbitrary n-feature vectors into density operators has been presented. In this paper, we introduce a new promizing encoding of arbitrary n-dimensional patterns into density operators, starting from the two-feature encoding provided in the first work. Further, unlike the NMC, the QNMC shows to be not invariant by rescaling the features of each pattern. This property allows us to introduce a free parameter whose variation provides, in some case, an improvement of the QNMC performance. We show experimental results where: i) the NMC and QNMC performances are compared on different datasets; ii) the effects of the non-invariance under uniform rescaling for the QNMC are investigated.
- Apr 24 2017 quant-ph physics.optics arXiv:1704.06309v1Quantum memory, capable of stopping flying photons and storing their quantum coherence, is essential for scalable quantum technologies. A broadband quantum memory operating at room temperature will enable building large-scale quantum systems for real-life applications, for instance, high-speed quantum repeater for long-distance quantum communication and synchronised multi-photon quantum sources for quantum computing and quantum simulation. Albeit advances of pushing bandwidth from narrowband to broadband and storage media from ultra-cold atomic gas to room-temperature atomic vapour, due to either intrinsic high noises or short lifetime, it is still challenging to find a room-temperature broadband quantum memory beyond conceptional demonstration. Here, we present a far off-resonance Duan-Lukin-Cirac-Zoller (DLCZ) protocol and for the first time demonstrate a genuine broadband quantum memory that is free of noise and limitation of time bandwidth product in room-temperature atoms. We observed a reported ever lowest unconditional noise level of 0.0001 and a cross-correlation between Stokes photon and anti-Stokes photon up to 28. A measurement of Cauchy-Schwarz inequality yields a violation of 320 standard deviations, which clearly indicates high-fidelity generation and preservation of non-classical correlation in room-temperature atoms. Without using any population preserving technique, we have reached a time bandwidth product up to 1400. Our results pave the avenue towards quantum memory-enabled applications in high speed at ambient condition, especially for scalable quantum communications and quantum computing.
- Apr 24 2017 cond-mat.dis-nn cond-mat.stat-mech arXiv:1704.06279v1Physical systems differing in their microscopic details often display strikingly similar behaviour when probed at low energies. Those universal properties, largely determining their physical characteristics, are revealed by the powerful renormalization group (RG) procedure, which systematically retains "slow" degrees of freedom and integrates out the rest. However, the important degrees of freedom may be difficult to identify. Here we demonstrate a machine learning (ML) algorithm capable of identifying the relevant degrees of freedom without any prior knowledge about the system. We introduce an artificial neural network based on a model-independent, information-theoretic characterization of a real-space RG procedure, performing this task. We apply the algorithm to classical statistical physics problems in two dimensions.
- Apr 24 2017 math.CO arXiv:1704.06536v2Hadwiger's Conjecture asserts that every $K_t$-minor-free graph has a proper $(t-1)$-colouring. We relax the conclusion in Hadwiger's Conjecture via improper colourings. We prove that every $K_t$-minor-free graph is $(2t-2)$-colourable with monochromatic components of order at most $\lceil{\frac12(t-2)}\rceil$. This result has no more colours and much smaller monochromatic components than all previous results in this direction. We then prove that every $K_t$-minor-free graph is $(t-1)$-colourable with monochromatic degree at most $t-2$. This is the best known degree bound for such a result. Both these theorems are based on a decomposition method of independent interest. We give analogous results for $K_{s,t}$-minor-free graphs, which lead to improved bounds on generalised colouring numbers for these classes. Finally, we prove that graphs containing no $K_t$-immersion are $2$-colourable with bounded monochromatic degree.
- Apr 24 2017 quant-ph arXiv:1704.06516v1We analyze the connection between Bell inequality violations and symmetric extendibility of quantum states. We prove that 2-qubit reduced states of multiqubit symmetric pure states do not violate the Bell-CHSH inequality. We then prove the more general converse that any 2-qubit state that violates the CHSH inequality cannot have a symmetric extension. We extend our analysis to qudits and provide a test for symmetric extendibility of 2-qudit states. We show that if a 2-qudit Bell inequality is monogamous, then any 2-qudit state that violates this inequality does not have a symmetric extension. For the specific case of 2-qutrit states, we use numerical evidence to conjecture that the CGLMP inequality is monogamous. Hence, the violation of the CGLMP inequality by any 2-qutrit state could be a sufficient condition for the non-existence of its symmetric extension.
- We study the problem of approximating the partition function of the ferromagnetic Ising model in graphs and hypergraphs. Our first result is a deterministic approximation scheme (an FPTAS) for the partition function in bounded degree graphs that is valid over the entire range of parameters $\beta$ (the interaction) and $\lambda$ (the external field), except for the case $\vert{\lambda}\vert=1$ (the "zero-field" case). A randomized algorithm (FPRAS) for all graphs, and all $\beta,\lambda$, has long been known. Unlike most other deterministic approximation algorithms for problems in statistical physics and counting, our algorithm does not rely on the "decay of correlations" property. Rather, we exploit and extend machinery developed recently by Barvinok, and Patel and Regts, based on the location of the complex zeros of the partition function, which can be seen as an algorithmic realization of the classical Lee-Yang approach to phase transitions. Our approach extends to the more general setting of the Ising model on hypergraphs of bounded degree and edge size, where no previous algorithms (even randomized) were known for a wide range of parameters. In order to achieve this extension, we establish a tight version of the Lee-Yang theorem for the Ising model on hypergraphs, improving a classical result of Suzuki and Fisher.
- Apr 24 2017 cs.CC arXiv:1704.06314v1We prove that any non-adaptive algorithm that tests whether an unknown Boolean function $f: \{0, 1\}^n\to \{0, 1\}$ is a $k$-junta or $\epsilon$-far from every $k$-junta must make $\widetilde{\Omega}(k^{3/2} / \epsilon)$ many queries for a wide range of parameters $k$ and $\epsilon$. Our result dramatically improves previous lower bounds from [BGSMdW13, STW15], and is essentially optimal given Blais's non-adaptive junta tester from [Blais08], which makes $\widetilde{O}(k^{3/2})/\epsilon$ queries. Combined with the adaptive tester of [Blais09] which makes $O(k\log k + k /\epsilon)$ queries, our result shows that adaptivity enables polynomial savings in query complexity for junta testing.
- Apr 24 2017 quant-ph arXiv:1704.06612v1We analyze the problem of quantum phase estimation where the set of allowed phases forms a discrete $N$ element subset of the whole $[0,2\pi]$ interval, $\varphi_n = 2\pi n/N$, $n=0,\dots N-1$ and study the discrete-to-continuous transition $N\rightarrow\infty$. In case $N \leq D$, where $D$ is the dimension of the probe space, the cost of estimation can be made zero. We show that for the standard phase estimation cost function $C(\varphi,\tilde{\varphi}) = 4\sin^2\frac{\varphi - \tilde{\varphi}}{2}$, the transition between discrete and continuous phase estimation is abrupt and happens already for $N=D+1$. We also analyze the relation between the problem of phase discrimination and estimation by considering step cost functions of a given width $\sigma$ around the true estimated value. By changing the ratio between the separation of allowed phases and the width of the cost function we can interpolate between discrimination-like and estimation-like regimes. While doing so, we show that a direct application of theory of covariant measurements for a discrete subgroup of the $U(1)$ group leads to suboptimal strategies due to an implicit requirement of estimating only the phases that appear in the prior distribution. We develop the theory of sub-covariant measurements to remedy this situation and demonstrate truly optimal estimation strategies when performing transition from a discrete to the continuous phase estimation regime.
- Apr 24 2017 physics.chem-ph cond-mat.mtrl-sci arXiv:1704.06439v2Accurate simulations of atomistic systems from first principles are limited by computational cost. In high-throughput settings, machine learning can potentially reduce these costs significantly by accurately interpolating between reference calculations. For this, kernel learning approaches crucially require a single Hilbert space accommodating any atomistic system. We introduce a many-body tensor representation that has proper mathematical structure, satisfies theoretical requirements, and can represent both molecules and crystals. Empirical evidence is presented for energy prediction errors below 1 kcal/mol for 7k organic molecules and 3 meV/atom for 11k elpasolite crystals. Applicability is demonstrated for phase diagrams of Pt-group/transition-metal binary systems.
- Apr 24 2017 cond-mat.mes-hall quant-ph arXiv:1704.06417v1We compare quantum dynamics in the presence of Markovian dephasing for a particle hopping on a chain and for an Ising domain wall whose motion leaves behind a string of flipped spins. Exact solutions show that on an infinite chain, the transport responses of the models are nearly identical. However, on finite-length chains, the broadening of discrete spectral lines is much more noticeable in the case of a domain wall.
- Apr 24 2017 quant-ph arXiv:1704.06359v1In quantum information W states are a central class of multipartite entangled states because of their robustness against noise and use in many quantum processes. Their generation however remains a demanding task whose difficulty increases with the number of particles. We report a simple scalable conceptual scheme where a single particle in an ancilla mode works as entanglement catalyst of W state for other $N$ separated identical particles. A crucial novel aspect of the scheme, which exploits basically spatial indistinguishability, is its universality, being applicable without essential changes to both bosons and fermions. Our proposal represents a new paradigm within experimental preparation of many-particle entanglement based on quantum indistinguishability.
- Apr 24 2017 quant-ph arXiv:1704.06289v1In this work we derive a lower bound for the minimum time required to implement a target unitary transformation through a classical time-dependent field in a closed quantum system. The bound depends on the target gate, the strength of the internal Hamiltonian and the highest permitted control field amplitude. These findings reveal some properties of the reachable set of operations, explicitly analyzed for a single qubit. Moreover, for fully controllable systems, we identify a lower bound for the time at which all unitary gates become reachable. We use numerical gate optimization in order to study the tightness of the obtained bounds. It is shown that in the single qubit case our analytical findings describe the relationship between the highest control field amplitude and the minimum evolution time remarkably well. Finally, we discuss both challenges and ways forward for obtaining tighter bounds for higher dimensional systems, offering a discussion about the mathematical form and the physical meaning of the bound.
- Recurrent neural network architectures can have useful computational properties, with complex temporal dynamics. However, evaluation of recurrent dynamic architectures requires solution of systems of differential equations, and the number of evaluations required to determine their response to a given input can vary with the input, or can be indeterminate altogether in the case of oscillations or instability. In feed-forward networks, by contrast, only a single pass through the network is needed to determine the response to a given input. Modern machine-learning systems are designed to operate efficiently on feed-forward architectures. We hypothesised that two-layer feedforward architectures with simple, deterministic dynamics could approximate the responses of single-layer recurrent network architectures. By identifying the fixed-point responses of a given recurrent network, we trained two-layer networks to directly approximate the fixed-point response to a given input. These feed-forward networks then embodied useful computations, including competitive interactions, information transformations and noise rejection. Our approach was able to find useful approximations to recurrent networks, which can then be evaluated in linear and deterministic time complexity.
- Apr 24 2017 astro-ph.CO arXiv:1704.06634v1Motivated by recent developments in perturbative calculations of the nonlinear evolution of large-scale structure, we present an iterative algorithm to reconstruct the initial conditions in a given volume starting from the dark matter distribution in real space. In our algorithm, objects are first iteratively moved back along estimated potential gradients until a nearly uniform catalog is obtained. The linear initial density is then estimated as the divergence of the cumulative displacement, with an optional second-order correction. This algorithm should undo non-linear effects up to one-loop order, including the higher-order infrared resummation piece. We test the method using dark matter simulations in real space. At redshift $z=0$, we find that after eight iterations the reconstructed density is more than $95\%$ correlated with the initial density at $k\le 0.35\; h\mathrm{Mpc}^{-1}$. The reconstruction also reduces the power in the difference between reconstructed and initial fields by more than two orders of magnitude at $k\le 0.2\; h\mathrm{Mpc}^{-1}$, and it extends the range of scales where the full broad-band shape of the power spectrum matches linear theory by a factor 2-3. As a specific application, we consider measurements of the Baryonic Acoustic Oscillation (BAO) scale that can be improved by reducing the degradation effects of large-scale flows. We find that the method improves the BAO signal-to-noise by a factor 2.7 at redshift $z=0$ and by a factor 2.5 at $z=0.6$ in our idealistic simulations. This improves standard BAO reconstruction by $70\%$ at $z=0$ and $30\%$ at $z=0.6$, and matches the optimal BAO signal and signal-to-noise of the linear density in the same volume.
- Compressive image recovery is a challenging problem that requires fast and accurate algorithms. Recently, neural networks have been applied to this problem with promising results. By exploiting massively parallel GPU processing architectures and oodles of training data, they are able to run orders of magnitude faster than existing methods. Unfortunately, these methods are difficult to train, often-times specific to a single measurement matrix, and largely unprincipled blackboxes. It was recently demonstrated that iterative sparse-signal-recovery algorithms can be unrolled to form interpretable deep neural networks. Taking inspiration from this work, we develop novel neural network architectures that mimic the behavior of the denoising-based approximate message passing (D-AMP) and denoising-based vector approximate message passing (D-VAMP) algorithms. We call these new networks \em Learned D-AMP (LDAMP) and \em Learned D-VAMP (LDVAMP). The LDAMP/LDVAMP networks are easy to train, can be applied to a variety of different measurement matrices, and come with a state-evolution heuristic that accurately predicts their performance. Most importantly, our networks outperform the state-of-the-art BM3D-AMP and NLR-CS algorithms in terms of both accuracy and runtime. At high resolutions, and when used with matrices which have fast matrix multiply implementations, LDAMP runs over $50\times$ faster than BM3D-AMP and hundreds of times faster than NLR-CS.
- Apr 24 2017 cs.DC arXiv:1704.06623v1With the surge of multi- and manycores, much research has focused on algorithms for mapping and scheduling on these complex platforms. Large classes of these algorithms face scalability problems. This is why diverse methods are commonly used for reducing the search space. While most such approaches leverage the inherent symmetry of architectures and applications, they do it in a problem-specific and intuitive way. However, intuitive approaches become impractical with growing hardware complexity, like Network-on-Chip interconnect or heterogeneous cores. In this paper, we present a formal framework that can determine the inherent symmetry of architectures and applications algorithmically and leverage these for problems in software synthesis. Our approach is based on the mathematical theory of groups and a generalization called inverse semigroups. We evaluate our approach in two state-of-the-art mapping frameworks. Even for the platforms with a handful of cores of today and moderate-size benchmarks, our approach consistently yields reductions of the overall execution time of algorithms, accelerating them by a factor up to 10 in our experiments, or improving the quality of the results.
- We propose a summarization approach for scientific articles which takes advantage of citation-context and the document discourse model. While citations have been previously used in generating scientific summaries, they lack the related context from the referenced article and therefore do not accurately reflect the article's content. Our method overcomes the problem of inconsistency between the citation summary and the article's content by providing context for each citation. We also leverage the inherent scientific article's discourse for producing better summaries. We show that our proposed method effectively improves over existing summarization approaches (greater than 30% improvement over the best performing baseline) in terms of \textscRouge scores on TAC2014 scientific summarization dataset. While the dataset we use for evaluation is in the biomedical domain, most of our approaches are general and therefore adaptable to other domains.
- Empirically, neural networks that attempt to learn programs from data have exhibited poor generalizability. Moreover, it has traditionally been difficult to reason about the behavior of these models beyond a certain level of input complexity. In order to address these issues, we propose augmenting neural architectures with a key abstraction: recursion. As an application, we implement recursion in the Neural Programmer-Interpreter framework on four tasks: grade-school addition, bubble sort, topological sort, and quicksort. We demonstrate superior generalizability and interpretability with small amounts of training data. Recursion divides the problem into smaller pieces and drastically reduces the domain of each neural network component, making it tractable to prove guarantees about the overall system's behavior. Our experience suggests that in order for neural architectures to robustly learn program semantics, it is necessary to incorporate a concept like recursion.
- Apr 24 2017 cs.CV arXiv:1704.06610v1The task of object viewpoint estimation has been a challenge since the early days of computer vision. To estimate the viewpoint (or pose) of an object, people have mostly looked at object intrinsic features, such as shape or appearance. Surprisingly, informative features provided by other, extrinsic elements in the scene, have so far mostly been ignored. At the same time, contextual cues have been proven to be of great benefit for related tasks such as object detection or action recognition. In this paper, we explore how information from other objects in the scene can be exploited for viewpoint estimation. In particular, we look at object configurations by following a relational neighbor-based approach for reasoning about object relations. We show that, starting from noisy object detections and viewpoint estimates, exploiting the estimated viewpoint and location of other objects in the scene can lead to improved object viewpoint predictions. Experiments on the KITTI dataset demonstrate that object configurations can indeed be used as a complementary cue to appearance-based viewpoint estimation. Our analysis reveals that the proposed context-based method can improve object viewpoint estimation by reducing specific types of viewpoint estimation errors commonly made by methods that only consider local information. Moreover, considering contextual information produces superior performance in scenes where a high number of object instances occur. Finally, our results suggest that, following a cautious relational neighbor formulation brings improvements over its aggressive counterpart for the task of object viewpoint estimation.
- Apr 24 2017 cs.CV arXiv:1704.06544v1Precise delineation of organs at risk (OAR) is a crucial task in radiotherapy treatment planning, which aims at delivering high dose to the tumour while sparing healthy tissues. In recent years algorithms showed high performance and the possibility to automate this task for many OAR. However, for some OAR precise delineation remains challenging. The esophagus with a versatile shape and poor contrast is among these structures. To tackle these issues we propose a 3D fully (convolutional neural network (CNN) driven random walk (RW) approach to automatically segment the esophagus on CT. First, a soft probability map is generated by the CNN. Then an active contour model (ACM) is fitted on the probability map to get a first estimation of the center line. The outputs of the CNN and ACM are then used in addition to CT Hounsfield values to drive the RW. Evaluation and training was done on 50 CTs with peer reviewed esophagus contours. Results were assessed regarding spatial overlap and shape similarities. The generated contours showed a mean Dice coefficient of 0.76, an average symmetric square distance of 1.36 mm and an average Hausdorff distance of 11.68 compared to the reference. These figures translate into a very good agreement with the reference contours and an increase in accuracy compared to other methods. We show that by employing a CNN accurate estimations of esophagus location can be obtained and refined by a post processing RW step. One of the main advantages compared to previous methods is that our network performs convolutions in a 3D manner, fully exploiting the 3D spatial context and performing an efficient and precise volume-wise prediction. The whole segmentation process is fully automatic and yields esophagus delineations in very good agreement with the used gold standard, showing that it can compete with previously published methods.
- We consider the cosmological implications of a gravitational theory containing two vector fields coupled minimally to gravity as well as a generalized Chern-Simons term that couples the two vector fields. One of the vector fields is the usual Maxwell field, while the other is a constrained vector field with constant norm included in the action via a Lagrange multiplier. The theory admits a de Sitter type solution, with healthy cosmological perturbations. We will show that there is 6 degrees of freedom propagate on top of de Sitter space-time, two tensor polarizations and four degrees of freedom related to two massless vector fields interacting with each other via Chern-Simons interaction term. We also investigate in detail the behavior of the geometric and physical parameters of a homogeneous and anisotropic Bianchi type I Universe, by using both analytical and numerical methods, by assuming that the matter content of the Universe can be described by the stiff causal and pressureless dust fluid equations of state. The time evolution of the Bianchi type I Universe strongly depends on the initial conditions of the physical and geometrical quantities, as well as on the numerical values of the model parameters. Two important observational parameters, the mean anisotropy parameter, and the deceleration parameter, are also studied in detail, and we show that independently of the matter equation of state the cosmological evolution of the Bianchi type I Universe always ends in an isotropic and exponentially accelerating, de Sitter type, phase.
- Apr 24 2017 quant-ph arXiv:1704.06514v1Fully autonomous precise control of qubits is crucial for quantum information processing, quantum communication, and quantum sensing applications. It requires minimal human intervention on the ability to model, to predict and to anticipate the quantum dynamics [1,2], as well as to precisely control and calibrate single qubit operations. Here, we demonstrate single qubit autonomous calibrations via closed-loop optimisations of electron spin quantum operations in diamond. The operations are examined by quantum state and process tomographic measurements at room temperature, and their performances against systematic errors are iteratively rectified by an optimal pulse engineering algorithm. We achieve an autonomous calibrated fidelity up to 1.00 on a time scale of minutes for a spin population inversion and up to 0.98 on a time scale of hours for a Hadamard gate within the experimental error of 2%. These results manifest a full potential for versatile quantum nanotechnologies.
- Apr 24 2017 quant-ph arXiv:1704.06504v1We introduce a new universal, measurement-based model of quantum computation, which is potentially attractive for implementation on ion trap and superconducting qubit devices. It differs from the traditional one-way model in two ways: (1) the resource states which play the role of graph states are prepared via 2-qubit Mølmer-Sørensen interactions and (2) the model only requires non-rotated Pauli X and Z measurements to obtain universality. We exploit the fact that the relevant resource states and measurements have a simple representation in the ZX-calculus to give a short, graphical proof of universality.
- Bandit structured prediction describes a stochastic optimization framework where learning is performed from partial feedback. This feedback is received in the form of a task loss evaluation to a predicted output structure, without having access to gold standard structures. We advance this framework by lifting linear bandit learning to neural sequence-to-sequence learning problems using attention-based recurrent neural networks. Furthermore, we show how to incorporate control variates into our learning algorithms for variance reduction and improved generalization. We present an evaluation on a neural machine translation task that shows improvements of up to 5.89 BLEU points for domain adaptation from simulated bandit feedback.
- We address personalization issues of image captioning, which have not been discussed yet in previous research. For a query image, we aim to generate a descriptive sentence, accounting for prior knowledge such as the user's active vocabularies in previous documents. As applications of personalized image captioning, we tackle two post automation tasks: hashtag prediction and post generation, on our newly collected Instagram dataset, consisting of 1.1M posts from 6.3K users. We propose a novel captioning model named Context Sequence Memory Network (CSMN). Its unique updates over previous memory network models include (i) exploiting memory as a repository for multiple types of context information, (ii) appending previously generated words into memory to capture long-term information without suffering from the vanishing gradient problem, and (iii) adopting CNN memory structure to jointly represent nearby ordered memory slots for better context understanding. With quantitative evaluation and user studies via Amazon Mechanical Turk, we show the effectiveness of the three novel features of CSMN and its performance enhancement for personalized image captioning over state-of-the-art captioning models.
- Apr 24 2017 cs.MM arXiv:1704.06444v1The panoramic video is widely used to build virtual reality (VR) and is expected to be one of the next generation Killer-Apps. Transmitting panoramic VR videos is a challenging task because of two problems: 1) panoramic VR videos are typically much larger than normal videos but they need to be transmitted with limited bandwidth in mobile networks. 2) high-resolution and fluent views should be provided to guarantee a superior user experience and avoid side-effects such as dizziness and nausea. To address these two problems, we propose a novel interactive streaming technology, namely Focus-based Interactive Streaming Framework (FISF). FISF consists of three parts: 1) we use the classic clustering algorithm DBSCAN to analyze real user data for Video Focus Detection (VFD); 2) we propose a Focus-based Interactive Streaming Technology (FIST), including a static version and a dynamic version; 3) we propose two optimization methods: focus merging and prefetch strategy. Experimental results show that FISF significantly outperforms the state-of-the-art. The paper is submitted to Sigcomm 2017, VR/AR Network on 31 Mar 2017 at 10:44:04am EDT.
- Apr 24 2017 cs.LG arXiv:1704.06440v1Two of the leading approaches for model-free reinforcement learning are policy gradient methods and $Q$-learning methods. $Q$-learning methods can be effective and sample-efficient when they work, however, it is not well-understood why they work, since empirically, the $Q$-values they estimate are very inaccurate. A partial explanation may be that $Q$-learning methods are secretly implementing policy gradient updates: we show that there is a precise equivalence between $Q$-learning and policy gradient methods in the setting of entropy-regularized reinforcement learning, that "soft" (entropy-regularized) $Q$-learning is exactly equivalent to a policy gradient method. We also point out a connection between $Q$-learning methods and natural policy gradient methods. Experimentally, we explore the entropy-regularized versions of $Q$-learning and policy gradients, and we find them to perform as well as (or slightly better than) the standard variants on the Atari benchmark. We also show that the equivalence holds in practical settings by constructing a $Q$-learning method that closely matches the learning dynamics of A3C without using a target network or $\epsilon$-greedy exploration schedule.
- We generalize the Caldero-Chapoton formula for cluster algebras of finite type to the skew-symmetrizable case. This is done by replacing representation categories of Dynkin quivers by categories of locally free modules over certain Iwanaga-Gorenstein algebras introduced in Part I. The proof relies on the realization of the positive part of the enveloping algebra of a simple Lie algebra of the same finite type as a convolution algebra of constructible functions on representation varieties of $H$, given in Part III. Along the way, we obtain a new result on the PBW basis of this convolution algebra.
- This article improves the existing proven rates of regret decay in optimal policy estimation. We give a margin-free result showing that the regret decay for estimating a within-class optimal policy is second-order for empirical risk minimizers over Donsker classes, with regret decaying at a faster rate than the standard error of an efficient estimator of the value of an optimal policy. We also give a result from the classification literature that shows that faster regret decay is possible via plug-in estimation provided a margin condition holds. Four examples are considered. In these examples, the regret is expressed in terms of either the mean value or the median value; the number of possible actions is either two or finitely many; and the sampling scheme is either independent and identically distributed or sequential, where the latter represents a contextual bandit sampling scheme.
- This paper addresses the problem of online tracking and classification of multiple objects in an image sequence. Our proposed solution is to first track all objects in the scene without relying on object-specific prior knowledge, which in other systems can take the form of hand-crafted features or user-based track initialization. We then classify the tracked objects with a fast-learning image classifier that is based on a shallow convolutional neural network architecture and demonstrate that object recognition improves when this is combined with object state information from the tracking algorithm. We argue that by transferring the use of prior knowledge from the detection and tracking stages to the classification stage we can design a robust, general purpose object recognition system with the ability to detect and track a variety of object types. We describe our biologically inspired implementation, which adaptively learns the shape and motion of tracked objects, and apply it to the Neovision2 Tower benchmark data set, which contains multiple object types. An experimental evaluation demonstrates that our approach is competitive with state-of-the-art video object recognition systems that do make use of object-specific prior knowledge in detection and tracking, while providing additional practical advantages by virtue of its generality.
- Apr 24 2017 astro-ph.IM astro-ph.EP arXiv:1704.06413v2Machine learning techniques are widely applied in many modern optical sky surveys, e.q. Pan-STARRS1, PTF/iPTF and Subaru/Hyper Suprime-Cam survey, to reduce human intervention for data verification. In this study, we have established a machine learning based real-bogus system to reject the false detections in the Subaru/Hyper-Suprime-Cam StrategicSurvey Program (HSC-SSP) source catalog. Therefore the HSC-SSP moving object detection pipeline can operate more effectively due to the reduction of false positives. To train the real-bogus system, we use the stationary sources as the real training set and the "flagged" data as the bogus set. The training set contains 47 features, most of which are photometric measurements and shape moments generated from the HSC image reduction pipeline (hscPipe). Our system can reach a true positive rate (tpr) ~96% with a false positive rate (fpr) ~ 1% or tpr ~99% at fpr ~5%. Therefore we conclude that the stationary sources are decent real training samples, and using photometry measurements and shape moments can reject the false positives effectively.
- Apr 24 2017 stat.AP arXiv:1704.06408v1We present a method to estimate a multivariate Gaussian distribution of diffusion tensor features in a set of brain regions based on a small sample of healthy individuals, and use this distribution to identify imaging abnormalities in subjects with mild traumatic brain injury. The multivariate model receives a \em apriori knowledge in the form of a neighborhood graph imposed on the precision matrix, which models brain region interactions, and an additional $L_1$ sparsity constraint. The model is then estimated using the graphical LASSO algorithm and the Mahalanobis distance of healthy and TBI subjects to the distribution mean is used to evaluate the discriminatory power of the model. Our experiments show that the addition of the \em apriori neighborhood graph results in significant improvements in classification performance compared to a model which does not take into account the brain region interactions or one which uses a fully connected prior graph. In addition, we describe a method, using our model, to detect the regions that contribute the most to the overall abnormality of the DTI profile of a subject's brain.
- Apr 24 2017 cs.HC arXiv:1704.06399v1In order to avoid the "Midas Touch" problem, gaze-based interfaces for selection often introduce a dwell time: a fixed amount of time the user must fixate upon an object before it is selected. Past interfaces have used a uniform dwell time across all objects. Here, we propose an algorithm for adjusting the dwell times of different objects based on the inferred probability that the user wishes to select them. In particular, we introduce a probabilistic model of gaze behavior and train it on natural gaze trajectories collected while users surf the web through a gaze-based web browser. Using this model, we can infer the probability that each hyperlink on the web page is the one the user intends to select. We assign shorter dwell times to more likely hyperlinks and longer dwell times to less likely hyperlinks. The resulting variable dwell time gaze-based browser enables subjects to surf the web faster and/or more reliably. We use a two-stage hidden Markov model based algorithm to model the natural eye gaze trajectories. We have evaluated this method objectively both in simulation and experimentally, and subjectively through questionnaires. Our results demonstrate that the proposed algorithm achieves a better tradeoff between accuracy and speed. This work demonstrates that natural eye movements can be used to infer user intent, and that this information can be used to improve gaze-based selection without increased cognitive load on users.
- Apr 24 2017 cs.CL arXiv:1704.06393v1Neural machine translation (NMT) becomes a new approach to machine translation and generates much more fluent results compared to statistical machine translation (SMT). However, SMT is usually better than NMT in translation adequacy. It is therefore a promising direction to combine the advantages of both NMT and SMT. In this paper, we propose a neural system combination framework leveraging multi-source NMT, which takes as input the outputs of NMT and SMT systems and produces the final translation. Extensive experiments on the Chinese-to-English translation task show that our model archives significant improvement by 5.3 BLEU points over the best single system output and 3.4 BLEU points over the state-of-the-art traditional system combination methods.
- Apr 24 2017 cs.CV arXiv:1704.06382v1Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of full volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of seven abdominal structures (artery, vein, liver, spleen, stomach, gallbladder, and pancreas) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training organ-specific models. To this end, we propose a two-stage, coarse-to-fine approach that trains an FCN model to roughly delineate the organs of interest in the first stage (seeing $\sim$40% of the voxels within a simple, automatically generated binary mask of the patient's body). We then use these predictions of the first-stage FCN to define a candidate region that will be used to train a second FCN. This step reduces the number of voxels the FCN has to classify to $\sim$10% while maintaining a recall high of $>$99%. This second-stage FCN can now focus on more detailed segmentation of the organs. We respectively utilize training and validation sets consisting of 281 and 50 clinical CT images. Our hierarchical approach provides an improved Dice score of 7.5 percentage points per organ on average in our validation set. We furthermore test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans with three anatomical labels (liver, spleen, and pancreas). In such challenging organs as the pancreas, our hierarchical approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset.
- Apr 24 2017 cs.CL arXiv:1704.06380v1Increased adaptability of RNN language models leads to improved predictions that benefit many applications. However, current methods do not take full advantage of the RNN structure. We show that the most widely-used approach to adaptation (concatenating the context with the word embedding at the input to the recurrent layer) is outperformed by a model that has some low-cost improvements: adaptation of both the hidden and output layers. and a feature hashing bias term to capture context idiosyncrasies. Experiments on language modeling and classification tasks using three different corpora demonstrate the advantages of the proposed techniques.
- Apr 24 2017 cs.CV arXiv:1704.06378v1One of the challenges in evaluating multi-object video detection, tracking and classification systems is having publically available data sets with which to compare different systems. However, the measures of performance for tracking and classification are different. Data sets that are suitable for evaluating tracking systems may not be appropriate for classification. Tracking video data sets typically only have ground truth track IDs, while classification video data sets only have ground truth class-label IDs. The former identifies the same object over multiple frames, while the latter identifies the type of object in individual frames. This paper describes an advancement of the ground truth meta-data for the DARPA Neovision2 Tower data set to allow both the evaluation of tracking and classification. The ground truth data sets presented in this paper contain unique object IDs across 5 different classes of object (Car, Bus, Truck, Person, Cyclist) for 24 videos of 871 image frames each. In addition to the object IDs and class labels, the ground truth data also contains the original bounding box coordinates together with new bounding boxes in instances where un-annotated objects were present. The unique IDs are maintained during occlusions between multiple objects or when objects re-enter the field of view. This will provide: a solid foundation for evaluating the performance of multi-object tracking of different types of objects, a straightforward comparison of tracking system performance using the standard Multi Object Tracking (MOT) framework, and classification performance using the Neovision2 metrics. These data have been hosted publically.
- A new recalibration post-processing method is presented to improve the quality of the posterior approximation when using Approximate Bayesian Computation (ABC) algorithms. Recalibration may be used in conjunction with existing post-processing methods, such as regression-adjustments. In addition, this work extends and strengthens the links between ABC and indirect inference algorithms, allowing more extensive use of misspecified auxiliary models in the ABC context. The method is illustrated using simulated examples to demonstrate the effects of recalibration under various conditions, and through an application to an analysis of stereological extremes both with and without the use of auxiliary models. Code to implement recalibration post-processing is available in the R package, abctools.
- Apr 24 2017 cs.CV arXiv:1704.06370v1In this paper, we present a real-time robust multi-view pedestrian detection and tracking system for video surveillance using neural networks which can be used in dynamic environments. The proposed system consists of two phases: multi-view pedestrian detection and tracking. First, pedestrian detection utilizes background subtraction to segment the foreground blob. An adaptive background subtraction method where each of the pixel of input image models as a mixture of Gaussians and uses an on-line approximation to update the model applies to extract the foreground region. The Gaussian distributions are then evaluated to determine which are most likely to result from a background process. This method produces a steady, real-time tracker in outdoor environment that consistently deals with changes of lighting condition, and long-term scene change. Second, the Tracking is performed at two phases: pedestrian classification and tracking the individual subject. A sliding window is applied on foreground binary image to select an input window which is used for selecting the input image patches from actually input frame. The neural networks is used for classification with PHOG features. Finally, a Kalman filter is applied to calculate the subsequent step for tracking that aims at finding the exact position of pedestrians in an input image. The experimental result shows that the proposed approach yields promising performance on multi-view pedestrian detection and tracking on different benchmark datasets.
- Apr 24 2017 cs.CV arXiv:1704.06369v3Thanks to the recent developments of Convolutional Neural Networks, the performance of face verification methods has increased rapidly. In a typical face verification method, feature normalization is a critical step for boosting performance. This motivates us to introduce and study the effect of normalization during training. But we find this is non-trivial, despite normalization being differentiable. We identify and study four issues related to normalization through mathematical analysis, which yields understanding and helps with parameter settings. Based on this analysis we propose two strategies for training using normalized features. The first is a modification of softmax loss, which optimizes cosine similarity instead of inner-product. The second is a reformulation of metric learning by introducing an agent vector for each class. We show that both strategies, and small variants, consistently improve performance by between 0.2% to 0.4% on the LFW dataset based on two models. This is significant because the performance of the two models on LFW dataset is close to saturation at over 98%. Codes and models are released on https://github.com/happynear/NormFace
- Training convolutional networks (CNN's) that fit on a single GPU with minibatch stochastic gradient descent has become effective in practice. However, there is still no effective method for training large CNN's that do not fit in the memory of a few GPU cards, or for parallelizing CNN training. In this work we show that a simple hard mixture of experts model can be efficiently trained to good effect on large scale hashtag (multilabel) prediction tasks. Mixture of experts models are not new (Jacobs et. al. 1991, Collobert et. al. 2003), but in the past, researchers have had to devise sophisticated methods to deal with data fragmentation. We show empirically that modern weakly supervised data sets are large enough to support naive partitioning schemes where each data point is assigned to a single expert. Because the experts are independent, training them in parallel is easy, and evaluation is cheap for the size of the model. Furthermore, we show that we can use a single decoding layer for all the experts, allowing a unified feature embedding space. We demonstrate that it is feasible (and in fact relatively painless) to train far larger models than could be practically trained with standard CNN architectures, and that the extra capacity can be well used on current datasets.
- Apr 24 2017 cs.CL arXiv:1704.06360v1We present SwellShark, a framework for building biomedical named entity recognition (NER) systems quickly and without hand-labeled data. Our approach views biomedical resources like lexicons as function primitives for autogenerating weak supervision. We then use a generative model to unify and denoise this supervision and construct large-scale, probabilistically labeled datasets for training high-accuracy NER taggers. In three biomedical NER tasks, SwellShark achieves competitive scores with state-of-the-art supervised benchmarks using no hand-labeled training data. In a drug name extraction task using patient medical records, one domain expert using SwellShark achieved within 5.1% of a crowdsourced annotation approach -- which originally utilized 20 teams over the course of several weeks -- in 24 hours.
- Feynman diagrams in $\phi^4$ theory have as their underlying structure 4-regular graphs. In particular, any 4-point $\phi^4$ graph can be uniquely derived from a 4-regular graph by deleting a vertex. The Feynman period is a simplified version of the Feynman integral, and is of special interest, as it maintains much of the important number theoretic information from the Feynman integral. It is also of structural interest, as it is known to be preserved by a number of graph theoretic operations. In particular, the vertex deleted in constructing the 4-point graph does not affect the Feynman period, and it is invariant under planar duality and the Schnetz twist, an operation that redirects edges incident to a 4-vertex cut. Further, a 4-regular graph may be produced by a 3-sum operation on triangles in two 4-regular graphs. The Feynman period of this graph with a vertex deleted is equal to the product of the Feynman periods of the two smaller graphs with one vertex deleted each. These operations currently explain all known instances of non-isomorphic 4-point $\phi^4$ graphs with equal periods. With this in mind, other graph invariants that are preserved by these operations for 4-point $\phi^4$ graphs are of interest, as they may provide insight into the Feynman period. In this thesis the extended graph permanent is introduced; an infinite sequence of residues from prime order finite fields. It is shown that this sequence is preserved by these three operations, and has a product property. Additionally, computational techniques will be established, and an alternate interpretation will be presented as the point count of a novel graph polynomial. Further, the previously existing $c_2$ invariant and Hepp bound are examined, two graph invariants that are conjectured to be preserved by these graph operations. A combinatorial approach to the $c_2$ invariant is introduced.
- Apr 24 2017 cs.CV arXiv:1704.06340v1We consider scenarios in which we wish to perform joint scene understanding, object tracking, activity recognition, and other tasks in environments in which multiple people are wearing body-worn cameras while a third-person static camera also captures the scene. To do this, we need to establish person-level correspondences across first- and third-person videos, which is challenging because the camera wearer is not visible from his/her own egocentric video, preventing the use of direct feature matching. In this paper, we propose a new semi-Siamese Convolutional Neural Network architecture to address this novel challenge. We formulate the problem as learning a joint embedding space for first- and third-person videos that considers both spatial- and motion-domain cues. A new triplet loss function is designed to minimize the distance between correct first- and third-person matches while maximizing the distance between incorrect ones. This end-to-end approach performs significantly better than several baselines, in part by learning the first- and third-person features optimized for matching jointly with the distance measure itself.
- Apr 24 2017 cs.LG arXiv:1704.06327v2Image clustering is one of the most important computer vision applications, which has been extensively studied in literature. However, current clustering methods mostly suffer from lack of efficiency and scalability when dealing with large-scale and high-dimensional data. In this paper, we propose a new clustering model, called DEeP Embedded RegularIzed ClusTering (DEPICT), which efficiently maps data into a discriminative embedding subspace and precisely predicts cluster assignments. DEPICT generally consists of a multinomial logistic regression function stacked on top of a multi-layer convolutional autoencoder. We define a clustering objective function using relative entropy (KL divergence) minimization, regularized by a prior for the frequency of clusters. An alternating strategy is then derived to optimize the objective by updating parameters and estimating cluster assignments. Furthermore, we employ the reconstruction loss functions in our autoencoder, as a data-dependent regularization term, to prevent the deep embedding function from overfitting. In order to benefit from end-to-end optimization and eliminate the necessity for layer-wise pretraining, we introduce a joint learning framework to minimize the unified clustering and reconstruction loss functions together and train all network layers simultaneously. Experimental results indicate the superiority and speed of DEPICT and our joint learning approach in real-world clustering tasks, in which no labeled data is available for hyper-parameter tuning.
- Apr 24 2017 cs.CV arXiv:1704.06326v1During the recent years, correlation filters have shown dominant and spectacular results for visual object tracking. The types of the features that are employed in these family of trackers significantly affect the performance of visual tracking. The ultimate goal is to utilize robust features invariant to any kind of appearance change of the object, while predicting the object location as properly as in the case of no appearance change. As the deep learning based methods has emerged, the study of learning features for specific tasks has accelerated. For instance, discriminative visual tracking methods based on deep architectures have been studied with promising performance. Nevertheless, correlation filter based (CFB) trackers confine themselves to use the pre-trained networks which are trained for object classification problem. To this end, in this manuscript the problem of learning deep fully convolutional features for the CFB visual tracking is formulated. In order to learn the proposed model, a novel and efficient backpropagation algorithm is presented based on the loss function of the network. The proposed learning framework enables the network model to be flexible for a custom design. Moreover, it alleviates the dependency on the network trained for classification. Extensive performance analysis shows the efficacy of the proposed custom design in the CFB tracking framework. By fine-tuning the convolutional parts of a state-of-the-art network and integrating this model to a CFB tracker, which is the winner of VOT2016, 18% increase is achieved in terms of expected average overlap, and tracking failures are decreased by 25%, while maintaining the superiority over the state-of-the-art methods in OTB-2013 and OTB-2015 tracking datasets.
- Apr 24 2017 cs.SY arXiv:1704.06325v1This paper presents a spatial-based trajectory planning method for automated vehicles under actuator, obstacle avoidance, and vehicle dimension constraints. Starting from a nonlinear kinematic bicycle model, vehicle dynamics are transformed to a road-aligned coordinate frame with path along the road centerline replacing time as the dependent variable. Space-varying vehicle dimension constraints are linearized around a reference path to pose convex optimization problems. Such constraints do not require to inflate obstacles by safety-margins and therefore maximize performance in very constrained environments. A sequential linear programming (SLP) algorithm is motivated. A linear program (LP) is solved at each SLP-iteration. The relation between LP formulation and maximum admissible traveling speeds within vehicle tire friction limits is discussed. The proposed method is evaluated in a roomy and in a tight maneuvering driving scenario, whereby a comparison to a semi-analytical clothoid-based path planner is given. Effectiveness is demonstrated particularly for very constrained environments, requiring to account for constraints and planning over the entire obstacle constellation space.
- Apr 24 2017 cs.FL arXiv:1704.06319v1We formalize and analyze a new problem in formal language theory termed control improvisation. Given a specification language, the problem is to produce an improviser, a probabilistic algorithm that randomly generates words in the language, subject to two additional constraints: the satisfaction of a quantitative soft constraint, and the exhibition of a specified amount of randomness. Control improvisation has many applications, including for example systematically generating random test vectors satisfying format constraints or preconditions while being similar to a library of seed inputs. Other applications include robotic surveillance, machine improvisation of music, and randomized variants of the supervisory control problem. We describe a general framework for solving the control improvisation problem, and use it to give efficient algorithms for several practical classes of instances with finite automaton and context-free grammar specifications. We also provide a detailed complexity analysis, establishing #P-hardness of the problem in many other cases. For these intractable cases, we show how symbolic techniques based on Boolean satisfiability (SAT) solvers can be used to find approximate solutions. Finally, we discuss an extension of control improvisation to multiple soft constraints that is useful in some applications.