- Oct 21 2016 quant-ph arXiv:1610.06546v1Given a Hermitian operator $\hat{H}=\langle G|\hat{U}|G\rangle$ that is the projection of an oracle $\hat{U}$ by state $|G\rangle$ created with oracle $\hat{G}$, the problem of Hamiltonian simulation is approximating the time evolution operator $e^{-i\hat{H}t}$ at time $t$ with error $\epsilon$. We show that this can be done with query complexity $\mathcal{O}(t+\log{(1/\epsilon))}$ to $\hat{G},\hat{U}$ that is optimal not just in asymptotic limits, but for all values $t,\epsilon$. Furthermore, only $2$ additional ancilla qubits are required in total, together with $\mathcal{O}(1)$ additional single and two-qubit gates per query. Our approach to Hamiltonian simulation subsumes important prior art considering Hamiltonians which are $d$-sparse or a linear combination of unitaries, leading to significant improvements in space complexity, as well as a quadratic speed-up for precision simulations. It also motivates useful new instances, such as where $\langle G|\hat{U}|G\rangle$ is a density matrix. A key technical result is `qubitization' which uses controlled-$\hat{U}$ and controlled-$\hat{G}$ to embed $\hat{H}$ in an invariant $\text{SU}(2)$ subspace. A large class of operator functions of $\hat{H}$ can then be computed with optimal query complexity, of which $e^{-i\hat{H}t}$ is a special case.
- Oct 21 2016 quant-ph arXiv:1610.06381v1We show a meaningful theory of classical communication over quantum channels when assisted by no-signalling (NS) and PPT-preserving (PPT) codes, for which both the optimal success probability of a given rate and one-shot $\epsilon$-error capacity are formalized as semidefinite programs (SDPs). Based on this, we obtain improved SDP finite blocklength converse bounds for quantum channels, which also reduce to the finite blocklength converse bound of Polyanskiy, Poor and Verdú for classical channels. Furthermore, we derive two SDP strong converse bounds for the classical capacity of a general quantum channel: for any code with a rate exceeding either of the two bounds of the channel, the success probability vanishes exponentially fast as the number of channel uses increases. This provides efficiently computable upper bounds for the classical capacity of a general quantum channel for the first time. In particular, we show that the classical capacity of an amplitude damping channel with parameter $\gamma$ is upper bounded by $\log_2(1+\sqrt{1-\gamma})$. Remarkably, we also establish the strong converse property for the classical and private capacities of a new class of quantum channels. Finally, we study the zero-error setting and provide efficiently computable upper bounds on the one-shot zero-error capacity of a general quantum channel.
- Oct 21 2016 quant-ph arXiv:1610.06499v1Several quantum key distribution (QKD) protocols employ iterative sifting. After each quantum transmission round, Alice and Bob disclose part of their setting information (including their basis choices) for the detected signals. The quantum phase of the protocol then ends when the numbers of detected signals per basis exceed certain pre-agreed threshold values. Recently, however, Pfister et al. [New J. Phys. 18 053001 (2016)] showed that iterative sifting makes QKD insecure, especially in the finite key regime, if the parameter estimation for privacy amplification uses the random sampling theory. This implies that a number of existing finite key security proofs could be flawed and cannot guarantee security. Here, we solve this serious problem by showing that the use of Azuma's inequality for parameter estimation makes QKD with iterative sifting secure again. This means that the existing protocols whose security proof employs this inequality remain secure even if they employ iterative sifting. Also, our results highlight a fundamental difference between the random sampling theorem and Azuma's inequality in proving security.
- Oct 21 2016 quant-ph arXiv:1610.06345v1We present a family of quantum money schemes with classical verification which display a number of benefits over previous proposals. Our schemes are based on hidden matching quantum retrieval games and they tolerate noise up to 23%, which we conjecture reaches 25% asymptotically as the dimension of the underlying hidden matching states is increased. Furthermore, we prove that 25% is the maximum tolerable noise for a wide class of quantum money schemes with classical verification, meaning our schemes are almost optimally noise tolerant. We use methods in semi-definite programming to prove security in a substantially different manner to previous proposals, leading to two main advantages: first, coin verification involves only a constant number of states (with respect to coin size), thereby allowing for smaller coins; second, the re-usability of coins within our scheme grows linearly with the size of the coin, which is known to be optimal. Lastly, we suggest methods by which the coins in our protocol could be implemented using weak coherent states and verified using existing experimental techniques, even in the presence of detector inefficiencies.
- Oct 21 2016 astro-ph.CO arXiv:1610.06559v1It is generally expected that heavy fields are present during inflation, which can leave their imprint in late-time cosmological observables. The main signature of these fields is a small amount of distinctly shaped non-Gaussianity, which if detected, would provide a wealth of information about the particle spectrum of the inflationary Universe. Here we investigate to what extent these signatures can be detected or constrained using futuristic 21-cm surveys. We construct model-independent templates that extract the squeezed-limit behavior of the bispectrum, and examine their overlap with standard inflationary shapes and secondary non-Gaussianities. We then use these templates to forecast detection thresholds for different masses and couplings using a 3D reconstruction of modes during the dark ages ($z\sim 30-100$). We consider interactions of several broad classes of models and quantify their detectability as a function of the baseline of a dark ages interferometer. Our analysis shows that there exists the tantalizing possibility of discovering new particles with different masses and interactions with future 21-cm surveys.
- Oct 21 2016 physics.hist-ph arXiv:1610.06518v1It is conventionally thought that New Zealand's distance from the large, northern hemisphere centres of learning, and our relatively small population and wealth, are detrimental to the contribution we can make to the advancement of scientific knowledge. Here the reverse point of view is argued.
- We investigate the potential for the LISA space-based interferometer to detect the stochastic gravitational wave background produced from different mechanisms during inflation. Focusing on well-motivated scenarios, we study the resulting contributions from particle production during inflation, inflationary spectator fields with varying speed of sound, effective field theories of inflation with specific patterns of symmetry breaking and models leading to the formation of primordial black holes. The projected sensitivities of LISA are used in a model-independent way for various detector designs and configurations. We demonstrate that LISA is able to probe these well-motivated inflationary scenarios beyond the irreducible vacuum tensor modes expected from any inflationary background.
- Oct 21 2016 quant-ph physics.optics arXiv:1610.06300v1We report an experimental realisation of a quantum random number generator using a plasmonic beamsplitter. Free-space single photons are converted into propagating single surface plasmon polaritons on a gold stripe waveguide via a grating. The surface plasmons are then guided to a region where they are scattered into one of two possible outputs. The presence of a plasmonic excitation in a given output determines the value of a random bit generated from the quantum scattering process. Using a stream of single surface plasmons injected into the beamsplitter we achieve a quantum random number generation rate of 2.37 Mbits/s even in the presence of loss. We characterise the quality of the random number sequence generated, finding it to be comparable to sequences from other quantum photonic-based devices. The compact nature of our nanophotonic device makes it suitable for tight integration in on-chip applications, such as in quantum computing and communication schemes.
- Trajectory tracking control for quadrotors is important for applications ranging from surveying and inspection, to film making. However, designing and tuning classical controllers, such as proportional-integral-derivative (PID) controllers, to achieve high tracking precision can be time-consuming and difficult, due to hidden dynamics and other non-idealities. The Deep Neural Network (DNN), with its superior capability of approximating abstract, nonlinear functions, proposes a novel approach for enhancing trajectory tracking control. This paper presents a DNN-based algorithm that improves the tracking performance of a classical feedback controller. Given a desired trajectory, the DNNs provide a tailored input to the controller based on their gained experience. The input aims to achieve a unity map between the desired and the output trajectory. The motivation for this work is an interactive "fly-as-you-draw" application, in which a user draws a trajectory on a mobile device, and a quadrotor instantly flies that trajectory with the DNN-enhanced control system. Experimental results demonstrate that the proposed approach improves the tracking precision for user-drawn trajectories after the DNNs are trained on selected periodic trajectories, suggesting the method's potential in real-world applications. Tracking errors are reduced by around 40-50 % for both training and testing trajectories from users, highlighting the DNNs' capability of generalizing knowledge.
- Oct 21 2016 cs.LO arXiv:1610.06552v1The notion of Reactive Turing machines (RTM) was proposed as an orthogonal extension of Turing machines with interaction. RTMs are used to define the notion of executable transition system in the same way as Turing machines are used to define the notion of computable function on natural numbers. RTMs inherited finiteness of all sets involved from Turing machines, and as a consequence, in a single step, an RTM can only communicate elements from a finite set of data. Some process calculi such as, e.g., the $\pi$-calculus, essentially use a form of infinite data in their definition and hence it immediately follows that transition systems specified in these calculi are not executable. On closer inspection, however, the $\pi$-calculus does not appear to use the infinite data in a non-computable manner. In this paper, we investigate several ways to relax the finiteness requirement. We start by considering a variant of RTMs in which all sets are allowed to be infinite, and then refine by adding extra restrictions. The most restricted variant of RTMs in which the sets of action and data symbols are still allowed to be infinite is the notion of RTM with atoms. We propose a notion of transition systems with atoms, and show that every effective legal transition system with atoms is executable by RTM with atoms. In such a way, we show that processes specified in the $\pi$-calculus are executable by RTM with atoms.
- Structural equation models (SEMs) and vector autoregressive models (VARMs) are two broad families of approaches that have been shown useful in effective brain connectivity studies. While VARMs postulate that a given region of interest in the brain is directionally connected to another one by virtue of time-lagged influences, SEMs assert that causal dependencies arise due to contemporaneous effects, and may even be adopted when nodal measurements are not necessarily multivariate time series. To unify these complementary perspectives, linear structural vector autoregressive models (SVARMs) that leverage both contemporaneous and time-lagged nodal data have recently been put forth. Albeit simple and tractable, linear SVARMs are quite limited since they are incapable of modeling nonlinear dependencies between neuronal time series. To this end, the overarching goal of the present paper is to considerably broaden the span of linear SVARMs by capturing nonlinearities through kernels, which have recently emerged as a powerful nonlinear modeling framework in canonical machine learning tasks, e.g., regression, classification, and dimensionality reduction. The merits of kernel-based methods are extended here to the task of learning the effective brain connectivity, and an efficient regularized estimator is put forth to leverage the edge sparsity inherent to real-world complex networks. Judicious kernel choice from a preselected dictionary of kernels is also addressed using a data-driven approach. Extensive numerical tests on ECoG data captured through a study on epileptic seizures demonstrate that it is possible to unveil previously unknown causal links between brain regions of interest.
- Oct 21 2016 cs.CL arXiv:1610.06550v1Most existing Neural Machine Translation models use groups of characters or whole words as their unit of input and output. We propose a model with a hierarchical char2word encoder, that takes individual characters both as input and output. We first argue that this hierarchical representation of the character encoder reduces computational complexity, and show that it improves translation performance. Secondly, by qualitatively studying attention plots from the decoder we find that the model learns to compress common words into a single embedding whereas rare words, such as names and places, are represented character by character.
- Oct 21 2016 stat.ML arXiv:1610.06545v1The goal of two-sample tests is to decide whether two probability distributions, denoted by $P$ and $Q$, are equal. One simple method to construct flexible two-sample tests is to use binary classifiers. More specifically, pair $n$ random samples drawn from $P$ with a positive label, and pair $n$ random samples drawn from $Q$ with a negative label. If the null hypothesis "$P = Q$" is true, the classification accuracy of a binary classifier on a hold-out subset of these data should remain near chance-level. Since the hold-out classification accuracy is an average of independent random variables under the null hypothesis, the two-sample test statistic follows a Binomial distribution. Furthermore, the decision boundary of our binary classifier provides interpretation on the differences between $P$ and $Q$. In particular this boundary can be useful to analyze which samples were correctly or incorrectly labeled by the classifier, with the least or most confidence. The goal of this paper is to revive the interest in classifier two-sample tests for a variety of applications, including independence testing, generative model evaluation, and causal discovery. To this end, we study their fundamentals, review prior literature on their applications, compare their performance against alternative state-of-the-art two-sample tests, and propose their use to evaluate generative adversarial network models applied to image synthesis. As a novel application of our research, we propose the use of conditional generative adversarial networks, together with classifier two-sample tests, to achieve state-of-the-art causal discovery.
- Oct 21 2016 cs.CL arXiv:1610.06542v1This year, the Nara Institute of Science and Technology (NAIST)/Carnegie Mellon University (CMU) submission to the Japanese-English translation track of the 2016 Workshop on Asian Translation was based on attentional neural machine translation (NMT) models. In addition to the standard NMT model, we make a number of improvements, most notably the use of discrete translation lexicons to improve probability estimates, and the use of minimum risk training to optimize the MT system for BLEU score. As a result, our system achieved the highest translation evaluation scores for the task.
- We propose an attention-enabled encoder-decoder model for the problem of grapheme-to-phoneme conversion. Most previous work has tackled the problem via joint sequence models that require explicit alignments for training. In contrast, the attention-enabled encoder-decoder model allows for jointly learning to align and convert characters to phonemes. We explore different types of attention models, including global and local attention, and our best models achieve state-of-the-art results on three standard data sets (CMUDict, Pronlex, and NetTalk).
- A novel learning Model Predictive Control technique is applied to the autonomous racing problem. The goal of the controller is to minimize the time to complete a lap. The proposed control strategy uses the data from previous laps to improve its performance while satisfying safety requirements. Moreover, a system identification technique is proposed to estimate the vehicle dynamics. Simulation results with the high fidelity simulator software CarSim show the effectiveness of the proposed control scheme.
- Understanding how users navigate in a network is of high interest in many applications. We consider a setting where only aggregate node-level traffic is observed and tackle the task of learning edge transition probabilities. We cast it as a preference learning problem, and we study a model where choices follow Luce's axiom. In this case, the $O(n)$ marginal counts of node visits are a sufficient statistic for the $O(n^2)$ transition probabilities. We show how to make the inference problem well-posed regardless of the network's structure, and we present ChoiceRank, an iterative algorithm that scales to networks that contains billions of nodes and edges. We apply the model to a month-long clickstream of the English Wikipedia and one year of rides on New York City's bicycle-sharing system. In both cases, we successfully recover the transition probabilities using only the network structure and marginal (node-level) traffic data.
- We consider unconstrained optimization problems where only "stochastic" estimates of the objective function are observable as replicates from a Monte Carlo oracle. The Monte Carlo oracle is assumed to provide no direct observations of the function gradient. We present ASTRO-DF --- a class of derivative-free trust-region algorithms, where a stochastic local interpolation model is constructed, optimized, and updated iteratively. Function estimation and model construction within ASTRO-DF is adaptive in the sense that the extent of Monte Carlo sampling is determined by continuously monitoring and balancing metrics of sampling error (or variance) and structural error (or model bias) within ASTRO-DF. Such balancing of errors is designed to ensure that Monte Carlo effort within ASTRO-DF is sensitive to algorithm trajectory, sampling more whenever an iterate is inferred to be close to a critical point and less when far away. We demonstrate the almost-sure convergence of ASTRO-DF's iterates to a first-order critical point when using linear or quadratic stochastic interpolation models. The question of using more complicated models, e.g., regression or stochastic kriging, in combination with adaptive sampling is worth further investigation and will benefit from the methods of proof presented here. We speculate that ASTRO-DF's iterates achieve the canonical Monte Carlo convergence rate, although a proof remains elusive.
- Oct 21 2016 cs.CL arXiv:1610.06498v1The authorship attribution is a problem of considerable practical and technical interest. Several methods have been designed to infer the authorship of disputed documents in multiple contexts. While traditional statistical methods based solely on word counts and related measurements have provided a simple, yet effective solution in particular cases; they are prone to manipulation. Recently, texts have been successfully modeled as networks, where words are represented by nodes linked according to textual similarity measurements. Such models are useful to identify informative topological patterns for the authorship recognition task. However, there is no consensus on which measurements should be used. Thus, we proposed a novel method to characterize text networks, by considering both topological and dynamical aspects of networks. Using concepts and methods from cellular automata theory, we devised a strategy to grasp informative spatio-temporal patterns from this model. Our experiments revealed an outperformance over traditional analysis relying only on topological measurements. Remarkably, we have found a dependence of pre-processing steps (such as the lemmatization) on the obtained results, a feature that has mostly been disregarded in related works. The optimized results obtained here pave the way for a better characterization of textual networks.
- Oct 21 2016 cs.CV arXiv:1610.06494v1This paper describes a dataset containing small images of text from everyday scenes. The purpose of the dataset is to support the development of new automated systems that can detect and analyze text. Although much research has been devoted to text detection and recognition in scanned documents, relatively little attention has been given to text detection in other types of images, such as photographs that are posted on social-media sites. This new dataset, known as COCO-Text-Patch, contains approximately 354,000 small images that are each labeled as "text" or "non-text". This dataset particularly addresses the problem of text verification, which is an essential stage in the end-to-end text detection and recognition pipeline. In order to evaluate the utility of this dataset, it has been used to train two deep convolution neural networks to distinguish text from non-text. One network is inspired by the GoogLeNet architecture, and the second one is based on CaffeNet. Accuracy levels of 90.2% and 90.9% were obtained using the two networks, respectively. All of the images, source code, and deep-learning trained models described in this paper will be publicly available
- The paper focuses on the problem of learning saccades enabling visual object search. The developed system combines reinforcement learning with a neural network for learning to predict the possible outcomes of its actions. We validated the solution in three types of environment consisting of (pseudo)-randomly generated matrices of digits. The experimental verification is followed by the discussion regarding elements required by systems mimicking the fovea movement and possible further research directions.
- Oct 21 2016 cs.AI arXiv:1610.06490v1A new approach to data stream clustering with the help of an ensemble of adaptive neuro-fuzzy systems is proposed. The proposed ensemble is formed with adaptive neuro-fuzzy self-organizing Kohonen maps in a parallel processing mode. A final result is chosen by the best neuro-fuzzy self-organizing Kohonen map.
- An architecture of a new neuro-fuzzy system is proposed. The basic idea of this approach is to tune both synaptic weights and membership functions with the help of the supervised learning and self-learning paradigms. The approach to solving the problem has to do with evolving online neuro-fuzzy systems that can process data under uncertainty conditions. The results prove the effectiveness of the developed architecture and the learning procedure.
- An evolving weighted neuro-neo-fuzzy-ANARX model and its learning procedures are introduced in the article. This system is basically used for time series forecasting. This system may be considered as a pool of elements that process data in a parallel manner. The proposed evolving system may provide online processing data streams.
- A modification of the neo-fuzzy neuron is proposed (an extended neo-fuzzy neuron (ENFN)) that is characterized by improved approximating properties. An adaptive learning algorithm is proposed that has both tracking and smoothing properties. An ENFN distinctive feature is its computational simplicity compared to other artificial neural networks and neuro-fuzzy systems.
- We present ORB-SLAM2 a complete SLAM system for monocular, stereo and RGB-D cameras, including map reuse, loop closing and relocalization capabilities. The system works in real-time in standard CPUs in a wide variety of environments from small hand-held indoors sequences, to drones flying in industrial environments and cars driving around a city. Our backend based on Bundle Adjustment with monocular and stereo observations allows for accurate trajectory estimation with metric scale. Our system includes a lightweight localization mode that leverages visual odometry tracks for unmapped regions and matches to map points that allow for zero-drift localization. The evaluation in 29 popular public sequences shows that our method achieves state-of-the-art accuracy, being in most cases the most accurate SLAM solution. We publish the source code, not only for the benefit of the SLAM community, but with the aim of being an out-of-the-box SLAM solution for researchers in other fields.
- Oct 21 2016 math.NA arXiv:1610.06469v1Blind deconvolution resolves unknown channel and input signal from their convolution. Recently, performance guarantees for fast algorithms under linear subspace models have been derived, and some of the results were further extended to a subsampled case and to a more general union of subspaces model, which represents sparsity in a certain transform domain. Motivated by a channel estimation problem in underwater acoustics, we consider a scenario where the unknown signal is observed through multiple unknown correlated channels, which are accurately approximated using a bilinear model by imposing structure on the system. We propose a new iterative algorithm for multichannel blind deconvolution using this bilinear channel model, which consists of computing an extreme eigenvector of given matrices. In particular, we show that the proposed iterative algorithm with a cheap initialization converges fast to a stable estimate of the unknown channel parameters. Our numerical results demonstrate that the empirical performance is consistent with the presented theory.
- This paper explores a simple question: How would we provide a high-quality search experience on Mars, where the fundamental physical limit is speed-of-light propagation delays on the order of tens of minutes? On Earth, users are accustomed to nearly instantaneous response times from search engines. Is it possible to overcome orders-of-magnitude longer latency to provide a tolerable user experience on Mars? In this paper, we formulate the searching from Mars problem as a tradeoff between "effort" (waiting for responses from Earth) and "data transfer" (pre-fetching or caching data on Mars). The contribution of our work is articulating this design space and presenting two case studies that explore the effectiveness of baseline techniques, using publicly available data from the TREC Total Recall and Sessions Tracks. We intend for this research problem to be aspirational and inspirational - even if one is not convinced by the premise of Mars colonization, there are Earth-based scenarios such as searching from a rural village in India that share similar constraints, thus making the problem worthy of exploration and attention from researchers.
- Hypothesis testing is an important cognitive process that supports human reasoning. In this paper, we introduce a computational hypothesis testing approach based on memory augmented neural networks. Our approach involves a hypothesis testing loop that reconsiders and progressively refines a previously formed hypothesis in order to generate new hypotheses to test. We apply the proposed approach to language comprehension task by using Neural Semantic Encoders (NSE). Our NSE models achieve the state-of-the-art results showing an absolute improvement of 1.2% to 2.6% accuracy over previous results obtained by single and ensemble systems on standard machine comprehension benchmarks such as the Children's Book Test (CBT) and Who-Did-What (WDW) news article datasets.
- Body-worn video (BWV) cameras are increasingly utilized by police departments to provide a record of police-public interactions. However, large-scale BWV deployment produces terabytes of data per week, necessitating the development of effective computational methods to identify salient changes in video. In work carried out at the 2016 RIPS program at IPAM, UCLA, we present a novel two-stage framework for video change-point detection. First, we employ state-of-the-art machine learning methods including convolutional neural networks and support vector machines for scene classification. We then develop and compare change-point detection algorithms utilizing mean squared-error minimization, forecasting methods, hidden Markov models, and maximum likelihood estimation to identify noteworthy changes. We test our framework on detection of vehicle exits and entrances in a BWV data set provided by the Los Angeles Police Department and achieve over 90% recall and nearly 70% precision -- demonstrating robustness to rapid scene changes, extreme luminance differences, and frequent camera occlusions.
- Oct 21 2016 stat.ML arXiv:1610.06434v1The ability of a human being to extrapolate previously gained knowledge to other domains inspired a new family of methods in machine learning called transfer learning. Transfer learning is often based on the assumption that objects in both target and source domains share some common feature and/or data space. In this paper, we propose a simple and intuitive approach that minimizes iteratively the distance between source and target task distributions by optimizing the kernel target alignment (KTA). We show that this procedure is suitable for transfer learning by relating it to Hilbert-Schmidt Independence Criterion (HSIC) and Quadratic Mutual Information (QMI) maximization. We run our method on benchmark computer vision data sets and show that it can outperform some state-of-art methods.
- The long-term memory of most connectionist systems lies entirely in the weights of the system. Since the number of weights is typically fixed, this bounds the total amount of knowledge that can be learned and stored. Though this is not normally a problem for a neural network designed for a specific task, such a bound is undesirable for a system that continually learns over an open range of domains. To address this, we describe a lifelong learning system that leverages a fast, though non-differentiable, content-addressable memory which can be exploited to encode both a long history of sequential episodic knowledge and semantic knowledge over many episodes for an unbounded number of domains. This opens the door for investigation into transfer learning, and leveraging prior knowledge that has been learned over a lifetime of experiences to new domains.
- Oct 21 2016 cs.LO arXiv:1610.06399v1We show that every (finite or not) typing derivation of system M, using non-idempotent intersection, which is the infinitary version of de Carvalho's system M_0 , can be represented in a rigid, non-idempotent intersection type system S. Namely, whereas non-idempotent intersection is represented by multisets in system M, system S resort to families of types indexed by integers, called tracks. Intersion is said to be sequential. The rigidity is here related to the fact that those indexes matter as well as the order in which the types are quoted in a familly of types.
- Oct 21 2016 cs.SE arXiv:1610.06371v1Precisely modeling complex systems like cyber-physical systems is often challenging, which may render model-based system verification techniques like model checking infeasible. To overcome this challenge, we propose a method called LAR to `verify' such complex systems through a combination of learning, abstraction and refinement. Instead of starting with system modeling, our method takes a set of concrete system traces as input. The output is either a counterexample with a bounded probability of being a spurious counterexample, or a probabilistic model based on which the given property is `verified'. The model could be viewed as a proof obligation, i.e., the property is verified if the model is correct. It can also be used for subsequent system analysis activities like runtime monitoring. Our method has been implemented as a self-contained software toolkit. The evaluation on multiple benchmark systems as well as a real-world water purification system show promising results.
- Assisted text input techniques can save time and effort and improve text quality. In this paper, we investigate how grounded and conditional extensions to standard neural language models can bring improvements in the tasks of word prediction and completion. These extensions incorporate a structured knowledge base and numerical values from the text into the context used to predict the next word. Our automated evaluation on a clinical dataset shows extended models significantly outperform standard models. Our best system uses both conditioning and grounding, because of their orthogonal benefits. For word prediction with a list of 5 suggestions, it improves recall from 25.03% to 71.28% and for word completion it improves keystroke savings from 34.35% to 44.81%, where theoretical bound for this dataset is 58.78%. We also perform a qualitative investigation of how models with lower perplexity occasionally fare better at the tasks. We found that at test time numbers have more influence on the document level than on individual word probabilities.
- Oct 21 2016 cs.CV arXiv:1610.06368v1Natural images contain often curvilinear structures, which might be disconnected, or partly occluded. Recovering the missing connection of disconnected structures is an open issue and needs appropriate geometric reasoning. We propose to find line co-occurrence statistics from the centerlines of blood vessels in retinal images and show its remarkable similarity to a well-known probabilistic model for the connectivity pattern in the primary visual cortex. Furthermore, the probabilistic model is trained from the data via statistics and used for automated grouping of interrupted vessels in a spectral clustering based approach. Several challenging image patches are investigated around junction points, where successful results indicate the perfect match of the trained model to the profiles of blood vessels in retinal images. Also, comparisons among several statistical models obtained from different datasets reveals their high similarity i.e., they are independent of the dataset. On top of that, the best approximation of the statistical model with the symmetrized extension of the probabilistic model on the projective line bundle is found with a least square error smaller than 2%. Apparently, the direction process on the projective line bundle is a good continuation model for vessels in retinal images.
- Oct 21 2016 cs.LO arXiv:1610.06362v1We investigate some subtle issues that arise when programming distributed computations over infinite data structures. To do this, we formalise a calculus that combines a call-by-name functional core with session-based communication primitives and that allows session operations to be performed "on demand". We develop a typing discipline that guarantees both normalisation of expressions and progress of processes and that uncovers an unexpected interplay between evaluation and communication.
- Linear codes are widely employed in communication systems, consumer electronics, and storage devices. All linear codes over finite fields can be generated by a generator matrix. Due to this, the generator matrix approach is called a fundamental construction of linear codes. This is the only known construction method that can produce all linear codes over finite fields. Recently, a defining-set construction of linear codes over finite fields has attracted a lot of attention, and have been employed to produce a huge number of classes of linear codes over finite fields. It was claimed that this approach can also generate all linear codes over finite fields. But so far, no proof of this claim is given in the literature. The objective of this paper is to prove this claim, and confirm that the defining-set approach is indeed a fundamental approach to constructing all linear codes over finite fields. As a byproduct, a trace representation of all linear codes over finite fields is presented.
- Oct 21 2016 cs.MM arXiv:1610.06347v1Image Forensics has already achieved great results for the source camera identification task on images. Standard approaches for data coming from Social Network Platforms cannot be applied due to different processes involved (e.g., scaling, compression, etc.). Over 1 billion images are shared each day on the Internet and obtaining information about their history from the moment they were acquired could be exploited for investigation purposes. In this paper, a classification engine for the reconstruction of the history of an image, is presented. Specifically, exploiting K-NN and decision trees classifiers and a-priori knowledge acquired through image analysis, we propose an automatic approach that can understand which Social Network Platform has processed an image and the software application used to perform the image upload. The engine makes use of proper alterations introduced by each platform as features. Results, in terms of global accuracy on a dataset of 2720 images, confirm the effectiveness of the proposed strategy.
- Oct 21 2016 cond-mat.stat-mech quant-ph arXiv:1610.06316v1Many studies of quantum-size heat engines assume that the dynamics of an internal system is unitary and that the extracted work is equal to the energy loss of the internal system. Both assumptions, however, should be under scrutiny. In the present paper, we analyze quantum-scale heat engines, employing the measurement-based formulation of the work extraction recently introduced by Hayashi and Tajima~[arXiv:1504.06150]. We first demonstrate the inappropriateness of the unitary time evolution of the internal system (namely the first assumption above) using a simple two-level system; we show that the variance of the energy transferred to an external system diverges when the dynamics of the internal system is approximated to a unitary time evolution. We second derive the quantum Jarzynski equality based on the formulation of Hayashi and Tajima as a relation for the work measured by an external macroscopic apparatus. The right-hand side of the equality reduces to unity for "natural" cyclic processes, but fluctuates wildly for non-cyclic ones, exceeding unity often. This fluctuation should be detectable in experiments and provide evidence for the present formulation.
- Oct 21 2016 math.LO arXiv:1610.06289v1The epsilon operator is a term-forming operator which replaces quantifiers in ordinary predicate logic. The application of this undervalued formalism has been hampered by the absence of well-behaved proof systems on the one hand, and accessible presentations of its theory on the other. One significant early result for the original axiomatic proof system for the epsilon-calculus is the first epsilon theorem, for which a proof is sketched. The system itself is discussed, also relative to possible semantic interpretations. The problems facing the development of proof-theoretically well-behaved systems are outlined.
- Oct 21 2016 astro-ph.SR arXiv:1610.06285v1Stellar astronomy, fueled by massive capital investments, advances in numerical modeling and theory, is resurgent and arguably is on the verge of a magnificent renaissance. Powerful time domain optical surveys, both on ground and in space, are producing data on variable stars on an unprecedented industrial scale. Those with deep knowledge of variable stars will stand to benefit from this resurgence. Notwithstanding these developments, in some astronomical communities, classical stellar astronomy has been in the doldrums. I offer a modest proposal to establish a basic level of familiarity with variable stellar phenomenology and an attractive scheme to make research in variable star astronomy visible, alluring and fashionable.
- Present day machine learning is computationally intensive and processes large amounts of data. It is implemented in a distributed fashion in order to address these scalability issues. The work is parallelized across a number of computing nodes. It is usually hard to estimate in advance how many nodes to use for a particular workload. We propose a simple framework for estimating the scalability of distributed machine learning algorithms. We measure the scalability by means of the speedup an algorithm achieves with more nodes. We propose time complexity models for gradient descent and graphical model inference. We validate our models with experiments on deep learning training and belief propagation. This framework was used to study the scalability of machine learning algorithms in Apache Spark.
- In this comment, we criticize three main conclusions of the letter\citeLee2016. We show that the concept of fractional winding number(FWN) is factitious, Lee's conclusions on Fig. 3 are finite-size effect and the breakdown of bulk-boundary correspondence (BBBC) cannot be explained by "defective".
- Oct 21 2016 cs.CL arXiv:1610.06272v1With the advent of word embeddings, lexicons are no longer fully utilized for sentiment analysis although they still provide important features in the traditional setting. This paper introduces a novel approach to sentiment analysis that integrates lexicon embeddings and an attention mechanism into Convolutional Neural Networks. Our approach performs separate convolutions for word and lexicon embeddings and provides a global view of the document using attention. Our models are experimented on both the SemEval'16 Task 4 dataset and the Stanford Sentiment Treebank, and show comparative or better results against the existing state-of-the-art systems. Our analysis shows that lexicon embeddings allow to build high-performing models with much smaller word embeddings, and the attention mechanism effectively dims out noisy words for sentiment analysis.
- Oct 21 2016 cs.DB arXiv:1610.06264v1We survey foundational features underlying modern graph query languages. We first discuss two popular graph data models: edge-labelled graphs, where nodes are connected to other nodes by directed, labelled edges; and property graphs, where nodes and edges can have attributes. Next we discuss the two most basic graph querying functionalities: graph patterns and navigational expressions. We start with graph patterns, in which a graph-structured query is matched against the data. Thereafter we discuss navigational expressions, in which patterns can be matched recursively against the graph to navigate paths of arbitrary length; we give an overview of what kinds of expressions have been proposed, and how such expressions can be combined with graph patterns. We also discuss a variety of semantics under which queries using the previous features can be evaluated, what effects the introduction of additional features and the selection of semantics has on complexity, as well as offering examples of said features in three modern languages that can be used to query graphs: SPARQL, Cypher and Gremlin. We conclude with discussion of the importance of formalisation for graph query languages, as well as possible future directions in which such languages can be extended.
- We consider the geometric properties of Hodge Cousin groups, introduced in an unpublished paper \citeOVV, emphasizing the case of Hodge Cousin groups corresponding to polarized $\mathbb{Q}$-Hodge structures. Basing on this consideration, we introduce the class of abelian Cousin groups and prove an analogue of Poincaré complete reducibility theorem for them.
- Oct 21 2016 math.CO arXiv:1610.06262v1A Latin square is reduced if its first row and column are in natural order. For Latin squares of a particular order $n$ there are four possible different parities. We confirm a conjecture of Stones and Wanless by showing asymptotic equality between the numbers of reduced Latin squares of each possible parity as the order $n\rightarrow\infty$.
- Until recently, research on artificial neural networks was largely restricted to systems with only two types of variable: Neural activities that represent the current or recent input and weights that learn to capture regularities among inputs, outputs and payoffs. There is no good reason for this restriction. Synapses have dynamics at many different time-scales and this suggests that artificial neural networks might benefit from variables that change slower than activities but much faster than the standard weights. These "fast weights" can be used to store temporary memories of the recent past and they provide a neurally plausible way of implementing the type of attention to the past that has recently proved very helpful in sequence-to-sequence models. By using fast weights we can avoid the need to store copies of neural activity patterns.
- The topological (or graph) structures of real-world networks are known to be predictive of multiple dynamic properties of the networks. Conventionally, a graph structure is represented using an adjacency matrix or a set of hand-crafted structural features. These representations either fail to highlight local and global properties of the graph or suffer from a severe loss of structural information. There lacks an effective graph representation, which hinges the realization of the predictive power of network structures. In this study, we propose to learn the represention of a graph, or the topological structure of a network, through a deep learning model. This end-to-end prediction model, named DeepGraph, takes the input of the raw adjacency matrix of a real-world network and outputs a prediction of the growth of the network. The adjacency matrix is first represented using a graph descriptor based on the heat kernel signature, which is then passed through a multi-column, multi-resolution convolutional neural network. Extensive experiments on five large collections of real-world networks demonstrate that the proposed prediction model significantly improves the effectiveness of existing methods, including linear or nonlinear regressors that use hand-crafted features, graph kernels, and competing deep learning methods.