- Suppose a large scale quantum computer becomes available over the Internet. Could we delegate universal quantum computations to this server, using only classical communication between client and server, in a way that is information-theoretically blind (i.e., the server learns nothing about the input apart from its size, with no cryptographic assumptions required)? In this paper we give strong indications that the answer is no. This contrasts with the situation where quantum communication between client and server is allowed --- where we know that such information-theoretically blind quantum computation is possible. It also contrasts with the case where cryptographic assumptions are allowed: there again, it is now known that there are quantum analogues of fully homomorphic encryption. In more detail, we observe that, if there exist information-theoretically secure classical schemes for performing universal quantum computations on encrypted data, then we get unlikely containments between complexity classes, such as ${\sf BQP} \subset {\sf NP/poly}$. Moreover, we prove that having such schemes for delegating quantum sampling problems, such as Boson Sampling, would lead to a collapse of the polynomial hierarchy. We then consider encryption schemes which allow one round of quantum communication and polynomially many rounds of classical communication, yielding a generalization of blind quantum computation. We give a complexity theoretic upper bound, namely ${\sf QCMA/qpoly} \cap {\sf coQCMA/qpoly}$, on the types of functions that admit such a scheme. This upper bound then lets us show that, under plausible complexity assumptions, such a protocol is no more useful than classical schemes for delegating ${\sf NP}$-hard problems to the server. Lastly, we comment on the implications of these results for the prospect of verifying a quantum computation through classical interaction with the server.
- Quantum computing is moving rapidly to the point of deployment of technology. Functional quantum devices will require the ability to correct error in order to be scalable and effective. A leading choice of error correction, in particular for modular or distributed architectures, is the surface code with logical two-qubit operations realised via "lattice surgery". These operations consist of "merges" and "splits" acting non-unitarily on the logical states and are not easily captured by standard circuit notation. This raises the question of how best to reason about lattice surgery in order efficiently to use quantum states and operations in architectures with complex resource management issues. In this paper we demonstrate that the operations of the ZX calculus, a form of quantum diagrammatic reasoning designed using category theory, match exactly the operations of lattice surgery. Red and green "spider" nodes match rough and smooth merges and splits, and follow the axioms of a dagger special associative Frobenius algebra. Some lattice surgery operations can require non-trivial correction operations, which are captured natively in the use of the ZX calculus in the form of ensembles of diagrams. We give a first taste of the power of the calculus as a language for surgery by considering two operations (magic state use and producing a CNOT) and show how ZX diagram re-write rules give lattice surgery procedures for these operations that are novel, efficient, and highly configurable.
- Let $V=\bigotimes_{k=1}^{N} V_{k}$ be the $N$ spin-$j$ Hilbert space with $d=2j+1$-dimensional single particle space. We fix an orthonormal basis $\{|m_i\rangle\}$ for each $V_{k}$, with weight $m_i\in \{-j,\ldots j\}$. Let $V_{(w)}$ be the subspace of $V$ with a constant weight $w$, with an orthonormal basis $\{|m_1,\ldots,m_N\rangle\}$ subject to $\sum_k m_k=w$. We show that the combinatorial properties of the constant weight condition imposes strong constraints on the reduced density matrices for any vector $|\psi\rangle$ in the constant weight subspace, which limits the possible entanglement structures of $|\psi\rangle$. Our results find applications in the overlapping quantum marginal problems, quantum error-correcting codes, and the spin-network structures in quantum gravity.
- Apr 28 2017 quant-ph arXiv:1704.08668v1We provide a new way to bound the security of quantum key distribution using only the diagrammatic behavior of complementary observables and essential uniqueness of purification for quantum channels. We begin by demonstrating a proof in the simplest case, where the eavesdropper doesn't noticeably disturb the channel at all and has no quantum memory. We then show how this case extends with almost no effort to account for quantum memory and noise.
- Apr 28 2017 quant-ph arXiv:1704.08314v1We describe the hardware, gateware, and software developed at Raytheon BBN Technologies for dynamic quantum information processing experiments on superconducting qubits. In dynamic experiments, real-time qubit state information is fedback or fedforward within a fraction of the qubits' coherence time to dynamically change the implemented sequence. The hardware presented here covers both control and readout of superconducting qubits. For readout we created a custom signal processing gateware and software stack on commercial hardware to convert pulses in a heterodyne receiver into qubit state assignments with minimal latency, alongside data taking capability. For control, we developed custom hardware with gateware and software for pulse sequencing and steering information distribution that is capable of arbitrary control flow on a fraction superconducting qubit coherence times. Both readout and control platforms make extensive use of FPGAs to enable tailored qubit control systems in a reconfigurable fabric suitable for iterative development.
- We study the strong duality of non-convex matrix factorization: we show under certain dual conditions, non-convex matrix factorization and its dual have the same optimum. This has been well understood for convex optimization, but little was known for matrix factorization. We formalize the strong duality of matrix factorization through a novel analytical framework, and show that the duality gap is zero for a wide class of matrix factorization problems. Although matrix factorization problems are hard to solve in full generality, under certain conditions the optimal solution of the non-convex program is the same as its bi-dual, and we can achieve global optimality of the non-convex program by solving its bi-dual. We apply our framework to matrix completion and robust Principal Component Analysis (PCA). While a long line of work has studied these problems, for basic problems in this area such as matrix completion, the information-theoretically optimal sample complexity was not known, and the sample complexity bounds if one also requires computational efficiency are even larger. In this work, we show that exact recoverability and strong duality hold with optimal sample complexity guarantees for matrix completion, and nearly-optimal guarantees for exact recoverability of robust PCA. For matrix completion, under the standard incoherence assumption that the underlying rank-$r$ matrix $X^*\in \mathbb{R}^{n\times n}$ with skinny SVD $U \Sigma V^T$ has $\max\{\|U^Te_i\|_2^2, \|V^Te_i\|_2^2\} \leq \frac{\mu r}{n}$ for all $i$, to the best of our knowledge we give (1) the first non-efficient algorithm achieving the optimal $O(\mu n r \log n)$ sample complexity, and (2) the first efficient algorithm achieving $O(\kappa^2\mu n r \log n)$ sample complexity, which matches the known $\Omega(\mu n r \log n)$ information-theoretic lower bound for constant condition number $\kappa$.
- Apr 28 2017 quant-ph arXiv:1704.08525v1There is a long history of representing a quantum state using a quasi-probability distribution: a distribution allowing negative values. In this paper we extend such representations to deal with quantum channels. The result is a convex, strongly monoidal, functorial embedding of the category of trace preserving completely positive maps into the category of quasi-stochastic matrices. This establishes quantum theory as a subcategory of quasi-stochastic processes. Such an embedding is induced by a choice of minimal informationally complete POVM's. We show that any two such embeddings are naturally isomorphic. The embedding preserves the dagger structure of the categories if and only if the POVM's are symmetric, giving a new use of SIC-POVM's. We also study general convex embeddings of quantum theory and prove a dichotomy that such an embedding is either trivial or faithful. The results of this paper allow a clear explanation of the characteristic features of quantum mechanics coming from being epistemically restricted (no-cloning, teleportation) and having negative probabilities (Bell inequalities, computational speed-up).
- Apr 28 2017 math.CO arXiv:1704.08661v1When considering binary strings, it's natural to wonder how many distinct subsequences might exist in a given string. Given that there is an existing algorithm which provides a straightforward way to compute the number of distinct subsequences in a fixed string, we might next be interested in the expected number of distinct subsequences in random strings. This expected value is already known for random binary strings where each letter in the string is, independently, equally likely to be a 1 or a 0. We generalize this result to random strings where the letter 1 appears independently with probability $\alpha \in [0,1]$. Also, we make some progress in the case of random strings from an arbitrary alphabet as well as when the string is generated by a two-state Markov chain.
- Reversible logic has two main properties. First, the number of inputs is equal to the number of outputs. Second, it implements a one-to-one mapping; i.e., ne can reconstruct the inputs from the outputs. These properties enable its applications in building quantum computing architectures. In this paper we study reverse engineering of reversible logic circuits, including reverse engineering of non-reversible functions embedded into reversible circuits. We propose the number of embeddings of non-reversible functions into a reversible circuit as the security metric for reverse engineering. We analyze the security benefits of automatic synthesis of reversible circuits. We use our proposed security metric to show that the functional synthesis approaches yield reversible circuits that are more resilient to reverse engineering than the structural synthesis approaches. Finally, we propose scrambling of the inputs and outputs of a reversible circuit to thwart reverse engineering.
- Apr 28 2017 cs.CV arXiv:1704.08345v1Existing zero-shot learning (ZSL) models typically learn a projection function from a feature space to a semantic embedding space (e.g.~attribute space). However, such a projection function is only concerned with predicting the training seen class semantic representation (e.g.~attribute prediction) or classification. When applied to test data, which in the context of ZSL contains different (unseen) classes without training data, a ZSL model typically suffers from the project domain shift problem. In this work, we present a novel solution to ZSL based on learning a Semantic AutoEncoder (SAE). Taking the encoder-decoder paradigm, an encoder aims to project a visual feature vector into the semantic space as in the existing ZSL models. However, the decoder exerts an additional constraint, that is, the projection/code must be able to reconstruct the original visual feature. We show that with this additional reconstruction constraint, the learned projection function from the seen classes is able to generalise better to the new unseen classes. Importantly, the encoder and decoder are linear and symmetric which enable us to develop an extremely efficient learning algorithm. Extensive experiments on six benchmark datasets demonstrate that the proposed SAE outperforms significantly the existing ZSL models with the additional benefit of lower computational cost. Furthermore, when the SAE is applied to supervised clustering problem, it also beats the state-of-the-art.
- End-to-end learning refers to training a possibly complex learning system by applying gradient-based learning to the system as a whole. End-to-end learning system is specifically designed so that all modules are differentiable. In effect, not only a central learning machine, but also all "peripheral" modules like representation learning and memory formation are covered by a holistic learning process. The power of end-to-end learning has been demonstrated on many tasks, like playing a whole array of Atari video games with a single architecture. While pushing for solutions to more challenging tasks, network architectures keep growing more and more complex. In this paper we ask the question whether and to what extent end-to-end learning is a future-proof technique in the sense of scaling to complex and diverse data processing architectures. We point out potential inefficiencies, and we argue in particular that end-to-end learning does not make optimal use of the modular design of present neural networks. Our surprisingly simple experiments demonstrate these inefficiencies, up to the complete breakdown of learning.
- Apr 28 2017 cs.CV arXiv:1704.08686v1We introduce a new framework for learning dense correspondence between deformable 3D shapes. Existing learning based approaches model shape correspondence as a labelling problem, where each point of a query shape receives a label identifying a point on some reference domain; the correspondence is then constructed a posteriori by composing the label predictions of two input shapes. We propose a paradigm shift and design a structured prediction model in the space of functional maps, linear operators that provide a compact representation of the correspondence. We model the learning process via a deep residual network which takes dense descriptor fields defined on two shapes as input, and outputs a soft map between the two given objects. The resulting correspondence is shown to be accurate on several challenging benchmarks comprising multiple categories, synthetic models, real scans with acquisition artifacts, topological noise, and partiality.
- This paper is the first chapter of three of the author's undergraduate thesis. We study the random matrix ensemble of covariance matrices arising from random $(d_b, d_w)$-regular bipartite graphs on a set of $M$ black vertices and $N$ white vertices, for $d_b \gg \log^4 N$. We simultaneously prove that the Green's functions of these covariance matrices and the adjacency matrices of the underlying graphs agree with the corresponding limiting law (e.g. Marchenko-Pastur law for covariance matrices) down to the optimal scale. This is an improvement from the previously known mesoscopic results. We obtain eigenvector delocalization for the covariance matrix ensemble as consequence, as well as a weak rigidity estimate.
- The theoretical description of non-renewal stochastic systems is a challenge. Analytical results are often not available or can only be obtained under strong conditions, limiting their applicability. Also, numerical results have mostly been obtained by ad-hoc Monte--Carlo simulations, which are usually computationally expensive when a high degree of accuracy is needed. To gain quantitative insight into these systems under general conditions, we here introduce a numerical iterated first-passage time approach based on solving the time-dependent Fokker-Planck equation (FPE) to describe the statistics of non-renewal stochastic systems. We illustrate the approach using spike-triggered neuronal adaptation in the leaky and perfect integrate-and-fire model, respectively. The transition to stationarity of first-passage time moments and their sequential correlations occur on a non-trivial timescale that depends on all system parameters. Surprisingly this is so for both single exponential and scale-free power-law adaptation. The method works beyond the small noise and timescale separation approximations. It shows excellent agreement with direct Monte Carlo simulations, which allows for the computation of transient and stationary distributions. We compare different methods to compute the evolution of the moments and serial correlation coefficients (SCC), and discuss the challenge of reliably computing the SCC which we find to be very sensitive to numerical inaccuracies for both the leaky and perfect integrate-and-fire models. In conclusion, our methods provide a general picture of non-renewal dynamics in a wide range of stochastic systems exhibiting short and long-range correlations.
- Apr 28 2017 cs.CV arXiv:1704.08628v1Text line detection and localization is a crucial step for full page document analysis, but still suffers from heterogeneity of real life documents. In this paper, we present a new approach for full page text recognition. Localization of the text lines is based on regressions with Fully Convolutional Neural Networks and Multidimensional Long Short-Term Memory as contextual layers. In order to increase the efficiency of this localization method, only the position of the left side of the text lines are predicted. The text recognizer is then in charge of predicting the end of the text to recognize. This method has shown good results for full page text recognition on the highly heterogeneous Maurdor dataset.
- Apr 28 2017 physics.soc-ph cond-mat.dis-nn arXiv:1704.08626v1We study the extinction of long-lived epidemics on finite complex networks induced by intrinsic noise. Applying analytical techniques to the stochastic Susceptible-Infected-Susceptible model, we predict the distribution of large fluctuations, the most probable, or optimal path through a network that leads to a disease-free state from an endemic state, and the average extinction time in general configurations. Our predictions agree with Monte-Carlo simulations on several networks, including synthetic weighted and degree-distributed networks with degree correlations, and an empirical high school contact network. In addition, our approach quantifies characteristic scaling patterns for the optimal path and distribution of large fluctuations, both near and away from the epidemic threshold, in networks with heterogeneous eigenvector centrality and degree distributions.
- Automatic affect recognition is a challenging task due to the various modalities emotions can be expressed with. Applications can be found in many domains including multimedia retrieval and human computer interaction. In recent years, deep neural networks have been used with great success in determining emotional states. Inspired by this success, we propose an emotion recognition system using auditory and visual modalities. To capture the emotional content for various styles of speaking, robust features need to be extracted. To this purpose, we utilize a Convolutional Neural Network (CNN) to extract features from the speech, while for the visual modality a deep residual network (ResNet) of 50 layers. In addition to the importance of feature extraction, a machine learning algorithm needs also to be insensitive to outliers while being able to model the context. To tackle this problem, Long Short-Term Memory (LSTM) networks are utilized. The system is then trained in an end-to-end fashion where - by also taking advantage of the correlations of the each of the streams - we manage to significantly outperform the traditional approaches based on auditory and visual handcrafted features for the prediction of spontaneous and natural emotions on the RECOLA database of the AVEC 2016 research challenge on emotion recognition.
- Apr 28 2017 cs.CV arXiv:1704.08614v1Computer vision systems are designed to work well within the context of everyday photography. However, artists often render the world around them in ways that do not resemble photographs. Artwork produced by people is not constrained to mimic the physical world, making it more challenging for machines to recognize. This work is a step toward teaching machines how to categorize images in ways that are valuable to humans. First, we collect a large-scale dataset of contemporary artwork from Behance, a website containing millions of portfolios from professional and commercial artists. We annotate Behance imagery with rich attribute labels for content, emotions, and artistic media. Furthermore, we carry out baseline experiments to show the value of this dataset for artistic style prediction, for improving the generality of existing object classifiers, and for the study of visual domain adaptation. We believe our Behance Artistic Media dataset will be a good starting point for researchers wishing to study artistic imagery and relevant problems.
- Apr 28 2017 quant-ph arXiv:1704.08600v1We construct $d\times d$ dimensional bound entangled states, which violate for any $d>2$ a bipartite Bell inequality introduced in this paper. We conjecture that the proposed class of Bell inequalities act as dimension witnesses for bound entangled states: for any $d>2$, there exists a Bell inequality from this class, which can be violated with bound entangled states only if their Hilbert space dimension is at least $d\times d$. Numerics supports this conjecture up to $d=8$.
- Apr 28 2017 cs.DS arXiv:1704.08592v1Finding central nodes is a fundamental problem in network analysis. Betweenness centrality is a well-known measure which quantifies the importance of a node based on the fraction of shortest paths going though it. Due to the dynamic nature of many today's networks, algorithms that quickly update centrality scores have become a necessity. For betweenness, several dynamic algorithms have been proposed over the years, targeting different update types (incremental- and decremental-only, fully-dynamic). In this paper we introduce a new dynamic algorithm for updating betweenness centrality after an edge insertion or an edge weight decrease. Our method is a combination of two independent contributions: a faster algorithm for updating pairwise distances as well as number of shortest paths, and a faster algorithm for updating dependencies. Whereas the worst-case running time of our algorithm is the same as recomputation, our techniques considerably reduce the number of operations performed by existing dynamic betweenness algorithms.
- We introduce Lipschitz-Killing curvature (LKC) regression, a new method to produce $(1-\alpha)$ thresholds for signal detection in random fields that does not require knowledge of the spatial correlation structure. The idea is to fit observed empirical Euler characteristics to the Gaussian kinematic formula via generalized least squares, which quickly and easily provides statistical estimates of the LKCs --- complex topological quantities that can be extremely challenging to compute, both theoretically and numerically. With these estimates, we can then make use of a powerful parametric approximation via Euler characteristics for Gaussian random fields to generate accurate $(1-\alpha)$ thresholds and $p$-values. The main features of our proposed LKC regression method are easy implementation, conceptual simplicity, and facilitated diagnostics, which we demonstrate in a variety of simulations and applications.
- Apr 28 2017 cs.CV arXiv:1704.08545v1We focus on the challenging task of realtime semantic segmentation in this paper. It finds many practical applications and yet is with fundamental difficulty of reducing a large portion of computation for pixel-wise label inference. We propose an compressed-PSPNet-based image cascade network (ICNet) that incorporates multi-resolution branches under proper label guidance to address this challenge. We provide in-depth analysis of our framework and introduce the cascade feature fusion to quickly achieve high-quality segmentation. Our system yields realtime inference on a single GPU card with decent quality results evaluated on challenging Cityscapes dataset.
- Apr 28 2017 cs.CL arXiv:1704.08531v1This paper aims to catalyze the discussions about text feature extraction techniques using neural network architectures. The research questions discussed in the paper focus on the state-of-the-art neural network techniques that have proven to be useful tools for language processing, language generation, text classification and other computational linguistics tasks.
- Despite the recent success of deep-learning based semantic segmentation, deploying a pre-trained road scene segmenter to a city whose images are not presented in the training set would not achieve satisfactory performance due to dataset biases. Instead of collecting a large number of annotated images of each city of interest to train or refine the segmenter, we propose an unsupervised learning approach to adapt road scene segmenters across different cities. By utilizing Google Street View and its time-machine feature, we can collect unannotated images for each road scene at different times, so that the associated static-object priors can be extracted accordingly. By advancing a joint global and class-specific domain adversarial learning framework, adaptation of pre-trained segmenters to that city can be achieved without the need of any user annotation or interaction. We show that our method improves the performance of semantic segmentation in multiple cities across continents, while it performs favorably against state-of-the-art approaches requiring annotated training data.
- This paper presents a novel deep neural network (DNN) based speech enhancement method that aims to enhance magnitude and phase components of speech signals simultaneously. The novelty of the proposed method is two-fold. First, to avoid the difficulty of direct clean phase estimation, the proposed algorithm adopts real and imaginary (RI) spectrograms to prepare both input and output features. In this way, the clean phase spectrograms can be effectively estimated from the enhanced RI spectrograms. Second, based on the RI spectro-grams, a multi-metrics learning (MML) criterion is derived to optimize multiple objective metrics simultaneously. Different from the concept of multi-task learning that incorporates heterogeneous features in the output layers, the MML criterion uses an objective function that considers different representations of speech signals (RI spectrograms, log power spectrograms, and waveform) during the enhancement process. Experimental results show that the proposed method can notably outperform the conventional DNN-based speech enhancement system that enhances the magnitude spectrogram alone. Furthermore, the MML criterion can further improve some objective metrics without trading off other objective metric scores.
- Apr 28 2017 cs.SE arXiv:1704.08476v1Version information plays an important role in spreadsheet understanding, maintaining and quality improving. However, end users rarely use version control tools to document spreadsheet version information. Thus, the spreadsheet version information is missing, and different versions of a spreadsheet coexist as individual and similar spreadsheets. Existing approaches try to recover spreadsheet version information through clustering these similar spreadsheets based on spreadsheet filenames or related email conversation. However, the applicability and accuracy of existing clustering approaches are limited due to the necessary information (e.g., filenames and email conversation) is usually missing. We inspected the versioned spreadsheets in VEnron, which is extracted from the Enron Corporation. In VEnron, the different versions of a spreadsheet are clustered into an evolution group. We observed that the versioned spreadsheets in each evolution group exhibit certain common features (e.g., similar table headers and worksheet names). Based on this observation, we proposed an automatic clustering algorithm, SpreadCluster. SpreadCluster learns the criteria of features from the versioned spreadsheets in VEnron, and then automatically clusters spreadsheets with the similar features into the same evolution group. We applied SpreadCluster on all spreadsheets in the Enron corpus. The evaluation result shows that SpreadCluster could cluster spreadsheets with higher precision and recall rate than the filename-based approach used by VEnron. Based on the clustering result by SpreadCluster, we further created a new versioned spreadsheet corpus VEnron2, which is much bigger than VEnron. We also applied SpreadCluster on the other two spreadsheet corpora FUSE and EUSES. The results show that SpreadCluster can cluster the versioned spreadsheets in these two corpora with high precision.
- Apr 28 2017 cs.DS arXiv:1704.08445v1A novel landmark-based oracle (CFLAT) is presented, which provides earliest-arrival-time route plans in time-dependent road networks. To our knowledge, this is the first oracle that preprocesses combinatorial structures (collections of time-stamped min-travel-time-path trees) rather than travel-time functions. The preprocessed data structure is exploited by a new query algorithm (CFCA) which computes (and pays for it), apart from earliest-arrival-time estimations, the actual connecting path that preserves the theoretical approximation guarantees. To make it practical and tackle the main burden of landmark-based oracles (the large preprocessing requirements), CFLAT is extensively engineered. A thorough experimental evaluation on two real-world benchmark instances shows that CFLAT achieves a significant improvement on preprocessing, approximation guarantees and query-times, in comparison to previous landmark-based oracles, whose query algorithms do not account for the path construction. It also achieves competitive query-time performance and approximation guarantees compared to state-of-art speedup heuristics for time-dependent road networks, whose query-times in most cases do not account for path construction.
- The technique of hiding messages in digital data is called a steganography technique. With improved sequencing techniques, increasing attempts have been conducted to hide hidden messages in deoxyribonucleic acid (DNA) sequences which have been become a medium for steganography. Many detection schemes have developed for conventional digital data, but these schemes not applicable to DNA sequences because of DNA's complex internal structures. In this paper, we propose the first DNA steganalysis framework for detecting hidden messages and conduct an experiment based on the random oracle model. Among the suitable models for the framework, splice junction classification using deep recurrent neural networks (RNNs) is most appropriate for performing DNA steganalysis. In our DNA steganography approach, we extract the hidden layer composed of RNNs to model the internal structure of a DNA sequence. We provide security for steganography schemes based on mutual entropy and provide simulation results that illustrate how our model detects hidden messages, independent of regions of a targeted reference genome. We apply our method to human genome datasets and determine that hidden messages in DNA sequences with a minimum sample size of 100 are detectable, regardless of the presence of hidden regions.
- Apr 28 2017 physics.soc-ph cs.SI arXiv:1704.08437v1In the modern era where highly-commodified cultural products compete heavily for mass consumption, finding the principles behind the complex process of how successful, "hit" products emerge remains a vital scientific goal that requires an interdisciplinary approach. Here we present a framework for tracing the cycle of prosperity-and-decline of a product to find insights into influential and potent factors that determine its success. As a rapid, high-throughput indicator of the preference of the public, popularity charts have emerged as a useful information source for finding the market performance patterns of products over time, which we call the on-chart life trajectories that show how the products enter the chart, fare inside it, and eventually exit from it. We propose quantitative parameters to characterise a life trajectory, and analyse a large-scale data set of nearly 7,000 songs from Gaon Chart, a major weekly Korean Pop (K-Pop) chart that cover a span of six years. We find that a significant role is played by non-musical extrinsic factors such as the might of production companies and possibly established fan base in the on-chart success of songs, strongly indicative of the commodified nature of modern cultural products. We also discuss several nontrivial yet intriguing trajectories that we call the "Late Bloomers" and the "Re-entrants" that appears to be strongly driven by serendipitous exposure on mass media and the changes of seasons.
- Apr 28 2017 cs.LG arXiv:1704.08432v1Chemical-chemical interaction (CCI) plays a key role in predicting candidate drugs, toxicity, therapeutic effects, and biological functions. CCI was created from text mining, experiments, similarities, and databases; to date, no learning-based CCI prediction method exist. In chemical analyses, computational approaches are required. The recent remarkable growth and outstanding performance of deep learning have attracted considerable esearch attention. However,even in state-of-the-art drug analyses, deep learning continues to be used only as a classifier. Nevertheless, its purpose includes not only simple classification, but also automated feature extraction. In this paper, we propose the first end-to-end learning method for CCI, named DeepCCI. Hidden features are derived from a simplified molecular input line entry system (SMILES), which is a string notation representing the chemical structure, instead of learning from crafted features. To discover hidden representations for the SMILES strings, we use convolutional neural networks (CNNs). To guarantee the commutative property for homogeneous interaction, we apply model sharing and hidden representation merging techniques. The performance of DeepCCI was compared with a plain deep classifier and conventional machine learning methods. The proposed DeepCCI showed the best performance in all seven evaluation metrics used. In addition, the commutative property was experimentally validated. The automatically extracted features through end-to-end SMILES learning alleviates the significant efforts required for manual feature engineering. It is expected to improve prediction performance, in drug analyses.
- Apr 28 2017 cs.CL arXiv:1704.08430v1Neural machine translation (NMT) heavily relies on an attention network to produce a context vector for each target word prediction. In practice, we find that context vectors for different target words are quite similar to one another and therefore are insufficient in discriminatively predicting target words. The reason for this might be that context vectors produced by the vanilla attention network are just a weighted sum of source representations that are invariant to decoder states. In this paper, we propose a novel GRU-gated attention model (GAtt) for NMT which enhances the degree of discrimination of context vectors by enabling source representations to be sensitive to the partial translation generated by the decoder. GAtt uses a gated recurrent unit (GRU) to combine two types of information: treating a source annotation vector originally produced by the bidirectional encoder as the history state while the corresponding previous decoder state as the input to the GRU. The GRU-combined information forms a new source annotation vector. In this way, we can obtain translation-sensitive source representations which are then feed into the attention network to generate discriminative context vectors. We further propose a variant that regards a source annotation vector as the current input while the previous decoder state as the history. Experiments on NIST Chinese-English translation tasks show that both GAtt-based models achieve significant improvements over the vanilla attentionbased NMT. Further analyses on attention weights and context vectors demonstrate the effectiveness of GAtt in improving the discrimination power of representations and handling the challenging issue of over-translation.
- Apr 28 2017 cs.CL arXiv:1704.08390v1This paper describes the Duluth system that participated in SemEval-2017 Task 6 #HashtagWars: Learning a Sense of Humor. The system participated in Subtasks A and B using N-gram language models, ranking highly in the task evaluation. This paper discusses the results of our system in the development and evaluation stages and from two post-evaluation runs.
- With the recent advancements in Artificial Intelligence (AI), various organizations and individuals started debating about the progress of AI as a blessing or a curse for the future of the society. This paper conducts an investigation on how the public perceives the progress of AI by utilizing the data shared on Twitter. Specifically, this paper performs a comparative analysis on the understanding of users from two categories -- general AI-Tweeters (AIT) and the expert AI-Tweeters (EAIT) who share posts about AI on Twitter. Our analysis revealed that users from both the categories express distinct emotions and interests towards AI. Users from both the categories regard AI as positive and are optimistic about the progress of AI but the experts are more negative than the general AI-Tweeters. Characterization of users manifested that `London' is the popular location of users from where they tweet about AI. Tweets posted by AIT are highly retweeted than posts made by EAIT that reveals greater diffusion of information from AIT.
- Apr 28 2017 cs.CL arXiv:1704.08387v1We introduce a neural semantic parser which is interpretable and scalable. Our model converts natural language utterances to intermediate, domain-general natural language representations in the form of predicate-argument structures, which are induced with a transition system and subsequently mapped to target domains. The semantic parser is trained end-to-end using annotated logical forms or their denotations. We obtain competitive results on various datasets. The induced predicate-argument structures shed light on the types of representations useful for semantic parsing and how these are different from linguistically motivated ones.
- Apr 28 2017 cs.CL arXiv:1704.08384v1Existing question answering methods infer answers either from a knowledge base or from raw text. While knowledge base (KB) methods are good at answering compositional questions, their performance is often affected by the incompleteness of the KB. Au contraire, web text contains millions of facts that are absent in the KB, however in an unstructured form. \it Universal schema can support reasoning on the union of both structured KBs and unstructured text by aligning them in a common embedded space. In this paper we extend universal schema to natural language question answering, employing \emphmemory networks to attend to the large body of facts in the combination of text and KB. Our models can be trained in an end-to-end fashion on question-answer pairs. Evaluation results on \spades fill-in-the-blank question answering dataset show that exploiting universal schema for question answering is better than using either a KB or text alone. This model also outperforms the current state-of-the-art by 8.5 $F_1$ points.\footnoteCode and data available in \urlhttps://rajarshd.github.io/TextKBQA
- Genome-wide association studies (GWAS) have achieved great success in the genetic study of Alzheimer's disease (AD). Collaborative imaging genetics studies across different research institutions show the effectiveness of detecting genetic risk factors. However, the high dimensionality of GWAS data poses significant challenges in detecting risk SNPs for AD. Selecting relevant features is crucial in predicting the response variable. In this study, we propose a novel Distributed Feature Selection Framework (DFSF) to conduct the large-scale imaging genetics studies across multiple institutions. To speed up the learning process, we propose a family of distributed group Lasso screening rules to identify irrelevant features and remove them from the optimization. Then we select the relevant group features by performing the group Lasso feature selection process in a sequence of parameters. Finally, we employ the stability selection to rank the top risk SNPs that might help detect the early stage of AD. To the best of our knowledge, this is the first distributed feature selection model integrated with group Lasso feature selection as well as detecting the risk genetic factors across multiple research institutions system. Empirical studies are conducted on 809 subjects with 5.9 million SNPs which are distributed across several individual institutions, demonstrating the efficiency and effectiveness of the proposed method.
- Apr 28 2017 cs.CV arXiv:1704.08382v1This work introduces a novel framework for quantifying the presence and strength of recurrent dynamics in video data. Specifically, we provide continuous measures of periodicity (perfect repetition) and quasiperiodicity (superposition of periodic modes with non-commensurate periods), in a way which does not require segmentation, training, object tracking or 1-dimensional surrogate signals. Our methodology operates directly on video data. The approach combines ideas from nonlinear time series analysis (delay embeddings) and computational topology (persistent homology), by translating the problem of finding recurrent dynamics in video data, into the problem of determining the circularity or toroidality of an associated geometric space. Through extensive testing, we show the robustness of our scores with respect to several noise models/levels; we show that our periodicity score is superior to other methods when compared to human-generated periodicity rankings; and furthermore, we show that our quasiperiodicity score clearly indicates the presence of biphonation in videos of vibrating vocal folds.
- Apr 28 2017 cs.CL arXiv:1704.08381v1Sequence-to-sequence models have shown strong performance across a broad range of applications. However, their application to parsing and generating text usingAbstract Meaning Representation (AMR)has been limited, due to the relatively limited amount of labeled data and the non-sequential nature of the AMR graphs. We present a novel training procedure that can lift this limitation using millions of unlabeled sentences and careful preprocessing of the AMR graphs. For AMR parsing, our model achieves competitive results of 62.1SMATCH, the current best score reported without significant use of external semantic resources. For AMR generation, our model establishes a new state-of-the-art performance of BLEU 33.8. We present extensive ablative and qualitative analysis including strong evidence that sequence-based AMR models are robust against ordering variations of graph-to-sequence conversions.
- Apr 28 2017 cs.MM arXiv:1704.08378v1This paper presents an empirical study on applying convolutional neural networks (CNNs) to detecting J-UNIWARD, one of the most secure JPEG steganographic method. Experiments guiding the architectural design of the CNNs have been conducted on the JPEG compressed BOSSBase containing 10,000 covers of size 512x512. Results have verified that both the pooling method and the depth of the CNNs are critical for performance. Results have also proved that a 20-layer CNN, in general, outperforms the most sophisticated feature-based methods, but its advantage gradually diminishes on hard-to-detect cases. To show that the performance generalizes to large-scale databases and to different cover sizes, one experiment has been conducted on the CLS-LOC dataset of ImageNet containing more than one million covers cropped to unified size of 256x256. The proposed 20-layer CNN has cut the error achieved by a CNN recently proposed for large-scale JPEG steganalysis by 35%. Source code is available via GitHub: https://github.com/GuanshuoXu/deep_cnn_jpeg_steganalysis
- In machine learning, the use of an artificial neural network is the mainstream approach. Such a network consists of layers of neurons. These neurons are of the same type characterized by the two features: (1) an inner product of an input vector and a matching weighting vector of trainable parameters and (2) a nonlinear excitation function. Here we investigate the possibility of replacing the inner product with a quadratic function of the input vector, thereby upgrading the 1st order neuron to the 2nd order neuron, empowering individual neurons, and facilitating the optimization of neural networks. Also, numerical examples are provided to illustrate the feasibility and merits of the 2nd order neurons. Finally, further topics are discussed.
- Apr 28 2017 cs.CL arXiv:1704.08352v1Words can be represented by composing the representations of subword units such as word segments, characters, and/or character n-grams. While such representations are effective and may capture the morphological regularities of words, they have not been systematically compared, and it is not understood how they interact with different morphological typologies. On a language modeling task, we present experiments that systematically vary (1) the basic unit of representation, (2) the composition of these representations, and (3) the morphological typology of the language modeled. Our results extend previous findings that character representations are effective across typologies, and we find that a previously unstudied combination of character trigram representations composed with bi-LSTMs outperforms most others. But we also find room for improvement: none of the character-level models match the predictive accuracy of a model with access to true morphological analyses, even when learned from an order of magnitude more data.
- Apr 28 2017 hep-th astro-ph.CO arXiv:1704.08351v1We study the likelihood which relative minima of random polynomial potentials support the slow-roll conditions for inflation. Consistent with renormalizability and boundedness, the coefficients that appear in the potential are chosen to be order one with respect to the energy scale at which inflation transpires. Investigation of the single field case illustrates a window in which the potentials satisfy the slow-roll conditions. When there are two scalar fields, we find that the probability depends on the choice of distribution for the coefficients. A uniform distribution yields a $0.05\%$ probability of finding a suitable minimum in the random potential whereas a maximum entropy distribution yields a $0.1\%$ probability.
- Apr 28 2017 cs.AI arXiv:1704.08350v1Current measures of machine intelligence are either difficult to evaluate or lack the ability to test a robot's problem-solving capacity in open worlds. We propose a novel evaluation framework based on the formal notion of MacGyver Test which provides a practical way for assessing the resilience and resourcefulness of artificial agents.
- Apr 28 2017 cs.CV arXiv:1704.08328v1In this thesis, we study two problems based on clustering algorithms. In the first problem, we study the role of visual attributes using an agglomerative clustering algorithm to whittle down the search area where the number of classes is high to improve the performance of clustering. We observe that as we add more attributes, the clustering performance increases overall. In the second problem, we study the role of clustering in aggregating templates in a 1:N open set protocol using multi-shot video as a probe. We observe that by increasing the number of clusters, the performance increases with respect to the baseline and reaches a peak, after which increasing the number of clusters causes the performance to degrade. Experiments are conducted using recently introduced unconstrained IARPA Janus IJB-A, CS2, and CS3 face recognition datasets.
- Cross-modal audio-visual perception has been a long-lasting topic in psychology and neurology, and various studies have discovered strong correlations in human perception of auditory and visual stimuli. Despite works in computational multimodal modeling, the problem of cross-modal audio-visual generation has not been systematically studied in the literature. In this paper, we make the first attempt to solve this cross-modal generation problem leveraging the power of deep generative adversarial training. Specifically, we use conditional generative adversarial networks to achieve cross-modal audio-visual generation of musical performances. We explore different encoding methods for audio and visual signals, and work on two scenarios: instrument-oriented generation and pose-oriented generation. Being the first to explore this new problem, we compose two new datasets with pairs of images and sounds of musical performances of different instruments. Our experiments using both classification and human evaluations demonstrate that our model has the ability to generate one modality, i.e., audio/visual, from the other modality, i.e., visual/audio, to a good extent. Our experiments on various design choices along with the datasets will facilitate future research in this new problem space.
- The notion of events has occupied a central role in modeling and has an influence in computer science and philosophy. Recent developments in diagrammatic modeling have made it possible to examine conceptual representation of events. This paper explores some aspects of the notion of events that are produced by applying a new diagrammatic methodology with a focus on the interaction of events with such concepts as time and space, objects. The proposed description applies to abstract machines where events form the dynamic phases of a system. The results of this nontechnical research can be utilized in many fields where the notion of an event is typically used in interdisciplinary application.
- Under the banner of `Big Data', the detection and classification of structure in extremely large, high dimensional, data sets, is, one of the central statistical challenges of our times. Among the most intriguing approaches to this challenge is `TDA', or `Topological Data Analysis', one of the primary aims of which is providing non-metric, but topologically informative, pre-analyses of data sets which make later, more quantitative analyses feasible. While TDA rests on strong mathematical foundations from Topology, in applications it has faced challenges due to an inability to handle issues of statistical reliability and robustness and, most importantly, in an inability to make scientific claims with verifiable levels of statistical confidence. We propose a methodology for the parametric representation, estimation, and replication of persistence diagrams, the main diagnostic tool of TDA. The power of the methodology lies in the fact that even if only one persistence diagram is available for analysis -- the typical case for big data applications -- replications can be generated to allow for conventional statistical hypothesis testing. The methodology is conceptually simple and computationally practical, and provides a broadly effective statistical procedure for persistence diagram TDA analysis. We demonstrate the basic ideas on a toy example, and the power of the approach in a novel and revealing analysis of CMB non-homogeneity.
- Recurrence networks and the associated statistical measures have become important tools in the analysis of time series data. In this work, we test how effective the recurrence network measures are in analyzing real world data involving two main types of noise, white noise and colored noise. We use two prominent network measures as discriminating statistic for hypothesis testing using surrogate data for a specific null hypothesis that the data is derived from a linear stochastic process. We show that the characteristic path length is especially efficient as a discriminating measure with the conclusions reasonably accurate even with limited number of data points in the time series. We also highlight an additional advantage of the network approach in identifying the dimensionality of the system underlying the time series through a convergence measure derived from the probability distribution of the local clustering coefficients. As examples of real world data, we use the light curves from a prominent black hole system and show that a combined analysis using three primary network measures can provide vital information regarding the nature of temporal variability of light curves from different spectroscopic classes.
- Apr 28 2017 cs.CV arXiv:1704.08331v1Dynamic scene understanding is a challenging problem and motion segmentation plays a crucial role in solving it. Incorporating semantics and motion enhances the overall perception of the dynamic scene. For applications of outdoor robotic navigation, joint learning methods have not been extensively used for extracting spatio-temporal features or adding different priors into the formulation. The task becomes even more challenging without stereo information being incorporated. This paper proposes an approach to fuse semantic features and motion clues using CNNs, to address the problem of monocular semantic motion segmentation. We deduce semantic and motion labels by integrating optical flow as a constraint with semantic features into dilated convolution network. The pipeline consists of three main stages i.e Feature extraction, Feature amplification and Multi Scale Context Aggregation to fuse the semantics and flow features. Our joint formulation shows significant improvements in monocular motion segmentation over the state of the art methods on challenging KITTI tracking dataset.