- This paper considers the potential impact that the nascent technology of quantum computing may have on society. It focuses on three areas: cryptography, optimization, and simulation of quantum systems. We will also discuss some ethical aspects of these developments, and ways to mitigate the risks.
- Covert communication can prevent the opponent from knowing that a wireless communication has occurred. In the additive white Gaussian noise channels, if we only take the ambient noise into account, a square root law was obtained and the result shows that Alice can reliably and covertly transmit $\mathcal{O}(\sqrt{n})$ bits to Bob in $n$ channel uses. If additional "friendly" node closest to the adversary can produce artificial noise to aid in hiding the communication, the covert throughput can be improved. In this paper, we consider the covert communication in noisy wireless networks, where potential transmitters form a stationary Poisson point process. Alice wishes to communicate covertly to Bob without being detected by the warden Willie. In this scenario, Bob and Willie not only experience the ambient noise, but also the aggregated interference simultaneously. Although the random interference sources are not in collusion with Alice and Bob, our results show that uncertainty in noise and interference experienced by Willie is beneficial to Alice. When the distance between Alice and Willie $d_{a,w}=\omega(n^{\delta/4})$ ($\delta=2/\alpha$ is stability exponent), Alice can reliably and covertly transmit $\mathcal{O}(\log_2\sqrt{n})$ bits to Bob in $n$ channel uses, and there is no limitation on the transmit power of transmitters. Although the covert throughout is lower than the square root law and the friendly jamming scheme, the spatial throughout of the network is higher, and Alice does not presuppose to know the location of Willie. From the network perspective, the communications are hidden in the noisy wireless networks, and what Willie sees is merely a "\emphshadow" wireless network where he knows for certain some nodes are transmitting, but he cannot catch anyone red-handed.
- Dec 15 2017 cs.CG arXiv:1712.05081v1In this paper, we consider the problem of computing the minimum area triangle that circumscribes a given $n$-sided convex polygon touching edge-to-edge. In other words, we compute the minimum area triangle that is the intersection of 3 half-planes out of $n$ half-planes defined by a given convex polygon. Previously, $O(n\log n)$ time algorithms were known which are based on the technique for computing the minimum weight k-link path given in \citekgon94,kgon95soda. By applying the new technique proposed in Jin's recent work for the dual problem of computing the maximum area triangle inside a convex polygon, we solve the problem at hand in $O(n)$ time, thus justify Jin's claim that his technique may find applications in other polygonal inclusion problems. Our algorithm actually computes all the local minimal area circumscribing triangles touching edge-to-edge.
- The recent literature on deep learning offers new tools to learn a rich probability distribution over high dimensional data such as images or sounds. In this work we investigate the possibility of learning the prior distribution over neural network parameters using such tools. Our resulting variational Bayes algorithm generalizes well to new tasks, even when very few training examples are provided. Furthermore, this learned prior allows the model to extrapolate correctly far from a given task's training data on a meta-dataset of periodic signals.
- We analyse the Tangle --- a DAG-valued stochastic process where new vertices get attached to the graph at Poissonian times, and the attachment's locations are chosen by means of random walks on that graph. We prove existence of ("almost symmetric") Nash equilibria for the system where a part of players tries to optimize their attachment strategies. Then, we also present simulations that show that the "selfish" players will nevertheless cooperate with the network by choosing attachment strategies that are similar to the default one.
- Sequence-to-sequence models with soft attention have been successfully applied to a wide variety of problems, but their decoding process incurs a quadratic time and space cost and is inapplicable to real-time sequence transduction. To address these issues, we propose Monotonic Chunkwise Attention (MoChA), which adaptively splits the input sequence into small chunks over which soft attention is computed. We show that models utilizing MoChA can be trained efficiently with standard backpropagation while allowing online and linear-time decoding at test time. When applied to online speech recognition, we obtain state-of-the-art results and match the performance of a model using an offline soft attention mechanism. In document summarization experiments where we do not expect monotonic alignments, we show significantly improved performance compared to a baseline monotonic attention-based model.
- We define a monad on the category of complete metric spaces with short maps, which assigns to each space the space of Radon probability measures on it with finite first moment, equipped with the Kantorovich--Wasserstein distance. It is analogous to the Giry monad on the category of Polish spaces, and it extends a construction due to van Breugel for compact and for 1-bounded complete metric spaces. We prove that this Kantorovich monad arises from a colimit construction on finite powers, which formalizes the intuition that probability measures are limits of finite samples. The proof relies on a new criterion for when an ordinary left Kan extension of lax monoidal functors is a monoidal Kan extension. This colimit characterization allows for the development of integration theory and other things, such as the treatment of measures on spaces of measures, completely without measure theory. We also show that the category of algebras of the Kantorovich monad is equivalent to the category of closed convex subsets of Banach spaces with short affine maps as the morphisms.
- Dec 15 2017 cs.SI arXiv:1712.05359v1Sociotechnological and geospatial processes exhibit time varying structure that make insight discovery challenging. To detect abnormal moments in these processes, a definition of `normal' must be established. This paper proposes a new statistical model for such systems, modeled as dynamic networks, to address this challenge. It assumes that vertices fall into one of k types and that the probability of edge formation at a particular time depends on the types of the incident nodes and the current time. The time dependencies are driven by unique seasonal processes, which many systems exhibit (e.g., predictable spikes in geospatial or web traffic each day). The paper defines the model as a generative process and an inference procedure to recover the `normal' seasonal processes from data when they are unknown. An outline of anomaly detection experiments to be completed over Enron emails and New York City taxi trips is presented.
- We cast the problem of combinatorial auction design in a Bayesian framework in order to incorporate prior information into the auction process and minimize the number of rounds to convergence. We first develop a generative model of agent valuations and market prices such that clearing prices become maximum a posteriori estimates given observed agent valuations. This generative model then forms the basis of an auction process which alternates between refining estimates of agent valuations and computing candidate clearing prices. We provide an implementation of the auction using assumed density filtering to estimate valuations and expectation maximization to compute prices. An empirical evaluation over a range of valuation domains demonstrates that our Bayesian auction mechanism is highly competitive against the combinatorial clock auction in terms of rounds to convergence, even under the most favorable choices of price increment for this baseline.
- Dec 15 2017 cs.CV arXiv:1712.05271v1This paper performs a comprehensive and comparative evaluation of the state of the art local features for the task of image based 3D reconstruction. The evaluated local features cover the recently developed ones by using powerful machine learning techniques and the elaborately designed handcrafted features. To obtain a comprehensive evaluation, we choose to include both float type features and binary ones. Meanwhile, two kinds of datasets have been used in this evaluation. One is a dataset of many different scene types with groundtruth 3D points, containing images of different scenes captured at fixed positions, for quantitative performance evaluation of different local features in the controlled image capturing situations. The other dataset contains Internet scale image sets of several landmarks with a lot of unrelated images, which is used for qualitative performance evaluation of different local features in the free image collection situations. Our experimental results show that binary features are competent to reconstruct scenes from controlled image sequences with only a fraction of processing time compared to use float type features. However, for the case of large scale image set with many distracting images, float type features show a clear advantage over binary ones.
- To harness the complexity of their high-dimensional bodies during sensorimotor development, infants are guided by patterns of freezing and freeing of degrees of freedom. For instance, when learning to reach, infants free the degrees of freedom in their arm proximodistally, i.e. from joints that are closer to the body to those that are more distant. Here, we formulate and study computationally the hypothesis that such patterns can emerge spontaneously as the result of a family of stochastic optimization processes (evolution strategies with covariance-matrix adaptation), without an innate encoding of a maturational schedule. In particular, we present simulated experiments with an arm where a computational learner progressively acquires reaching skills through adaptive exploration, and we show that a proximodistal organization appears spontaneously, which we denote PDFF (ProximoDistal Freezing and Freeing of degrees of freedom). We also compare this emergent organization between different arm morphologies -- from human-like to quite unnatural ones -- to study the effect of different kinematic structures on the emergence of PDFF. Keywords: human motor learning; proximo-distal exploration; stochastic optimization; modelling; evolution strategies; cross-entropy methods; policy search; morphology.
- Dec 15 2017 cs.CV arXiv:1712.05248v1Recent random-forest (RF)-based image super-resolution approaches inherit some properties from dictionary-learning-based algorithms, but the effectiveness of the properties in RF is overlooked in the literature. In this paper, we present a novel feature-augmented random forest (FARF) for image super-resolution, where the conventional gradient-based features are augmented with gradient magnitudes and different feature recipes are formulated on different stages in an RF. The advantages of our method are that, firstly, the dictionary-learning-based features are enhanced by adding gradient magnitudes, based on the observation that the non-linear gradient magnitude are with highly discriminative property. Secondly, generalized locality-sensitive hashing (LSH) is used to replace principal component analysis (PCA) for feature dimensionality reduction and original high-dimensional features are employed, instead of the compressed ones, for the leaf-nodes' regressors, since regressors can benefit from higher dimensional features. This original-compressed coupled feature sets scheme unifies the unsupervised LSH evaluation on both image super-resolution and content-based image retrieval (CBIR). Finally, we present a generalized weighted ridge regression (GWRR) model for the leaf-nodes' regressors. Experiment results on several public benchmark datasets show that our FARF method can achieve an average gain of about 0.3 dB, compared to traditional RF-based methods. Furthermore, a fine-tuned FARF model can compare to or (in many cases) outperform some recent stateof-the-art deep-learning-based algorithms.
- Dec 15 2017 cs.AI arXiv:1712.05247v1This paper presents a framework for intrinsic point of interest discovery from trajectory databases. Intrinsic points of interest are regions of a geospatial area innately defined by the spatial and temporal aspects of trajectory data, and can be of varying size, shape, and resolution. Any trajectory database exhibits such points of interest, and hence are intrinsic, as compared to most other point of interest definitions which are said to be extrinsic, as they require trajectory metadata, external knowledge about the region the trajectories are observed, or other application-specific information. Spatial and temporal aspects are qualities of any trajectory database, making the framework applicable to data from any domain and of any resolution. The framework is developed under recent developments on the consistency of nonparametric hierarchical density estimators and enables the possibility of formal statistical inference and evaluation over such intrinsic points of interest. Comparisons of the POIs uncovered by the framework in synthetic truth data to thousands of parameter settings for common POI discovery methods show a marked improvement in fidelity without the need to tune any parameters by hand.
- Deep learning with 3D data such as reconstructed point clouds and CAD models has received great research interests recently. However, the capability of using point clouds with convolutional neural network has been so far not fully explored. In this technical report, we present a convolutional neural network for semantic segmentation and object recognition with 3D point clouds. At the core of our network is point-wise convolution, a convolution operator that can be applied at each point of a point cloud. Our fully convolutional network design, while being simple to implement, can yield competitive accuracy in both semantic segmentation and object recognition task.
- Dec 15 2017 cs.CV arXiv:1712.05231v1Most of existing correlation filter-based tracking approaches only estimate simple axis-aligned bounding boxes, and very few of them is capable of recovering the underlying similarity transformation. To a large extent, such limitation restricts the applications of such trackers for a wide range of scenarios. In this paper, we propose a novel correlation filter-based tracker with robust estimation of similarity transformation on the large displacements to tackle this challenging problem. In order to efficiently search in such a large 4-DoF space in real-time, we formulate the problem into two 2-DoF sub-problems and apply an efficient Block Coordinates Descent solver to optimize the estimation result. Specifically, we employ an efficient phase correlation scheme to deal with both scale and rotation changes simultaneously in log-polar coordinates. Moreover, a fast variant of correlation filter is used to predict the translational motion individually. Our experimental results demonstrate that the proposed tracker achieves very promising prediction performance compared with the state-of-the-art visual object tracking methods while still retaining the advantages of efficiency and simplicity in conventional correlation filter-based tracking methods.
- Modeling of music audio semantics has been previously tackled through learning of mappings from audio data to high-level tags or latent unsupervised spaces. The resulting semantic spaces are theoretically limited, either because the chosen high-level tags do not cover all of music semantics or because audio data itself is not enough to determine music semantics. In this paper, we propose a generic framework for semantics modeling that focuses on the perception of the listener, through EEG data, in addition to audio data. We implement this framework using a novel end-to-end 2-view Neural Network (NN) architecture and a Deep Canonical Correlation Analysis (DCCA) loss function that forces the semantic embedding spaces of both views to be maximally correlated. We also detail how the EEG dataset was collected and use it to train our proposed model. We evaluate the learned semantic space in a transfer learning context, by using it as an audio feature extractor in an independent dataset and proxy task: music audio-lyrics cross-modal retrieval. We show that our embedding model outperforms Spotify features and performs comparably to a state-of-the-art embedding model that was trained on 700 times more data. We further propose improvements to the model that are likely to improve its performance.
- With the advent of the Internet, large amount of digital text is generated everyday in the form of news articles, research publications, blogs, question answering forums and social media. It is important to develop techniques for extracting information automatically from these documents, as lot of important information is hidden within them. This extracted information can be used to improve access and management of knowledge hidden in large text corpora. Several applications such as Question Answering, Information Retrieval would benefit from this information. Entities like persons and organizations, form the most basic unit of the information. Occurrences of entities in a sentence are often linked through well-defined relations; e.g., occurrences of person and organization in a sentence may be linked through relations such as employed at. The task of Relation Extraction (RE) is to identify such relations automatically. In this paper, we survey several important supervised, semi-supervised and unsupervised RE techniques. We also cover the paradigms of Open Information Extraction (OIE) and Distant Supervision. Finally, we describe some of the recent trends in the RE techniques and possible future research directions. This survey would be useful for three kinds of readers - i) Newcomers in the field who want to quickly learn about RE; ii) Researchers who want to know how the various RE techniques evolved over time and what are possible future research directions and iii) Practitioners who just need to know which RE technique works best in various settings.
- We introduce a pair of tools, Rasa NLU and Rasa Core, which are open source python libraries for building conversational software. Their purpose is to make machine-learning based dialogue management and language understanding accessible to non-specialist software developers. In terms of design philosophy, we aim for ease of use, and bootstrapping from minimal (or no) initial training data. Both packages are extensively documented and ship with a comprehensive suite of tests. The code is available at https://github.com/RasaHQ/
- Recurrent Neural Networks (RNNs) are powerful sequence modeling tools. However, when dealing with high dimensional inputs, the training of RNNs becomes computational expensive due to the large number of model parameters. This hinders RNNs from solving many important computer vision tasks, such as Action Recognition in Videos and Image Captioning. To overcome this problem, we propose a compact and flexible structure, namely Block-Term tensor decomposition, which greatly reduces the parameters of RNNs and improves their training efficiency. Compared with alternative low-rank approximations, such as tensor-train RNN (TT-RNN), our method, Block-Term RNN (BT-RNN), is not only more concise (when using the same rank), but also able to attain a better approximation to the original RNNs with much fewer parameters. On three challenging tasks, including Action Recognition in Videos, Image Captioning and Image Generation, BT-RNN outperforms TT-RNN and the standard RNN in terms of both prediction accuracy and convergence rate. Specifically, BT-LSTM utilizes 17,388 times fewer parameters than the standard LSTM to achieve an accuracy improvement over 15.6\% in the Action Recognition task on the UCF11 dataset.
- Dec 15 2017 cs.SD arXiv:1712.05119v1In the use of deep neural networks, it is crucial to provide appropriate input representations for the network to learn from. In this paper, we propose an approach to learn a representation that focus on rhythmic representation which is named as DLR (Deep Learning Rhythmic representation). The proposed approach aims to learn DLR from the raw audio signal and use it for other music informatics tasks. A 1-dimensional convolutional network is utilised in the learning of DLR. In the experiment, we present the results from the source task and the target task as well as visualisations of DLRs. The results reveals that DLR provides compact rhythmic information which can be used on multi-tagging task.
- Dec 15 2017 cs.SI arXiv:1712.05112v1Hiring a head coach of a college sports team is vital which will definitely have a great influence on the later development of the team. However, a lot of attention has been focused on each coach's individual features. A systematic and quantitative analysis of the whole coach hiring market is lacking. In a coach hiring network, the coaches are actually voting with their feet. It is interesting to analyze what factors are affecting the "footprint" left by those head coaches. In this paper, we collect more than 12,000 head coach hiring records in two different popular sports from the NCAA. Using network-based methods, we build the coach hiring network in the NCAA men's basketball and football. We find that: (1).the coach hiring network is of great inequality in coach production with a Gini coefficient close to 0.60. (2).coaches prefer to work within the same geographical region and the same division to their alma maters'. (3).the coach production rankings we calculated using network-based methods are generally correlated to the authoritative rankings, but also show disaccord in specific time period. The results provide us a novel view and better understanding of the coach hiring market in the NCAA and shed new light on the coach hiring system.
- Dec 15 2017 cs.RO arXiv:1712.05109v1This paper proposes an approach for robots to perform co-working task alongside humans by using neuro-dynamical models. The proposed model comprised two models: an Autoencoder and a hierarchical recurrent neural network (RNN). We trained hierarchical RNN with various sensory-motor sequences and instructions. To acquire the interactive ability to switch and combine appropriate motions according to visual information and instructions from outside, we embedded the cyclic neuronal dynamics in a network. To evaluate our model, we designed a cloth-folding task that consists of four short folding motions and three patterns of instruction that indicate the direction of each short motion. The results showed that the robot can perform the task by switching or combining short motions with instructions and visual information. We also showed that the proposed model acquired relationships between the instructions and sensory-motor information in its internal neuronal dynamics.
- Dec 15 2017 cs.CR arXiv:1712.05090v1Virtualization has become more important since cloud computing is getting more and more popular than before. There is an increasing demand for security among the cloud customers. AMD plans to provide Secure Encrypted Virtualization (SEV) technology in its latest processor EPYC to protect virtual machines by encrypting its memory but without integrity protection. In this paper, we analyzed the weakness in the SEV design due to lack of integrity protection thus it is not so secure. Using different design flaw in physical address-based tweak algorithm to protect against ciphertext block move attacks, we found a realistic attack against SEV which could obtain the root privilege of an encrypted virtual machine protected by SEV. A demo to simulate the attack against a virtual machine protected by SEV is done in a Ryzen machine which supports Secure Memory Encryption (SME) technology since SEV enabled machine is still not available in market.
- We study domain-specific video streaming. Specifically, we target a streaming setting where the videos to be streamed from a server to a client are all in the same domain and they have to be compressed to a small size for low-latency transmission. Several popular video streaming services, such as the video game streaming services of GeForce Now and Twitch, fall in this category. While conventional video compression standards such as H.264 are commonly used for this task, we hypothesize that one can leverage the property that the videos are all in the same domain to achieve better video quality. Based on this hypothesis, we propose a novel video compression pipeline. Specifically, we first apply H.264 to compress domain-specific videos. We then train a novel binary autoencoder to encode the leftover domain-specific residual information frame-by-frame into binary representations. These binary representations are then compressed and sent to the client together with the H.264 stream. In our experiments, we show that our pipeline yields consistent gains over standard H.264 compression across several benchmark datasets while using the same channel bandwidth.
- Adaptability is central to autonomy. Intuitively, for high-dimensional learning problems such as navigating based on vision, internal models with higher complexity allow to accurately encode the information available. However, most learning methods rely on models with a fixed structure and complexity. In this paper, we present a self-supervised framework for robots to learn to navigate, without any prior knowledge of the environment, by incrementally building the structure of a deep network as new data becomes available. Our framework captures images from a monocular camera and self labels the images to continuously train and predict actions from a computationally efficient adaptive deep architecture based on Autoencoders (AE), in a self-supervised fashion. The deep architecture, named Reinforced Adaptive Denoising Autoencoders (RA-DAE), uses reinforcement learning to dynamically change the network structure by adding or removing neurons. Experiments were conducted in simulation and real-world indoor and outdoor environments to assess the potential of self-supervised navigation. RA-DAE demonstrates better performance than equivalent non-adaptive deep learning alternatives and can continue to expand its knowledge, trading-off past and present information.
- Dec 15 2017 cs.CV arXiv:1712.05083v1Existing single view, 3D face reconstruction methods can produce beautifully detailed 3D results, but typically only for near frontal, unobstructed viewpoints. We describe a system designed to provide detailed 3D reconstructions of faces viewed under extreme conditions, out of plane rotations, and occlusions. Motivated by the concept of bump mapping, we propose a layered approach which decouples estimation of a global shape from its mid-level details (e.g., wrinkles). We estimate a coarse 3D face shape which acts as a foundation and then separately layer this foundation with details represented by a bump map. We show how a deep convolutional encoder-decoder can be used to estimate such bump maps. We further show how this approach naturally extends to generate plausible details for occluded facial regions. We test our approach and its components extensively, quantitatively demonstrating the invariance of our estimated facial details. We further provide numerous qualitative examples showing that our method produces detailed 3D face shapes in viewing conditions where existing state of the art often break down.
- Dec 15 2017 cs.CV arXiv:1712.05080v1We propose a weakly supervised temporal action localization algorithm on untrimmed videos using convolutional neural networks. Our algorithm predicts temporal intervals of human actions given video-level class labels with no requirement of temporal localization information of actions. This objective is achieved by proposing a novel deep neural network that recognizes actions and identifies a sparse set of key segments associated with the actions through adaptive temporal pooling of video segments. We design the loss function of the network to comprise two terms--one for classification error and the other for sparsity of the selected segments. After recognizing actions with sparse attention weights for key segments, we extract temporal proposals for actions using temporal class activation mappings to estimate time intervals that localize target actions. The proposed algorithm attains state-of-the-art accuracy on the THUMOS14 dataset and outstanding performance on ActivityNet1.3 even with weak supervision.
- Dec 15 2017 cs.SE arXiv:1712.05078v1A Chair of Software Engineering existed at ETH Zurich, the Swiss Federal Insti-tute of Technology, from 1 October 2001 to 31 January 2016, under my leader-ship. Our work, summarized here, covered a wide range of theoretical and practi-cal topics, with object technology in the Eiffel method as the unifying thread .
- Dec 15 2017 cs.CV arXiv:1712.05055v1Recent studies have discovered that deep networks are capable of memorizing the entire data even when the labels are completely random. Since deep models are trained on big data where labels are often noisy, the ability to overfit noise can lead to poor performance. To overcome the overfitting on corrupted training data, we propose a novel technique to regularize deep networks in the data dimension. This is achieved by learning a neural network called MentorNet to supervise the training of the base network, namely, StudentNet. Our work is inspired by curriculum learning and advances the theory by learning a curriculum from data by neural networks. We demonstrate the efficacy of MentorNet on several benchmarks. Comprehensive experiments show that it is able to significantly improve the generalization performance of the state-of-the-art deep networks on corrupted training data.
- Dec 15 2017 cs.CV arXiv:1712.05053v1Skeletal bone age assessment is a common clinical practice to diagnose endocrine and metabolic disorders in child development. In this paper, we describe a fully automated deep learning approach to the problem of bone age assessment using data from Pediatric Bone Age Challenge organized by RSNA 2017. The dataset for this competition is consisted of 12.6k radiological images of left hand labeled by the bone age and sex of patients. Our approach utilizes several deep learning architectures: U-Net, ResNet-50, and custom VGG-style neural networks trained end-to-end. We use images of whole hands as well as specific parts of a hand for both training and inference. This approach allows us to measure importance of specific hand bones for the automated bone age analysis. We further evaluate performance of the method in the context of skeletal development stages. Our approach outperforms other common methods for bone age assessment.
- Dec 15 2017 cs.NE arXiv:1712.05043v1Deep Learning (DL) aims at learning the \emphmeaningful representations. A meaningful representation refers to the one that gives rise to significant performance improvement of associated Machine Learning (ML) tasks by replacing the raw data as the input. However, optimal architecture design and model parameter estimation in DL algorithms are widely considered to be intractable. Evolutionary algorithms are much preferable for complex and non-convex problems due to its inherent characteristics of gradient-free and insensitivity to local optimum. In this paper, we propose a computationally economical algorithm for evolving \emphunsupervised deep neural networks to efficiently learn \emphmeaningful representations, which is very suitable in the current Big Data era where sufficient labeled data for training is often expensive to acquire. In the proposed algorithm, finding an appropriate architecture and the initialized parameter values for a ML task at hand is modeled by one computational efficient gene encoding approach, which is employed to effectively model the task with a large number of parameters. In addition, a local search strategy is incorporated to facilitate the exploitation search for further improving the performance. Furthermore, a small proportion labeled data is utilized during evolution search to guarantee the learnt representations to be meaningful. The performance of the proposed algorithm has been thoroughly investigated over classification tasks. Specifically, error classification rate on MNIST with $1.15\%$ is reached by the proposed algorithm consistently, which is a very promising result against state-of-the-art unsupervised DL algorithms.
- Dec 15 2017 cs.NE arXiv:1712.05042v1Convolutional auto-encoders have shown their remarkable performance in stacking to deep convolutional neural networks for classifying image data during past several years. However, they are unable to construct the state-of-the-art convolutional neural networks due to their intrinsic architectures. In this regard, we propose a flexible convolutional auto-encoder by eliminating the constraints on the numbers of convolutional layers and pooling layers from the traditional convolutional auto-encoder. We also design an architecture discovery method by using particle swarm optimization, which is capable of automatically searching for the optimal architectures of the proposed flexible convolutional auto-encoder with much less computational resource and without any manual intervention. We use the designed architecture optimization algorithm to test the proposed flexible convolutional auto-encoder through utilizing one graphic processing unit card on four extensively used image classification datasets. Experimental results show that our work in this paper significantly outperform the peer competitors including the state-of-the-art algorithm.
- This paper considers the scheduling of parallel real-time tasks with arbitrary-deadlines. Each job of a parallel task is described as a directed acyclic graph (DAG). In contrast to prior work in this area, where decomposition-based scheduling algorithms are proposed based on the DAG-structure and inter-task interference is analyzed as self-suspending behavior, this paper generalizes the federated scheduling approach. We propose a reservation-based algorithm, called reservation-based federated scheduling, that dominates federated scheduling. We provide general constraints for the design of such systems and prove that reservation-based federated scheduling has a constant speedup factor with respect to any optimal DAG task scheduler. Furthermore, the presented algorithm can be used in conjunction with any scheduler and scheduling analysis suitable for ordinary arbitrary-deadline sporadic task sets, i.e., without parallelism.
- Dec 15 2017 cs.CV arXiv:1712.05021v1Hematoxylin and Eosin stained histopathology image analysis is essential for the diagnosis and study of complicated diseases such as cancer. Existing state-of-the-art approaches demand extensive amount of supervised training data from trained pathologists. In this work we synthesize in an unsupervised manner, large histopathology image datasets, suitable for supervised training tasks. We propose a unified pipeline that: a) generates a set of initial synthetic histopathology images with paired information about the nuclei such as segmentation masks; b) refines the initial synthetic images through a Generative Adversarial Network (GAN) to reference styles; c) trains a task-specific CNN and boosts the performance of the task-specific CNN with on-the-fly generated adversarial examples. Our main contribution is that the synthetic images are not only realistic, but also representative (in reference styles) and relatively challenging for training task-specific CNNs. We test our method for nucleus segmentation using images from four cancer types. When no supervised data exists for a cancer type, our method without supervision cost significantly outperforms supervised methods which perform across-cancer generalization. Even when supervised data exists for all cancer types, our approach without supervision cost performs better than supervised methods.
- Face recognition has seen a significant improvement by using the deep convolutional neural networks. In this work, we mainly study the influence of the 2D warping module for one-shot face recognition. To achieve this, we first propose a 2D-Warping Layer to generate new features for the novel classes during the training, then fine-tuning the network by adding the recent proposed fisher loss to learn more discriminative features. We evaluate the proposed method on two popular databases for unconstrained face recognition, the Labeled Faces in the Wild (LFW) and the Youtube Faces (YTF) database. In both cases, the proposed method achieves competitive results with the accuracy of 99.25\% for LFW and 94.3\% for YTF, separately. Moreover, the experimental results on MS-Celeb-1M one-shot faces dataset show that with the proposed method, the model achieves comparable results of 77.92\% coverage rate at precision = 99\% for the novel classes while still keeps top-1 accuracy of 99.80\% for the normal classes.
- Dec 15 2017 cs.LO arXiv:1712.04982v1Imprecise and incomplete specification of system \textitconfigurations threatens safety, security, functionality, and other critical system properties and uselessly enlarges the configuration spaces to be searched by configuration engineers and auto-tuners. To address these problems, this paper introduces \textitinterpreted formalisms based on real-world types for configurations. Configuration values are lifted to values of real-world types, which we formalize as \textitsubset types in Coq. Values of these types are dependent pairs whose components are values of underlying Coq types and proofs of additional properties about them. Real-world types both extend and further constrain \textitmachine-level configurations, enabling richer, proof-based checking of their consistency with real-world constraints. Tactic-based proof scripts are written once to automate the construction of proofs, if proofs exist, for configuration fields and whole configurations. \textitFailures to prove reveal real-world type errors. Evaluation is based on a case study of combinatorial optimization of Hadoop performance by meta-heuristic search over Hadoop configurations spaces.
- Dec 15 2017 cs.RO arXiv:1712.04965v1In this paper, we present a Model Predictive Control (MPC) framework based on path velocity decomposition paradigm for autonomous driving. The optimization underlying the MPC has a two layer structure wherein first, an appropriate path is computed for the vehicle followed by the computation of optimal forward velocity along it. The very nature of the proposed path velocity decomposition allows for seamless compatibility between the two layers of the optimization. A key feature of the proposed work is that it offloads most of the responsibility of collision avoidance to velocity optimization layer for which computationally efficient formulations can be derived. In particular, we extend our previously developed concept of time scaled collision cone (TSCC) constraints and formulate the forward velocity optimization layer as a convex quadratic programming problem. We perform validation on autonomous driving scenarios wherein proposed MPC repeatedly solves both the optimization layers in receding horizon manner to compute lane change, overtaking and merging maneuvers among multiple dynamic obstacles.
- Dec 15 2017 cs.CV arXiv:1712.04961v1Mobile virtual reality (VR) head mounted displays (HMD) have become popular among consumers in recent years. In this work, we demonstrate real-time egocentric hand gesture detection and localization on mobile HMDs. Our main contributions are: 1) A novel mixed-reality data collection tool to automatic annotate bounding boxes and gesture labels; 2) The largest-to-date egocentric hand gesture and bounding box dataset with more than 400,000 annotated frames; 3) A neural network that runs real time on modern mobile CPUs, and achieves higher than 76% precision on gesture recognition across 8 classes.
- The prediction of near surface wind speed is becoming increasingly vital for the operation of electrical energy grids as the capacity of installed wind power grows. The majority of predictive wind speed modeling has focused on point-based time-series forecasting. Effectively balancing demand and supply in the presence of distributed wind turbine electricity generation, however, requires the prediction of wind fields in space and time. Additionally, predictions of full wind fields are particularly useful for future power planning such as the optimization of electricity power supply systems. In this paper, we propose a composite artificial neural network (ANN) model to predict the 6-hour and 24-hour ahead average wind speed over a large area (~3.15*106 km2). The ANN model consists of a convolutional input layer, a Long Short-Term Memory (LSTM) hidden layer, and a transposed convolutional layer as the output layer. We compare the ANN model with two non-parametric models, a null persistence model and a mean value model, and find that the ANN model has substantially smaller error than each of these models. Additionally, the ANN model also generally performs better than integrated autoregressive moving average models, which are trained for optimal performance in specific locations.
- Dec 15 2017 cs.CV arXiv:1712.05277v1Depth cameras allow to setup reliable solutions for people monitoring and behavior understanding, specially when unstable or poor illumination conditions make unusable common RGB sensors. Therefore, we propose a complete framework for the estimation of the head and shoulder pose based on depth images only. A head detection and localization module is also included, in order to develop a complete end-to-end system. The core element of the framework is a Convolutional Neural Network, called POSEidon+, that receives as input three types of images and provides the 3D angles of the pose as output. Moreover, a Face-from-Depth component based on a Deterministic Conditional GAN model is able to hallucinate a face from the corresponding depth image and we empirically demonstrate that this positively impacts the system performances. We test the proposed framework on two public datasets, namely Biwi Kinect Head Pose and ICT-3DHP, and on Pandora, a new challenging dataset mainly inspired by the automotive setup. Experimental results show that our method overcomes all recent state-of-art works based on both intensity and depth input data, running in real time at more than 30 frames per second.
- Dec 15 2017 cs.SE arXiv:1712.05236v1Motivation: Automatically testing changes to code is an essential feature of continuous integration. For open-source code, without licensed dependencies, a variety of continuous integration services exist. The COnstraint-Based Reconstruction and Analysis (COBRA) Toolbox is a suite of open-source code for computational modelling with dependencies on licensed software. A novel automated framework of continuous integration in a semi-licensed environment is required for the development of the COBRA Toolbox and related tools of the COBRA community. Results: ARTENOLIS is a general-purpose infrastructure software application that implements continuous integration for open-source software with licensed dependencies. It uses a master-slave framework, tests code on multiple operating systems, and multiple versions of licensed software dependencies. ARTENOLIS ensures the stability, integrity, and cross-platform compatibility of code in the COBRA Toolbox and related tools. Availability and Implementation: The continuous integration server, core of the reproducibility and testing infrastructure, can be freely accessed under artenolis.lcsb.uni.lu. The continuous integration framework code is located in the /.ci directory and at the root of the repository freely available under github.com/opencobra/cobratoolbox.
- Dec 15 2017 cs.NI arXiv:1712.05004v1By means of network densification, ultra dense networks (UDNs) can efficiently broaden the network coverage and enhance the system throughput. In parallel, unmanned aerial vehicles (UAVs) communications and networking have attracted increasing attention recently due to their high agility and numerous applications. In this article, we present a vision of UAV-supported UDNs. Firstly, we present four representative scenarios to show the broad applications of UAV-supported UDNs in communications, caching and energy transfer. Then, we highlight the efficient power control in UAV-supported UDNs by discussing the main design considerations and methods in a comprehensive manner. Furthermore, we demonstrate the performance superiority of UAV-supported UDNs via case study simulations, compared to traditional fixed infrastructure based networks. In addition, we discuss the dominating technical challenges and open issues ahead.
- Dec 15 2017 cs.CY arXiv:1712.05376v1
- Dec 15 2017 cs.NI arXiv:1712.05344v1
- Dec 15 2017 cs.CV arXiv:1712.05319v1
- Dec 15 2017 cs.NI arXiv:1712.05307v1
- Dec 15 2017 cs.SY arXiv:1712.05278v1