# Computer Science (cs)

• This paper considers the potential impact that the nascent technology of quantum computing may have on society. It focuses on three areas: cryptography, optimization, and simulation of quantum systems. We will also discuss some ethical aspects of these developments, and ways to mitigate the risks.
• Covert communication can prevent the opponent from knowing that a wireless communication has occurred. In the additive white Gaussian noise channels, if we only take the ambient noise into account, a square root law was obtained and the result shows that Alice can reliably and covertly transmit $\mathcal{O}(\sqrt{n})$ bits to Bob in $n$ channel uses. If additional "friendly" node closest to the adversary can produce artificial noise to aid in hiding the communication, the covert throughput can be improved. In this paper, we consider the covert communication in noisy wireless networks, where potential transmitters form a stationary Poisson point process. Alice wishes to communicate covertly to Bob without being detected by the warden Willie. In this scenario, Bob and Willie not only experience the ambient noise, but also the aggregated interference simultaneously. Although the random interference sources are not in collusion with Alice and Bob, our results show that uncertainty in noise and interference experienced by Willie is beneficial to Alice. When the distance between Alice and Willie $d_{a,w}=\omega(n^{\delta/4})$ ($\delta=2/\alpha$ is stability exponent), Alice can reliably and covertly transmit $\mathcal{O}(\log_2\sqrt{n})$ bits to Bob in $n$ channel uses, and there is no limitation on the transmit power of transmitters. Although the covert throughout is lower than the square root law and the friendly jamming scheme, the spatial throughout of the network is higher, and Alice does not presuppose to know the location of Willie. From the network perspective, the communications are hidden in the noisy wireless networks, and what Willie sees is merely a "\emphshadow" wireless network where he knows for certain some nodes are transmitting, but he cannot catch anyone red-handed.
• In this paper, we consider the problem of computing the minimum area triangle that circumscribes a given $n$-sided convex polygon touching edge-to-edge. In other words, we compute the minimum area triangle that is the intersection of 3 half-planes out of $n$ half-planes defined by a given convex polygon. Previously, $O(n\log n)$ time algorithms were known which are based on the technique for computing the minimum weight k-link path given in \citekgon94,kgon95soda. By applying the new technique proposed in Jin's recent work for the dual problem of computing the maximum area triangle inside a convex polygon, we solve the problem at hand in $O(n)$ time, thus justify Jin's claim that his technique may find applications in other polygonal inclusion problems. Our algorithm actually computes all the local minimal area circumscribing triangles touching edge-to-edge.
• Dec 15 2017 stat.ML cs.LG arXiv:1712.05016v1
The recent literature on deep learning offers new tools to learn a rich probability distribution over high dimensional data such as images or sounds. In this work we investigate the possibility of learning the prior distribution over neural network parameters using such tools. Our resulting variational Bayes algorithm generalizes well to new tasks, even when very few training examples are provided. Furthermore, this learned prior allows the model to extrapolate correctly far from a given task's training data on a meta-dataset of periodic signals.
• Dec 15 2017 math.PR cs.GT arXiv:1712.05385v1
We analyse the Tangle --- a DAG-valued stochastic process where new vertices get attached to the graph at Poissonian times, and the attachment's locations are chosen by means of random walks on that graph. We prove existence of ("almost symmetric") Nash equilibria for the system where a part of players tries to optimize their attachment strategies. Then, we also present simulations that show that the "selfish" players will nevertheless cooperate with the network by choosing attachment strategies that are similar to the default one.
• Dec 15 2017 cs.CL stat.ML arXiv:1712.05382v1
Sequence-to-sequence models with soft attention have been successfully applied to a wide variety of problems, but their decoding process incurs a quadratic time and space cost and is inapplicable to real-time sequence transduction. To address these issues, we propose Monotonic Chunkwise Attention (MoChA), which adaptively splits the input sequence into small chunks over which soft attention is computed. We show that models utilizing MoChA can be trained efficiently with standard backpropagation while allowing online and linear-time decoding at test time. When applied to online speech recognition, we obtain state-of-the-art results and match the performance of a model using an offline soft attention mechanism. In document summarization experiments where we do not expect monotonic alignments, we show significantly improved performance compared to a baseline monotonic attention-based model.
• We define a monad on the category of complete metric spaces with short maps, which assigns to each space the space of Radon probability measures on it with finite first moment, equipped with the Kantorovich--Wasserstein distance. It is analogous to the Giry monad on the category of Polish spaces, and it extends a construction due to van Breugel for compact and for 1-bounded complete metric spaces. We prove that this Kantorovich monad arises from a colimit construction on finite powers, which formalizes the intuition that probability measures are limits of finite samples. The proof relies on a new criterion for when an ordinary left Kan extension of lax monoidal functors is a monoidal Kan extension. This colimit characterization allows for the development of integration theory and other things, such as the treatment of measures on spaces of measures, completely without measure theory. We also show that the category of algebras of the Kantorovich monad is equivalent to the category of closed convex subsets of Banach spaces with short affine maps as the morphisms.
• Sociotechnological and geospatial processes exhibit time varying structure that make insight discovery challenging. To detect abnormal moments in these processes, a definition of normal' must be established. This paper proposes a new statistical model for such systems, modeled as dynamic networks, to address this challenge. It assumes that vertices fall into one of k types and that the probability of edge formation at a particular time depends on the types of the incident nodes and the current time. The time dependencies are driven by unique seasonal processes, which many systems exhibit (e.g., predictable spikes in geospatial or web traffic each day). The paper defines the model as a generative process and an inference procedure to recover the normal' seasonal processes from data when they are unknown. An outline of anomaly detection experiments to be completed over Enron emails and New York City taxi trips is presented.
• We cast the problem of combinatorial auction design in a Bayesian framework in order to incorporate prior information into the auction process and minimize the number of rounds to convergence. We first develop a generative model of agent valuations and market prices such that clearing prices become maximum a posteriori estimates given observed agent valuations. This generative model then forms the basis of an auction process which alternates between refining estimates of agent valuations and computing candidate clearing prices. We provide an implementation of the auction using assumed density filtering to estimate valuations and expectation maximization to compute prices. An empirical evaluation over a range of valuation domains demonstrates that our Bayesian auction mechanism is highly competitive against the combinatorial clock auction in terms of rounds to convergence, even under the most favorable choices of price increment for this baseline.
• This paper performs a comprehensive and comparative evaluation of the state of the art local features for the task of image based 3D reconstruction. The evaluated local features cover the recently developed ones by using powerful machine learning techniques and the elaborately designed handcrafted features. To obtain a comprehensive evaluation, we choose to include both float type features and binary ones. Meanwhile, two kinds of datasets have been used in this evaluation. One is a dataset of many different scene types with groundtruth 3D points, containing images of different scenes captured at fixed positions, for quantitative performance evaluation of different local features in the controlled image capturing situations. The other dataset contains Internet scale image sets of several landmarks with a lot of unrelated images, which is used for qualitative performance evaluation of different local features in the free image collection situations. Our experimental results show that binary features are competent to reconstruct scenes from controlled image sequences with only a fraction of processing time compared to use float type features. However, for the case of large scale image set with many distracting images, float type features show a clear advantage over binary ones.
• To harness the complexity of their high-dimensional bodies during sensorimotor development, infants are guided by patterns of freezing and freeing of degrees of freedom. For instance, when learning to reach, infants free the degrees of freedom in their arm proximodistally, i.e. from joints that are closer to the body to those that are more distant. Here, we formulate and study computationally the hypothesis that such patterns can emerge spontaneously as the result of a family of stochastic optimization processes (evolution strategies with covariance-matrix adaptation), without an innate encoding of a maturational schedule. In particular, we present simulated experiments with an arm where a computational learner progressively acquires reaching skills through adaptive exploration, and we show that a proximodistal organization appears spontaneously, which we denote PDFF (ProximoDistal Freezing and Freeing of degrees of freedom). We also compare this emergent organization between different arm morphologies -- from human-like to quite unnatural ones -- to study the effect of different kinematic structures on the emergence of PDFF. Keywords: human motor learning; proximo-distal exploration; stochastic optimization; modelling; evolution strategies; cross-entropy methods; policy search; morphology.
• Recent random-forest (RF)-based image super-resolution approaches inherit some properties from dictionary-learning-based algorithms, but the effectiveness of the properties in RF is overlooked in the literature. In this paper, we present a novel feature-augmented random forest (FARF) for image super-resolution, where the conventional gradient-based features are augmented with gradient magnitudes and different feature recipes are formulated on different stages in an RF. The advantages of our method are that, firstly, the dictionary-learning-based features are enhanced by adding gradient magnitudes, based on the observation that the non-linear gradient magnitude are with highly discriminative property. Secondly, generalized locality-sensitive hashing (LSH) is used to replace principal component analysis (PCA) for feature dimensionality reduction and original high-dimensional features are employed, instead of the compressed ones, for the leaf-nodes' regressors, since regressors can benefit from higher dimensional features. This original-compressed coupled feature sets scheme unifies the unsupervised LSH evaluation on both image super-resolution and content-based image retrieval (CBIR). Finally, we present a generalized weighted ridge regression (GWRR) model for the leaf-nodes' regressors. Experiment results on several public benchmark datasets show that our FARF method can achieve an average gain of about 0.3 dB, compared to traditional RF-based methods. Furthermore, a fine-tuned FARF model can compare to or (in many cases) outperform some recent stateof-the-art deep-learning-based algorithms.
• This paper presents a framework for intrinsic point of interest discovery from trajectory databases. Intrinsic points of interest are regions of a geospatial area innately defined by the spatial and temporal aspects of trajectory data, and can be of varying size, shape, and resolution. Any trajectory database exhibits such points of interest, and hence are intrinsic, as compared to most other point of interest definitions which are said to be extrinsic, as they require trajectory metadata, external knowledge about the region the trajectories are observed, or other application-specific information. Spatial and temporal aspects are qualities of any trajectory database, making the framework applicable to data from any domain and of any resolution. The framework is developed under recent developments on the consistency of nonparametric hierarchical density estimators and enables the possibility of formal statistical inference and evaluation over such intrinsic points of interest. Comparisons of the POIs uncovered by the framework in synthetic truth data to thousands of parameter settings for common POI discovery methods show a marked improvement in fidelity without the need to tune any parameters by hand.
• Dec 15 2017 cs.CV cs.LG arXiv:1712.05245v1
Deep learning with 3D data such as reconstructed point clouds and CAD models has received great research interests recently. However, the capability of using point clouds with convolutional neural network has been so far not fully explored. In this technical report, we present a convolutional neural network for semantic segmentation and object recognition with 3D point clouds. At the core of our network is point-wise convolution, a convolution operator that can be applied at each point of a point cloud. Our fully convolutional network design, while being simple to implement, can yield competitive accuracy in both semantic segmentation and object recognition task.
• Most of existing correlation filter-based tracking approaches only estimate simple axis-aligned bounding boxes, and very few of them is capable of recovering the underlying similarity transformation. To a large extent, such limitation restricts the applications of such trackers for a wide range of scenarios. In this paper, we propose a novel correlation filter-based tracker with robust estimation of similarity transformation on the large displacements to tackle this challenging problem. In order to efficiently search in such a large 4-DoF space in real-time, we formulate the problem into two 2-DoF sub-problems and apply an efficient Block Coordinates Descent solver to optimize the estimation result. Specifically, we employ an efficient phase correlation scheme to deal with both scale and rotation changes simultaneously in log-polar coordinates. Moreover, a fast variant of correlation filter is used to predict the translational motion individually. Our experimental results demonstrate that the proposed tracker achieves very promising prediction performance compared with the state-of-the-art visual object tracking methods while still retaining the advantages of efficiency and simplicity in conventional correlation filter-based tracking methods.
• Modeling of music audio semantics has been previously tackled through learning of mappings from audio data to high-level tags or latent unsupervised spaces. The resulting semantic spaces are theoretically limited, either because the chosen high-level tags do not cover all of music semantics or because audio data itself is not enough to determine music semantics. In this paper, we propose a generic framework for semantics modeling that focuses on the perception of the listener, through EEG data, in addition to audio data. We implement this framework using a novel end-to-end 2-view Neural Network (NN) architecture and a Deep Canonical Correlation Analysis (DCCA) loss function that forces the semantic embedding spaces of both views to be maximally correlated. We also detail how the EEG dataset was collected and use it to train our proposed model. We evaluate the learned semantic space in a transfer learning context, by using it as an audio feature extractor in an independent dataset and proxy task: music audio-lyrics cross-modal retrieval. We show that our embedding model outperforms Spotify features and performs comparably to a state-of-the-art embedding model that was trained on 700 times more data. We further propose improvements to the model that are likely to improve its performance.
• Dec 15 2017 cs.CL cs.AI cs.IR arXiv:1712.05191v1
With the advent of the Internet, large amount of digital text is generated everyday in the form of news articles, research publications, blogs, question answering forums and social media. It is important to develop techniques for extracting information automatically from these documents, as lot of important information is hidden within them. This extracted information can be used to improve access and management of knowledge hidden in large text corpora. Several applications such as Question Answering, Information Retrieval would benefit from this information. Entities like persons and organizations, form the most basic unit of the information. Occurrences of entities in a sentence are often linked through well-defined relations; e.g., occurrences of person and organization in a sentence may be linked through relations such as employed at. The task of Relation Extraction (RE) is to identify such relations automatically. In this paper, we survey several important supervised, semi-supervised and unsupervised RE techniques. We also cover the paradigms of Open Information Extraction (OIE) and Distant Supervision. Finally, we describe some of the recent trends in the RE techniques and possible future research directions. This survey would be useful for three kinds of readers - i) Newcomers in the field who want to quickly learn about RE; ii) Researchers who want to know how the various RE techniques evolved over time and what are possible future research directions and iii) Practitioners who just need to know which RE technique works best in various settings.
• We introduce a pair of tools, Rasa NLU and Rasa Core, which are open source python libraries for building conversational software. Their purpose is to make machine-learning based dialogue management and language understanding accessible to non-specialist software developers. In terms of design philosophy, we aim for ease of use, and bootstrapping from minimal (or no) initial training data. Both packages are extensively documented and ship with a comprehensive suite of tests. The code is available at https://github.com/RasaHQ/
• Recurrent Neural Networks (RNNs) are powerful sequence modeling tools. However, when dealing with high dimensional inputs, the training of RNNs becomes computational expensive due to the large number of model parameters. This hinders RNNs from solving many important computer vision tasks, such as Action Recognition in Videos and Image Captioning. To overcome this problem, we propose a compact and flexible structure, namely Block-Term tensor decomposition, which greatly reduces the parameters of RNNs and improves their training efficiency. Compared with alternative low-rank approximations, such as tensor-train RNN (TT-RNN), our method, Block-Term RNN (BT-RNN), is not only more concise (when using the same rank), but also able to attain a better approximation to the original RNNs with much fewer parameters. On three challenging tasks, including Action Recognition in Videos, Image Captioning and Image Generation, BT-RNN outperforms TT-RNN and the standard RNN in terms of both prediction accuracy and convergence rate. Specifically, BT-LSTM utilizes 17,388 times fewer parameters than the standard LSTM to achieve an accuracy improvement over 15.6\% in the Action Recognition task on the UCF11 dataset.
• In the use of deep neural networks, it is crucial to provide appropriate input representations for the network to learn from. In this paper, we propose an approach to learn a representation that focus on rhythmic representation which is named as DLR (Deep Learning Rhythmic representation). The proposed approach aims to learn DLR from the raw audio signal and use it for other music informatics tasks. A 1-dimensional convolutional network is utilised in the learning of DLR. In the experiment, we present the results from the source task and the target task as well as visualisations of DLRs. The results reveals that DLR provides compact rhythmic information which can be used on multi-tagging task.
• Hiring a head coach of a college sports team is vital which will definitely have a great influence on the later development of the team. However, a lot of attention has been focused on each coach's individual features. A systematic and quantitative analysis of the whole coach hiring market is lacking. In a coach hiring network, the coaches are actually voting with their feet. It is interesting to analyze what factors are affecting the "footprint" left by those head coaches. In this paper, we collect more than 12,000 head coach hiring records in two different popular sports from the NCAA. Using network-based methods, we build the coach hiring network in the NCAA men's basketball and football. We find that: (1).the coach hiring network is of great inequality in coach production with a Gini coefficient close to 0.60. (2).coaches prefer to work within the same geographical region and the same division to their alma maters'. (3).the coach production rankings we calculated using network-based methods are generally correlated to the authoritative rankings, but also show disaccord in specific time period. The results provide us a novel view and better understanding of the coach hiring market in the NCAA and shed new light on the coach hiring system.
• This paper proposes an approach for robots to perform co-working task alongside humans by using neuro-dynamical models. The proposed model comprised two models: an Autoencoder and a hierarchical recurrent neural network (RNN). We trained hierarchical RNN with various sensory-motor sequences and instructions. To acquire the interactive ability to switch and combine appropriate motions according to visual information and instructions from outside, we embedded the cyclic neuronal dynamics in a network. To evaluate our model, we designed a cloth-folding task that consists of four short folding motions and three patterns of instruction that indicate the direction of each short motion. The results showed that the robot can perform the task by switching or combining short motions with instructions and visual information. We also showed that the proposed model acquired relationships between the instructions and sensory-motor information in its internal neuronal dynamics.
• Dec 15 2017 cs.CR arXiv:1712.05090v1
Virtualization has become more important since cloud computing is getting more and more popular than before. There is an increasing demand for security among the cloud customers. AMD plans to provide Secure Encrypted Virtualization (SEV) technology in its latest processor EPYC to protect virtual machines by encrypting its memory but without integrity protection. In this paper, we analyzed the weakness in the SEV design due to lack of integrity protection thus it is not so secure. Using different design flaw in physical address-based tweak algorithm to protect against ciphertext block move attacks, we found a realistic attack against SEV which could obtain the root privilege of an encrypted virtual machine protected by SEV. A demo to simulate the attack against a virtual machine protected by SEV is done in a Ryzen machine which supports Secure Memory Encryption (SME) technology since SEV enabled machine is still not available in market.
• We study domain-specific video streaming. Specifically, we target a streaming setting where the videos to be streamed from a server to a client are all in the same domain and they have to be compressed to a small size for low-latency transmission. Several popular video streaming services, such as the video game streaming services of GeForce Now and Twitch, fall in this category. While conventional video compression standards such as H.264 are commonly used for this task, we hypothesize that one can leverage the property that the videos are all in the same domain to achieve better video quality. Based on this hypothesis, we propose a novel video compression pipeline. Specifically, we first apply H.264 to compress domain-specific videos. We then train a novel binary autoencoder to encode the leftover domain-specific residual information frame-by-frame into binary representations. These binary representations are then compressed and sent to the client together with the H.264 stream. In our experiments, we show that our pipeline yields consistent gains over standard H.264 compression across several benchmark datasets while using the same channel bandwidth.
• Adaptability is central to autonomy. Intuitively, for high-dimensional learning problems such as navigating based on vision, internal models with higher complexity allow to accurately encode the information available. However, most learning methods rely on models with a fixed structure and complexity. In this paper, we present a self-supervised framework for robots to learn to navigate, without any prior knowledge of the environment, by incrementally building the structure of a deep network as new data becomes available. Our framework captures images from a monocular camera and self labels the images to continuously train and predict actions from a computationally efficient adaptive deep architecture based on Autoencoders (AE), in a self-supervised fashion. The deep architecture, named Reinforced Adaptive Denoising Autoencoders (RA-DAE), uses reinforcement learning to dynamically change the network structure by adding or removing neurons. Experiments were conducted in simulation and real-world indoor and outdoor environments to assess the potential of self-supervised navigation. RA-DAE demonstrates better performance than equivalent non-adaptive deep learning alternatives and can continue to expand its knowledge, trading-off past and present information.
• Existing single view, 3D face reconstruction methods can produce beautifully detailed 3D results, but typically only for near frontal, unobstructed viewpoints. We describe a system designed to provide detailed 3D reconstructions of faces viewed under extreme conditions, out of plane rotations, and occlusions. Motivated by the concept of bump mapping, we propose a layered approach which decouples estimation of a global shape from its mid-level details (e.g., wrinkles). We estimate a coarse 3D face shape which acts as a foundation and then separately layer this foundation with details represented by a bump map. We show how a deep convolutional encoder-decoder can be used to estimate such bump maps. We further show how this approach naturally extends to generate plausible details for occluded facial regions. We test our approach and its components extensively, quantitatively demonstrating the invariance of our estimated facial details. We further provide numerous qualitative examples showing that our method produces detailed 3D face shapes in viewing conditions where existing state of the art often break down.
• We propose a weakly supervised temporal action localization algorithm on untrimmed videos using convolutional neural networks. Our algorithm predicts temporal intervals of human actions given video-level class labels with no requirement of temporal localization information of actions. This objective is achieved by proposing a novel deep neural network that recognizes actions and identifies a sparse set of key segments associated with the actions through adaptive temporal pooling of video segments. We design the loss function of the network to comprise two terms--one for classification error and the other for sparsity of the selected segments. After recognizing actions with sparse attention weights for key segments, we extract temporal proposals for actions using temporal class activation mappings to estimate time intervals that localize target actions. The proposed algorithm attains state-of-the-art accuracy on the THUMOS14 dataset and outstanding performance on ActivityNet1.3 even with weak supervision.
• A Chair of Software Engineering existed at ETH Zurich, the Swiss Federal Insti-tute of Technology, from 1 October 2001 to 31 January 2016, under my leader-ship. Our work, summarized here, covered a wide range of theoretical and practi-cal topics, with object technology in the Eiffel method as the unifying thread .
• Recent studies have discovered that deep networks are capable of memorizing the entire data even when the labels are completely random. Since deep models are trained on big data where labels are often noisy, the ability to overfit noise can lead to poor performance. To overcome the overfitting on corrupted training data, we propose a novel technique to regularize deep networks in the data dimension. This is achieved by learning a neural network called MentorNet to supervise the training of the base network, namely, StudentNet. Our work is inspired by curriculum learning and advances the theory by learning a curriculum from data by neural networks. We demonstrate the efficacy of MentorNet on several benchmarks. Comprehensive experiments show that it is able to significantly improve the generalization performance of the state-of-the-art deep networks on corrupted training data.
• Skeletal bone age assessment is a common clinical practice to diagnose endocrine and metabolic disorders in child development. In this paper, we describe a fully automated deep learning approach to the problem of bone age assessment using data from Pediatric Bone Age Challenge organized by RSNA 2017. The dataset for this competition is consisted of 12.6k radiological images of left hand labeled by the bone age and sex of patients. Our approach utilizes several deep learning architectures: U-Net, ResNet-50, and custom VGG-style neural networks trained end-to-end. We use images of whole hands as well as specific parts of a hand for both training and inference. This approach allows us to measure importance of specific hand bones for the automated bone age analysis. We further evaluate performance of the method in the context of skeletal development stages. Our approach outperforms other common methods for bone age assessment.
• Deep Learning (DL) aims at learning the \emphmeaningful representations. A meaningful representation refers to the one that gives rise to significant performance improvement of associated Machine Learning (ML) tasks by replacing the raw data as the input. However, optimal architecture design and model parameter estimation in DL algorithms are widely considered to be intractable. Evolutionary algorithms are much preferable for complex and non-convex problems due to its inherent characteristics of gradient-free and insensitivity to local optimum. In this paper, we propose a computationally economical algorithm for evolving \emphunsupervised deep neural networks to efficiently learn \emphmeaningful representations, which is very suitable in the current Big Data era where sufficient labeled data for training is often expensive to acquire. In the proposed algorithm, finding an appropriate architecture and the initialized parameter values for a ML task at hand is modeled by one computational efficient gene encoding approach, which is employed to effectively model the task with a large number of parameters. In addition, a local search strategy is incorporated to facilitate the exploitation search for further improving the performance. Furthermore, a small proportion labeled data is utilized during evolution search to guarantee the learnt representations to be meaningful. The performance of the proposed algorithm has been thoroughly investigated over classification tasks. Specifically, error classification rate on MNIST with $1.15\%$ is reached by the proposed algorithm consistently, which is a very promising result against state-of-the-art unsupervised DL algorithms.
• Convolutional auto-encoders have shown their remarkable performance in stacking to deep convolutional neural networks for classifying image data during past several years. However, they are unable to construct the state-of-the-art convolutional neural networks due to their intrinsic architectures. In this regard, we propose a flexible convolutional auto-encoder by eliminating the constraints on the numbers of convolutional layers and pooling layers from the traditional convolutional auto-encoder. We also design an architecture discovery method by using particle swarm optimization, which is capable of automatically searching for the optimal architectures of the proposed flexible convolutional auto-encoder with much less computational resource and without any manual intervention. We use the designed architecture optimization algorithm to test the proposed flexible convolutional auto-encoder through utilizing one graphic processing unit card on four extensively used image classification datasets. Experimental results show that our work in this paper significantly outperform the peer competitors including the state-of-the-art algorithm.
• This paper considers the scheduling of parallel real-time tasks with arbitrary-deadlines. Each job of a parallel task is described as a directed acyclic graph (DAG). In contrast to prior work in this area, where decomposition-based scheduling algorithms are proposed based on the DAG-structure and inter-task interference is analyzed as self-suspending behavior, this paper generalizes the federated scheduling approach. We propose a reservation-based algorithm, called reservation-based federated scheduling, that dominates federated scheduling. We provide general constraints for the design of such systems and prove that reservation-based federated scheduling has a constant speedup factor with respect to any optimal DAG task scheduler. Furthermore, the presented algorithm can be used in conjunction with any scheduler and scheduling analysis suitable for ordinary arbitrary-deadline sporadic task sets, i.e., without parallelism.
• Dec 15 2017 cs.CV arXiv:1712.05021v1
Hematoxylin and Eosin stained histopathology image analysis is essential for the diagnosis and study of complicated diseases such as cancer. Existing state-of-the-art approaches demand extensive amount of supervised training data from trained pathologists. In this work we synthesize in an unsupervised manner, large histopathology image datasets, suitable for supervised training tasks. We propose a unified pipeline that: a) generates a set of initial synthetic histopathology images with paired information about the nuclei such as segmentation masks; b) refines the initial synthetic images through a Generative Adversarial Network (GAN) to reference styles; c) trains a task-specific CNN and boosts the performance of the task-specific CNN with on-the-fly generated adversarial examples. Our main contribution is that the synthetic images are not only realistic, but also representative (in reference styles) and relatively challenging for training task-specific CNNs. We test our method for nucleus segmentation using images from four cancer types. When no supervised data exists for a cancer type, our method without supervision cost significantly outperforms supervised methods which perform across-cancer generalization. Even when supervised data exists for all cancer types, our approach without supervision cost performs better than supervised methods.
• Face recognition has seen a significant improvement by using the deep convolutional neural networks. In this work, we mainly study the influence of the 2D warping module for one-shot face recognition. To achieve this, we first propose a 2D-Warping Layer to generate new features for the novel classes during the training, then fine-tuning the network by adding the recent proposed fisher loss to learn more discriminative features. We evaluate the proposed method on two popular databases for unconstrained face recognition, the Labeled Faces in the Wild (LFW) and the Youtube Faces (YTF) database. In both cases, the proposed method achieves competitive results with the accuracy of 99.25\% for LFW and 94.3\% for YTF, separately. Moreover, the experimental results on MS-Celeb-1M one-shot faces dataset show that with the proposed method, the model achieves comparable results of 77.92\% coverage rate at precision = 99\% for the novel classes while still keeps top-1 accuracy of 99.80\% for the normal classes.
• Dec 15 2017 cs.LO arXiv:1712.04982v1
Imprecise and incomplete specification of system \textitconfigurations threatens safety, security, functionality, and other critical system properties and uselessly enlarges the configuration spaces to be searched by configuration engineers and auto-tuners. To address these problems, this paper introduces \textitinterpreted formalisms based on real-world types for configurations. Configuration values are lifted to values of real-world types, which we formalize as \textitsubset types in Coq. Values of these types are dependent pairs whose components are values of underlying Coq types and proofs of additional properties about them. Real-world types both extend and further constrain \textitmachine-level configurations, enabling richer, proof-based checking of their consistency with real-world constraints. Tactic-based proof scripts are written once to automate the construction of proofs, if proofs exist, for configuration fields and whole configurations. \textitFailures to prove reveal real-world type errors. Evaluation is based on a case study of combinatorial optimization of Hadoop performance by meta-heuristic search over Hadoop configurations spaces.
• In this paper, we present a Model Predictive Control (MPC) framework based on path velocity decomposition paradigm for autonomous driving. The optimization underlying the MPC has a two layer structure wherein first, an appropriate path is computed for the vehicle followed by the computation of optimal forward velocity along it. The very nature of the proposed path velocity decomposition allows for seamless compatibility between the two layers of the optimization. A key feature of the proposed work is that it offloads most of the responsibility of collision avoidance to velocity optimization layer for which computationally efficient formulations can be derived. In particular, we extend our previously developed concept of time scaled collision cone (TSCC) constraints and formulate the forward velocity optimization layer as a convex quadratic programming problem. We perform validation on autonomous driving scenarios wherein proposed MPC repeatedly solves both the optimization layers in receding horizon manner to compute lane change, overtaking and merging maneuvers among multiple dynamic obstacles.
• Mobile virtual reality (VR) head mounted displays (HMD) have become popular among consumers in recent years. In this work, we demonstrate real-time egocentric hand gesture detection and localization on mobile HMDs. Our main contributions are: 1) A novel mixed-reality data collection tool to automatic annotate bounding boxes and gesture labels; 2) The largest-to-date egocentric hand gesture and bounding box dataset with more than 400,000 annotated frames; 3) A neural network that runs real time on modern mobile CPUs, and achieves higher than 76% precision on gesture recognition across 8 classes.
• The prediction of near surface wind speed is becoming increasingly vital for the operation of electrical energy grids as the capacity of installed wind power grows. The majority of predictive wind speed modeling has focused on point-based time-series forecasting. Effectively balancing demand and supply in the presence of distributed wind turbine electricity generation, however, requires the prediction of wind fields in space and time. Additionally, predictions of full wind fields are particularly useful for future power planning such as the optimization of electricity power supply systems. In this paper, we propose a composite artificial neural network (ANN) model to predict the 6-hour and 24-hour ahead average wind speed over a large area (~3.15*106 km2). The ANN model consists of a convolutional input layer, a Long Short-Term Memory (LSTM) hidden layer, and a transposed convolutional layer as the output layer. We compare the ANN model with two non-parametric models, a null persistence model and a mean value model, and find that the ANN model has substantially smaller error than each of these models. Additionally, the ANN model also generally performs better than integrated autoregressive moving average models, which are trained for optimal performance in specific locations.
• Depth cameras allow to setup reliable solutions for people monitoring and behavior understanding, specially when unstable or poor illumination conditions make unusable common RGB sensors. Therefore, we propose a complete framework for the estimation of the head and shoulder pose based on depth images only. A head detection and localization module is also included, in order to develop a complete end-to-end system. The core element of the framework is a Convolutional Neural Network, called POSEidon+, that receives as input three types of images and provides the 3D angles of the pose as output. Moreover, a Face-from-Depth component based on a Deterministic Conditional GAN model is able to hallucinate a face from the corresponding depth image and we empirically demonstrate that this positively impacts the system performances. We test the proposed framework on two public datasets, namely Biwi Kinect Head Pose and ICT-3DHP, and on Pandora, a new challenging dataset mainly inspired by the automotive setup. Experimental results show that our method overcomes all recent state-of-art works based on both intensity and depth input data, running in real time at more than 30 frames per second.
• Motivation: Automatically testing changes to code is an essential feature of continuous integration. For open-source code, without licensed dependencies, a variety of continuous integration services exist. The COnstraint-Based Reconstruction and Analysis (COBRA) Toolbox is a suite of open-source code for computational modelling with dependencies on licensed software. A novel automated framework of continuous integration in a semi-licensed environment is required for the development of the COBRA Toolbox and related tools of the COBRA community. Results: ARTENOLIS is a general-purpose infrastructure software application that implements continuous integration for open-source software with licensed dependencies. It uses a master-slave framework, tests code on multiple operating systems, and multiple versions of licensed software dependencies. ARTENOLIS ensures the stability, integrity, and cross-platform compatibility of code in the COBRA Toolbox and related tools. Availability and Implementation: The continuous integration server, core of the reproducibility and testing infrastructure, can be freely accessed under artenolis.lcsb.uni.lu. The continuous integration framework code is located in the /.ci directory and at the root of the repository freely available under github.com/opencobra/cobratoolbox.
• By means of network densification, ultra dense networks (UDNs) can efficiently broaden the network coverage and enhance the system throughput. In parallel, unmanned aerial vehicles (UAVs) communications and networking have attracted increasing attention recently due to their high agility and numerous applications. In this article, we present a vision of UAV-supported UDNs. Firstly, we present four representative scenarios to show the broad applications of UAV-supported UDNs in communications, caching and energy transfer. Then, we highlight the efficient power control in UAV-supported UDNs by discussing the main design considerations and methods in a comprehensive manner. Furthermore, we demonstrate the performance superiority of UAV-supported UDNs via case study simulations, compared to traditional fixed infrastructure based networks. In addition, we discuss the dominating technical challenges and open issues ahead.
• The domain name system translates human friendly web addresses to a computer readable internet protocol address. This basic infrastructure is insecure and can be manipulated. Deployment of technology to secure the DNS system has been slow, reaching about 20% of all web sites based in the USA. Little is known about the efforts hospitals and health systems make to secure the domain name system for their websites. To investigate the prevalence of implementing Domain Name System Security Extensions (DNSSEC), we analyzed the websites of the 210 public hospitals in the state of Illinois, USA. Only one Illinois hospital website was found to have implemented DNSSEC by December, 2017.
• Emerging 5G systems will need to efficiently support both broadband traffic (eMBB) and ultra-low-latency (URLLC) traffic. In these systems, time is divided into slots which are further sub-divided into minislots. From a scheduling perspective, eMBB resource allocations occur at slot boundaries, whereas to reduce latency URLLC traffic is pre-emptively overlapped at the minislot timescale, resulting in selective superposition/puncturing of eMBB allocations. This approach enables minimal URLLC latency at a potential rate loss to eMBB traffic. We study joint eMBB and URLLC schedulers for such systems, with the dual objectives of maximizing utility for eMBB traffic while satisfying instantaneous URLLC demands. For a linear rate loss model (loss to eMBB is linear in the amount of superposition/puncturing), we derive an optimal joint scheduler. Somewhat counter-intuitively, our results show that our dual objectives can be met by an iterative gradient scheduler for eMBB traffic that anticipates the expected loss from URLLC traffic, along with an URLLC demand scheduler that is oblivious to eMBB channel states, utility functions and allocations decisions of the eMBB scheduler. Next we consider a more general class of (convex) loss models and study optimal online joint eMBB/URLLC schedulers within the broad class of channel state dependent but time-homogeneous policies. We validate the characteristics and benefits of our schedulers via simulation.
• Precise 3D segmentation of infant brain tissues is an essential step towards comprehensive volumetric studies and quantitative analysis of early brain developement. However, computing such segmentations is very challenging, especially for the infant brain at round 6-month of age, due to the poor image quality, among other difficulties inherent to infant brain MRI, e.g., the isointense contrast between white and gray matter and the severe partial volume effect due to small brain sizes. This study investigates the problem with an ensemble of semi-dense fully convolutional neural networks (CNNs), which employs T1-weighted and T2-weighted MR images as input. We demonstrate that the ensemble agreement is highly correlated with the segmentation errors. Therefore, our method provides measures that can guide local user corrections. To the best of our knowledge, this work is the first ensemble of 3D CNNs for suggesting annotations within images. Furthermore, inspired by the very recent success of dense networks, we propose a novel architecture, SemiDenseNet, which connects all convolutional layers directly to the end of the network. Our architecture allows the efficient propagation of gradients during training, while limiting the number of parameters, requiring one order of magnitude less parameters than popular medical image segmentation networks such as 3D U-Net. Another contribution of our work is the study of the impact that early or late fusions of multiple image modalities might have on the performances of deep architectures. We report evaluations of our method on the public data of the MICCAI iSEG-2017 Challenge on 6-month infant brain MRI segmentation, and show very competitive results among 21 teams, ranking first or second in most metrics.
• Dec 15 2017 cs.LO cs.CC arXiv:1712.05310v1
Various extensions of public announcement logic have been proposed with quantification over announcements. The best-known extension is called arbitrary public announcement logic, APAL. It contains a primitive language construct Box phi intuitively expressing that 'after every public announcement of a formula, formula phi is true.' The logic APAL is undecidable and it has an infinitary axiomatization. Now consider restricting the APAL quantification to public announcements of boolean formulas only, such that Box phi intuitively expresses that 'after every public announcement of a boolean formula, formula phi is true.' This logic can therefore called boolean arbitrary public announcement logic, BAPAL. The logic BAPAL is the subject of this work. It is decidable and it has a finitary axiomatization. These results may be considered of interest, as for various applications quantification over booleans is sufficient in formal specifications.
• Vehicular networks are one of the cornerstone of an Intelligent Transportation System (ITS). They are expected to provide ubiquitous network connectivity to moving vehicles while supporting various ITS services, some with very stringent requirements in terms of latency and reliability. Two vehicular networking technologies are envisioned to jointly support the full range of ITS services : DSRC (Dedicated Short Range Communication) for direct vehicle to vehicle/Road Side Units (RSU) communications and cellular technologies. To the best of our knowledge, approaches from the literature usually divide ITS services on each of these networks according to their requirements and one single network is in charge of supporting the each service. Those that consider both network technologies to offer multi-path routing, load balancing or path splitting for a better quality of experience of ITS services assume obviously separately controlled networks. Under the umbrella of SDN (Software Defined Networking), we propose in this paper a hybrid network architecture that enables the joint control of the networks providing connectivity to multi-homed vehicles and, also, explore the opportunities brought by such an architecture. We show through some use cases, that in addition to the flexibility and fine-grained programmability brought by SDN, it opens the way towards the development of effective network control algorithms that are the key towards the successful support of ITS services and especially those with stringent QoS. We also show how these algorithms could also benefit from information related to the environment or context in which vehicles evolve (traffic density, planned trajectory, ..), which could be easily collected by data providers and made available via the cloud.
• This paper considers the integrated problem of quay crane assignment, quay crane scheduling, yard location assignment, and vehicle dispatching operations at a container terminal. The main objective is to minimize vessel turnover times and maximize the terminal throughput, which are key economic drivers in terminal operations. Due to their computational complexities, these problems are not optimized jointly in existing work. This paper revisits this limitation and proposes Mixed Integer Programming (MIP) and Constraint Programming (CP) models for the integrated problem, under some realistic assumptions. Experimental results show that the MIP formulation can only solve small instances, while the CP model finds optimal solutions in reasonable times for realistic instances derived from actual container terminal operations.
• We consider the symbolic controller synthesis approach to enforce safety specifications on perturbed, nonlinear control systems. In general, in each state of the system several control values might be applicable to enforce the safety requirement and in the implementation one has the burden of picking a particular control value out of possibly many. We present a class of implementation strategies to obtain a controller with certain performance guarantees. This class includes two existing implementation strategies from the literature, based on discounted payoff and mean-payoff games. We unify both approaches by using games characterized by a single discount factor determining the implementation. We evaluate different implementations from our class experimentally on two case studies. We show that the choice of the discount factor has a significant influence on the average long-term costs, and the best performance guarantee for the symbolic model does not result in the best implementation. Comparing the optimal choice of the discount factor here with the previously proposed values, the costs differ by a factor of up to 50. Our approach therefore yields a method to choose systematically a good implementation for safety controllers with quantitative objectives.
• In recent years, neural networks have been used to generate music pieces, especially symbolic melody. However, the long-term structure in the melody has posed great difficulty for designing a good model. In this paper, we present a hierarchical recurrent neural network for melody generation, which consists of three Long-Short-Term-Memory (LSTM) subnetworks working in a coarse-to-fine manner. Specifically, the three subnetworks generate bar profiles, beat profiles and notes in turn, and the output of the high-level subnetworks are fed into the low-level subnetworks, serving as guidance for generating the finer time-scale melody components. Two human behavior experiments demonstrate the advantage of this structure over the single-layer LSTM which attempts to learn all hidden structures in melodies. In the third human behavior experiment, subjects are asked to judge whether the generated melody is composed by human or computer. The results show that 33.69% of the generated melodies are wrongly classified as human composed.

Bin Shi Oct 05 2017 00:07 UTC

Welcome to give the comments for this paper!

gae Jul 26 2017 21:19 UTC

For those interested in the literature on teleportation simulation of quantum channels, a detailed and *comprehensive* review is provided in Supplementary Note 8 of https://images.nature.com/original/nature-assets/ncomms/2017/170426/ncomms15043/extref/ncomms15043-s1.pdf
The note describes well the t

...(continued)
SHUAI ZHANG Jul 26 2017 00:20 UTC

I am still working on improving this survey. If you have any suggestions, questions or find any mistakes, please do not hesitate to contact me: shuai.zhang@student.unsw.edu.au.

Eddie Smolansky May 26 2017 05:23 UTC

Updated summary [here](https://github.com/eddiesmo/papers).

# How they made the dataset
- collect youtube videos
- automated filtering with yolo and landmark detection projects
- crowd source final filtering (AMT - give 50 face images to turks and ask which don't belong)
- quality control through s

...(continued)
Stefano Pirandola May 05 2017 05:45 UTC

Today I have seen on the arXiv the version 2 of this paper on quantum reading. I am sorry to say that this revision still misses to acknowledge important contributions from previous works, especially in relation to the methods on channel simulation and teleportation that are crucial for its claims.

...(continued)
Lei Cui May 03 2017 09:00 UTC

what's the value for $n$ of n-grams?

Robin Blume-Kohout Apr 07 2017 20:30 UTC

Zak, David: thanks! So (I think) this is a relation problem, not a decision problem (or even a partial function). Which is fine -- I'm happier with relation problems than with sampling problems, and the quantum part of Shor's algorithm is solving a relation problem, which is a pretty good pedigre

...(continued)