Aug 22 2017 cs.DB
In exploratory data analysis, analysts often have a need to identify histograms that possess a specific distribution, among a large class of candidate histograms, e.g., find histograms of countries whose income distribution is most similar to that of Greece. This distribution could be a new one that the user is curious about, or a known distribution from an existing histogram visualization. At present, this process of identification is brute-force, requiring the manual generation and evaluation of a large number of histograms. We present FastMatch: an end-to-end architecture for interactively retrieving the histogram visualizations that are most similar to a user-specified target, from a large collection of histograms. The primary technical contribution underlying FastMatch is a sublinear algorithm, HistSim, a theoretically sound sampling-based approach to identify the top-$k$ closest histograms under $\ell_1$ distance. While HistSim can be used independently, within FastMatch we couple HistSim with a novel system architecture that is aware of practical considerations, employing block-based sampling policies and asynchronous statistics and computation, building on lightweight sampling engines developed in recent work. In our experiments on several real-world datasets, FastMatch obtains near-perfect accuracy with up to $100\times$ speedups over less sophisticated approaches.
Transfer learning borrows knowledge from a source domain to facilitate learning in a target domain. Two primary issues to be addressed in transfer learning are what and how to transfer. For a pair of domains, adopting different transfer learning algorithms results in different knowledge transferred between them. To discover the optimal transfer learning algorithm that maximally improves the learning performance in the target domain, researchers have to exhaustively explore all existing transfer learning algorithms, which is computationally intractable. As a trade-off, a sub-optimal algorithm is selected, which requires considerable expertise in an ad-hoc way. Meanwhile, it is widely accepted in educational psychology that human beings improve transfer learning skills of deciding what to transfer through meta-cognitive reflection on inductive transfer learning practices. Motivated by this, we propose a novel transfer learning framework known as Learning to Transfer (L2T) to automatically determine what and how to transfer are the best by leveraging previous transfer learning experiences. We establish the L2T framework in two stages: 1) we first learn a reflection function encrypting transfer learning skills from experiences; and 2) we infer what and how to transfer for a newly arrived pair of domains by optimizing the reflection function. Extensive experiments demonstrate the L2T's superiority over several state-of-the-art transfer learning algorithms and its effectiveness on discovering more transferable knowledge.
Aug 17 2017 cs.CL
Learning latent representations from long text sequences is an important first step in many natural language processing applications. Recurrent Neural Networks (RNNs) have become a cornerstone for this challenging task. However, the quality of sentences during RNN-based decoding (reconstruction) decreases with the length of the text. We propose a sequence-to-sequence, purely convolutional and deconvolutional autoencoding framework that is free of the above issue, while also being computationally efficient. The proposed method is simple, easy to implement and can be leveraged as a building block for many applications. We show empirically that compared to RNNs, our framework is better at reconstructing and correcting long paragraphs. Quantitative evaluation on semi-supervised text classification and summarization tasks demonstrate the potential for better utilization of long unlabeled text data.
Aug 16 2017 cs.PF
This paper presents the Graph Analytics Repository for Designing Next-generation Accelerators (GARDENIA), a benchmark suite for studying irregular algorithms on massively parallel accelerators. Existing generic benchmarks for accelerators have mainly focused on high performance computing (HPC) applications with limited control and data irregularity, while available graph analytics benchmarks do not apply state-of-the-art algorithms and/or optimization techniques. GARDENIA includes emerging irregular applications in big-data and machine learning domains which mimic massively multithreaded commercial programs running on modern large-scale datacenters. Our characterization shows that GARDENIA exhibits irregular microarchitectural behavior which is quite different from structured workloads and straightforward-implemented graph benchmarks.
Aug 16 2017 cs.DS
An efficient pairwise Boolean matching algorithm to solve the problem of matching single-output specified Boolean functions under input negation and/or input permutation and/or output negation (NPN) is proposed in this paper. We present the Structural Signature (SS) vector, which is composed of a 1st signature value, two symmetry marks, and a group mark. As a necessary condition for NPN Boolean matching, the structural signature is more effective than is the traditional signature. Two Boolean functions, f and g, may be equivalent when they have the same SS vector. The symmetry mark can distinguish symmetric variables and asymmetric variables and search multiple variable mappings in a single variable-mapping search operation, which reduces the search space significantly. Updating the SS vector using Shannon decomposition provides benefits in distinguishing unidentified variables, and the group mark and the phase collision check discover incorrect variable mappings quickly, which also speeds up the NPN Boolean matching process. Using the algorithm proposed in this paper, we tested both equivalent and non-equivalent matching peeds on the MCNC benchmark circuit sets and the random circuit sets. In the experiment, our algorithm is two times faster than competitors when testing equivalent circuits and averages at least one hundred times faster when testing non-equivalent circuits. The experimental results show that our approach is highly effective in solving the NPN Boolean matching problem.
Facing an unknown situation, a person may not be able to firmly elicit his/her preferences over different alternatives, so he/she tends to express uncertain preferences. Given a community of different persons expressing their preferences over certain alternatives under uncertainty, to get a collective representative opinion of the whole community, a preference fusion process is required. The aim of this work is to propose a preference fusion method that copes with uncertainty and escape from the Condorcet paradox. To model preferences under uncertainty, we propose to develop a model of preferences based on belief function theory that accurately describes and captures the uncertainty associated with individual or collective preferences. This work improves and extends the previous results. This work improves and extends the contribution presented in a previous work. The benefits of our contribution are twofold. On the one hand, we propose a qualitative and expressive preference modeling strategy based on belief-function theory which scales better with the number of sources. On the other hand, we propose an incremental distance-based algorithm (using Jousselme distance) for the construction of the collective preference order to avoid the Condorcet Paradox.
Accurate and efficient record linkage is an open challenge of particular relevance to Australian Government Agencies, who recognise that so-called wicked social problems are best tackled by forming partnerships founded on large-scale data fusion. Names and addresses are the most common attributes on which data from different government agencies can be linked. In this paper, we focus on the problem of address linking. Linkage is particularly problematic when the data has significant quality issues. The most common approach for dealing with quality issues is to standardise raw data prior to linking. If a mistake is made in standardisation, however, it is usually impossible to recover from it to perform linkage correctly. This paper proposes a novel algorithm for address linking that is particularly practical for linking large disparate sets of addresses, being highly scalable, robust to data quality issues and simple to implement. It obviates the need for labour intensive and problematic address standardisation. We demonstrate the efficacy of the algorithm by matching two large address datasets from two government agencies with good accuracy and computational efficiency.
Aug 04 2017 cs.CV
Low rank tensor representation underpins much of recent progress in tensor completion. In real applications, however, this approach is confronted with two challenging problems, namely (1) tensor rank determination; (2) handling real tensor data which only approximately fulfils the low-rank requirement. To address these two issues, we develop a data-adaptive tensor completion model which explicitly represents both the low-rank and non-low-rank structures in a latent tensor. Representing the non-low-rank structure separately from the low-rank one allows priors which capture the important distinctions between the two, thus enabling more accurate modelling, and ultimately, completion. Through defining a new tensor rank, we develop a sparsity induced prior for the low-rank structure, with which the tensor rank can be automatically determined. The prior for the non-low-rank structure is established based on a mixture of Gaussians which is shown to be flexible enough, and powerful enough, to inform the completion process for a variety of real tensor data. With these two priors, we develop a Bayesian minimum mean squared error estimate (MMSE) framework for inference which provides the posterior mean of missing entries as well as their uncertainty. Compared with the state-of-the-art methods in various applications, the proposed model produces more accurate completion results.
Aug 04 2017 cs.CV
In this paper, we introduce a new CT image denoising method based on the generative adversarial network (GAN) with Wasserstein distance and perceptual similarity. The Wasserstein distance is a key concept of the optimal transform theory, and promises to improve the performance of the GAN. The perceptual loss compares the perceptual features of a denoised output against those of the ground truth in an established feature space, while the GAN helps migrate the data noise distribution from strong to weak. Therefore, our proposed method transfers our knowledge of visual perception to the image denoising task, is capable of not only reducing the image noise level but also keeping the critical information at the same time. Promising results have been obtained in our experiments with clinical CT images.
Aug 04 2017 cs.SY
The composite load model (CLM) proposed by the Western Electricity Coordinating Council (WECC) is gaining increasing traction in industry, particularly in North America. At the same time, it has been recognized that further improvements in structure, initialization and aggregation methods are needed to enhance model accuracy. However, the lack of an open-source implementation of the WECC CLM has become a roadblock for many researchers for further improvement. To bridge this gap, this paper presents the first open reference implementation of the WECC CLM. Individual load components and the CLM are first developed and tested in Matlab, then translated to the high performance computing (HPC) based, parallel simulation framework - GridPACK. The main contributions of the paper include: 1) presenting important yet undocumented details of modeling and initializing the CLM, particularly for a parallel simulation frame-work like GridPACK; 2) implementation details of the load components such as the single-phase air conditioner motor; 3) implementing the CLM in a modular and extensible manner. The implementation has been tested at both the component as well as system levels and benchmarked against commercial simulation programs, with satisfactory accuracy.
During the last half decade, convolutional neural networks (CNNs) have triumphed over semantic segmentation, which is a core task of various emerging industrial applications such as autonomous driving and medical imaging. However, to train CNNs requires a huge amount of data, which is difficult to collect and laborious to annotate. Recent advances in computer graphics make it possible to train CNN models on photo-realistic synthetic data with computer-generated annotations. Despite this, the domain mismatch between the real images and the synthetic data significantly decreases the models' performance. Hence we propose a curriculum-style learning approach to minimize the domain gap in semantic segmentation. The curriculum domain adaptation solves easy tasks first in order to infer some necessary properties about the target domain; in particular, the first task is to learn global label distributions over images and local distributions over landmark superpixels. These are easy to estimate because images of urban traffic scenes have strong idiosyncrasies (e.g., the size and spatial relations of buildings, streets, cars, etc.). We then train the segmentation network in such a way that the network predictions in the target domain follow those inferred properties. In experiments, our method significantly outperforms the baselines as well as the only known existing approach to the same problem.
Compressive sensing (CS) has proved effective for tomographic reconstruction from sparsely collected data or under-sampled measurements, which are practically important for few-view CT, tomosynthesis, interior tomography, and so on. To perform sparse-data CT, the iterative reconstruction commonly use regularizers in the CS framework. Currently, how to choose the parameters adaptively for regularization is a major open problem. In this paper, inspired by the idea of machine learning especially deep learning, we unfold a state-of-the-art "fields of experts" based iterative reconstruction scheme up to a number of iterations for data-driven training, construct a Learned Experts' Assessment-based Reconstruction Network ("LEARN") for sparse-data CT, and demonstrate the feasibility and merits of our LEARN network. The experimental results with our proposed LEARN network produces a competitive performance with the well-known Mayo Clinic Low-Dose Challenge Dataset relative to several state-of-the-art methods, in terms of artifact reduction, feature preservation, and computational speed. This is consistent to our insight that because all the regularization terms and parameters used in the iterative reconstruction are now learned from the training data, our LEARN network utilizes application-oriented knowledge more effectively and recovers underlying images more favorably than competing algorithms. Also, the number of layers in the LEARN network is only 12, reducing the computational complexity of typical iterative algorithms by orders of magnitude.
Multi-Task Learning (MTL) is a learning paradigm in machine learning and its aim is to leverage useful information contained in multiple related tasks to help improve the generalization performance of all the tasks. In this paper, we give a survey for MTL. First, we classify different MTL algorithms into several categories: feature learning approach, low-rank approach, task clustering approach, task relation learning approach, dirty approach, multi-level approach and deep learning approach. In order to compare different approaches, we discuss the characteristics of each approach. In order to improve the performance of learning tasks further, MTL can be combined with other learning paradigms including semi-supervised learning, active learning, reinforcement learning, multi-view learning and graphical models. When the number of tasks is large or the data dimensionality is high, batch MTL models are difficult to handle this situation and online, parallel and distributed MTL models as well as feature hashing are reviewed to reveal the computational and storage advantages. Many real-world applications use MTL to boost their performance and we introduce some representative works. Finally, we present theoretical analyses and discuss several future directions for MTL.
One of the main benefits of a wrist-worn computer is its ability to collect a variety of physiological data in a minimally intrusive manner. Among these data, electrodermal activity (EDA) is readily collected and provides a window into a person's emotional and sympathetic responses. EDA data collected using a wearable wristband are easily influenced by motion artifacts (MAs) that may significantly distort the data and degrade the quality of analyses performed on the data if not identified and removed. Prior work has demonstrated that MAs can be successfully detected using supervised machine learning algorithms on a small data set collected in a lab setting. In this paper, we demonstrate that unsupervised learning algorithms perform competitively with supervised algorithms for detecting MAs on EDA data collected in both a lab-based setting and a real-world setting comprising about 23 hours of data. We also find, somewhat surprisingly, that incorporating accelerometer data as well as EDA improves detection accuracy only slightly for supervised algorithms and significantly degrades the accuracy of unsupervised algorithms.
Jul 26 2017 cs.CL
To learn a semantic parser from denotations, a learning algorithm must search over a combinatorially large space of logical forms for ones consistent with the annotated denotations. We propose a new online learning algorithm that searches faster as training progresses. The two key ideas are using macro grammars to cache the abstract patterns of useful logical forms found thus far, and holistic triggering to efficiently retrieve the most relevant patterns based on sentence similarity. On the WikiTableQuestions dataset, we first expand the search space of an existing model to improve the state-of-the-art accuracy from 38.7% to 42.7%, and then use macro grammars and holistic triggering to achieve an 11x speedup and an accuracy of 43.7%.
Domain mismatch between training and testing can lead to significant degradation in performance in many machine learning scenarios. Unfortunately, this is not a rare situation for automatic speech recognition deployments in real-world applications. Research on robust speech recognition can be regarded as trying to overcome this domain mismatch issue. In this paper, we address the unsupervised domain adaptation problem for robust speech recognition, where both source and target domain speech are presented, but word transcripts are only available for the source domain speech. We present novel augmentation-based methods that transform speech in a way that does not change the transcripts. Specifically, we first train a variational autoencoder on both source and target domain data (without supervision) to learn a latent representation of speech. We then transform nuisance attributes of speech that are irrelevant to recognition by modifying the latent representations, in order to augment labeled training data with additional data whose distribution is more similar to the target domain. The proposed method is evaluated on the CHiME-4 dataset and reduces the absolute word error rate (WER) by as much as 35% compared to the non-adapted baseline.
Jul 19 2017 cs.CV
Although Neural networks could achieve state-of-the-art performance while recongnizing images, they often suffer a tremendous defeat from adversarial examples--inputs generated by utilizing imperceptible but intentional perturbations to samples from the datasets. How to defense against adversarial examples is an important problem which is well worth to research. So far, only two well-known methods adversarial training and defensive distillation have provided a significant defense. In contrast to existing methods mainly based on model itself, we address the problem purely based on the adversarial examples itself. In this paper, a novel idea and the first framework based Generative Adversarial Nets named AE-GAN capable of resisting adversarial examples are proposed. Extensive experiments on benchmark datasets indicate that AE-GAN is able to defense against adversarial examples effectively.
Jul 19 2017 cs.IR
The recent boom of AI has seen the emergence of many human-computer conversation systems such as Google Assistant, Microsoft Cortana, Amazon Echo and Apple Siri. We introduce and formalize the task of predicting questions in conversations, where the goal is to predict the new question that the user will ask, given the past conversational context. This task can be modeled as a "sequence matching" problem, where two sequences are given and the aim is to learn a model that maps any pair of sequences to a matching probability. Neural matching models, which adopt deep neural networks to learn sequence representations and matching scores, have attracted immense research interests of information retrieval and natural language processing communities. In this paper, we first study neural matching models for the question retrieval task that has been widely explored in the literature, whereas the effectiveness of neural models for this task is relatively unstudied. We further evaluate the neural matching models in the next question prediction task in conversations. We have used the publicly available Quora data and Ubuntu chat logs in our experiments. Our evaluations investigate the potential of neural matching models with representation learning for question retrieval and next question prediction in conversations. Experimental results show that neural matching models perform well for both tasks.
Security-critical tasks require proper isolation from untrusted software. Chip manufacturers design and include trusted execution environments (TEEs) in their processors to secure these tasks. The integrity and security of the software in the trusted environment depend on the verification process of the system. We find a form of attack that can be performed on the current implementations of the widely deployed ARM TrustZone technology. The attack exploits the fact that the trustlet (TA) or TrustZone OS loading verification procedure may use the same verification key and may lack proper rollback prevention across versions. If an exploit works on an out-of-date version, but the vulnerability is patched on the latest version, an attacker can still use the same exploit to compromise the latest system by downgrading the software to an older and exploitable version. We did experiments on popular devices on the market including those from Google, Samsung and Huawei, and found that all of them have the risk of being attacked. Also, we show a real-world example to exploit Qualcomm's QSEE. In addition, in order to find out which device images share the same verification key, pattern matching schemes for different vendors are analyzed and summarized.
Jul 18 2017 cs.CL
We propose a neural reranking system for named entity recognition (NER). The basic idea is to leverage recurrent neural network models to learn sentence-level patterns that involve named entity mentions. In particular, given an output sentence produced by a baseline NER model, we replace all entity mentions, such as \textitBarack Obama, into their entity types, such as \textitPER. The resulting sentence patterns contain direct output information, yet is less sparse without specific named entities. For example, "PER was born in LOC" can be such a pattern. LSTM and CNN structures are utilised for learning deep representations of such sentences for reranking. Results show that our system can significantly improve the NER accuracies over two different baselines, giving the best reported results on a standard benchmark.
Jul 18 2017 cs.AI
Among the many anticipated roles for robots in the future is that of being a human teammate. Aside from all the technological hurdles that have to be overcome with respect to hardware and control to make robots fit to work with humans, the added complication here is that humans have many conscious and subconscious expectations of their teammates - indeed, we argue that teaming is mostly a cognitive rather than physical coordination activity. This introduces new challenges for the AI and robotics community and requires fundamental changes to the traditional approach to the design of autonomy. With this in mind, we propose an update to the classical view of the intelligent agent architecture, highlighting the requirements for mental modeling of the human in the deliberative process of the autonomous agent. In this article, we outline briefly the recent efforts of ours, and others in the community, towards developing cognitive teammates along these guidelines.
Jul 18 2017 cs.CL
Both bottom-up and top-down strategies have been used for neural transition-based constituent parsing. The parsing strategies differ in terms of the order in which they recognize productions in the derivation tree, where bottom-up strategies and top-down strategies take post-order and pre-order traversal over trees, respectively. Bottom-up parsers benefit from rich features from readily built partial parses, but lack lookahead guidance in the parsing process; top-down parsers benefit from non-local guidance for local decisions, but rely on a strong encoder over the input to predict a constituent hierarchy before its construction.To mitigate both issues, we propose a novel parsing system based on in-order traversal over syntactic trees, designing a set of transition actions to find a compromise between bottom-up constituent information and top-down lookahead information. Based on stack-LSTM, our psycholinguistically motivated constituent parsing system achieves 91.8 F1 on WSJ benchmark. Furthermore, the system achieves 93.6 F1 with supervised reranking and 94.2 F1 with semi-supervised reranking, which are the best results on the WSJ benchmark.
Analytical electron microscopy and spectroscopy of biological specimens, polymers, and other beam sensitive materials has been a challenging area due to irradiation damage. There is a pressing need to develop novel imaging and spectroscopic imaging methods that will minimize such sample damage as well as reduce the data acquisition time. The latter is useful for high-throughput analysis of materials structure and chemistry. In this work, we present a novel machine learning based method for dynamic sparse sampling of EDS data using a scanning electron microscope. Our method, based on the supervised learning approach for dynamic sampling algorithm and neural networks based classification of EDS data, allows a dramatic reduction in the total sampling of up to 90%, while maintaining the fidelity of the reconstructed elemental maps and spectroscopic data. We believe this approach will enable imaging and elemental mapping of materials that would otherwise be inaccessible to these analysis techniques.
Jul 13 2017 cs.CR
Intel Software Guard Extension (SGX) offers software applications enclave to protect their confidentiality and integrity from malicious operating systems. The SSL/TLS protocol, which is the de facto standard for protecting transport-layer network communications, has been broadly deployed for a secure communication channel. However, in this paper, we show that the marriage between SGX and SSL may not be smooth sailing. Particularly, we consider a category of side-channel attacks against SSL/TLS implementations in secure enclaves, which we call the control-flow inference attacks. In these attacks, the malicious operating system kernel may perform a powerful man-in-the-kernel attack to collect execution traces of the enclave programs at page, cacheline, or branch level, while positioning itself in the middle of the two communicating parties. At the center of our work is a differential analysis framework, dubbed Stacco, to dynamically analyze the SSL/TLS implementations and detect vulnerabilities that can be exploited as decryption oracles. Surprisingly, we found exploitable vulnerabilities in the latest versions of all the SSL/TLS libraries we have examined. To validate the detected vulnerabilities, we developed a man-in-the-kernel adversary to demonstrate Bleichenbacher attacks against the latest OpenSSL library running in the SGX enclave (with the help of Graphene) and completely broke the PreMasterSecret encrypted by a 4096-bit RSA public key with only 57286 queries. We also conducted CBC padding oracle attacks against the latest GnuTLS running in Graphene-SGX and an open-source SGX-implementation of mbedTLS (i.e., mbedTLS-SGX) that runs directly inside the enclave, and showed that it only needs 48388 and 25717 queries, respectively, to break one block of AES ciphertext. Empirical evaluation suggests these man-in-the-kernel attacks can be completed within 1 or 2 hours.
Fourier ptychographic microscopy (FPM) is a recently proposed computational imaging technique with both high resolution and wide field-of-view. In current FP experimental setup, the dark-field images with high-angle illuminations are easily submerged by stray light and background noise due to the low signal-to-noise ratio, thus significantly degrading the reconstruction quality and also imposing a major restriction on the synthetic numerical aperture (NA) of the FP approach. To this end, an overall and systematic data preprocessing scheme for noise removal from FP's raw dataset is provided, which involves sampling analysis as well as underexposed/overexposed treatments, then followed by the elimination of unknown stray light and suppression of inevitable background noise, especially Gaussian noise and CCD dark current in our experiments. The reported non-parametric scheme facilitates great enhancements of the FP's performance, which has been demonstrated experimentally that the benefits of noise removal by these methods far outweigh its defects of concomitant signal loss. In addition, it could be flexibly cooperated with the existing state-of-the-art algorithms, producing a stronger robustness of the FP approach in various applications.
Urban structure detection is a basic task in urban geography. Clustering is a core technology to detect the patterns of urban spatial structure, urban functional region, and so on. In big data era, diverse urban sensing datasets recording information like human behaviour and human social activity, suffer from complexity in high dimension and high noise. And unfortunately, the state-of-the-art clustering methods does not handle the problem with high dimension and high noise issues concurrently. In this paper, a probabilistic embedding clustering method is proposed. Firstly, we come up with a Probabilistic Embedding Model (PEM) to find latent features from high dimensional urban sensing data by learning via probabilistic model. By latent features, we could catch essential features hidden in high dimensional data known as patterns; with the probabilistic model, we can also reduce uncertainty caused by high noise. Secondly, through tuning the parameters, our model could discover two kinds of urban structure, the homophily and structural equivalence, which means communities with intensive interaction or in the same roles in urban structure. We evaluated the performance of our model by conducting experiments on real-world data and experiments with real data in Shanghai (China) proved that our method could discover two kinds of urban structure, the homophily and structural equivalence, which means clustering community with intensive interaction or under the same roles in urban space.
Recent progress in logic programming (e.g., the development of the Answer Set Programming paradigm) has made it possible to teach it to general undergraduate and even high school students. Given the limited exposure of these students to computer science, the complexity of downloading, installing and using tools for writing logic programs could be a major barrier for logic programming to reach a much wider audience. We developed an online answer set programming environment with a self contained file system and a simple interface, allowing users to write logic programs and perform several tasks over the programs.
Jul 05 2017 cs.CV
Fine-Grained Visual Categorization (FGVC) has achieved significant progress recently. However, the number of fine-grained species could be huge and dynamically increasing in real scenarios, making it difficult to recognize unseen objects under the current FGVC framework. This raises an open issue to perform large-scale fine-grained identification without a complete training set. Aiming to conquer this issue, we propose a retrieval task named One-Shot Fine-Grained Instance Retrieval (OSFGIR). "One-Shot" denotes the ability of identifying unseen objects through a fine-grained retrieval task assisted with an incomplete auxiliary training set. This paper first presents the detailed description to OSFGIR task and our collected OSFGIR-378K dataset. Next, we propose the Convolutional and Normalization Networks (CN-Nets) learned on the auxiliary dataset to generate a concise and discriminative representation. Finally, we present a coarse-to-fine retrieval framework consisting of three components, i.e., coarse retrieval, fine-grained retrieval, and query expansion, respectively. The framework progressively retrieves images with similar semantics, and performs fine-grained identification. Experiments show our OSFGIR framework achieves significantly better accuracy and efficiency than existing FGVC and image retrieval methods, thus could be a better solution for large-scale fine-grained object identification.
Jul 05 2017 cs.CV
Learning discriminative representations for unseen person images is critical for person Re-Identification (ReID). Most of current approaches learn deep representations in classification tasks, which essentially minimize the empirical classification risk on the training set. As shown in our experiments, such representations commonly focus on several body parts discriminative to the training set, rather than the entire human body. Inspired by the structural risk minimization principle in SVM, we revise the traditional deep representation learning procedure to minimize both the empirical classification risk and the representation learning risk. The representation learning risk is evaluated by the proposed part loss, which automatically generates several parts for an image, and computes the person classification loss on each part separately. Compared with traditional global classification loss, simultaneously considering multiple part loss enforces the deep network to focus on the entire human body and learn discriminative representations for different parts. Experimental results on three datasets, i.e., Market1501, CUHK03, VIPeR, show that our representation outperforms the existing deep representations.
Objective: Amyotrophic lateral sclerosis (ALS) is a rare disease, but is also one of the most common motor neuron diseases, and people of all races and ethnic backgrounds are affected. There is currently no cure. Brain computer interfaces (BCIs) can establish a communication channel directly between the brain and an external device by recognizing brain activities that reflect user intent. Therefore, this technology could help ALS patients in promoting functional independence through BCI-based speller systems and motor assistive devices. Methods: In this paper, two kinds of ERP-based speller systems were tested on 18 ALS patients to: (1) assess performance when they spelled 42 characters online continuously, without a break; and (2) to compare performance between a matrix-based speller paradigm (MS-P, mean visual angle 6 degree) and a new speller paradigm that used a larger visual angle called the large visual angle speller paradigm (LS-P, mean visual angle 8 degree). Results: Although results showed that there were no significant differences between the two paradigms in accuracy trend over continuous use (p>0.05), the fatigue during the LS-P condition was significantly lower than that of MS-P (p<0.05). Results also showed that continuous use slightly reduced the performance of this ERP-based BCI. Conclusion: 15 subjects obtained higher than 80% feedback accuracy (online output accuracy) and 9 subjects obtained higher than 90% feedback accuracy in one of the two paradigms, thus validating the BCI approaches in this study. Significance: Most ALS subjects in this study could spell effectively after continuous use of an ERP-based BCI. The new LS-P display may be easier for subjects to use, resulting in lower fatigue.
This paper investigates how high school students approach computing through an introductory computer science course situated in the Logic Programming (LP) paradigm. This study shows how novice students operate within the LP paradigm while engaging in foundational computing concepts and skills, and presents a case for LP as a viable paradigm choice for introductory CS courses.
Jun 27 2017 cs.CL
Starting from NMT, encoder-decoder neu- ral networks have been used for many NLP problems. Graph-based models and transition-based models borrowing the en- coder components achieve state-of-the-art performance on dependency parsing and constituent parsing, respectively. How- ever, there has not been work empirically studying the encoder-decoder neural net- works for transition-based parsing. We apply a simple encoder-decoder to this end, achieving comparable results to the parser of Dyer et al. (2015) on standard de- pendency parsing, and outperforming the parser of Vinyals et al. (2015) on con- stituent parsing.
Jun 27 2017 cs.GR
NURBS curve is widely used in Computer Aided Design and Computer Aided Geometric Design. When a single weight approaches infinity, the limit of a NURBS curve tends to the corresponding control point. In this paper, a kind of control structure of a NURBS curve, called regular control curve, is defined. We prove that the limit of the NURBS curve is exactly its regular control curve when all of weights approach infinity, where each weight is multiplied by a certain one-parameter function tending to infinity, different for each control point. Moreover, some representative examples are presented to show this property and indicate its application for shape deformation.
Jun 27 2017 cs.LG
Do GANS (Generative Adversarial Nets) actually learn the target distribution? The foundational paper of (Goodfellow et al 2014) suggested they do, if they were given sufficiently large deep nets, sample size, and computation time. A recent theoretical analysis in Arora et al (to appear at ICML 2017) raised doubts whether the same holds when discriminator has finite size. It showed that the training objective can approach its optimum value even if the generated distribution has very low support ---in other words, the training objective is unable to prevent mode collapse. The current note reports experiments suggesting that such problems are not merely theoretical. It presents empirical evidence that well-known GANs approaches do learn distributions of fairly low support, and thus presumably are not learning the target distribution. The main technical contribution is a new proposed test, based upon the famous birthday paradox, for estimating the support size of the generated distribution.
Jun 16 2017 cs.NA
Quasi-brittle materials such as concrete suffer from cracks during their life cycle, requiring great cost for conventional maintenance or replacement. In the last decades, self-healing materials are developed which are capable of filling and healing the cracks and regaining part of the stiffness and strength automatically after getting damaged, bringing the possibility of maintenance-free materials and structures. In this paper, a time dependent softening-healing law for self-healing quasi-brittle materials is presented by introducing limited material parameters with clear physical background. Strong Discontinuity embedded Approach (SDA) is adopted for evaluating the reliability of the model. In the numerical studies, values of healing parameters are firstly obtained by back analysis of experimental results of self-healing beams. Then numerical models regarding concrete members and structures built with self-healing and non-healing materials are simulated and compared for showing the capability of the self-healing material.
Jun 16 2017 cs.CV
Image segmentation is a fundamental problem in biomedical image analysis. Recent advances in deep learning have achieved promising results on many biomedical image segmentation benchmarks. However, due to large variations in biomedical images (different modalities, image settings, objects, noise, etc), to utilize deep learning on a new application, it usually needs a new set of training data. This can incur a great deal of annotation effort and cost, because only biomedical experts can annotate effectively, and often there are too many instances in images (e.g., cells) to annotate. In this paper, we aim to address the following question: With limited effort (e.g., time) for annotation, what instances should be annotated in order to attain the best performance? We present a deep active learning framework that combines fully convolutional network (FCN) and active learning to significantly reduce annotation effort by making judicious suggestions on the most effective annotation areas. We utilize uncertainty and similarity information provided by FCN and formulate a generalized version of the maximum set cover problem to determine the most representative and uncertain areas for annotation. Extensive experiments using the 2015 MICCAI Gland Challenge dataset and a lymph node ultrasound image segmentation dataset show that, using annotation suggestions by our method, state-of-the-art segmentation performance can be achieved by using only 50% of training data.
In this paper, we suggest a framework to make use of mutual information as a regularization criterion to train Auto-Encoders (AEs). In the proposed framework, AEs are regularized by minimization of the mutual information between input and encoding variables of AEs during the training phase. In order to estimate the entropy of the encoding variables and the mutual information, we propose a non-parametric method. We also give an information theoretic view of Variational AEs (VAEs), which suggests that VAEs can be considered as parametric methods that estimate entropy. Experimental results show that the proposed non-parametric models have more degree of freedom in terms of representation learning of features drawn from complex distributions such as Mixture of Gaussians, compared to methods which estimate entropy using parametric approaches, such as Variational AEs.
The Generative Adversarial Network (GAN) has achieved great success in generating realistic (real-valued) synthetic data. However, convergence issues and difficulties dealing with discrete data hinder the applicability of GAN to text. We propose a framework for generating realistic text via adversarial training. We employ a long short-term memory network as generator, and a convolutional network as discriminator. Instead of using the standard objective of GAN, we propose matching the high-dimensional latent feature distributions of real and synthetic sentences, via a kernelized discrepancy metric. This eases adversarial training by alleviating the mode-collapsing problem. Our experiments show superior performance in quantitative evaluation, and demonstrate that our model can generate realistic-looking sentences.
Jun 13 2017 cs.CV
As a major type of transportation equipments, bicycles, including electrical bicycles, are distributed almost everywhere in China. The accidents caused by bicycles have become a serious threat to the public safety. So bicycle detection is one major task of traffic video surveillance systems in China. In this paper, a method based on multi-feature and multi-frame fusion is presented for bicycle detection in low-resolution traffic videos. It first extracts some geometric features of objects from each frame image, then concatenate multiple features into a feature vector and use linear support vector machine (SVM) to learn a classifier, or put these features into a cascade classifier, to yield a preliminary detection result regarding whether an object is a bicycle. It further fuses these preliminary detection results from multiple frames to provide a more reliable detection decision, together with a confidence level of that decision. Experimental results show that this method based on multi-feature and multi-frame fusion can identify bicycles with high accuracy and low computational complexity. It is, therefore, applicable for real-time traffic video surveillance systems.
Jun 13 2017 cs.CV
Accurate detection of the myocardial infarction (MI) area is crucial for early diagnosis planning and follow-up management. In this study, we propose an end-to-end deep-learning algorithm framework (OF-RNN ) to accurately detect the MI area at the pixel level. Our OF-RNN consists of three different function layers: the heart localization layers, which can accurately and automatically crop the region-of-interest (ROI) sequences, including the left ventricle, using the whole cardiac magnetic resonance image sequences; the motion statistical layers, which are used to build a time-series architecture to capture two types of motion features (at the pixel-level) by integrating the local motion features generated by long short-term memory-recurrent neural networks and the global motion features generated by deep optical flows from the whole ROI sequence, which can effectively characterize myocardial physiologic function; and the fully connected discriminate layers, which use stacked auto-encoders to further learn these features, and they use a softmax classifier to build the correspondences from the motion features to the tissue identities (infarction or not) for each pixel. Through the seamless connection of each layer, our OF-RNN can obtain the area, position, and shape of the MI for each patient. Our proposed framework yielded an overall classification accuracy of 94.35% at the pixel level, from 114 clinical subjects. These results indicate the potential of our proposed method in aiding standardized MI assessments.
Jun 12 2017 cs.CV
Structured light illumination is an active 3-D scanning technique based on projecting/capturing a set of striped patterns and measuring the warping of the patterns as they reflect off a target object's surface. As designed, each pixel in the camera sees exactly one pixel from the projector; however, there are exceptions to this when the scanned surface has a complicated geometry with step edges and other discontinuities in depth or where the target surface has specularities that reflect light away from the camera. These situations are generally referred to multipath where a given camera pixel receives light from multiple positions from the projector. In the case of bimodal multipath, the camera pixel receives light from exactly two positions from the projector which occurs when light bounce back from a reflective surface or along a step edge where the edge slices through a pixel so that the pixel sees both a foreground and background surface. In this paper, we present a general mathematical model and address the bimodal multipath issue in a phase measuring profilometry scanner to measure the constructive and destructive interference between the two light paths, and by taking advantage of this interesting cue, separate the paths and make two separated depth measurements. We also validate our algorithm with both simulation and a number of challenging real cases.
Jun 12 2017 cs.CL
We present a state-of-the-art end-to-end Automatic Speech Recognition (ASR) model. We learn to listen and write characters with a joint Connectionist Temporal Classification (CTC) and attention-based encoder-decoder network. The encoder is a deep Convolutional Neural Network (CNN) based on the VGG network. The CTC network sits on top of the encoder and is jointly trained with the attention-based decoder. During the beam search process, we combine the CTC predictions, the attention-based decoder predictions and a separately trained LSTM language model. We achieve a 5-10\% error reduction compared to prior systems on spontaneous Japanese and Chinese speech, and our end-to-end model beats out traditional hybrid ASR systems.
In retailer management, the Newsvendor problem has widely attracted attention as one of basic inventory models. In the traditional approach to solving this problem, it relies on the probability distribution of the demand. In theory, if the probability distribution is known, the problem can be considered as fully solved. However, in any real world scenario, it is almost impossible to even approximate or estimate a better probability distribution for the demand. In recent years, researchers start adopting machine learning approach to learn a demand prediction model by using other feature information. In this paper, we propose a supervised learning that optimizes the demand quantities for products based on feature information. We demonstrate that the original Newsvendor loss function as the training objective outperforms the recently suggested quadratic loss function. The new algorithm has been assessed on both the synthetic data and real-world data, demonstrating better performance.
Jun 09 2017 cs.CV
Automatically generating a natural language description of an image is a task close to the heart of image understanding. In this paper, we present a multi-model neural network method closely related to the human visual system that automatically learns to describe the content of images. Our model consists of two sub-models: an object detection and localization model, which extract the information of objects and their spatial relationship in images respectively; Besides, a deep recurrent neural network (RNN) based on long short-term memory (LSTM) units with attention mechanism for sentences generation. Each word of the description will be automatically aligned to different objects of the input image when it is generated. This is similar to the attention mechanism of the human visual system. Experimental results on the COCO dataset showcase the merit of the proposed method, which outperforms previous benchmark models.
Jun 09 2017 cs.CV
Structured light illumination is an active 3-D scanning technique based on projecting/capturing a set of striped patterns and measuring the warping of the patterns as they reflect off a target object's surface. In the case of phase measuring profilometry (PMP), the projected patterns are composed of a rolling sinusoidal wave, but as a set of time-multiplexed patterns, PMP requires the target surface to remain motionless or for scanning to be performed at such high rates that any movement is small. But high speed scanning places a significant burden on the projector electronics to produce contone patterns inside of short exposure intervals. Binary patterns are, therefore, of great value, but converting contone patterns into binary comes with significant risk. As such, this paper introduces a contone-to-binary conversion algorithm for deriving binary patterns that best mimic their contone counterparts. Experimental results will show a greater than 3 times reduction in pattern noise over traditional halftoning procedures.
This paper addresses active state estimation with a team of robotic sensors. The states to be estimated are represented by spatially distributed, uncorrelated, stationary vectors. Given a prior belief on the geographic locations of the states, we cluster the states in moderately sized groups and propose a new hierarchical Dynamic Programming (DP) framework to compute optimal sensing policies for each cluster that mitigates the computational cost of planning optimal policies in the combined belief space. Then, we develop a decentralized assignment algorithm that dynamically allocates clusters to robots based on the pre-computed optimal policies at each cluster. The integrated distributed state estimation framework is optimal at the cluster level but also scales very well to large numbers of states and robot sensors. We demonstrate efficiency of the proposed method in both simulations and real-world experiments using stereoscopic vision sensors.
Jun 08 2017 cs.RO
In this paper, we address the problem of controlling a mobile stereo camera under image quantization noise. Assuming that a pair of images of a set of targets is available, the camera moves through a sequence of Next-Best-Views (NBVs), i.e., a sequence of views that minimize the trace of the targets' cumulative state covariance, constructed using a realistic model of the stereo rig that captures image quantization noise and a Kalman Filter (KF) that fuses the observation history with new information. The proposed controller decomposes control of the NBVs from motion control of the camera in the camera and world frames, respectively. This decomposition allows the camera to realize the goal NBVs while satisfying Field-of-View constraints, a task that is particularly challenging using complex sensing models. We provide simulations and real experiments that illustrate the ability of the proposed mobile camera system to accurately localize sets of targets. We also propose a novel data-driven technique to characterize unmodeled uncertainty, such as calibration errors, at the pixel level and show that this method ensures stability of the KF.
To explore large-scale population indoor interactions, we analyze 18,715 users' WiFi access logs recorded in a Chinese university campus during 3 months, and define two categories of human interactions, the event interaction (EI) and the temporal interaction (TI). The EI helps construct a transmission graph, and the TI helps build an interval graph. The dynamics of EIs show that their active durations are truncated power-law distributed, which is independent on the number of involved individuals. The transmission duration presents a truncated power-law behavior at the daily timescale with weekly periodicity. Besides, those `leaf' individuals in the aggregated contact network may participate in the `super-connecting cliques' in the aggregated transmission graph. Analyzing the dynamics of the interval graph, we find that the probability distribution of TIs' inter-event duration also displays a truncated power-law pattern at the daily timescale with weekly periodicity, while the pairwise individuals with burst interactions are prone to randomly select their interactive locations, and those individuals with periodic interactions have preferred interactive locations.
Network science has released its talents in social network analysis based on the information of static topologies. In reality social contacts are dynamic and evolve concurrently in time. Nowadays they can be recorded by ubiquitous information technologies, and generated into temporal social networks to provide new sights in social reality mining. Here, we define \emphcircle link to measure contextual relationships in three empirical social temporal networks, and find that the tendency of friends having frequent continuous interactions with their common friend prefer to be close, which can be considered as the extension of Granovetter's hypothesis in temporal social networks. Finally, we present a heuristic method based on circle link to predict relationships and acquire acceptable results.
Jun 06 2017 cs.NI
In this paper, we consider a wireless cooperative network with an energy harvesting relay which is powered by the energy harvested from ambient RF waves, such as that of a data packet. At any given time, the relay operates either in the energy harvesting (EH) mode or the data decoding (DD) mode, but not both. Separate energy and data buffers are kept at the relay to store the harvested energy and decoded data packets, respectively. In this paper, we optimize a time switching policy that switches between the EH mode and DD mode to maximize the system throughput or minimize the average transmission delay. Both static and dynamic time switching policies are derived. In particular, static policies are the ones where EH or DD mode is selected with a pre-determined probability. In contrast, in a dynamic policy, the mode is selected dynamically according to the states of data and energy buffers. We prove that the throughput-optimal static and dynamic policies keep the relay data buffer at the boundary of stability. More specifically, we show that the throughput-optimal dynamic policy has a threshold-based structure. Moreover, we prove that the delay-optimal dynamic policy also has a threshold-based structure and keeps at most one packet at the relay. We notice that the delay-optimal and throughput-optimal dynamic policies coincide in most cases. However, it is not true for optimal static policies. Finally, through extensive numerical results, we show the efficiency of optimal dynamic policies compared with the static ones in different conditions.
Jun 02 2017 cs.CV
Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network. The typical application is to transfer from a powerful large network or ensemble to a small network, that is better suited to low-memory or fast execution requirements. In this paper, we present a deep mutual learning (DML) strategy where, rather than one way transfer between a static pre-defined teacher and a student, an ensemble of students learn collaboratively and teach each other throughout the training process. Our experiments show that a variety of network architectures benefit from mutual learning and achieve compelling results on CIFAR-100 recognition and Market-1501 person re-identification benchmarks. Surprisingly, it is revealed that no prior powerful teacher network is necessary -- mutual learning of a collection of simple student networks works, and moreover outperforms distillation from a more powerful yet static teacher.
This paper considers the slotted ALOHA protocol in a communication channel shared by N users. It is assumed that the channel has the multiple-packet reception (MPR) capability that allows the correct reception of up to M ($1 \leq M < N$) time-overlapping packets. To evaluate the reliability in the scenario that a packet needs to be transmitted within a strict delivery deadline D ($D \geq 1$) (in unit of slot) since its arrival at the head of queue, we consider the successful delivery probability (SDP) of a packet as performance metric of interest. We derive the optimal transmission probability that maximizes the SDP for any $1 \leq M < N$ and any $D \geq 1$, and show it can be computed by a fixed-point iteration. In particular, the case for D = 1 (i.e., throughput maximization) is first completely addressed in this paper. Based on these theoretical results, for real-life scenarios where N may be unknown and changing, we develop a distributed algorithm that enables each user to tune its transmission probability at runtime according to the estimate of N. Simulation results show that the proposed algorithm is effective in dynamic scenarios, with near-optimal performance.
Jun 01 2017 cs.CV
We propose in this paper a novel two-stage Projection Correction Modeling (PCM) framework for image reconstruction from (non-uniform) Fourier measurements. PCM consists of a projection stage (P-stage) motivated by the multi-scale Galerkin method and a correction stage (C-stage) with an edge guided regularity fusing together the advantages of total variation (TV) and total fractional variation (TFV). The P-stage allows for continuous modeling of the underlying image of interest. The given measurements are projected onto a space in which the image is well represented. We then enhance the reconstruction result at the C-stage that minimizes an energy functional consisting of a fidelity in the transformed domain and a novel edge guided regularity. We further develop efficient proximal algorithms to solve the corresponding optimization problem. Various numerical results in both 1D signals and 2D images have also been presented to demonstrate the superior performance of the proposed two-stage method to other classical one-stage methods.
Several recent works have empirically observed that Convolutional Neural Nets (CNNs) are (approximately) invertible. To understand this approximate invertibility phenomenon and how to leverage it more effectively, we focus on a theoretical explanation and develop a mathematical model of sparse signal recovery that is consistent with CNNs with random weights. We give an exact connection to a particular model of model-based compressive sensing (and its recovery algorithms) and random-weight CNNs. We show empirically that several learned networks are consistent with our mathematical analysis and then demonstrate that with such a simple theoretical framework, we can obtain reasonable re- construction results on real images. We also discuss gaps between our model assumptions and the CNN trained for classification in practical scenarios.
Consider a full-duplex (FD) bidirectional secure communication system, where two communication nodes, named Alice and Bob, simultaneously transmit and receive confidential information from each other, and an eavesdropper, named Eve, overhears the transmissions. Our goal is to maximize the sum secrecy rate (SSR) of the bidirectional transmissions by optimizing the transmit covariance matrices at Alice and Bob. To tackle this SSR maximization (SSRM) problem, we develop an alternating difference-of-concave (ADC) programming approach to alternately optimize the transmit covariance matrices at Alice and Bob. We show that the ADC iteration has a semi-closed-form beamforming solution, and is guaranteed to converge to a stationary solution of the SSRM problem. Besides the SSRM design, this paper also deals with a robust SSRM transmit design under a moment-based random channel state information (CSI) model, where only some roughly estimated first and second-order statistics of Eve's CSI are available, but the exact distribution or other high-order statistics is not known. This moment-based error model is new and different from the widely used bounded-sphere error model and the Gaussian random error model. Under the consider CSI error model, the robust SSRM is formulated as an outage probability-constrained SSRM problem. By leveraging the Lagrangian duality theory and DC programming, a tractable safe solution to the robust SSRM problem is derived. The effectiveness and the robustness of the proposed designs are demonstrated through simulations.
May 23 2017 cs.CR
Side-channel risks of Intel's SGX have recently attracted great attention. Under the spotlight is the newly discovered page-fault attack, in which an OS-level adversary induces page faults to observe the page-level access patterns of a protected process running in an SGX enclave. With almost all proposed defense focusing on this attack, little is known about whether such efforts indeed raises the bar for the adversary, whether a simple variation of the attack renders all protection ineffective, not to mention an in-depth understanding of other attack surfaces in the SGX system. In the paper, we report the first step toward systematic analyses of side-channel threats that SGX faces, focusing on the risks associated with its memory management. Our research identifies 8 potential attack vectors, ranging from TLB to DRAM modules. More importantly, we highlight the common misunderstandings about SGX memory side channels, demonstrating that high frequent AEXs can be avoided when recovering EdDSA secret key through a new page channel and fine-grained monitoring of enclave programs (at the level of 64B) can be done through combining both cache and cross-enclave DRAM channels. Our findings reveal the gap between the ongoing security research on SGX and its side-channel weaknesses, redefine the side-channel threat model for secure enclaves, and can provoke a discussion on when to use such a system and how to use it securely.
May 23 2017 cs.CV
Recurrent feedback connections in the mammalian visual system have been hypothesized to play a role in synthesizing input in the theoretical framework of analysis by synthesis. The comparison of internally synthesized representation with that of the input provides a validation mechanism during perceptual inference and learning. Inspired by these ideas, we proposed that the synthesis machinery can compose new, unobserved images by imagination to train the network itself so as to increase the robustness of the system in novel scenarios. As a proof of concept, we investigated whether images composed by imagination could help an object recognition system to deal with occlusion, which is challenging for the current state-of-the-art deep convolutional neural networks. We fine-tuned a network on images containing objects in various occlusion scenarios, that are imagined or self-generated through a deep generator network. Trained on imagined occluded scenarios under the object persistence constraint, our network discovered more subtle and localized image features that were neglected by the original network for object classification, obtaining better separability of different object classes in the feature space. This leads to significant improvement of object recognition under occlusion for our network relative to the original network trained only on un-occluded images. In addition to providing practical benefits in object recognition under occlusion, this work demonstrates the use of self-generated composition of visual scenes through the synthesis loop, combined with the object persistence constraint, can provide opportunities for neural networks to discover new relevant patterns in the data, and become more flexible in dealing with novel situations.
May 19 2017 cs.CL
Singlish can be interesting to the ACL community both linguistically as a major creole based on English, and computationally for information extraction and sentiment analysis of regional social media. We investigate dependency parsing of Singlish by constructing a dependency treebank under the Universal Dependencies scheme, and then training a neural network model by integrating English syntactic knowledge into a state-of-the-art parser trained on the Singlish treebank. Results show that English knowledge can lead to 25% relative error reduction, resulting in a parser of 84.47% accuracies. To the best of our knowledge, we are the first to use neural stacking to improve cross-lingual dependency parsing on low-resource languages. We make both our annotation and parser available for further research.
May 17 2017 cs.CY
China has encountered serious land loss problems along with urban expansion due to rapid urbanization. Without considering complicated spatiotemporal heterogeneity, previous studies could not extract urban transition rules at large scale well. This study proposed a random forest algorithm (RFA) based cellular automata (CA) model to simulate China's urban expansion and farmland loss in a fine scale from 2000 to 2030. The objectives of this study are to 1) mine urban conversion rules in different homogeneous economic development regions, and 2) simulate China's urban expansion process and farmland loss at high spatial resolution (30 meters). Firstly, we clustered several homogeneous economic development regions among China according to official statistical data. Secondly, we constructed a RFA-based CA model to mine complex urban conversion rules and carried out simulation of urban expansion and farmland loss at each homo-region. The proposed model was implemented on Tianhe-1 supercomputer located in Guangzhou, China. The accuracy evaluation demonstrates that the simulation result of proposed RFA-based CA model is more in agreement with actual land use change. This study proves that the primary factor of farmland loss in China is rapid urbanization from 2000, and the farmland loss rate is expected to slow down gradually and will stabilize from 2010 to 2030. It shows that China is able to preserve the 1.20 million km farmland without crossing the "red line" within the next 20 years, but the situation remains severe.
Impervious surface area is a direct consequence of the urbanization, which also plays an important role in urban planning and environmental management. With the rapidly technical development of remote sensing, monitoring urban impervious surface via high spatial resolution (HSR) images has attracted unprecedented attention recently. Traditional multi-classes models are inefficient for impervious surface extraction because it requires labeling all needed and unneeded classes that occur in the image exhaustively. Therefore, we need to find a reliable one-class model to classify one specific land cover type without labeling other classes. In this study, we investigate several one-class classifiers, such as Presence and Background Learning (PBL), Positive Unlabeled Learning (PUL), OCSVM, BSVM and MAXENT, to extract urban impervious surface area using high spatial resolution imagery of GF-1, China's new generation of high spatial remote sensing satellite, and evaluate the classification accuracy based on artificial interpretation results. Compared to traditional multi-classes classifiers (ANN and SVM), the experimental results indicate that PBL and PUL provide higher classification accuracy, which is similar to the accuracy provided by ANN model. Meanwhile, PBL and PUL outperforms OCSVM, BSVM, MAXENT and SVM models. Hence, the one-class classifiers only need a small set of specific samples to train models without losing predictive accuracy, which is supposed to gain more attention on urban impervious surface extraction or other one specific land cover type.
We demonstrate that a deep neural network can significantly improve optical microscopy, enhancing its spatial resolution over a large field-of-view and depth-of-field. After its training, the only input to this network is an image acquired using a regular optical microscope, without any changes to its design. We blindly tested this deep learning approach using various tissue samples that are imaged with low-resolution and wide-field systems, where the network rapidly outputs an image with remarkably better resolution, matching the performance of higher numerical aperture lenses, also significantly surpassing their limited field-of-view and depth-of-field. These results are transformative for various fields that use microscopy tools, including e.g., life sciences, where optical microscopy is considered as one of the most widely used and deployed techniques. Beyond such applications, our presented approach is broadly applicable to other imaging modalities, also spanning different parts of the electromagnetic spectrum, and can be used to design computational imagers that get better and better as they continue to image specimen and establish new transformations among different modes of imaging.
Blood pressure (BP) has been a difficult vascular risk factor to measure precisely and continuously due to its multiscale temporal dependencies. However, both pulse transit time (PTT) model and regression model fail to learn such dependencies and thus suffer from accuracy decay over time. In this work, we addressed the limitation of existing BP prediction models by formulating BP extraction as a sequence prediction problem in which both the input and target are temporal sequence. By incorporating both a bidirectional layer structure and a deep architecture in a standard long short term-memory (LSTM), we established a deep bidirectional LSTM (DB-LSTM) network that can adaptively discover the latent structures of different timescales in BP sequences and automatically learn such multiscale dependencies. We evaluated our proposed model on a static and follow-up continuous BP dataset, and the results show that DB-LSTM network can effectively learn different timescale dependencies in the BP sequences and advances the state-of-the-art by achieving superior accuracy performance than other leading methods on both datasets. To the best of our knowledge, this is the first study to validate the ability of recurrent neural networks to learn the multiscale dependencies of long-term continuous BP sequence.
Background: Mining gene modules from genomic data is an important step to detect gene members of pathways or other relations such as protein-protein interactions. In this work, we explore the plausibility of detecting gene modules by factorizing gene-phenotype associations from a phenotype ontology rather than the conventionally used gene expression data. In particular, the hierarchical structure of ontology has not been sufficiently utilized in clustering genes while functionally related genes are consistently associated with phenotypes on the same path in the phenotype ontology. Results: We propose a hierarchal Nonnegative Matrix Factorization (NMF)-based method, called Consistent Multiple Nonnegative Matrix Factorization (CMNMF), to factorize genome-phenome association matrix at two levels of the hierarchical structure in phenotype ontology for mining gene functional modules. CMNMF constrains the gene clusters from the association matrices at two consecutive levels to be consistent since the genes are annotated with both the child phenotype and the parent phenotype in the consecutive levels. CMNMF also restricts the identified phenotype clusters to be densely connected in the phenotype ontology hierarchy. In the experiments on mining functionally related genes from mouse phenotype ontology and human phenotype ontology, CMNMF effectively improved clustering performance over the baseline methods. Gene ontology enrichment analysis was also conducted to reveal interesting gene modules. Conclusions: Utilizing the information in the hierarchical structure of phenotype ontology, CMNMF can identify functional gene modules with more biological significance than the conventional methods. CMNMF could also be a better tool for predicting members of gene pathways and protein-protein interactions. Availability: https://github.com/nkiip/CMNMF
Phase recovery from intensity-only measurements forms the heart of coherent imaging techniques and holography. Here we demonstrate that a neural network can learn to perform phase recovery and holographic image reconstruction after appropriate training. This deep learning-based approach provides an entirely new framework to conduct holographic imaging by rapidly eliminating twin-image and self-interference related spatial artifacts. Compared to existing approaches, this neural network based method is significantly faster to compute, and reconstructs improved phase and amplitude images of the objects using only one hologram, i.e., requires less number of measurements in addition to being computationally faster. We validated this method by reconstructing phase and amplitude images of various samples, including blood and Pap smears, and tissue sections. These results are broadly applicable to any phase recovery problem, and highlight that through machine learning challenging problems in imaging science can be overcome, providing new avenues to design powerful computational imaging systems.
Private information retrieval (PIR) gets renewed attentions in recent years due to its information-theoretic reformulation and its application in distributed storage system (DSS). We study a variant of the information-theoretic and DSS-oriented PIR problem which considers retrieving several files simultaneously, known as the multi-file PIR. In this paper, we give an upper bound of the multi-file PIR capacity (the maximal number of bits privately retrieved per one bit of downloaded bit) for the degenerating cases. We then present two general multi-file PIR schemes according to different ratios of the number of desired files to the total number of files. In particular, if the ratio is at least half, then the rate of our scheme meets the upper bound of the capacity for the degenerating cases and is thus optimal.
In this paper, we address a major challenge confronting the Cloud Service Providers (CSPs) utilizing a tiered storage architecture - how to maximize their overall profit over a variety of storage tiers that offer distinct characteristics, as well as file placement and access request scheduling policies.
As a fundamental challenge in vast disciplines, link prediction aims to identify potential links in a network based on the incomplete observed information, which has broad applications ranging from uncovering missing protein-protein interaction to predicting the evolution of networks. One of the most influential methods rely on similarity indices characterized by the common neighbors or its variations. We construct a hidden space mapping a network into Euclidean space based solely on the connection structures of a network. Compared with real geographical locations of nodes, our reconstructed locations are in conformity with those real ones. The distances between nodes in our hidden space could serve as a novel similarity metric in link prediction. In addition, we hybrid our hidden space method with other state-of-the-art similarity methods which substantially outperforms the existing methods on the prediction accuracy. Hence, our hidden space reconstruction model provides a fresh perspective to understand the network structure, which in particular casts a new light on link prediction.
May 08 2017 cs.SY
The rapid emergence of electric vehicles (EVs) demands an advanced infrastructure of publicly accessible charging stations that provide efficient charging services. In this paper, we propose a new charging station operation mechanism, the JoAP, which jointly optimizes the EV admission control, pricing, and charging scheduling to maximize the charging station's profit. More specifically, by introducing a tandem queueing network model, we analytically characterize the average charging station profit as a function of the admission control and pricing policies. Based on the analysis, we characterize the optimal JoAP algorithm. Through extensive simulations, we demonstrate that the proposed JoAP algorithm on average can achieve 330\% and 531\% higher profit than a widely adopted benchmark method under two representative waiting-time penalty rates.
Following the presentation and proof of the hypothesis that image features are particularly perceived at points where the Fourier components are maximally in phase, the concept of phase congruency (PC) is introduced. Subsequently, a two-dimensional multi-scale phase congruency (2D-MSPC) is developed, which has been an important tool for detecting and evaluation of image features. However, the 2D-MSPC requires many parameters to be appropriately tuned for optimal image features detection. In this paper, we defined a criterion for parameter optimization of the 2D-MSPC, which is a function of its maximum and minimum moments. We formulated the problem in various optimal and suboptimal frameworks, and discussed the conditions and features of the suboptimal solutions. The effectiveness of the proposed method was verified through several examples, ranging from natural objects to medical images from patients with a neurological disease, multiple sclerosis.
May 03 2017 cs.SC
We generalize the notions of singularities and ordinary points from linear ordinary differential equations to D-finite systems. Ordinary points of a D-finite system are characterized in terms of its formal power series solutions. We also show that apparent singularities can be removed like in the univariate case by adding suitable additional solutions to the system at hand. Several algorithms are presented for removing and detecting apparent singularities. In addition, an algorithm is given for computing formal power series solutions of a D-finite system at apparent singularities.