Inspired by recent successes of Monte-Carlo tree search (MCTS) in a number of artificial intelligence (AI) application domains, we propose a model-based reinforcement learning (RL) technique that iteratively applies MCTS on batches of small, finite-horizon versions of the original infinite-horizon Markov decision process. The terminal condition of the finite-horizon problems, or the leaf-node evaluator of the decision tree generated by MCTS, is specified using a combination of an estimated value function and an estimated policy function. The recommendations generated by the MCTS procedure are then provided as feedback in order to refine, through classification and regression, the leaf-node evaluator for the next iteration. We provide the first sample complexity bounds for a tree search-based RL algorithm. In addition, we show that a deep neural network implementation of the technique can create a competitive AI agent for the popular multi-player online battle arena (MOBA) game King of Glory.
End-To-End speech recognition have become increasingly popular in mandarin speech recognition and achieved delightful performance. Mandarin is a tonal language which is different from English and requires special treatment for the acoustic modeling units. There have been several different kinds of modeling units for mandarin such as phoneme, syllable and Chinese character. In this work, we explore two major end-to-end models: connectionist temporal classification (CTC) model and attention based encoder-decoder model for mandarin speech recognition. We compare the performance of three different scaled modeling units: context dependent phoneme(CDP), syllable with tone and Chinese character. We find that all types of modeling units can achieve approximate character error rate (CER) in CTC model and the performance of Chinese character attention model is better than syllable attention model. Furthermore, we find that Chinese character is a reasonable unit for mandarin speech recognition. On DidiCallcenter task, Chinese character attention model achieves a CER of 5.68% and CTC model gets a CER of 7.29%, on the other DidiReading task, CER are 4.89% and 5.79%, respectively. Moreover, attention model achieves a better performance than CTC model on both datasets.
In recent years, an increasing popularity of deep learning model for intelligent condition monitoring and diagnosis as well as prognostics used for mechanical systems and structures has been observed. In the previous studies, however, a major assumption accepted by default, is that the training and testing data are taking from same feature distribution. Unfortunately, this assumption is mostly invalid in real application, resulting in a certain lack of applicability for the traditional diagnosis approaches. Inspired by the idea of transfer learning that leverages the knowledge learnt from rich labeled data in source domain to facilitate diagnosing a new but similar target task, a new intelligent fault diagnosis framework, i.e., deep transfer network (DTN), which generalizes deep learning model to domain adaptation scenario, is proposed in this paper. By extending the marginal distribution adaptation (MDA) to joint distribution adaptation (JDA), the proposed framework can exploit the discrimination structures associated with the labeled data in source domain to adapt the conditional distribution of unlabeled target data, and thus guarantee a more accurate distribution matching. Extensive empirical evaluations on three fault datasets validate the applicability and practicability of DTN, while achieving many state-of-the-art transfer results in terms of diverse operating conditions, fault severities and fault types.
Jan 24 2018 cs.CL
We present assertion based question answering (ABQA), an open domain question answering task that takes a question and a passage as inputs, and outputs a semi-structured assertion consisting of a subject, a predicate and a list of arguments. An assertion conveys more evidences than a short answer span in reading comprehension, and it is more concise than a tedious passage in passage-based QA. These advantages make ABQA more suitable for human-computer interaction scenarios such as voice-controlled speakers. Further progress towards improving ABQA requires richer supervised dataset and powerful models of text understanding. To remedy this, we introduce a new dataset called WebAssertions, which includes hand-annotated QA labels for 358,427 assertions in 55,960 web passages. To address ABQA, we develop both generative and extractive approaches. The backbone of our generative approach is sequence to sequence learning. In order to capture the structure of the output assertion, we introduce a hierarchical decoder that first generates the structure of the assertion and then generates the words of each field. The extractive approach is based on learning to rank. Features at different levels of granularity are designed to measure the semantic relevance between a question and an assertion. Experimental results show that our approaches have the ability to infer question-aware assertions from a passage. We further evaluate our approaches by incorporating the ABQA results as additional features in passage-based QA. Results on two datasets show that ABQA features significantly improve the accuracy on passage-based~QA.
In order to stimulate secure sensing for Internet of Things (IoT) applications such as healthcare and traffic monitoring, mobile crowdsensing (MCS) systems have to address security threats, such as jamming, spoofing and faked sensing attacks, during both the sensing and the information exchange processes in large-scale dynamic and heterogenous networks. In this article, we investigate secure mobile crowdsensing and present how to use deep learning (DL) methods such as stacked autoencoder (SAE), deep neural network (DNN), and convolutional neural network (CNN) to improve the MCS security approaches including authentication, privacy protection, faked sensing countermeasures, intrusion detection and anti-jamming transmissions in MCS. We discuss the performance gain of these DL-based approaches compared with traditional security schemes and identify the challenges that need to be addressed to implement them in practical MCS systems.
To improve signal-to-interference ratio (SIR) and make better use of file diversity provided by random caching, we consider two types of linear receivers, i.e., maximal ratio combining (MRC) receiver and partial zero forcing (PZF) receiver, at users in a large-scale cache-enabled single-input multi-output (SIMO) network. First, for each receiver, by utilizing tools from stochastic geometry, we derive a tractable expression and a tight upper bound for the successful transmission probability (STP). In the case of the MRC receiver, we also derive a closed-form expression for the asymptotic outage probability in the low SIR threshold regime. Then, for each receiver, we maximize the STP. In the case of the MRC receiver, we consider the maximization of the tight upper bound on the STP by optimizing the caching distribution, which is a non-convex problem. We obtain a stationary point, by solving an equivalent difference of convex (DC) programming problem using concave-convex procedure (CCCP). We also obtain a closed-form asymptotically optimal solution in the low SIR threshold regime. In the case of the PZF receiver, we consider the maximization of the tight upper bound on the STP by optimizing the caching distribution and the degrees of freedom (DoF) allocation (for boosting the signal power), which is a mixed discrete-continuous problem. Based on structural properties, we obtain a low-complexity near optimal solution by using an alternating optimization approach. The analysis and optimization results reveal the impact of antenna resource at users on random caching. Finally, by numerical results, we show that the random caching design with the PZF receiver achieves significant performance gains over the random caching design with the MRC receiver and some baseline caching designs.
Dec 27 2017 cs.CV
The denoising of magnetic resonance (MR) images is a task of great importance for improving the acquired image quality. Many methods have been proposed in the literature to retrieve noise free images with good performances. Howerever, the state-of-the-art denoising methods, all needs a time-consuming optimization processes and their performance strongly depend on the estimated noise level parameter. Within this manuscript we propose the idea of denoising MRI Rician noise using a convolutional neural network. The advantage of the proposed methodology is that the learning based model can be directly used in the denosing process without optimization and even without the noise level parameter. Specifically, a ten convolutional layers neural network combined with residual learning and multi-channel strategy was proposed. Two training ways: training on a specific noise level and training on a general level were conducted to demonstrate the capability of our methods. Experimental results over synthetic and real 3D MR data demonstrate our proposed network can achieve superior performance compared with other methods in term of both of the peak signal to noise ratio and the global of structure similarity index. Without noise level parameter, our general noise-applicable model is also better than the other compared methods in two datasets. Furthermore, our training model shows good general applicability.
By using smart radio devices, a jammer can dynamically change its jamming policy based on opposing security mechanisms; it can even induce the mobile device to enter a specific communication mode and then launch the jamming policy accordingly. On the other hand, mobile devices can exploit spread spectrum and user mobility to address both jamming and interference. In this paper, a two-dimensional anti-jamming mobile communication scheme is proposed in which a mobile device leaves a heavily jammed/interfered-with frequency or area. It is shown that, by applying reinforcement learning techniques, a mobile device can achieve an optimal communication policy without the need to know the jamming and interference model and the radio channel model in a dynamic game framework. More specifically, a hotbooting deep Q-network based two-dimensional mobile communication scheme is proposed that exploits experiences in similar scenarios to reduce the exploration time at the beginning of the game, and applies deep convolutional neural network and macro-action techniques to accelerate the learning speed in dynamic situations. Several real-world scenarios are simulated to evaluate the proposed method. These simulation results show that our proposed scheme can improve both the signal-to-interference-plus-noise ratio of the signals and the utility of the mobile devices against cooperative jamming compared with benchmark schemes.
Familia is an open-source toolkit for pragmatic topic modeling in industry. Familia abstracts the utilities of topic modeling in industry as two paradigms: semantic representation and semantic matching. Efficient implementations of the two paradigms are made publicly available for the first time. Furthermore, we provide off-the-shelf topic models trained on large-scale industrial corpora, including Latent Dirichlet Allocation (LDA), SentenceLDA and Topical Word Embedding (TWE). We further describe typical applications which are successfully powered by topic modeling, in order to ease the confusions and difficulties of software engineers during topic model selection and utilization.
Monte Carlo Tree Search (MCTS), most famously used in game-play artificial intelligence (e.g., the game of Go), is a well-known strategy for constructing approximate solutions to sequential decision problems. Its primary innovation is the use of a heuristic, known as a default policy, to obtain Monte Carlo estimates of downstream values for states in a decision tree. This information is used to iteratively expand the tree towards regions of states and actions that an optimal policy might visit. However, to guarantee convergence to the optimal action, MCTS requires the entire tree to be expanded asymptotically. In this paper, we propose a new technique called Primal-Dual MCTS that utilizes sampled information relaxation upper bounds on potential actions, creating the possibility of "ignoring" parts of the tree that stem from highly suboptimal choices. This allows us to prove that despite converging to a partial decision tree in the limit, the recommended action from Primal-Dual MCTS is optimal. The new approach shows significant promise when used to optimize the behavior of a single driver navigating a graph while operating on a ride-sharing platform. Numerical experiments on a real dataset of 7,000 trips in New Jersey suggest that Primal-Dual MCTS improves upon standard MCTS by producing deeper decision trees and exhibits a reduced sensitivity to the size of the action space.
The omnipresence of deep learning architectures such as deep convolutional neural networks (CNN)s is fueled by the synergistic combination of ever-increasing labeled datasets and specialized hardware. Despite the indisputable success, the reliance on huge amounts of labeled data and specialized hardware can be a limiting factor when approaching new applications. To help alleviating these limitations, we propose an efficient learning strategy for layer-wise unsupervised training of deep CNNs on conventional hardware in acceptable time. Our proposed strategy consists of randomly convexifying the reconstruction contractive auto-encoding (RCAE) learning objective and solving the resulting large-scale convex minimization problem in the frequency domain via coordinate descent (CD). The main advantages of our proposed learning strategy are: (1) single tunable optimization parameter; (2) fast and guaranteed convergence; (3) possibilities for full parallelization. Numerical experiments show that our proposed learning strategy scales (in the worst case) linearly with image size, number of filters and filter size.
Existing designs for content dissemination do not fully explore and exploit potential caching and computation capabilities in advanced wireless networks. In this paper, we propose two partition-based caching designs, i.e., a coded caching design based on Random Linear Network Coding and an uncoded caching design. We consider the analysis and optimization of the two caching designs in a large-scale successive interference cancelation (SIC)-enabled wireless network. First, under each caching design, by utilizing tools from stochastic geometry and adopting appropriate approximations, we derive a tractable expression for the successful transmission probability in the general file size regime. To further obtain design insights, we also derive closed-form expressions for the successful transmission probability in the small and large file size regimes, respectively. Then, under each caching design, we consider the successful transmission probability maximization in the general file size regime, which is an NP-hard problem. By exploring structural properties, we successfully transform the original optimization problem into a Multiple-Choice Knapsack Problem (MCKP), and obtain a near optimal solution with 1/2 approximation guarantee and polynomial complexity. We also obtain closed-form asymptotically optimal solutions. The analysis and optimization results show the advantage of the coded caching design over the uncoded caching design, and reveal the impact of caching and SIC capabilities. Finally, by numerical results, we show that the two proposed caching designs achieve significant performance gains over some baseline caching designs.
Heterogeneous wireless networks (HetNets) provide a powerful approach to meet the dramatic mobile traffic growth, but also impose a significant challenge on backhaul. Caching and multicasting at macro and pico base stations (BSs) are two promising methods to support massive content delivery and reduce backhaul load in HetNets. In this paper, we jointly consider caching and multicasting in a large-scale cache-enabled HetNet with backhaul constraints. We propose a hybrid caching design consisting of identical caching in the macro-tier and random caching in the pico-tier, and a corresponding multicasting design. By carefully handling different types of interferers and adopting appropriate approximations, we derive tractable expressions for the successful transmission probability in the general region as well as the high signal-to-noise ratio (SNR) and user density region, utilizing tools from stochastic geometry. Then, we consider the successful transmission probability maximization by optimizing the design parameters, which is a very challenging mixed discrete-continuous optimization problem due to the sophisticated structure of the successful transmission probability. By using optimization techniques and exploring the structural properties, we obtain a near optimal solution with superior performance and manageable complexity. This solution achieves better performance in the general region than any asymptotically optimal solution, under a mild condition. The analysis and optimization results provide valuable design insights for practical cache-enabled HetNets.
In heterogeneous networks (HetNets), strong interference due to spectrum reuse affects each user's signal-to-interference ratio (SIR), and hence is one limiting factor of network performance. In this paper, we propose a user-centric interference nulling (IN) scheme in a downlink large-scale HetNet to improve coverage/outage probability by improving each user's SIR. This IN scheme utilizes at most maximum IN degree of freedom (DoF) at each macro-BS to avoid interference to uniformly selected macro (pico) users with signal-to-individual-interference ratio (SIIR) below a macro (pico) IN threshold, where the maximum IN DoF and the two IN thresholds are three design parameters. Using tools from stochastic geometry, we first obtain a tractable expression of the coverage (equivalently outage) probability. Then, we analyze the asymptotic coverage/outage probability in the low and high SIR threshold regimes. The analytical results indicate that the maximum IN DoF can affect the order gain of the outage probability in the low SIR threshold regime, but cannot affect the order gain of the coverage probability in the high SIR threshold regime. Moreover, we characterize the optimal maximum IN DoF which optimizes the asymptotic coverage/outage probability. The optimization results reveal that the IN scheme can linearly improve the outage probability in the low SIR threshold regime, but cannot improve the coverage probability in the high SIR threshold regime. Finally, numerical results show that the proposed scheme can achieve good gains in coverage/outage probability over a maximum ratio beamforming scheme and a user-centric almost blank subframes (ABS) scheme.
Jan 05 2016 cs.DB
Modern Internet applications often produce a large volume of user activity records. Data analysts are interested in cohort analysis, or finding unusual user behavioral trends, in these large tables of activity records. In a traditional database system, cohort analysis queries are both painful to specify and expensive to evaluate. We propose to extend database systems to support cohort analysis. We do so by extending SQL with three new operators. We devise three different evaluation schemes for cohort query processing. Two of them adopt a non-intrusive approach. The third approach employs a columnar based evaluation scheme with optimizations specifically designed for cohort query processing. Our experimental results confirm the performance benefits of our proposed columnar database system, compared against the two non-intrusive approaches that implement cohort queries on top of regular relational databases.
Caching and multicasting at base stations are two promising approaches to support massive content delivery over wireless networks. However, existing analysis and designs do not fully explore and exploit the potential advantages of the two approaches. In this paper, we consider the analysis and optimization of caching and multicasting in a large-scale cache-enabled wireless network. We propose a random caching and multicasting scheme with a design parameter. By carefully handling different types of interferers and adopting appropriate approximations, we derive a tractable expression for the successful transmission probability in the general region, utilizing tools from stochastic geometry. We also obtain a closed-form expression for the successful transmission probability in the high signal-to-noise ratio (SNR) and user density region. Then, we consider the successful transmission probability maximization, which is a very complex non-convex problem in general. Using optimization techniques, we develop an iterative numerical algorithm to obtain a local optimal caching and multicasting design in the general region. To reduce complexity and maintain superior performance, we also derive an asymptotically optimal caching and multicasting design in the asymptotic region, based on a two-step optimization framework. Finally, numerical simulations show that the asymptotically optimal design achieves a significant gain in successful transmission probability over some baseline schemes in the general region.
In this paper, we consider a finite-horizon Markov decision process (MDP) for which the objective at each stage is to minimize a quantile-based risk measure (QBRM) of the sequence of future costs; we call the overall objective a dynamic quantile-based risk measure (DQBRM). In particular, we consider optimizing dynamic risk measures where the one-step risk measures are QBRMs, a class of risk measures that includes the popular value at risk (VaR) and the conditional value at risk (CVaR). Although there is considerable theoretical development of risk-averse MDPs in the literature, the computational challenges have not been explored as thoroughly. We propose data-driven and simulation-based approximate dynamic programming (ADP) algorithms to solve the risk-averse sequential decision problem. We address the issue of inefficient sampling for risk applications in simulated settings and present a procedure, based on importance sampling, to direct samples toward the "risky region" as the ADP algorithm progresses. Finally, we show numerical results of our algorithms in the context of an application involving risk-averse bidding for energy storage.