Genome-wide association studies (GWAS) have achieved great success in the genetic study of Alzheimer's disease (AD). Collaborative imaging genetics studies across different research institutions show the effectiveness of detecting genetic risk factors. However, the high dimensionality of GWAS data poses significant challenges in detecting risk SNPs for AD. Selecting relevant features is crucial in predicting the response variable. In this study, we propose a novel Distributed Feature Selection Framework (DFSF) to conduct the large-scale imaging genetics studies across multiple institutions. To speed up the learning process, we propose a family of distributed group Lasso screening rules to identify irrelevant features and remove them from the optimization. Then we select the relevant group features by performing the group Lasso feature selection process in a sequence of parameters. Finally, we employ the stability selection to rank the top risk SNPs that might help detect the early stage of AD. To the best of our knowledge, this is the first distributed feature selection model integrated with group Lasso feature selection as well as detecting the risk genetic factors across multiple research institutions system. Empirical studies are conducted on 809 subjects with 5.9 million SNPs which are distributed across several individual institutions, demonstrating the efficiency and effectiveness of the proposed method.
In this paper, we consider a coverage problem for uncertain points in a tree. Let T be a tree containing a set P of n (weighted) demand points, and the location of each demand point P_i∈P is uncertain but is known to appear in one of m_i points on T each associated with a probability. Given a covering range \lambda, the problem is to find a minimum number of points (called centers) on T to build facilities for serving (or covering) these demand points in the sense that for each uncertain point P_i∈P, the expected distance from P_i to at least one center is no more than $\lambda$. The problem has not been studied before. We present an O(|T|+M\log^2 M) time algorithm for the problem, where |T| is the number of vertices of T and M is the total number of locations of all uncertain points of P, i.e., M=\sum_P_i∈Pm_i. In addition, by using this algorithm, we solve a k-center problem on T for the uncertain points of P.
Apr 25 2017 cs.NI
With the emerging of the fifth generation (5G) mobile communication systems, millimeter wave transmissions are believed to be a promising solution for vehicular networks, especially in vehicle to vehicle (V2V) communications. In millimeter wave V2V communications, different vehicular networking services have different quality requirements for V2V multi-hop links. To evaluate the quality of different V2V wireless links, a new link quality indicator is proposed in this paper considering requirements of the real-time and the reliability in V2V multi-hop links. Moreover, different weight factors are configured to reflect the different requirements of different types of services on real-time and reliability in the new quality indicator. Based on the proposed link quality indicator, the relationship between V2V link quality and one-hop communication distance under different vehicle densities is analyzed in this paper. Simulation results indicate that the link quality is improved with the increasing of vehicle density and there exists an optimal one-hop communication distance for the link quality when the vehicle density is fixed.
Apr 24 2017 cs.CL
Neural machine translation (NMT) becomes a new approach to machine translation and generates much more fluent results compared to statistical machine translation (SMT). However, SMT is usually better than NMT in translation adequacy. It is therefore a promising direction to combine the advantages of both NMT and SMT. In this paper, we propose a neural system combination framework leveraging multi-source NMT, which takes as input the outputs of NMT and SMT systems and produces the final translation. Extensive experiments on the Chinese-to-English translation task show that our model archives significant improvement by 5.3 BLEU points over the best single system output and 3.4 BLEU points over the state-of-the-art traditional system combination methods.
The proliferation of social media in communication and information dissemination has made it an ideal platform for spreading rumors. Automatically debunking rumors at their stage of diffusion is known as \textitearly rumor detection, which refers to dealing with sequential posts regarding disputed factual claims with certain variations and highly textual duplication over time. Thus, identifying trending rumors demands an efficient yet flexible model that is able to capture long-range dependencies among postings and produce distinct representations for the accurate early detection. However, it is a challenging task to apply conventional classification algorithms to rumor detection in earliness since they rely on hand-crafted features which require intensive manual efforts in the case of large amount of posts. This paper presents a deep attention model on the basis of recurrent neural networks (RNN) to learn \textitselectively temporal hidden representations of sequential posts for identifying rumors. The proposed model delves soft-attention into the recurrence to simultaneously pool out distinct features with particular focus and produce hidden representations that capture contextual variations of relevant posts over time. Extensive experiments on real datasets collected from social media websites demonstrate that (1) the deep attention based RNN model outperforms state-of-the-arts that rely on hand-crafted features; (2) the introduction of soft attention mechanism can effectively distill relevant parts to rumors from original posts in advance; (3) the proposed method detects rumors more quickly and accurately than competitors.
Apr 18 2017 cs.NE
In this paper, we propose a Hybrid Ant Colony Optimization algorithm (HACO) for Next Release Problem (NRP). NRP, a NP-hard problem in requirement engineering, is to balance customer requests, resource constraints, and requirement dependencies by requirement selection. Inspired by the successes of Ant Colony Optimization algorithms (ACO) for solving NP-hard problems, we design our HACO to approximately solve NRP. Similar to traditional ACO algorithms, multiple artificial ants are employed to construct new solutions. During the solution construction phase, both pheromone trails and neighborhood information will be taken to determine the choices of every ant. In addition, a local search (first found hill climbing) is incorporated into HACO to improve the solution quality. Extensively wide experiments on typical NRP test instances show that HACO outperforms the existing algorithms (GRASP and simulated annealing) in terms of both solution uality and running time.
We study an extension of active learning in which the learning algorithm may ask the annotator to compare the distances of two examples from the boundary of their label-class. For example, in a recommendation system application (say for restaurants), the annotator may be asked whether she liked or disliked a specific restaurant (a label query); or which one of two restaurants did she like more (a comparison query). We focus on the class of half spaces, and show that under natural assumptions, such as large margin or bounded bit-description of the input examples, it is possible to reveal all the labels of a sample of size $n$ using approximately $O(\log n)$ queries. This implies an exponential improvement over classical active learning, where only label queries are allowed. We complement these results by showing that if any of these assumptions is removed then, in the worst case, $\Omega(n)$ queries are required. Our results follow from a new general framework of active learning with additional queries. We identify a combinatorial dimension, called the \emphinference dimension, that captures the query complexity when each additional query is determined by $O(1)$ examples (such as comparison queries, each of which is determined by the two compared examples). Our results for half spaces follow by bounding the inference dimension in the cases discussed above.
Apr 11 2017 cs.MM
Noise is often brought to host audio by common signal processing operation, and it usually changes the high-frequency component of an audio signal. So embedding watermark by adjusting low-frequency coefficient can improve the robustness of a watermark scheme. Moving Average sequence is a low-frequency feature of an audio signal. This work proposed a method which embedding watermark into the maximal coefficient in discrete cosine transform domain of a moving average sequence. Subjective and objective tests reveal that the proposed watermarking scheme maintains highly audio quality, and simultaneously, the algorithm is highly robust to common digital signal processing operations, including additive noise, sampling rate change, bit resolution transformation, MP3 compression, and random cropping, especially low-pass filtering.
Apr 07 2017 cs.CV
Person re-identification (ReID) is an important task in wide area video surveillance which focuses on identifying people across different cameras. Recently, deep learning networks with a triplet loss become a common framework for person ReID. However, the triplet loss pays main attentions on obtaining correct orders on the training set. It still suffers from a weaker generalization capability from the training set to the testing set, thus resulting in inferior performance. In this paper, we design a quadruplet loss, which can lead to the model output with a larger inter-class variation and a smaller intra-class variation compared to the triplet loss. As a result, our model has a better generalization ability and can achieve a higher performance on the testing set. In particular, a quadruplet deep network using a margin-based online hard negative mining is proposed based on the quadruplet loss for the person ReID. In extensive experiments, the proposed network outperforms most of the state-of-the-art algorithms on representative datasets which clearly demonstrates the effectiveness of our proposed method.
This paper considers a massive MIMO system, where K single-antenna transmit terminals communicate with an N-antenna receiver. By massive MIMO, we assume N>> K >> 1. We propose a novel blind detection scheme by exploiting the channel sparsity inherent to the angular domain of the receive antenna array. We show that the overhead for channel acquisition can be largely compensated by the potential gain due to the channel sparsity. To this end, we propose a novel blind detection scheme that simultaneously estimates the channel and data by factorizing the received signal matrix. We show that by exploiting the channel sparsity, our proposed scheme can achieve a DoF arbitrarily close to K(1-1/T) with T being the channel coherence time, provided that N is sufficiently large and the channel is sufficiently sparse. This achievable DoF has a fractional gap of only 1/T from the ideal DoF of K, which is a remarkable advance for understanding the performance limit of the massive MIMO system. We further show that the performance advantage of our proposed scheme in the asymptotic SNR regime carries over to the practical SNR regime. In specific, we present an efficient message-passing algorithm to jointly estimate the channel and detect the data via matrix factorization. Numerical results demonstrate that our proposed scheme significantly outperforms its counterpart schemes in the practical SNR regime under various system configurations.
This paper presents the next evolution of FD-MIMO technology for beyond 5G, where antennas of the FD-MIMO system are placed in a distributed manner throughout the cell in a multi-cell deployment scenario. This system, referred to as Distributed FD-MIMO (D-FD-MIMO) system, is capable of providing higher cell average throughput as well as more uniform user experience compared to the conventional FD-MIMO system. System level simulations are performed to evaluate performance. Our results show that the proposed D-FD-MIMO system achieves 1.4-2 times cell average throughput gain compared to the FD-MIMO system. The insights of performance gain are provided. Hardware implementation challenges and potential standards impact are also presented.
Apr 03 2017 cs.NI
The volume and types of traffic data in mobile cellular networks have been increasing continuously. Meanwhile, traffic data change dynamically in several dimensions such as time and space. Thus, traffic modeling is essential for theoretical analysis and energy efficient design of future ultra-dense cellular networks. In this paper, the authors try to build a tractable and accurate model to describe the traffic variation pattern for a single base station in real cellular networks. Firstly a sinusoid superposition model is proposed for describing the temporal traffic variation of multiple base stations based on real data in a current cellular network. It shows that the mean traffic volume of many base stations in an area changes periodically and has three main frequency components. Then, lognormal distribution is verified for spatial modeling of real traffic data. The spatial traffic distributions at both spare time and busy time are analyzed. Moreover, the parameters of the model are presented in three typical regions: park, campus and central business district. Finally, an approach for combined spatial-temporal traffic modeling of single base station is proposed based on the temporal and spatial traffic distribution of multiple base stations. All the three models are evaluated through comparison with real data in current cellular networks. The results show that these models can accurately describe the variation pattern of real traffic data in cellular networks.
3-dimensional Multiple-Input Multiple-Output (3D MIMO) systems have received great interest recently because of the spatial diversity advantage and capability for full-dimensional beamforming, making them promising candidates for practical realization of massive MIMO. In this paper, we present a low-cost test equipment (channel sounder) and post-processing algorithms suitable for investigating 3D MIMO channels, as well as the results from a measurement campaign for obtaining elevation and azimuth characteristics in an outdoor-to-indoor (O2I) environment. Due to limitations in available antenna switches, our channel sounder consists of a hybrid switched/virtual cylindrical array with effectively 480 antenna elements at the base station (BS). The virtual setup increased the overall MIMO measurement duration, thereby introducing phase drift errors in the measurements. Using a reference antenna measurements, we estimate and correct for the phase errors during post-processing. We provide the elevation and azimuth angular spreads, for the measurements done in an urban macro-cellular (UMa) and urban micro-cellular (UMi) environments, and study their dependence on the UE height. Based on the measurements done with UE placed on different floors, we study the feasibility of separating users in the elevation domain. The measured channel impulse responses are also used to study the channel hardening aspects of Massive MIMO and the optimality of Maximum Ratio Combining (MRC) receiver.
Mar 31 2017 cs.CR
Chip designers outsource chip fabrication to external foundries, but at the risk of IP theft. Logic locking, a promising solution to mitigate this threat, adds extra logic gates (key gates) and inputs (key bits) to the chip so that it functions correctly only when the correct key, known only to the designer but not the foundry, is applied. In this paper, we identify a new vulnerability in all existing logic locking schemes. Prior attacks on logic locking have assumed that, in addition to the design of the locked chip, the attacker has access to a working copy of the chip. Our attack does not require a working copy and yet we successfully recover a significant fraction of key bits from the design of the locked chip only. Empirically, we demonstrate the success of our attack on eight large benchmark circuits from a benchmark suite that has been tailored specifically for logic synthesis research, for two different logic locking schemes. Then, to address this vulnerability, we initiate the study of provably secure logic locking mechanisms. We formalize, for the first time to our knowledge, a precise notion of security for logic locking. We establish that any locking procedure that is secure under our definition is guaranteed to counter our desynthesis attack, and all other such known attacks. We then devise a new logic locking procedure, Meerkat, that guarantees that the locked chip reveals no information about the key or the designer's intended functionality. A main insight behind Meerkat is that canonical representations of boolean functionality via Reduced Ordered Binary Decision Diagrams (ROBDDs) can be leveraged effectively to provide security. We analyze Meerkat with regards to its security properties and the overhead it incurs. As such, our work is a contribution to both the foundations and practice of securing digital ICs.
Mar 31 2017 cs.CV
Outdoor lighting has extremely high dynamic range. This makes the process of capturing outdoor environment maps notoriously challenging since special equipment must be used. In this work, we propose an alternative approach. We first capture lighting with a regular, LDR omnidirectional camera, and aim to recover the HDR after the fact via a novel, learning-based tonemapping method. We propose a deep autoencoder framework which regresses linear, high dynamic range data from non-linear, saturated, low dynamic range panoramas. We validate our method through a wide set of experiments on synthetic data, as well as on a novel dataset of real photographs with ground truth. Our approach finds applications in a variety of settings, ranging from outdoor light capture to image matching.
In this paper, we consider efficient differentially private empirical risk minimization from the viewpoint of optimization algorithms. For strongly convex and smooth objectives, we prove that gradient descent with output perturbation not only achieves nearly optimal utility, but also significantly improves the running time of previous state-of-the-art private optimization algorithms, for both $\epsilon$-DP and $(\epsilon, \delta)$-DP. For non-convex but smooth objectives, we propose an RRPSGD (Random Round Private Stochastic Gradient Descent) algorithm, which provably converges to a stationary point with privacy guarantee. Besides the expected utility bounds, we also provide guarantees in high probability form. Experiments demonstrate that our algorithm consistently outperforms existing method in both utility and running time.
Mar 28 2017 cs.CV
Precise segmentation is a prerequisite for an accurate quantification of the imaged objects. It is a very challenging task in many medical imaging applications due to relatively poor image quality and data scarcity. In this work, we present an innovative segmentation paradigm, named Deep Poincare Map (DPM), by coupling the dynamical system theory with a novel deep learning based approach. Firstly, we model the image segmentation process as a dynamical system, in which limit cycle models the boundary of the region of interest (ROI). Secondly, instead of segmenting the ROI directly, convolutional neural network is employed to predict the vector field of the dynamical system. Finally, the boundary of the ROI is identified using the Poincare map and the flow integration. We demonstrate that our segmentation model can be built using a very limited number of train- ing data. By cross-validation, we can achieve a mean Dice score of 94% compared to the manual delineation (ground truth) of the left ventricle ROI defined by clinical experts on a cardiac MRI dataset. Compared with other state-of-the-art methods, we can conclude that the proposed DPM method is adaptive, accurate and robust. It is straightforward to apply this method for other medical imaging applications.
Mar 27 2017 cs.DC
Memory caches are being aggressively used in today's data-parallel systems such as Spark, Tez, and Piccolo. However, prevalent systems employ rather simple cache management policies--notably the Least Recently Used (LRU) policy--that are oblivious to the application semantics of data dependency, expressed as a directed acyclic graph (DAG). Without this knowledge, memory caching can at best be performed by "guessing" the future data access patterns based on historical information (e.g., the access recency and/or frequency), which frequently results in inefficient, erroneous caching with low hit ratio and a long response time. In this paper, we propose a novel cache replacement policy, Least Reference Count (LRC), which exploits the application-specific DAG information to optimize the cache management. LRC evicts the cached data blocks whose reference count is the smallest. The reference count is defined, for each data block, as the number of dependent child blocks that have not been computed yet. We demonstrate the efficacy of LRC through both empirical analysis and cluster deployments against popular benchmarking workloads. Our Spark implementation shows that, compared with LRU, LRC speeds up typical applications by 60%.
Mar 23 2017 cs.HC
Music history, referring to the records of users' listening or downloading history in online music services, is the primary source for music service providers to analyze users' preferences on music and thus to provide personalized recommendations to users. In order to engage users into the service and to improve user experience, it would be beneficial to provide visual analyses of one user's music history as well as visualized recommendations to that user. In this paper, we take a user-centric approach to the design of such visual analyses. We start by investigating user needs on such visual analyses and recommendations, then propose several different visualization schemes, and perform a pilot study to collect user feedback on the designed schemes. We further conduct user studies to verify the utility of the proposed schemes, and the results not only demonstrate the effectiveness of our proposed visualization, but also provide important insights to guide the visualization design in the future.
Mar 23 2017 cs.SY
This paper studies physical consequences of unobservable false data injection (FDI) attacks designed only with information inside a sub-network of the power system. The goal of this attack is to overload a chosen target line without being detected via measurements. To overcome the limited information, a multiple linear regression model is developed to learn the relationship between the external network and the attack sub-network from historical data. The worst possible consequences of such FDI attacks are evaluated by solving a bi-level optimization problem wherein the first level models the limited attack resources, while the second level formulates the system response to such attacks via DC optimal power flow (OPF). The attack model with limited information is reflected in the DC OPF formulation that only takes into account the system information for the attack sub-network. The vulnerability of this attack model is illustrated on the IEEE 24-bus RTS and IEEE 118-bus systems.
Mar 22 2017 cs.CL
Recurrent neural networks (RNNs), especially long short-term memory (LSTM) RNNs, are effective network for sequential task like speech recognition. Deeper LSTM models perform well on large vocabulary continuous speech recognition, because of their impressive learning ability. However, it is more difficult to train a deeper network. We introduce a training framework with layer-wise training and exponential moving average methods for deeper LSTM models. It is a competitive framework that LSTM models of more than 7 layers are successfully trained on Shenma voice search data in Mandarin and they outperform the deep LSTM models trained by conventional approach. Moreover, in order for online streaming speech recognition applications, the shallow model with low real time factor is distilled from the very deep model. The recognition accuracy have little loss in the distillation process. Therefore, the model trained with the proposed training framework reduces relative 14\% character error rate, compared to original model which has the similar real-time capability. Furthermore, the novel transfer learning strategy with segmental Minimum Bayes-Risk is also introduced in the framework. The strategy makes it possible that training with only a small part of dataset could outperform full dataset training from the beginning.
In this paper, energy efficient power allocation for downlink massive MIMO systems is investigated. A constrained non-convex optimization problem is formulated to maximize the energy efficiency (EE), which takes into account the quality of service (QoS) requirements. By exploiting the properties of fractional programming and the lower bound of the user data rate, the non-convex optimization problem is transformed into a convex optimization problem. The Lagrangian dual function method is utilized to convert the constrained convex problem into an unconstrained convex one. Due to the multi-variable coupling problem caused by the intra-user interference, it is intractable to derive an explicit solution to the above optimization problem. Exploiting the standard interference function, we propose an implicit iterative algorithm to solve the unconstrained convex optimization problem and obtain the optimal power allocation scheme. Simulation results show that the proposed iterative algorithm converges in just a few iterations, and demonstrate the impact of the number of users and the number of antennas on the EE.
Mar 22 2017 cs.CV
Recent advances in generative adversarial networks (GANs) have shown promising potentials in conditional image generation. However, how to generate high-resolution images remains an open problem. In this paper, we aim at generating high-resolution well-blended images given composited copy-and-paste ones, i.e. realistic high-resolution image blending. To achieve this goal, we propose Gaussian-Poisson GAN (GP-GAN), a framework that combines the strengths of classical gradient-based approaches and GANs, which is the first work that explores the capability of GANs in high-resolution image blending task to the best of our knowledge. Particularly, we propose Gaussian-Poisson Equation to formulate the high-resolution image blending problem, which is a joint optimisation constrained by the gradient and colour information. Gradient filters can obtain gradient information. For generating the colour information, we propose Blending GAN to learn the mapping between the composited image and the well-blended one. Compared to the alternative methods, our approach can deliver high-resolution, realistic images with fewer bleedings and unpleasant artefacts. Experiments confirm that our approach achieves the state-of-the-art performance on Transient Attributes dataset. A user study on Amazon Mechanical Turk finds that majority of workers are in favour of the proposed approach.
Translating information between text and image is a fundamental problem in artificial intelligence that connects natural language processing and computer vision. In the past few years, performance in image caption generation has seen significant improvement through the adoption of recurrent neural networks (RNN). Meanwhile, text-to-image generation begun to generate plausible images using datasets of specific categories like birds and flowers. We've even seen image generation from multi-category datasets such as the Microsoft Common Objects in Context (MSCOCO) through the use of generative adversarial networks (GANs). Synthesizing objects with a complex shape, however, is still challenging. For example, animals and humans have many degrees of freedom, which means that they can take on many complex shapes. We propose a new training method called Image-Text-Image (I2T2I) which integrates text-to-image and image-to-text (image captioning) synthesis to improve the performance of text-to-image synthesis. We demonstrate that %the capability of our method to understand the sentence descriptions, so as to I2T2I can generate better multi-categories images using MSCOCO than the state-of-the-art. We also demonstrate that I2T2I can achieve transfer learning by using a pre-trained image captioning module to generate human images on the MPII Human Pose
The singular value decomposition (SVD) is a widely used matrix factorization tool which underlies plenty of useful applications, e.g. recommendation system, abnormal detection and data compression. Under the environment of emerging Internet of Things (IoT), there would be an increasing demand for data analysis to better human's lives and create new economic growth points. Moreover, due to the large scope of IoT, most of the data analysis work should be done in the network edge, i.e. handled by fog computing. However, the devices which provide fog computing may not be trustable while the data privacy is often the significant concern of the IoT application users. Thus, when performing SVD for data analysis purpose, the privacy of user data should be preserved. Based on the above reasons, in this paper, we propose a privacy-preserving fog computing framework for SVD computation. The security and performance analysis shows the practicability of the proposed framework. Furthermore, since different applications may utilize the result of SVD operation in different ways, three applications with different objectives are introduced to show how the framework could flexibly achieve the purposes of different applications, which indicates the flexibility of the design.
Studies show that refining real-world categories into semantic subcategories contributes to better image modeling and classification. Previous image sub-categorization work relying on labeled images and WordNet's hierarchy is not only labor-intensive, but also restricted to classify images into NOUN subcategories. To tackle these problems, in this work, we exploit general corpus information to automatically select and subsequently classify web images into semantic rich (sub-)categories. The following two major challenges are well studied: 1) noise in the labels of subcategories derived from the general corpus; 2) noise in the labels of images retrieved from the web. Specifically, we first obtain the semantic refinement subcategories from the text perspective and remove the noise by the relevance-based approach. To suppress the search error induced noisy images, we then formulate image selection and classifier learning as a multi-class multi-instance learning problem and propose to solve the employed problem by the cutting-plane algorithm. The experiments show significant performance gains by using the generated data of our way on both image categorization and sub-categorization tasks. The proposed approach also consistently outperforms existing weakly supervised and web-supervised approaches.
Mar 16 2017 cs.CL
The last several years have seen intensive interest in exploring neural-network-based models for machine comprehension (MC) and question answering (QA). In this paper, we approach the problems by closely modelling questions in a neural network framework. We first introduce syntactic information to help encode questions. We then view and model different types of questions and the information shared among them as an adaptation task and proposed adaptation models for them. On the Stanford Question Answering Dataset (SQuAD), we show that these approaches can help attain better results over a competitive baseline.
In this paper, we present a novel real-time MIMO channel sounder for 28 GHz. Until now, the common practice to investigate the directional characteristics of millimeter-wave channels has been using a rotating horn antenna. The sounder presented here is capable of performing horizontal and vertical beam steering with the help of phased arrays. Thanks to fast beam-switching capability, the proposed sounder can perform measurements that are directionally resolved both at the transmitter (TX) and receiver (RX) as fast as 1.44 milliseconds compared to the minutes or even hours required for rotating horn antenna sounders. This does not only enable us to measure more points for better statistical inference but also allows to perform directional analysis in dynamic environments. Equally importantly, the short measurement time combined with the high phase stability of our setup limits the phase drift between TX and RX, enabling phase-coherent sounding of all beam pairs even when TX and RX are physically separated and have no cabled connection for synchronization. This ensures that the measurement data is suitable for high-resolution parameter extraction algorithms. Along with the system design and specifications, this paper also discusses the measurements performed for verification of the sounder. Furthermore, we present sample measurements from a channel sounding campaign performed on a residential street.
We develop a method to estimate from data travel latency cost functions in multi-class transportation networks, which accommodate different types of vehicles with very different characteristics (e.g., cars and trucks). Leveraging our earlier work on inverse variational inequalities, we develop a data-driven approach to estimate the travel latency cost functions. Extensive numerical experiments using benchmark networks, ranging from moderate-sized to large-sized, demonstrate the effectiveness and efficiency of our approach.
The practical deployment of massive multiple-input multiple-output (MIMO) in future fifth generation (5G) wireless communication systems is challenging due to its high hardware cost and power consumption. One promising solution to address this challenge is to adopt the low-resolution analog-to-digital converter (ADC) architecture. However, the practical implementation of such architecture is challenging due to the required complex signal processing to compensate the coarse quantization caused by low-resolution ADCs. Therefore, few high-resolution ADCs are reserved in the recently proposed mixed-ADC architecture to enable low-complexity transceiver algorithms. In contrast to previous works over Rayleigh fading channels, we investigate the performance of mixed-ADC massive MIMO systems over the Rician fading channel, which is more general for the 5G scenarios like Internet of Things (IoT). Specially, novel closed-form approximate expressions for the uplink achievable rate are derived for both cases of perfect and imperfect channel state information (CSI). With the increasing Rician $K$-factor, the derived results show that the achievable rate will converge to a fixed value. We also obtain the power-scaling law that the transmit power of each user can be scaled down proportionally to the inverse of the number of base station (BS) antennas for both perfect and imperfect CSI. Moreover, we reveal the trade-off between the achievable rate and energy efficiency with respect to key system parameters including the quantization bits, number of BS antennas, Rician $K$-factor, user transmit power, and CSI quality. Finally, numerical results are provided to show that the mixed-ADC architecture can achieve a better energy-rate trade-off compared with the ideal infinite-resolution and low-resolution ADC architectures.
Loyalty is an essential component of multi-community engagement. When users have the choice to engage with a variety of different communities, they often become loyal to just one, focusing on that community at the expense of others. However, it is unclear how loyalty is manifested in user behavior, or whether loyalty is encouraged by certain community characteristics. In this paper we operationalize loyalty as a user-community relation: users loyal to a community consistently prefer it over all others; loyal communities retain their loyal users over time. By exploring this relation using a large dataset of discussion communities from Reddit, we reveal that loyalty is manifested in remarkably consistent behaviors across a wide spectrum of communities. Loyal users employ language that signals collective identity and engage with more esoteric, less popular content, indicating they may play a curational role in surfacing new material. Loyal communities have denser user-user interaction networks and lower rates of triadic closure, suggesting that community-level loyalty is associated with more cohesive interactions and less fragmentation into subgroups. We exploit these general patterns to predict future rates of loyalty. Our results show that a user's propensity to become loyal is apparent from their first interactions with a community, suggesting that some users are intrinsically loyal from the very beginning.
We consider sequential detection based on quantized data in the presence of eavesdropper. Stochastic encryption is employed as a counter measure that flips the quantization bits at each sensor according to certain probabilities, and the flipping probabilities are only known to the legitimate fusion center (LFC) but not the eavesdropping fusion center (EFC). As a result, the LFC employs the optimal sequential probability ratio test (SPRT) for sequential detection whereas the EFC employs a mismatched SPRT (MSPRT). We characterize the asymptotic performance of the MSPRT in terms of the expected sample size as a function of the vanishing error probabilities. We show that when the detection error probabilities are set to be the same at the LFC and EFC, every symmetric stochastic encryption is ineffective in the sense that it leads to the same expected sample size at the LFC and EFC. Next, in the asymptotic regime of small detection error probabilities, we show that every stochastic encryption degrades the performance of the quantized sequential detection at the LFC by increasing the expected sample size, and the expected sample size required at the EFC is no fewer than that is required at the LFC. Then the optimal stochastic encryption is investigated in the sense of maximizing the difference between the expected sample sizes required at the EFC and LFC. Although this optimization problem is nonconvex, we show that if the acceptable tolerance of the increase in the expected sample size at the LFC induced by the stochastic encryption is small enough, then the globally optimal stochastic encryption can be analytically obtained; and moreover, the optimal scheme only flips one type of quantized bits (i.e., 1 or 0) and keeps the other type unchanged.
Mar 08 2017 cs.HC
In the Google Play store, an introduction page is associated with every mobile application (app) for users to acquire its details, including screenshots, description, reviews, etc. However, it remains a challenge to identify what items influence users most when downloading an app. To explore users' perspective, we conduct a survey to inquire about this question. The results of survey suggest that the participants pay most attention to the app description which gives users a quick overview of the app. Although there exist some guidelines about how to write a good app description to attract more downloads, it is hard to define a high quality app description. Meanwhile, there is no tool to evaluate the quality of app description. In this paper, we employ the method of crowdsourcing to extract the attributes that affect the app descriptions' quality. First, we download some app descriptions from Google Play, then invite some participants to rate their quality with the score from one (very poor) to five (very good). The participants are also requested to explain every score's reasons. By analyzing the reasons, we extract the attributes that the participants consider important during evaluating the quality of app descriptions. Finally, we train the supervised learning models on a sample of 100 app descriptions. In our experiments, the support vector machine model obtains up to 62% accuracy. In addition, we find that the permission, the number of paragraphs and the average number of words in one feature play key roles in defining a good app description.
Mar 07 2017 cs.SE
Developers increasingly rely on API tutorials to facilitate software development. However, it remains a challenging task for them to discover relevant API tutorial fragments explaining unfamiliar APIs. Existing supervised approaches suffer from the heavy burden of manually preparing corpus-specific annotated data and features. In this study, we propose a novel unsupervised approach, namely Fragment Recommender for APIs with PageRank and Topic model (FRAPT). FRAPT can well address two main challenges lying in the task and effectively determine relevant tutorial fragments for APIs. In FRAPT, a Fragment Parser is proposed to identify APIs in tutorial fragments and replace ambiguous pronouns and variables with related ontologies and API names, so as to address the pronoun and variable resolution challenge. Then, a Fragment Filter employs a set of nonexplanatory detection rules to remove non-explanatory fragments, thus address the non-explanatory fragment identification challenge. Finally, two correlation scores are achieved and aggregated to determine relevant fragments for APIs, by applying both topic model and PageRank algorithm to the retained fragments. Extensive experiments over two publicly open tutorial corpora show that, FRAPT improves the state-of-the-art approach by 8.77% and 12.32% respectively in terms of F-Measure. The effectiveness of key components of FRAPT is also validated.
Mar 07 2017 cs.SE
Developers prefer to utilize third-party libraries when they implement some functionalities and Application Programming Interfaces (APIs) are frequently used by them. Facing an unfamiliar API, developers tend to consult tutorials as learning resources. Unfortunately, the segments explaining a specific API scatter across tutorials. Hence, it remains a challenging issue to find the relevant segments. In this study, we propose a more accurate model to find the exact tutorial fragments explaining APIs. This new model consists of a text classifier with domain specific features. More specifically, we discover two important indicators to complement traditional text based features, namely co-occurrence APIs and knowledge based API extensions. In addition, we incorporate Word2Vec, a semantic similarity metric to enhance the new model. Extensive experiments over two publicly available tutorial datasets show that our new model could find up to 90% fragments explaining APIs and improve the state-of-the-art model by up to 30% in terms of F-measure.
Mar 06 2017 cs.CL
As training data rapid growth, large-scale parallel training with multi-GPUs cluster is widely applied in the neural network model learning currently.We present a new approach that applies exponential moving average method in large-scale parallel training of neural network model. It is a non-interference strategy that the exponential moving average model is not broadcasted to distributed workers to update their local models after model synchronization in the training process, and it is implemented as the final model of the training system. Fully-connected feed-forward neural networks (DNNs) and deep unidirectional Long short-term memory (LSTM) recurrent neural networks (RNNs) are successfully trained with proposed method for large vocabulary continuous speech recognition on Shenma voice search data in Mandarin. The character error rate (CER) of Mandarin speech recognition further degrades than state-of-the-art approaches of parallel training.
Mar 06 2017 cs.CV
One of the major challenges in Minimally Invasive Surgery (MIS) such as laparoscopy is the lack of depth perception. In recent years, laparoscopic scene tracking and surface reconstruction has been a focus of investigation to provide rich additional information to aid the surgical process and compensate for the depth perception issue. However, robust 3D surface reconstruction and augmented reality with depth perception on the reconstructed scene are yet to be reported. This paper presents our work in this area. First, we adopt a state-of-the-art visual simultaneous localization and mapping (SLAM) framework - ORB-SLAM - and extend the algorithm for use in MIS scenes for reliable endoscopic camera tracking and salient point mapping. We then develop a robust global 3D surface reconstruction frame- work based on the sparse point clouds extracted from the SLAM framework. Our approach is to combine an outlier removal filter within a Moving Least Squares smoothing algorithm and then employ Poisson surface reconstruction to obtain smooth surfaces from the unstructured sparse point cloud. Our proposed method has been quantitatively evaluated compared with ground-truth camera trajectories and the organ model surface we used to render the synthetic simulation videos. In vivo laparoscopic videos used in the tests have demonstrated the robustness and accuracy of our proposed framework on both camera tracking and surface reconstruction, illustrating the potential of our algorithm for depth augmentation and depth-corrected augmented reality in MIS with monocular endoscopes.
We consider a D2D-enabled cellular network where user equipments (UEs) owned by rational users are incentivized to form D2D pairs using tokens. They exchange tokens electronically to "buy" and "sell" D2D services. Meanwhile the devices have the ability to choose the transmission mode, i.e. receiving data via cellular links or D2D links. Thus taking the different benefits brought by diverse traffic types as a prior, the UEs can utilize their tokens more efficiently via transmission mode selection. In this paper, the optimal transmission mode selection strategy as well as token collection policy are investigated to maximize the long-term utility in the dynamic network environment. The optimal policy is proved to be a threshold strategy, and the thresholds have a monotonicity property. Numerical simulations verify our observations and the gain from transmission mode selection is observed.
Process modeling and understanding is fundamental for advanced human-computer interfaces and automation systems. Recent research focused on activity recognition, but little work has focused on process progress detection from sensor data. We introduce a real-time, sensor-based system for modeling, recognizing and estimating the completeness of a process. We implemented a multimodal CNN-LSTM structure to extract the spatio-temporal features from different sensory datatypes. We used a novel deep regression structure for overall completeness estimation. By combining process completeness estimation with a Gaussian mixture model, our system can predict the process phase using the estimated completeness. We also introduce the rectified hyperbolic tangent (rtanh) activation function and conditional loss to help the training process. Using the completeness estimation result and performance speed calculations, we also implemented an online estimator of remaining time. We tested this system using data obtained from a medical process (trauma resuscitation) and sport events (swim competition). Our system outperformed existing implementations for phase prediction during trauma resuscitation and achieved over 80% of process phase detection accuracy with less than 9% completeness estimation error and time remaining estimation error less than 18% of duration in both dataset.
Under Markovian assumptions we leverage a Central Limit Theorem (CLT) related to the test statistic in the composite hypothesis Hoeffding test so as to derive a new estimator for the threshold needed by the test. We first show the advantages of our estimator over an existing estimator by conducting extensive numerical experiments. We find that our estimator controls better for false alarms while maintaining satisfactory detection probabilities. We then apply the Hoeffding test with our threshold estimator to detecting anomalies in both communication and transportation networks. The former application seeks to enhance cyber security and the latter aims at building smarter transportation systems in cities.
Feb 23 2017 cs.NI
In this paper, we develop a framework for an innovative perceptive mobile (i.e. cellular) network that integrates sensing with communication, and supports new applications widely in transportation, surveillance and environmental sensing. Three types of sensing methods implemented in the base-stations are proposed, using either uplink or downlink multiuser communication signals. The required changes to system hardware and major technical challenges are briefly discussed. We also demonstrate the feasibility of estimating sensing parameters via developing a compressive sensing based scheme and providing simulation results to validate its effectiveness.
Airborne laser scanning (lidar) point clouds can be process to extract tree-level information over large forested landscapes. Existing procedures typically detect more than 90% of overstory trees, yet they barely detect 60% of understory trees because of reduced number of lidar points penetrating the top canopy layer. Although understory trees provide limited financial value, they offer habitat for numerous wildlife species and are important for stand development. Here we model tree identification accuracy according to point cloud density by decomposing lidar point cloud into overstory and multiple understory canopy layers, estimating the fraction of points representing the different layers, and inspecting tree identification accuracy as a function of point density. We show at a density of about 170 pt/m2 understory tree identification accuracy likely plateaus, which we regard as the required point density for reasonable identification of understory trees. Given the advancements of lidar sensor technology, point clouds can feasibly reach the required density to enable effective identification of individual understory trees, ultimately making remote quantification of forest resources more accurate. The layer decomposition methodology can also be adopted for other similar remote sensing or advanced imaging applications such as geological subsurface modelling or biomedical tissue analysis.
Caching at mobile devices, accompanied by device-to-device (D2D) communications, is one promising technique to accommodate the exponentially increasing mobile data traffic. While most previous works ignored user mobility, there are some recent works taking it into account. However, the duration of user contact times has been ignored, making it difficult to explicitly characterize the effect of mobility. In this paper, we adopt the alternating renewal process to model the duration of both the contact and inter-contact times, and investigate how the caching performance is affected by mobility. The data offloading ratio, i.e., the proportion of requested data that can be delivered via D2D links, is taken as the performance metric. We first approximate the distribution of the communication time for a given user by beta distribution through moment matching. With this approximation, an accurate expression of the data offloading ratio is derived. For the homogeneous case where the average contact and inter-contact times of different user pairs are identical, we prove that the data offloading ratio increases with the user moving speed, assuming that the transmission rate remains the same. Simulation results are provided to show the accuracy of the approximate result, and also validate the effect of user mobility.
Feb 21 2017 cs.CV
3D face reconstruction from a single image is a classical and challenging problem, with wide applications in many areas. Inspired by recent works in face animation from RGB-D or monocular video inputs, we develop a novel method for reconstructing 3D faces from unconstrained 2D images, using a coarse-to-fine optimization strategy. First, a smooth coarse 3D face is generated from an example-based bilinear face model, by aligning the projection of 3D face landmarks with 2D landmarks detected from the input image. Afterwards, using global corrective deformation fields, the coarse 3D face is refined using photometric consistency constraints, resulting in a medium face shape. Finally, a shape-from-shading method is applied on the medium face to recover fine geometric details. Our method outperforms state-of-the-art approaches in terms of accuracy and detail recovery, which is demonstrated in extensive experiments using real world models and publicly available datasets.
Millimeter wave (mm-wave) communications is considered a promising technology for 5G networks. Exploiting beamforming gains with large-scale antenna arrays to combat the increased path loss at mm-wave bands is one of its defining features. However, previous works on mm-wave network analysis usually adopted oversimplified antenna patterns for tractability, which can lead to significant deviation from the performance with actual antenna patterns. In this paper, using tools from stochastic geometry, we carry out a comprehensive investigation on the impact of directional antenna arrays in mm-wave networks. We first present a general and tractable framework for coverage analysis with arbitrary distributions for interference power and arbitrary antenna patterns. It is then applied to mm-wave ad hoc and cellular networks, where two sophisticated antenna patterns with desirable accuracy and analytical tractability are proposed to approximate the actual antenna pattern. Compared with previous works, the proposed approximate antenna patterns help to obtain more insights on the role of directional antenna arrays in mm-wave networks. In particular, it is shown that the coverage probabilities of both types of networks increase as a non-decreasing concave function with the antenna array size. The analytical results are verified to be effective and reliable through simulations, and numerical results also show that large-scale antenna arrays are required for satisfactory coverage in mm-wave networks.
Densifying the network and deploying more antennas at each access point are two principal ways to boost the capacity of wireless networks. However, due to the complicated distributions of random signal and interference channel gains, largely induced by various space-time processing techniques, it is highly challenging to quantitatively characterize the performance of dense multi-antenna networks. In this paper, using tools from stochastic geometry, a tractable framework is proposed for the analytical evaluation of such networks. The major result is an innovative representation of the coverage probability, as an induced $\ell_1$-norm of a Toeplitz matrix. This compact representation incorporates lots of existing analytical results on single- and multi-antenna networks as special cases, and its evaluation is almost as simple as the single-antenna case with Rayleigh fading. To illustrate its effectiveness, we apply the proposed framework to investigate two kinds of prevalent dense wireless networks, i.e., physical layer security aware networks and millimeter-wave networks. In both examples, in addition to tractable analytical results of relevant performance metrics, insightful design guidelines are also analytically obtained.
Groups of Small and Medium Enterprises (SME) back each other and form guarantee network to obtain loan from banks. The risk over the networked enterprises may cause significant contagious damage. To dissolve such risks, we propose a hybrid feature representation, which is feeded into a gradient boosting model for credit risk assessment of guarantee network. Empirical study is performed on a ten-year guarantee loan record from commercial banks. We find that often hundreds or thousands of enterprises back each other and constitute a sparse complex network. We study the risk of various structures of loan guarantee network, and observe the high correlation between defaults with centrality, and with the communities of the network. In particular, our quantitative risk evaluation model shows promising prediction performance on real-world data, which can be useful to both regulators and stakeholders.
Feb 07 2017 cs.CV
We introduce a system that recognizes concurrent activities from real-world data captured by multiple sensors of different types. The recognition is achieved in two steps. First, we extract spatial and temporal features from the multimodal data. We feed each datatype into a convolutional neural network that extracts spatial features, followed by a long-short term memory network that extracts temporal information in the sensory data. The extracted features are then fused for decision making in the second step. Second, we achieve concurrent activity recognition with a single classifier that encodes a binary output vector in which elements indicate whether the corresponding activity types are currently in progress. We tested our system with three datasets from different domains recorded using different sensors and achieved performance comparable to existing systems designed specifically for those domains. Our system is the first to address the concurrent activity recognition with multisensory data using a single model, which is scalable, simple to train and easy to deploy.
Mobile-edge computing (MEC) has recently emerged as a prominent technology to liberate mobile devices from computationally intensive workloads, by offloading them to the proximate MEC server. To make offloading effective, the radio and computational resources need to be dynamically managed, to cope with the time-varying computation demands and wireless fading channels. In this paper, we develop an online joint radio and computational resource management algorithm for multi-user MEC systems, with the objective as minimizing the long-term average weighted sum power consumption of the mobile devices and the MEC server, subject to a task buffer stability constraint. Specifically, at each time slot, the optimal CPU-cycle frequencies of the mobile devices are obtained in closed forms, and the optimal transmit power and bandwidth allocation for computation offloading are determined with the Gauss-Seidel method; while for the MEC server, both the optimal frequencies of the CPU cores and the optimal MEC server scheduling decision are derived in closed forms. Besides, a delay-improved mechanism is proposed to reduce the execution delay. Rigorous performance analysis is conducted for the proposed algorithm and its delay-improved version, indicating that the weighted sum power consumption and execution delay obey an $\left[O\left(1\slash V\right),O\left(V\right)\right]$ tradeoff with $V$ as a control parameter. Simulation results are provided to validate the theoretical analysis and demonstrate the impacts of various parameters.
Jan 25 2017 cs.CV
Recent works such as DEEPDESC, DEEPCOMPARE have proposed the learning of robust local image descriptors using a Siamese convolutional neural network directly from images instead of handcrafting them like traditional descriptors such as SIFT and MROGH. Though these algorithms show the state-of-the-art results on the Multi-View Stereo (MVS) dataset, they fail to accomplish many challenging real world tasks such as stitching image panoramas, primarily due to the limited performance of finding correspondence. In this paper, we propose a novel hybrid algorithm with which we are able to harness the power of a learning based approach along with the discriminative advantages that traditional descriptors have to offer. We also propose the PhotoSynth dataset, with size of an order of magnitude more that the traditional MVS dataset in terms of the number of scenes, images, patches along with positive and negative correspondence. Our PhotoSynth dataset also has better coverage of the overall viewpoint, scale, and lighting challenges than the MVS dataset. We evaluate our approach on two data sets which provides images having high viewpoints difference and wide-baselines. One of them is Graffiti scene from the Oxford Affine Covariant Regions Dataset (ACRD) for matching images with 2D affine transformations. The other is the Fountain-P11 dataset for images with 3D projective transformations. We report, to the best of our knowledge, the best results till date on the ACRD Graffiti scene compared to descriptors such as SIFT, MROGH or any other learnt descriptors such as DEEPDESC.
Mobile-edge computing (MEC) has emerged as a prominent technique to provide mobile services with high computation requirement, by migrating the computation-intensive tasks from the mobile devices to the nearby MEC servers. To reduce the execution latency and device energy consumption, in this paper, we jointly optimize task offloading scheduling and transmit power allocation for MEC systems with multiple independent tasks. A low-complexity sub-optimal algorithm is proposed to minimize the weighted sum of the execution delay and device energy consumption based on alternating minimization. Specifically, given the transmit power allocation, the optimal task offloading scheduling, i.e., to determine the order of offloading, is obtained with the help of flow shop scheduling theory. Besides, the optimal transmit power allocation with a given task offloading scheduling decision will be determined using convex optimization techniques. Simulation results show that task offloading scheduling is more critical when the available radio and computational resources in MEC systems are relatively balanced. In addition, it is shown that the proposed algorithm achieves near-optimal execution delay along with a substantial device energy saving.
Jan 17 2017 cs.DB
We propose hMDAP, a hybrid framework for large-scale data analytical processing on Spark, to support multi-paradigm process (incl. OLAP, machine learning, and graph analysis etc.) in distributed environments. The framework features a three-layer data process module and a business process module which controls the former. We will demonstrate the strength of hMDAP by using traffic scenarios in a real world.
Due to the exponential complexity of the resources required for quantum state tomography (QST), people are looking for approaches that identify quantum states which require less efforts and involve faster speed. In this paper, we provide a tailored efficient method for reconstructing mixed quantum states up to $12$ (or even more) qubits from an incomplete set of observables subject to noises. Our method is applicable to any pure state $\rho$ and can be extended to many states of interest in quantum information tasks, such as multi-particle entangled $W$ state, GHZ state and cluster states that are matrix product operators of low dimensions. The method applies the quantum density matrix constraints to a quantum compressive sensing optimization problem, and exploits a modified Quantum Alternating Direction Multiplier Method (Quantum-ADMM) to accelerate the convergence. Our algorithm takes $8,35, 226$ seconds respectively to reconstruct arbitrary superposition state density matrices of $10,11,12$ qubits with acceptable fidelity, using less than $1 \%$ of measurements of expectation, which is the fastest realization to date that people can achieve using a normal desktop. We further discuss applications of this method using experimental data of mixed states obtained in an ion trap experiment of up to $8$ qubits.
We investigate random search processes on complex networks and for the first time derive an exact expression for the partial cover time that quantifies the time a walker needs to visit multiple targets. Based on that, we find some invariant metrics like the effects of source location and the scale exponent of the size effect, which are independent of the target number. Interestingly, we observe the slow, logarithmic increase of the global partial cover time with the target number across various real networks. This suggests that more unvisited targets could be easily found by spending only a little extra time. This finding has practical applications in a broad range of areas where random searches are used to model complex dynamical processes.
Jan 11 2017 cs.AI
Forecasting the flow of crowds is of great importance to traffic management and public safety, and very challenging as it is affected by many complex factors, including spatial dependencies (nearby and distant), temporal dependencies (closeness, period, trend), and external conditions (e.g., weather and events). We propose a deep-learning-based approach, called ST-ResNet, to collectively forecast two types of crowd flows (i.e. inflow and outflow) in each and every region of a city. We design an end-to-end structure of ST-ResNet based on unique properties of spatio-temporal data. More specifically, we employ the residual neural network framework to model the temporal closeness, period, and trend properties of crowd traffic. For each property, we design a branch of residual convolutional units, each of which models the spatial properties of crowd traffic. ST-ResNet learns to dynamically aggregate the output of the three residual neural networks based on data, assigning different weights to different branches and regions. The aggregation is further combined with external factors, such as weather and day of the week, to predict the final traffic of crowds in each and every region. We have developed a real-time system based on Microsoft Azure Cloud, called UrbanFlow, providing the crowd flow monitoring and forecasting in Guiyang City of China. In addition, we present an extensive experimental evaluation using two types of crowd flows in Beijing and New York City (NYC), where ST-ResNet outperforms nine well-known baselines.
User-generated social media data are exploding and also of high demand in public and private sectors. The disclosure of complete and intact social media data exacerbates the threats to user privacy. In this paper, we first identify a text-based user-linkage attack on current social media data publishing practices, in which the real users of anonymous IDs in a published dataset can be pinpointed based on the users' unprotected text data. Then we propose a framework for differentially privacy-preserving social media data publishing for the first time in literature. Within our framework, social media data service providers can publish perturbed datasets to provide differential privacy to social media users while offering high data utility to social media data consumers. Our differential privacy mechanism is based on a novel notion of $\epsilon$-text indistinguishability, which we propose to thwart the text-based user-linkage attack. Extensive experiments on real-world and simulated datasets confirm that our framework can enable high-level differential privacy protection and also high data utility at the same time.
Hybrid precoding is a cost-effective approach to support directional transmissions for millimeter wave (mmWave) communications. While existing works on hybrid precoding mainly focus on single-user single-carrier transmission, in practice multicarrier transmission is needed to combat the much increased bandwidth, and multiuser MIMO can provide additional spatial multiplexing gains. In this paper, we propose a new hybrid precoding structure for multiuser OFDM mmWave systems, which greatly simplifies the hybrid precoder design and is able to approach the performance of the fully digital precoder. In particular, two groups of phase shifters are combined to map the signals from radio frequency (RF) chains to antennas. Then an effective hybrid precoding algorithm based on alternating minimization (AltMin) is proposed, which will alternately optimize the digital and analog precoders. A major algorithmic innovation is a LASSO formulation for the analog precoder, which yields computationally efficient algorithms. Simulation results will show the performance gain of the proposed algorithm. Moreover, it will reveal that canceling the interuser interference is critical in multiuser OFDM hybrid precoding systems.
Driven by the visions of Internet of Things and 5G communications, recent years have seen a paradigm shift in mobile computing, from the centralized Mobile Cloud Computing towards Mobile Edge Computing (MEC). The main feature of MEC is to push mobile computing, network control and storage to the network edges (e.g., base stations and access points) so as to enable computation-intensive and latency-critical applications at the resource-limited mobile devices. MEC promises dramatic reduction in latency and mobile energy consumption, tackling the key challenges for materializing 5G vision. The promised gains of MEC have motivated extensive efforts in both academia and industry on developing the technology. A main thrust of MEC research is to seamlessly merge the two disciplines of wireless communications and mobile computing, resulting in a wide-range of new designs ranging from techniques for computation offloading to network architectures. This paper provides a comprehensive survey of the state-of-the-art MEC research with a focus on joint radio-and-computational resource management. We also present a research outlook consisting of a set of promising directions for MEC research, including MEC system deployment, cache-enabled MEC, mobility management for MEC, green MEC, as well as privacy-aware MEC. Advancements in these directions will facilitate the transformation of MEC from theory to practice. Finally, we introduce recent standardization efforts on MEC as well as some typical MEC application scenarios.
Jan 04 2017 cs.CL
Deep stacked RNNs are usually hard to train. Adding shortcut connections across different layers is a common way to ease the training of stacked networks. However, extra shortcuts make the recurrent step more complicated. To simply the stacked architecture, we propose a framework called shortcut block, which is a marriage of the gating mechanism and shortcuts, while discarding the self-connected part in LSTM cell. We present extensive empirical experiments showing that this design makes training easy and improves generalization. We propose various shortcut block topologies and compositions to explore its effectiveness. Based on this architecture, we obtain a 6% relatively improvement over the state-of-the-art on CCGbank supertagging dataset. We also get comparable results on POS tagging task.
This paper presents a distributed approach that scales up to segment tree crowns within a LiDAR point cloud representing an arbitrarily large forested area. The approach uses a single-processor tree segmentation algorithm as a building block in order to process the data delivered in the shape of tiles in parallel. The distributed processing is performed in a master-slave manner, in which the master maintains the global map of the tiles and coordinates the slaves that segment tree crowns within and across the boundaries of the tiles. A minimal bias was introduced to the number of detected trees because of trees lying across the tile boundaries, which was quantified and adjusted for. Theoretical and experimental analyses of the runtime of the approach revealed a near linear speedup. The estimated number of trees categorized by crown class and the associated error margins as well as the height distribution of the detected trees aligned well with field estimations, verifying that the distributed approach works correctly. The approach enables providing information of individual tree locations and point cloud segments for a forest-level area in a timely manner, which can be used to create detailed remotely sensed forest inventories. Although the approach was presented for tree segmentation within LiDAR point clouds, the idea can also be generalized to scale up processing other big spatial datasets. Highlights: - A scalable distributed approach for tree segmentation was developed and theoretically analyzed. - ~2 million trees in a 7440 ha forest was segmented in 2.5 hours using 192 cores. - 2% false positive trees were identified as a result of the distributed run. - The approach can be used to scale up processing other big spatial data
This paper presents a non-parametric approach for segmenting trees from airborne LiDAR data in deciduous forests. Based on the LiDAR point cloud, the approach collects crown information such as steepness and height on-the-fly to delineate crown boundaries, and most importantly, does not require a priori assumptions of crown shape and size. The approach segments trees iteratively starting from the tallest within a given area to the smallest until all trees have been segmented. To evaluate its performance, the approach was applied to the University of Kentucky Robinson Forest, a deciduous closed-canopy forest with complex terrain and vegetation conditions. The approach identified 94% of dominant and co-dominant trees with a false detection rate of 13%. About 62% of intermediate, overtopped, and dead trees were also detected with a false detection rate of 15%. The overall segmentation accuracy was 77%. Correlations of the segmentation scores of the proposed approach with local terrain and stand metrics was not significant, which is likely an indication of the robustness of the approach as results are not sensitive to the differences in terrain and stand structures.
In this paper, we focus on one of the representative 5G network scenarios, namely multi-tier heterogeneous cellular networks. User association is investigated in order to reduce the down-link co-channel interference. Firstly, in order to analyze the multi-tier heterogeneous cellular networks where the base stations in different tiers usually adopt different transmission powers, we propose a Transmission Power Normalization Model (TPNM), which is able to convert a multi-tier cellular network into a single-tier network, such that all base stations have the same normalized transmission power. Then using TPNM, the signal and interference received at any point in the complex multi-tier environment can be analyzed by considering the same point in the equivalent single-tier cellular network model, thus significantly simplifying the analysis. On this basis, we propose a new user association scheme in heterogeneous cellular networks, where the base station that leads to the smallest interference to other co-channel mobile stations is chosen from a set of candidate base stations that satisfy the quality-of-service (QoS) constraint for an intended mobile station. Numerical results show that the proposed user association scheme is able to significantly reduce the down-link interference compared with existing schemes while maintaining a reasonably good QoS.
Airborne LiDAR point cloud representing a forest contains 3D data, from which vertical stand structure can be derived. This paper presents a tree segmentation approach for multi-story stands that iteratively strips canopy layers off the point cloud and segments individual tree crowns within each layer using a digital surface model based tree segmentation method as a building block. We analyze the vertical distributions of LiDAR points within overlapping locales in order to determine the local height thresholds for stripping a canopy layer. Unlike the previous work that stripped stiff layers within constrained areas, the local layering method strips flexible (in thickness and height) canopy layers within unconstrained areas, which can also be utilized as a robust vertical stratification of canopy, independent of the tree segmentation method applied to each layer. Statistical analyses showed that layering strongly improves detecting under-story trees at the cost of moderately increasing over-segmentation rate of the detected under-story trees, while only slightly affecting the segmentation quality of over-story trees. Results obtained from layering the canopy suggest that acquiring denser LiDAR point clouds (becoming affordable due to advancements of the sensor technology and platforms) would allow segmenting under-story trees as accurately as over-story trees. Keywords: LiDAR remote sensing, multi-layered stand, canopy layering, vertical stratification, individual tree segmentation.
Dec 23 2016 cs.CV
Neural image/video captioning models can generate accurate descriptions, but their internal process of mapping regions to words is a black box and therefore difficult to explain. Top-down neural saliency methods can find important regions given a high-level semantic task such as object classification, but cannot use a natural language sentence as the top-down input for the task. In this paper, we propose Caption-Guided Visual Saliency to expose the region-to-word mapping in modern encoder-decoder networks and demonstrate that it is learned implicitly from caption training data, without any pixel-level annotations. Our approach can produce spatial or spatiotemporal heatmaps for both predicted captions, and for arbitrary query sentences. It recovers saliency without the overhead of introducing explicit attention layers, and can be used to analyze a variety of existing model architectures and improve their design. Evaluation on large-scale video and image datasets demonstrates that our approach achieves comparable captioning performance with existing methods while providing more accurate saliency heatmaps. Our code is available at visionlearninggroup.github.io/caption-guided-saliency/.
In this paper, we propose a new task and solution for vision and language: generation of grounded visual questions. Visual question answering (VQA) is an emerging topic which links textual questions with visual input. To the best of our knowledge, it lacks automatic method to generate reasonable and versatile questions. So far, almost all the textual questions are generated manually, as well as the corresponding answers. To this end, we propose a system that automatically generates visually grounded questions . First, visual input is analyzed with deep caption model. Second, the captions along with VGG-16 features are used as input for our proposed question generator to generate visually grounded questions. Finally, to enable generating of versatile questions, a question type selection module is provided which selects reasonable question types and provide them as parameters for question generation. This is done using a hybrid LSTM with both visual and answer input. Our system is trained using VQA and Visual7W dataset and shows reasonable results on automatically generating of new visual questions. We also propose a quantitative metric for automatic evaluation of the question quality.
In this paper we consider the problem of robot navigation in simple maze-like environments where the robot has to rely on its onboard sensors to perform the navigation task. In particular, we are interested in solutions to this problem that do not require localization, mapping or planning. Additionally, we require that our solution can quickly adapt to new situations (e.g., changing navigation goals and environments). To meet these criteria we frame this problem as a sequence of related reinforcement learning tasks. We propose a successor feature based deep reinforcement learning algorithm that can learn to transfer knowledge from previously mastered navigation tasks to new problem instances. Our algorithm substantially decreases the required learning time after the first task instance has been solved, which makes it easily adaptable to changing environments. We validate our method in both simulated and real robot experiments with a Robotino and compare it to a set of baseline methods including classical planning-based navigation.
Dec 15 2016 cs.CL
Product Community Question Answering (PCQA) provides useful information about products and their features (aspects) that may not be well addressed by product descriptions and reviews. We observe that a product's compatibility issues with other products are frequently discussed in PCQA and such issues are more frequently addressed in accessories, i.e., via a yes/no question "Does this mouse work with windows 10?". In this paper, we address the problem of extracting compatible and incompatible products from yes/no questions in PCQA. This problem can naturally have a two-stage framework: first, we perform Complementary Entity (product) Recognition (CER) on yes/no questions; second, we identify the polarities of yes/no answers to assign the complementary entities a compatibility label (compatible, incompatible or unknown). We leverage an existing unsupervised method for the first stage and a 3-class classifier by combining a distant PU-learning method (learning from positive and unlabeled examples) together with a binary classifier for the second stage. The benefit of using distant PU-learning is that it can help to expand more implicit yes/no answers without using any human annotated data. We conduct experiments on 4 products to show that the proposed method is effective.
Stabilization and trajectory control of a quadrotor carrying a suspended load with a fixed known mass has been extensively studied in recent years. However, the load mass is not always known beforehand or may vary during the practical transportations. This mass uncertainty brings uncertain disturbances to the quadrotor system, causing existing controllers to have worse stability and trajectory tracking performance. To improve the quadrotor stability and trajectory tracking capability in this situation, we fully investigate the impacts of the uncertain load mass on the quadrotor. By comparing the performances of three different controllers -- the proportional-derivative (PD) controller, the sliding mode controller (SMC), and the model predictive controller (MPC) -- stabilization rather than trajectory tracking error is proved to be the main influence in the load mass uncertainty. A critical motion mass exists for the quadrotor to maintain a desired transportation performance. Moreover, simulation results verify that a controller with strong robustness against disturbances is a good choice for practical applications.
This paper proposes a unified framework for the effective rate analysis over arbitrary correlated and not necessarily identical multiple inputs single output (MISO) fading channels, which uses moment generating function (MGF) based approach and H transform representation. The proposed framework has the potential to simplify the cumbersome analysis procedure compared to the probability density function (PDF) based approach. Moreover, the effective rates over two specific fading scenarios are investigated, namely independent but not necessarily identical distributed (i.n.i.d.) MISO hyper Fox's H fading channels and arbitrary correlated generalized K fading channels. The exact analytical representations for these two scenarios are also presented. By substituting corresponding parameters, the effective rates in various practical fading scenarios, such as Rayleigh, Nakagami-m, Weibull/Gamma and generalized K fading channels, are readily available. In addition, asymptotic approximations are provided for the proposed H transform and MGF based approach as well as for the effective rate over i.n.i.d. MISO hyper Fox's H fading channels. Simulations under various fading scenarios are also presented, which support the validity of the proposed method.
Dec 09 2016 cs.CV
The hashing methods have attracted much attention for large scale image retrieval. Some deep hashing methods have achieved promising results by taking advantage of the better representation power of deep networks recently. However, existing deep hashing methods treat all hash bits equally. On one hand, a large number of images share the same distance to a query image because of the discrete Hamming distance, which cannot provide fine-grained retrieval since the ranking of these images is ambiguous. On the other hand, different hash bits actually contribute to the image retrieval differently, treating them equally greatly affects the image retrieval accuracy. To address the two problems, we propose the query-adaptive deep weighted hashing (QaDWH) approach, which can perform fine-grained image retrieval for different queries by weighted Hamming distance. First, a novel deep hashing network is designed to learn the hash codes and corresponding class-wise hash bit weights jointly, so that the learned weights can reflect the importance of different hash bits for different image class. Second, a query-adaptive image retrieval method is proposed, which rapidly generate query image's hash bit weights by the fusion of the semantic probability of the query and the learned class-wise weights. Fine-grained image retrieval is then performed by the weighted Hamming distance, which can provide more accurate ranking than the original Hamming distance. Extensive experiments on 3 widely used datasets show that the proposed approach outperforms state-of-the-art hashing methods.