results for au:Xu_J in:cs

- Super point is a kind of special host in the network which contacts with huge of other hosts. Estimating its cardinality, the number of other hosts contacting with it, plays important roles in network management. But all of existing works focus on discrete time window super point cardinality estimation which has great latency and ignores many measuring periods. Sliding time window measures super point cardinality in a finer granularity than that of discrete time window but also more complex. This paper firstly introduces an algorithm to estimate super point cardinality under sliding time window from distributed edge routers. This algorithm's ability of sliding super point cardinality estimating comes from a novel method proposed in this paper which can record the time that a host appears. Based on this method, two sliding cardinality estimators, sliding rough estimator and sliding linear estimator, are devised for super points detection and their cardinalities estimation separately. When using these two estimators together, the algorithm consumes the smallest memory with the highest accuracy. This sliding super point cardinality algorithm can be deployed in distributed environment and acquire the global super points' cardinality by merging estimators of distributed nodes. Both of these estimators could process packets parallel which makes it becom possible to deal with high speed network in real time by GPU. Experiments on a real world traffic show that this algorithm have the highest accuracy and the smallest memory comparing with others when running under discrete time window. Under sliding time window, this algorithm also has the same performance as under discrete time window.
- This paper contributes to cross-lingual image annotation and retrieval in terms of data and methods. We propose COCO-CN, a novel dataset enriching MS-COCO with manually written Chinese sentences and tags. For more effective annotation acquisition, we develop a recommendation-assisted collective annotation system, automatically providing an annotator with several tags and sentences deemed to be relevant with respect to the pictorial content. Having 20,342 images annotated with 27,218 Chinese sentences and 70,993 tags, COCO-CN is currently the largest Chinese-English dataset applicable for cross-lingual image tagging, captioning and retrieval. We develop methods per task for effectively learning from cross-lingual resources. Extensive experiments on the multiple tasks justify the viability of our dataset and methods.
- This paper proposes a novel non-oscillatory pattern (NOP) learning scheme for several oscillatory data analysis problems including signal decomposition, super-resolution, and signal sub-sampling. To the best of our knowledge, the proposed NOP is the first algorithm for these problems with fully non-stationary oscillatory data with close and crossover frequencies, and general oscillatory patterns. NOP is capable of handling complicated situations while existing algorithms fail; even in simple cases, e.g., stationary cases with trigonometric patterns, numerical examples show that NOP admits competitive or better performance in terms of accuracy and robustness than several state-of-the-art algorithms.
- May 22 2018 cs.GR arXiv:1805.07794v1To carry out autonomous 3D scanning and online reconstruction of unknown indoor scenes, one has to find a balance between global exploration of the entire scene and local scanning of the objects within it. In this work, we propose a novel approach, which provides object-aware guidance for autoscanning, for exploring, reconstructing, and understanding an unknown scene within one navigation pass. Our approach interleaves between object analysis to identify the next best object (NBO) for global exploration, and object-aware information gain analysis to plan the next best view (NBV) for local scanning. First, an objectness-based segmentation method is introduced to extract semantic objects from the current scene surface via a multi-class graph cuts minimization. Then, an object of interest (OOI) is identified as the NBO which the robot aims to visit and scan. The robot then conducts fine scanning on the OOI with views determined by the NBV strategy. When the OOI is recognized as a full object, it can be replaced by its most similar 3D model in a shape database. The algorithm iterates until all of the objects are recognized and reconstructed in the scene. Various experiments and comparisons have shown the feasibility of our proposed approach.
- This paper proposes a novel user cooperation approach in both computation and communication for mobile edge computing (MEC) systems to improve the energy efficiency for latency-constrained computation. We consider a basic three-node MEC system consisting of a user node, a helper node, and an access point (AP) node attached with an MEC server, in which the user has latency-constrained and computation-intensive tasks to be executed. We consider two different computation offloading models, namely the partial and binary offloading, respectively. Under this setup, we focus on a particular finite time block and develop an efficient four-slot transmission protocol to enable the joint computation and communication cooperation. Besides the local task computing over the whole block, the user can offload some computation tasks to the helper in the first slot, and the helper cooperatively computes these tasks in the remaining time; while in the second and third slots, the helper works as a cooperative relay to help the user offload some other tasks to the AP for remote execution in the fourth slot. For both cases with partial and binary offloading, we jointly optimize the computation and communication resources allocation at both the user and the helper (i.e., the time and transmit power allocations for offloading, and the CPU frequencies for computing), so as to minimize their total energy consumption while satisfying the user's computation latency constraint. Although the two problems are non-convex in general, we propose efficient algorithms to solve them optimally. Numerical results show that the proposed joint computation and communication cooperation approach significantly improves the computation capacity and energy efficiency at the user and helper nodes, as compared to other benchmark schemes without such a joint design.
- May 16 2018 cs.IR arXiv:1805.05737v1Assessing relevance between a query and a document is challenging in ad-hoc retrieval due to its diverse patterns, i.e., a document could be relevant to a query as a whole or partially as long as it provides sufficient information for users' need. Such diverse relevance patterns require an ideal retrieval model to be able to assess relevance in the right granularity adaptively. Unfortunately, most existing retrieval models compute relevance at a single granularity, either document-wide or passage-level, or use fixed combination strategy, restricting their ability in capturing diverse relevance patterns. In this work, we propose a data-driven method to allow relevance signals at different granularities to compete with each other for final relevance assessment. Specifically, we propose a HIerarchical Neural maTching model (HiNT) which consists of two stacked components, namely local matching layer and global decision layer. The local matching layer focuses on producing a set of local relevance signals by modeling the semantic matching between a query and each passage of a document. The global decision layer accumulates local signals into different granularities and allows them to compete with each other to decide the final relevance score. Experimental results demonstrate that our HiNT model outperforms existing state-of-the-art retrieval models significantly on benchmark ad-hoc retrieval datasets.
- May 15 2018 cs.CV arXiv:1805.05029v2Cross-view classification that means to classify samples from heterogeneous views is a significant yet challenging problem in computer vision. A promising approach to handle this problem is the multi-view subspace learning (MvSL), which intends to find a common subspace for multi-view data. Despite the satisfactory results achieved by existing methods, the performance of previous work will be dramatically degraded when multi-view data lies on nonlinear manifolds. To circumvent this drawback, we propose Multi-view Common Component Discriminant Analysis (MvCCDA) to handle view discrepancy, discriminability and nonlinearity in a joint manner. Specifically, our MvCCDA incorporates supervised information and local geometric information into the common component extraction process to learn a discriminant common subspace and to discover the nonlinear structure embedded in multi-view data. We develop a kernel method of MvCCDA to further boost the performance of MvCCDA. Beyond kernel extension, optimization and complexity analysis of MvCCDA are also presented for completeness. Our MvCCDA is competitive with the state-of-the-art MvSL based methods on four benchmark datasets, demonstrating its superiority.
- May 15 2018 cs.CL arXiv:1805.05181v1The goal of sentiment-to-sentiment "translation" is to change the underlying sentiment of a sentence while keeping its content. The main challenge is the lack of parallel data. To solve this problem, we propose a cycled reinforcement learning method that enables training on unpaired data by collaboration between a neutralization module and an emotionalization module. We evaluate our approach on two review datasets, Yelp and Amazon. Experimental results show that our approach significantly outperforms the state-of-the-art systems. Especially, the proposed method substantially improves the content preservation performance. The BLEU score is improved from 1.64 to 22.46 and from 0.56 to 14.06 on the two datasets, respectively.
- This paper considers an unmanned aerial vehicle (UAV)-enabled wireless power transfer (WPT) system, in which a UAV equipped with a directional antenna is dispatched to deliver wireless energy to charge two energy receivers (ERs) on the ground. Under this setup, we maximize the common (or minimum) energy received by the two ERs over a particular finite charging period, by jointly optimizing the altitude, trajectory, and transmit beamwidth of the UAV, subject to the UAV's maximum speed constraints, as well as the maximum/minimum altitude and beamwidth constraints. However, the common energy maximization is a non-convex optimization problem that is generally difficult to be solved optimally. To tackle this problem, we first ignore the maximum UAV speed constraints and solve the relaxed problem optimally. The optimal solution to the relaxed problem reveals that the UAV should hover above two symmetric locations during the whole charging period, with the corresponding altitude and beamwidth optimized. Next, we study the original problem with the maximum UAV speed constraints considered, for which a heuristic hover-fly-hover trajectory design is proposed based on the optimal symmetric-location-hovering solution to the relaxed problem. Numerical results validate that thanks to the employment of directional antenna with adaptive beamwidth and altitude control, our proposed design significantly improves the common energy received by the two ERs, as compared to other benchmark schemes.
- May 10 2018 cs.CV arXiv:1805.03344v2Person re-identification (ReID) is to identify pedestrians observed from different camera views based on visual appearance. It is a challenging task due to large pose variations, complex background clutters and severe occlusions. Recently, human pose estimation by predicting joint locations was largely improved in accuracy. It is reasonable to use pose estimation results for handling pose variations and background clutters, and such attempts have obtained great improvement in ReID performance. However, we argue that the pose information was not well utilized and hasn't yet been fully exploited for person ReID. In this work, we introduce a novel framework called Attention-Aware Compositional Network (AACN) for person ReID. AACN consists of two main components: Pose-guided Part Attention (PPA) and Attention-aware Feature Composition (AFC). PPA is learned and applied to mask out undesirable background features in pedestrian feature maps. Furthermore, pose-guided visibility scores are estimated for body parts to deal with part occlusion in the proposed AFC module. Extensive experiments with ablation analysis show the effectiveness of our method, and state-of-the-art results are achieved on several public datasets, including Market-1501, CUHK03, CUHK01, SenseReID, CUHK03-NP and DukeMTMC-reID.
- Imaging in low light is challenging due to low photon count and low SNR. Short-exposure images suffer from noise, while long exposure can induce blur and is often impractical. A variety of denoising, deblurring, and enhancement techniques have been proposed, but their effectiveness is limited in extreme conditions, such as video-rate imaging at night. To support the development of learning-based pipelines for low-light image processing, we introduce a dataset of raw short-exposure low-light images, with corresponding long-exposure reference images. Using the presented dataset, we develop a pipeline for processing low-light images, based on end-to-end training of a fully-convolutional network. The network operates directly on raw sensor data and replaces much of the traditional image processing pipeline, which tends to perform poorly on such data. We report promising results on the new dataset, analyze factors that affect performance, and highlight opportunities for future work. The results are shown in the supplementary video at https://youtu.be/qWKUFK7MWvg
- This letter considers a mobile edge computing (MEC) system with one access point (AP) serving multiple users over a multicarrier channel, in the presence of a malicious eavesdropper. In this system, each user can execute the respective computation tasks by partitioning them into two parts, which are computed locally and offloaded to AP, respectively. We exploit the physical-layer security to secure the multiuser computation offloading from being overheard by the eavesdropper. Under this setup, we minimize the weighted sum-energy consumption for these users, subject to the newly imposed secrecy offloading rate constraints and the computation latency constraints, by jointly optimizing their computation and communication resource allocations. We propose an efficient algorithm to solve this problem.
- In this paper we propose a novel reinforcement learning based model for sequence tagging, referred to as MM-Tag. Inspired by the success and methodology of the AlphaGo Zero, MM-Tag formalizes the problem of sequence tagging with a Monte Carlo tree search (MCTS) enhanced Markov decision process (MDP) model, in which the time steps correspond to the positions of words in a sentence from left to right, and each action corresponds to assign a tag to a word. Two long short-term memory networks (LSTM) are used to summarize the past tag assignments and words in the sentence. Based on the outputs of LSTMs, the policy for guiding the tag assignment and the value for predicting the whole tagging accuracy of the whole sentence are produced. The policy and value are then strengthened with MCTS, which takes the produced raw policy and value as inputs, simulates and evaluates the possible tag assignments at the subsequent positions, and outputs a better search policy for assigning tags. A reinforcement learning algorithm is proposed to train the model parameters. Our work is the first to apply the MCTS enhanced MDP model to the sequence tagging task. We show that MM-Tag can accurately predict the tags thanks to the exploratory decision making mechanism introduced by MCTS. Experimental results show based on a chunking benchmark showed that MM-Tag outperformed the state-of-the-art sequence tagging baselines including CRF and CRF with LSTM.
- We consider securing a distributed machine learning system wherein the data is kept confidential by its providers who are recruited as workers to help the learner to train a $d$--dimensional model. In each communication round, up to $q$ out of the $m$ workers suffer Byzantine faults; faulty workers are assumed to have complete knowledge of the system and can collude to behave arbitrarily adversarially against the learner. We assume that each worker keeps a local sample of size $n$. (Thus, the total number of data points is $N=nm$.) Of particular interest is the high-dimensional regime $d \gg n$. We propose a secured variant of the classical gradient descent method which can tolerate up to a constant fraction of Byzantine workers. We show that the estimation error of the iterates converges to an estimation error $O(\sqrt{q/N} + \sqrt{d/N})$ in $O(\log N)$ rounds. The core of our method is a robust gradient aggregator based on the iterative filtering algorithm proposed by Steinhardt et al. \citeSteinhardt18 for robust mean estimation. We establish a uniform concentration of the sample covariance matrix of gradients, and show that the aggregated gradient, as a function of model parameter, converges uniformly to the true gradient function. As a by-product, we develop a new concentration inequality for sample covariance matrices of sub-exponential distributions, which might be of independent interest.
- One of the most significant bottleneck in training large scale machine learning models on parameter server (PS) is the communication overhead, because it needs to frequently exchange the model gradients between the workers and servers during the training iterations. Gradient quantization has been proposed as an effective approach to reducing the communication volume. One key issue in gradient quantization is setting the number of bits for quantizing the gradients. Small number of bits can significantly reduce the communication overhead while hurts the gradient accuracies, and vise versa. An ideal quantization method would dynamically balance the communication overhead and model accuracy, through adjusting the number bits according to the knowledge learned from the immediate past training iterations. Existing methods, however, quantize the gradients either with fixed number of bits, or with predefined heuristic rules. In this paper we propose a novel adaptive quantization method within the framework of reinforcement learning. The method, referred to as MQGrad, formalizes the selection of quantization bits as actions in a Markov decision process (MDP) where the MDP states records the information collected from the past optimization iterations (e.g., the sequence of the loss function values). During the training iterations of a machine learning algorithm, MQGrad continuously updates the MDP state according to the changes of the loss function. Based on the information, MDP learns to select the optimal actions (number of bits) to quantize the gradients. Experimental results based on a benchmark dataset showed that MQGrad can accelerate the learning of a large scale deep neural network while keeping its prediction accuracies.
- We present a novel cross-view classification algorithm where the gallery and probe data come from different views. A popular approach to tackle this problem is the multi-view subspace learning (MvSL) that aims to learn a latent subspace shared by multi-view data. Despite promising results obtained on some applications, the performance of existing methods deteriorates dramatically when the multi-view data is sampled from nonlinear manifolds or suffers from heavy outliers. To circumvent this drawback, motivated by the Divide-and-Conquer strategy, we propose Multi-view Hybrid Embedding (MvHE), a unique method of dividing the problem of cross-view classification into three subproblems and building one model for each subproblem. Specifically, the first model is designed to remove view discrepancy, whereas the second and third models attempt to discover the intrinsic nonlinear structure and to increase discriminability in intra-view and inter-view samples respectively. The kernel extension is conducted to further boost the representation power of MvHE. Extensive experiments are conducted on four benchmark datasets. Our methods demonstrate overwhelming advantages against the state-of-the-art MvSL based cross-view classification approaches in terms of classification accuracy and robustness.
- Apr 18 2018 cs.CV arXiv:1804.06332v1Autonomous driving has harsh requirements of small model size and energy efficiency, in order to enable the embedded system to achieve real-time on-board object detection. Recent deep convolutional neural network based object detectors have achieved state-of-the-art accuracy. However, such models are trained with numerous parameters and their high computational costs and large storage prohibit the deployment to memory and computation resource limited systems. Low-precision neural networks are popular techniques for reducing the computation requirements and memory footprint. Among them, binary weight neural network (BWN) is the extreme case which quantizes the float-point into just $1$ bit. BWNs are difficult to train and suffer from accuracy deprecation due to the extreme low-bit representation. To address this problem, we propose a knowledge transfer (KT) method to aid the training of BWN using a full-precision teacher network. We built DarkNet- and MobileNet-based binary weight YOLO-v2 detectors and conduct experiments on KITTI benchmark for car, pedestrian and cyclist detection. The experimental results show that the proposed method maintains high detection accuracy while reducing the model size of DarkNet-YOLO from 257 MB to 8.8 MB and MobileNet-YOLO from 193 MB to 7.9 MB.
- We introduce the problem of hidden Hamiltonian cycle recovery, where there is an unknown Hamiltonian cycle in an $n$-vertex complete graph that needs to be inferred from noisy edge measurements. The measurements are independent and distributed according to $\calP_n$ for edges in the cycle and $\calQ_n$ otherwise. This formulation is motivated by a problem in genome assembly, where the goal is to order a set of contigs (genome subsequences) according to their positions on the genome using long-range linking measurements between the contigs. Computing the maximum likelihood estimate in this model reduces to a Traveling Salesman Problem (TSP). Despite the NP-hardness of TSP, we show that a simple linear programming (LP) relaxation, namely the fractional $2$-factor (F2F) LP, recovers the hidden Hamiltonian cycle with high probability as $n \to \infty$ provided that $\alpha_n - \log n \to \infty$, where $\alpha_n \triangleq -2 \log \int \sqrt{d P_n d Q_n}$ is the Rényi divergence of order $\frac{1}{2}$. This condition is information-theoretically optimal in the sense that, under mild distributional assumptions, $\alpha_n \geq (1+o(1)) \log n$ is necessary for any algorithm to succeed regardless of the computational cost. Departing from the usual proof techniques based on dual witness construction, the analysis relies on the combinatorial characterization (in particular, the half-integrality) of the extreme points of the F2F polytope. Represented as bicolored multi-graphs, these extreme points are further decomposed into simpler "blossom-type" structures for the large deviation analysis and counting arguments. Evaluation of the algorithm on real data shows improvements over existing approaches.
- This technical report provides the description and the derivation of a novel nonlinear unknown input and state estimation algorithm (NUISE) for mobile robots. The algorithm is designed for real-world robots with nonlinear dynamic models and subject to stochastic noises on sensing and actuation. Leveraging sensor readings and planned control commands, the algorithm detects and quantifies anomalies on both sensors and actuators. Later, we elaborate the dynamic models of two distinctive mobile robots for the purpose of demonstrating the application of NUISE. This report serves as a supplementary document for [1].
- Apr 10 2018 cs.CV arXiv:1804.02603v1Most of previous image denoising methods focus on additive white Gaussian noise (AWGN). However,the real-world noisy image denoising problem with the advancing of the computer vision techiniques. In order to promote the study on this problem while implementing the concurrent real-world image denoising datasets, we construct a new benchmark dataset which contains comprehensive real-world noisy images of different natural scenes. These images are captured by different cameras under different camera settings. We evaluate the different denoising methods on our new dataset as well as previous datasets. Extensive experimental results demonstrate that the recently proposed methods designed specifically for realistic noise removal based on sparse or low rank theories achieve better denoising performance and are more robust than other competing methods, and the newly proposed dataset is more challenging. The constructed dataset of real photographs is publicly available at \urlhttps://github.com/csjunxu/PolyUDataset for researchers to investigate new real-world image denoising methods. We will add more analysis on the noise statistics in the real photographs of our new dataset in the next version of this article.
- This paper studies unmanned aerial vehicle (UAV) enabled wireless communication, where a rotarywing UAV is dispatched to send/collect data to/from multiple ground nodes (GNs). We aim to minimize the total UAV energy consumption, including both propulsion energy and communication related energy, while satisfying the communication throughput requirement of each GN. To this end, we first derive an analytical propulsion power consumption model for rotary-wing UAVs, and then formulate the energy minimization problem by jointly optimizing the UAV trajectory and communication time allocation among GNs, as well as the total mission completion time. The problem is difficult to be optimally solved, as it is non-convex and involves infinitely many variables over time. To tackle this problem, we first consider the simple fly-hover-communicate design, where the UAV successively visits a set of hovering locations and communicates with one corresponding GN when hovering at each location. For this design, we propose an efficient algorithm to optimize the hovering locations and durations, as well as the flying trajectory connecting these hovering locations, by leveraging the travelling salesman problem (TSP) and convex optimization techniques. Next, we consider the general case where the UAV communicates also when flying. We propose a new path discretization method to transform the original problem into a discretized equivalent with a finite number of optimization variables, for which we obtain a locally optimal solution by applying the successive convex approximation (SCA) technique. Numerical results show the significant performance gains of the proposed designs over benchmark schemes, in achieving energy-efficient communication with rotary-wing UAVs.
- Apr 05 2018 cs.CV arXiv:1804.01422v1In this paper, we propose a simple but effective semantic-based aggregation (SBA) method. The proposed SBA utilizes the discriminative filters of deep convolutional layers as semantic detectors. Moreover, we propose the effective unsupervised strategy to select some semantic detectors to generate the "probabilistic proposals", which highlight certain discriminative pattern of objects and suppress the noise of background. The final global SBA representation could then be acquired by aggregating the regional representations weighted by the selected "probabilistic proposals" corresponding to various semantic content. Our unsupervised SBA is easy to generalize and achieves excellent performance on various tasks. We conduct comprehensive experiments and show that our unsupervised SBA outperforms the state-of-the-art unsupervised and supervised aggregation methods on image retrieval, place recognition and cloud classification.
- Apr 02 2018 cs.NI arXiv:1803.11449v1Network super point is a kind of special host which plays an important role in network management and security. For a core network, detecting super points in real time is a burden task because it requires plenty computing resources to keep up with the high speed of packets. Previous works try to solve this problem by using expensive memory, such as static random access memory, and multi cores of CPU. But the number of cores in CPU is small and each core of CPU has a high price. In this work, we use a popular parallel computing platform, graphic processing unit GPU, to mining core network's super point. We propose a double direction hash functions group which can map hosts randomly and restore them from a dense structure. Because the high randomness and simple process of the double direction hash functions, our algorithm reduce the memory to smaller than one-fourth of other algorithms. Because the small memory requirement of our algorithm, a low cost GPU, only worth 200 dollars, is fast enough to deal with a high speed network such as 750 Gb/s. No other algorithm can cope with such a high bandwidth traffic as accuracy as our algorithm on such a cheap platform. Experiments on the traffic collecting from a core network demonstrate the advantage of our efficient algorithm.
- Mar 30 2018 cs.DC arXiv:1803.11036v1Sliding super point is a special host defined under sliding time window with which there are huge other hosts contact. It plays important roles in network security and management. But how to detect them in real time from nowadays high-speed network which contains several distributed routers is a hard task. Distributed sliding super point detection requires an algorithm that can estimate the number of contacting hosts incrementally, scan packets faster than their flowing speed and reconstruct sliding super point at the end of a time period. But no existing algorithm satisfies these three requirements simultaneously. To solve this problem, this paper firstly proposed a distributed sliding super point detection algorithm running on GPU. The advantage of this algorithm comes from a novel sliding estimator, which can estimate contacting host number incrementally under a sliding window, and a set of reversible hash functions, by which sliding super points could be regained without storing additional data such as IP list. There are two main procedures in this algorithm: packets scanning and sliding super points reconstruction. Both could run parallel without any data reading conflict. When deployed on a low cost GPU, this algorithm could deal with traffic with bandwidth as high as 680 Gb/s. A real world core network traffic is used to evaluate the performance of this sliding super point detection algorithm on a cheap GPU, Nvidia GTX950 with 4 GB graphic memory. Experiments comparing with other algorithms under discrete time window show that this algorithm has the highest accuracy. Under sliding time widow, this algorithm has the same performance as in discrete time window, where no other algorithms can work.
- Mar 29 2018 cs.NI arXiv:1803.10369v1Super point is a special host in network which communicates with lots of other hosts in a certain time period. The number of hosts contacting with a super point is called as its cardinality. Cardinality estimating plays important roles in network management and security. All of existing works focus on how to estimate super point's cardinality under discrete time window. But discrete time window causes great delay and the accuracy of estimating result is subject to the starting of the window. sliding time window, moving forwarding a small slice every time, offers a more accuracy and timely scale to monitor super point's cardinality. On the other hand, super point's cardinality estimating under sliding time window is more difficult because it requires an algorithm to record the cardinality incrementally and report them immediately at the end of the sliding duration. This paper firstly solves this problem by devising a sliding time window available algorithm SRLA. SRLA records hosts cardinality by a novel structure which could be updated incrementally. In order to reduce the cardinality estimating time at the end of every sliding time window, SRLA generates a super point candidate list while scanning packets and calculates the cardinality of hosts in the candidate list only. It also has the ability to run parallel to deal with high speed network in line speed. This paper gives the way to deploy SRLA on a common GPU. Experiments on real world traffics which have 40 GB/s bandwidth show that SRLA successfully estimates super point's cardinality within 100 milliseconds under sliding time window when running on a low cost Nvidia GPU, GTX650 with 1 GB memory. The estimating time of SRLA is much smaller than that of other algorithms which consumes more than 2000 milliseconds under discrete time window.
- In this paper, we focus on solving a distributed convex optimization problem in a network, where each agent has its own convex cost function and the goal is to minimize the sum of the agents' cost functions while obeying the network connectivity structure. In order to minimize the sum of the cost functions, we consider a new distributed gradient-based method where each node maintains two estimates, namely, an estimate of the optimal decision variable and an estimate of the gradient for the average of the agents' objective functions. From the viewpoint of an agent, the information about the decision variable is pushed to the neighbors, while the information about the gradients is pulled from the neighbors (hence giving the name "push-pull gradient method"). The method unifies the algorithms with different types of distributed architecture, including decentralized (peer-to-peer), centralized (master-slave), and semi-centralized (leader-follower) architecture. We show that the algorithm converges linearly for strongly convex and smooth objective functions over a directed static network. In our numerical test, the algorithm performs well even for time-varying directed networks.
- Mar 20 2018 cs.CV arXiv:1803.06598v1Cascaded Regression (CR) based methods have been proposed to solve facial landmarks detection problem, which learn a series of descent directions by multiple cascaded regressors separately trained in coarse and fine stages. They outperform the traditional gradient descent based methods in both accuracy and running speed. However, cascaded regression is not robust enough because each regressor's training data comes from the output of previous regressor. Moreover, training multiple regressors requires lots of computing resources, especially for deep learning based methods. In this paper, we develop a Self-Iterative Regression (SIR) framework to improve the model efficiency. Only one self-iterative regressor is trained to learn the descent directions for samples from coarse stages to fine stages, and parameters are iteratively updated by the same regressor. Specifically, we proposed Landmarks-Attention Network (LAN) as our regressor, which concurrently learns features around each landmark and obtains the holistic location increment. By doing so, not only the rest of regressors are removed to simplify the training process, but the number of model parameters is significantly decreased. The experiments demonstrate that with only 3.72M model parameters, our proposed method achieves the state-of-the-art performance.
- Mar 16 2018 cs.CV arXiv:1803.05471v1Aim: Early detection and correct diagnosis of lung cancer are the most important steps in improving patient outcome. This study aims to assess which deep learning models perform best in lung cancer diagnosis. Methods: Non-small cell lung carcinoma and small cell lung carcinoma biopsy specimens were consecutively obtained and stained. The specimen slides were diagnosed by two experienced pathologists (over 20 years). Several deep learning models were trained to discriminate cancer and non-cancer biopsies. Result: Deep learning models give reasonable AUC from 0.8810 to 0.9119. Conclusion: The deep learning analysis could help to speed up the detection process for the whole-slide image (WSI) and keep the comparable detection rate with human observer.
- As one of most fascinating machine learning techniques, deep neural network (DNN) has demonstrated excellent performance in various intelligent tasks such as image classification. DNN achieves such performance, to a large extent, by performing expensive training over huge volumes of training data. To reduce the data storage and transfer overhead in smart resource-limited Internet-of-Thing (IoT) systems, effective data compression is a "must-have" feature before transferring real-time produced dataset for training or classification. While there have been many well-known image compression approaches (such as JPEG), we for the first time find that a human-visual based image compression approach such as JPEG compression is not an optimized solution for DNN systems, especially with high compression ratios. To this end, we develop an image compression framework tailored for DNN applications, named "DeepN-JPEG", to embrace the nature of deep cascaded information process mechanism of DNN architecture. Extensive experiments, based on "ImageNet" dataset with various state-of-the-art DNNs, show that "DeepN-JPEG" can achieve ~3.5x higher compression rate over the popular JPEG solution while maintaining the same accuracy level for image recognition, demonstrating its great potential of storage and power efficiency in DNN-based smart IoT system design.
- This paper studies a new mobile edge computing (MEC) setup where an unmanned aerial vehicle (UAV) is served by cellular ground base stations (GBSs) for computation offloading. The UAV flies between a give pair of initial and final locations, during which it needs to accomplish certain computation tasks by offloading them to some selected GBSs along its trajectory for parallel execution. Under this setup, we aim to minimize the UAV's mission completion time by optimizing its trajectory jointly with the computation offloading scheduling, subject to the maximum speed constraint of the UAV, and the computation capacity constraints at GBSs. The joint UAV trajectory and computation offloading optimization problem is, however, non-convex and thus difficult to be solved optimally. To tackle this problem, we propose an efficient algorithm to obtain a high-quality suboptimal solution. Numerical results show that the proposed design significantly reduces the UAV's mission completion time, as compared to benchmark schemes.
- Laser power has become a viable solution to provide convenient and sustainable energy supply to unmanned aerial vehicles (UAVs). In this paper, we study a laser-powered UAV wireless communication system, where a laser transmitter sends laser beams to charge a fixed-wing UAV in flight, and the UAV uses the harvested laser energy to communicate with a ground station. To maintain the UAV's sustainable operation, its total energy consumption cannot exceed that harvested from the laser transmitter. Under such a laser energy harvesting constraint, we maximize the downlink communication throughput from the UAV to the ground station over a finite time duration, by jointly optimizing the UAV's trajectory and its transmit power allocation. However, due to the complicated UAV energy consumption model, this problem is non-convex and difficult to be solved. To tackle the problem, we first consider a special case with a double-circular UAV trajectory which balances the tradeoff between maximizing the performance of laser energy harvesting versus wireless communication at the UAV. Next, based on the obtained double-circular trajectory, we propose an efficient solution to the general problem, by applying the techniques of alternating optimization and sequential convex programming (SCP). Finally, numerical results are provided to validate the communication throughput performance of the proposed design.
- Feb 27 2018 cs.CL arXiv:1802.08970v1Generating plausible and fluent sentence with desired properties has long been a challenge. Most of the recent works use recurrent neural networks (RNNs) and their variants to predict following words given previous sequence and target label. In this paper, we propose a novel framework to generate constrained sentences via Gibbs Sampling. The candidate sentences are revised and updated iteratively, with sampled new words replacing old ones. Our experiments show the effectiveness of the proposed method to generate plausible and diverse sentences.
- We propose a novel locally adaptive learning estimator for enhancing the inter- and intra- discriminative capabilities of Deep Neural Networks, which can be used as improved loss layer for semantic image segmentation tasks. Most loss layers compute pixel-wise cost between feature maps and ground truths, ignoring spatial layouts and interactions between neighboring pixels with same object category, and thus networks cannot be effectively sensitive to intra-class connections. Stride by stride, our method firstly conducts adaptive pooling filter operating over predicted feature maps, aiming to merge predicted distributions over a small group of neighboring pixels with same category, and then it computes cost between the merged distribution vector and their category label. Such design can make groups of neighboring predictions from same category involved into estimations on predicting correctness with respect to their category, and hence train networks to be more sensitive to regional connections between adjacent pixels based on their categories. In the experiments on Pascal VOC 2012 segmentation datasets, the consistently improved results show that our proposed approach achieves better segmentation masks against previous counterparts.
- This paper studies a multi-user cooperative mobile-edge computing (MEC) system, in which a local mobile user can offload intensive computation tasks to multiple nearby edge devices serving as helpers for remote execution. We focus on the scenario where the local user has a number of independent tasks that can be executed in parallel but cannot be further partitioned. We consider a time division multiple access (TDMA) communication protocol, in which the local user can offload computation tasks to the helpers and download results from them over pre-scheduled time slots. Under this setup, we minimize the local user's computation latency by optimizing the task assignment jointly with the time and power allocations, subject to individual energy constraints at the local user and the helpers. However, the joint task assignment and wireless resource allocation problem is a mixed-integer non-linear program (MINLP) that is hard to solve optimally. To tackle this challenge, we first relax it into a convex problem, and then propose an efficient suboptimal solution based on the optimal solution to the relaxed convex problem. Finally, numerical results show that our proposed joint design significantly reduces the local user's computation latency, as compared against other benchmark schemes that design the task assignment separately from the offloading/downloading resource allocations and local execution.
- Low-rank matrix completion (MC) has achieved great success in many real-world data applications. A latent feature model formulation is usually employed and, to improve prediction performance, the similarities between latent variables can be exploited by pairwise learning, e.g., the graph regularized matrix factorization (GRMF) method. However, existing GRMF approaches often use a squared L2 norm to measure the pairwise difference, which may be overly influenced by dissimilar pairs and lead to inferior prediction. To fully empower pairwise learning for matrix completion, we propose a general optimization framework that allows a rich class of (non-)convex pairwise penalty functions. A new and efficient algorithm is further developed to uniformly solve the optimization problem, with a theoretical convergence guarantee. In an important situation where the latent variables form a small number of subgroups, its statistical guarantee is also fully characterized. In particular, we theoretically characterize the complexity-regularized maximum likelihood estimator, as a special case of our framework. It has a better error bound when compared to the standard trace-norm regularized matrix completion. We conduct extensive experiments on both synthetic and real datasets to demonstrate the superior performance of this general framework.
- Redundancy is a fundamental characteristic of many biological processes such as those in the genetic, visual, muscular and nervous system; yet its function has not been fully understood. The conventional interpretation of redundancy is that it serves as a fault-tolerance mechanism, which leads to redundancy's de facto application in man-made systems for reliability enhancement. On the contrary, our previous works have demonstrated an example where redundancy can be engineered solely for enhancing other aspects of the system, namely accuracy and precision. This design was inspired by the binocular structure of the human vision which we believe may share a similar operation. In this paper, we present a unified theory describing how such utilization of redundancy is feasible through two complementary mechanisms: representational redundancy (RPR) and entangled redundancy (ETR). Besides the previous works, we point out two additional examples where our new understanding of redundancy can be applied to justify a system's superior performance. One is the human musculoskeletal system (HMS) - a biological instance, and one is the deep residual neural network (ResNet) - an artificial counterpart. We envision that our theory would provide a framework for the future development of bio-inspired redundant artificial systems as well as assist the studies of the fundamental mechanisms governing various biological processes.
- Sensing is the process of deriving signals from the environment that allows artificial systems to interact with the physical world. The Shannon theorem specifies the maximum rate at which information can be acquired. However, this upper bound is hard to achieve in many man-made systems. The biological visual systems, on the other hand, have highly efficient signal representation and processing mechanisms that allow precise sensing. In this work, we argue that redundancy is one of the critical characteristics for such superior performance. We show architectural advantages by utilizing redundant sensing, including correction of mismatch error and significant precision enhancement. For a proof-of-concept demonstration, we have designed a heuristic-based analog-to-digital converter - a zero-dimensional quantizer. Through Monte Carlo simulation with the error probabilistic distribution as a priori, the performance approaching the Shannon limit is feasible. In actual measurements without knowing the error distribution, we observe at least 2-bit extra precision. The results may also help explain biological processes including the dominance of binocular vision, the functional roles of the fixational eye movements, and the structural mechanisms allowing hyperacuity.
- In this paper we study the differentially private Empirical Risk Minimization (ERM) problem in different settings. For smooth (strongly) convex loss function with or without (non)-smooth regularization, we give algorithms that achieve either optimal or near optimal utility bounds with less gradient complexity compared with previous work. For ERM with smooth convex loss function in high-dimensional ($p\gg n$) setting, we give an algorithm which achieves the upper bound with less gradient complexity than previous ones. At last, we generalize the expected excess empirical risk from convex loss functions to non-convex ones satisfying the Polyak-Lojasiewicz condition and give a tighter upper bound on the utility than the one in \citeijcai2017-548.
- In this paper, we study the Empirical Risk Minimization problem in the non-interactive local model of differential privacy. In the case of constant or low dimensionality ($p\ll n$), we first show that if the ERM loss function is $(\infty, T)$-smooth, then we can avoid a dependence of the sample complexity, to achieve error $\alpha$, on the exponential of the dimensionality $p$ with base $1/\alpha$ (i.e., $\alpha^{-p}$), which answers a question in [smith 2017 interaction]. Our approach is based on polynomial approximation. Then, we propose player-efficient algorithms with $1$-bit communication complexity and $O(1)$ computation cost for each player. The error bound is asymptotically the same as the original one. Also with additional assumptions we show a server efficient algorithm. Next we consider the high dimensional case ($n\ll p$), we show that if the loss function is Generalized Linear function and convex, then we could get an error bound which is dependent on the Gaussian width of the underlying constrained set instead of $p$, which is lower than that in [smith 2017 interaction].
- Feb 13 2018 cs.CV arXiv:1802.03617v1Bronchoscopy inspection as a follow-up procedure from the radiological imaging plays a key role in lung disease diagnosis and determining treatment plans for the patients. Doctors needs to make a decision whether to biopsy the patients timely when performing bronchoscopy. However, the doctors also needs to be very selective with biopsies as biopsies may cause uncontrollable bleeding of the lung tissue which is life-threaten. To help doctors to be more selective on biopsies and provide a second opinion on diagnosis, in this work, we propose a computer-aided diagnosis (CAD) system for lung diseases including cancers and tuberculosis (TB). The system is developed based on transfer learning. We propose a novel transfer learning method: sentential fine-tuning . Compared to traditional fine-tuning methods, our methods achieves the best performance. We obtained a overall accuracy of 77.0% a dataset of 81 normal cases, 76 tuberculosis cases and 277 lung cancer cases while the other traditional transfer learning methods achieve an accuracy of 73% and 68%. . The detection accuracy of our method for cancers, TB and normal cases are 87%, 54% and 91% respectively. This indicates that the CAD system has potential to improve lung disease diagnosis accuracy in bronchoscopy and it also might be used to be more selective with biopsies.
- In this paper, we revisit the large-scale constrained linear regression problem and propose faster methods based on some recent developments in sketching and optimization. Our algorithms combine (accelerated) mini-batch SGD with a new method called two-step preconditioning to achieve an approximate solution with a time complexity lower than that of the state-of-the-art techniques for the low precision case. Our idea can also be extended to the high precision case, which gives an alternative implementation to the Iterative Hessian Sketch (IHS) method with significantly improved time complexity. Experiments on benchmark and synthetic datasets suggest that our methods indeed outperform existing ones considerably in both the low and high precision cases.
- Feb 06 2018 cs.CL arXiv:1802.01345v2Existing text generation methods tend to produce repeated and "boring" expressions. To tackle this problem, we propose a new text generation model, called Diversity-Promoting Generative Adversarial Network (DP-GAN). The proposed model assigns low reward for repeated text and high reward for "novel" text, encouraging the generator to produce diverse and informative text. Moreover, we propose a novel language-model based discriminator, which can better distinguish novel text from repeated text without the saturation problem compared with existing classifier-based discriminators. The experimental results on review generation and dialogue generation tasks demonstrate that our method can generate substantially more diverse and informative text than existing baseline methods. The code is available at https://github.com/lancopku/DPGAN
- Jan 23 2018 cs.CV arXiv:1801.06742v2Sufficient training data is normally required to train deeply learned models. However, the number of pedestrian images per ID in person re-identification (re-ID) datasets is usually limited, since manually annotations are required for multiple camera views. To produce more data for training deeply learned models, generative adversarial network (GAN) can be leveraged to generate samples for person re-ID. However, the samples generated by vanilla GAN usually do not have labels. So in this paper, we propose a virtual label called Multi-pseudo Regularized Label (MpRL) and assign it to the generated images. With MpRL, the generated samples will be used as supplementary of real training data to train a deep model in a semi-supervised learning fashion. Considering data bias between generated and real samples, MpRL utilizes different contributions from predefined training classes. The contribution-based virtual labels are automatically assigned to generated samples to reduce ambiguous prediction in training. Meanwhile, MpRL only relies on predefined training classes without using extra classes. Furthermore, to reduce over-fitting, a regularized manner is applied to MpRL to regularize the learning process. To verify the effectiveness of MpRL, two state-of-the-art convolutional neural networks (CNNs) are adopted in our experiments. Experiments demonstrate that by assigning MpRL to generated samples, we can further improve the person re-ID performance on three datasets i.e., Market-1501, DukeMTMCreID, and CUHK03. The proposed method obtains +6.29%, +6.30% and +5.58% improvements in rank-1 accuracy over a strong CNN baseline respectively, and outperforms the state-of-the- art methods.
- Jan 19 2018 cs.DC arXiv:1801.05868v1Mobile Edge Computing (MEC) pushes computing functionalities away from the centralized cloud to the network edge, thereby meeting the latency requirements of many emerging mobile applications and saving backhaul network bandwidth. Although many existing works have studied computation offloading policies, service caching is an equally, if not more important, design topic of MEC, yet receives much less attention. Service caching refers to caching application services and their related databases/libraries in the edge server (e.g. MEC-enabled BS), thereby enabling corresponding computation tasks to be executed. Because only a small number of application services can be cached in resource-limited edge server at the same time, which services to cache has to be judiciously decided to maximize the edge computing performance. In this paper, we investigate the extremely compelling but much less studied problem of dynamic service caching in MEC-enabled dense cellular networks. We propose an efficient online algorithm, called OREO, which jointly optimizes dynamic service caching and task offloading to address a number of key challenges in MEC systems, including service heterogeneity, unknown system dynamics, spatial demand coupling and decentralized coordination. Our algorithm is developed based on Lyapunov optimization and Gibbs sampling, works online without requiring future information, and achieves provable close-to-optimal performance. Simulation results show that our algorithm can effectively reduce computation latency for end users while keeping energy consumption low.
- Graph embedding has been proven to be efficient and effective in facilitating graph analysis. In this paper, we present a novel spectral framework called NOn-Backtracking Embedding (NOBE), which offers a new perspective that organizes graph data at a deep level by tracking the flow traversing on the edges with backtracking prohibited. Further, by analyzing the non-backtracking process, a technique called graph approximation is devised, which provides a channel to transform the spectral decomposition on an edge-to-edge matrix to that on a node-to-node matrix. Theoretical guarantees are provided by bounding the difference between the corresponding eigenvalues of the original graph and its graph approximation. Extensive experiments conducted on various real-world networks demonstrate the efficacy of our methods on both macroscopic and microscopic levels, including clustering and structural hole spanner detection.
- This paper studies an unmanned aerial vehicle (UAV)-enabled wireless powered communication network (WPCN), in which a UAV is dispatched as a mobile access point (AP) to serve a set of ground users periodically. The UAV employs the radio frequency (RF) wireless power transfer (WPT) to charge the users in the downlink, and the users use the harvested RF energy to send independent information to the UAV in the uplink. Unlike the conventional WPCN with fixed APs, the UAV-enabled WPCN can exploit the mobility of the UAV via trajectory design, jointly with the wireless resource allocation optimization, to maximize the system throughput. In particular, we aim to maximize the uplink common (minimum) throughput among all ground users over a finite UAV's flight period, subject to its maximum speed constraint and the users' energy neutrality constraints. The resulted problem is non-convex and thus difficult to be solved optimally. To tackle this challenge, we first consider an ideal case without the UAV's maximum speed constraint, and obtain the optimal solution to the relaxed problem. The optimal solution shows that the UAV should successively hover above a finite number of ground locations for downlink WPT, as well as above each of the ground users for uplink communication. Next, we consider the general problem with the UAV's maximum speed constraint. Based on the above multi-location-hovering solution, we first propose an efficient successive hover-and-fly trajectory design, jointly with the downlink and uplink wireless resource allocation, and then propose a locally optimal solution by applying the techniques of alternating optimization and successive convex programming (SCP). Numerical results show that the proposed UAV-enabled WPCN achieves significant throughput gains over the conventional WPCN with fixed-location AP.
- Suppose a database containing $M$ records is replicated across $N$ servers, and a user wants to privately retrieve one record by accessing the servers such that identity of the retrieved record is secret against any up to $T$ servers. A scheme designed for this purpose is called a $T$-private information retrieval ($T$-PIR) scheme. Three indexes are concerned for PIR schemes: (1)rate, indicating the amount of retrieved information per unit of downloaded data. The highest achievable rate is characterized by the capacity; (2) sub-packetization, reflexing the implementation complexity for linear schemes; (3) field size. We consider linear schemes over a finite field. In this paper, a general $T$-PIR scheme simultaneously attaining the optimality of almost all of the three indexes is presented. Specifically, we design a linear capacity-achieving $T$-PIR scheme with sub-packetization $\!dn^{M-1}\!$ over a finite field $\mathbb{F}_q$, $q\geq N$. The sub-packetization $\!dn^{M-1}\!$, where $\!d\!=\!{\rm gcd}(N,T)\!$ and $\!n\!=\!N/d$, has been proved to be optimal in our previous work. The field size of all existing capacity-achieving $T$-PIR schemes must be larger than $Nt^{M-2}$ where $t=T/d$, while our scheme reduces the field size by an exponential factor.
- Deep learning has achieved impressive results on many problems. However, it requires high degree of expertise or a lot of experience to tune well the hyperparameters, and such manual tuning process is likely to be biased. Moreover, it is not practical to try out as many different hyperparameter configurations in deep learning as in other machine learning scenarios, because evaluating each single hyperparameter configuration in deep learning would mean training a deep neural network, which usually takes quite long time. Hyperband algorithm achieves state-of-the-art performance on various hyperparameter optimization problems in the field of deep learning. However, Hyperband algorithm does not utilize history information of previous explored hyperparameter configurations, thus the solution found is suboptimal. We propose to combine Hyperband algorithm with Bayesian optimization (which does not ignore history when sampling next trial configuration). Experimental results show that our combination approach is superior to other hyperparameter optimization approaches including Hyperband algorithm.
- A fundamental challenge in wireless multicast has been how to simultaneously achieve high-throughput and low-delay for reliably serving a large number of users. In this paper, we show how to harness substantial throughput and delay gains by exploiting multi-channel resources. We develop a new scheme called Multi-Channel Moving Window Codes (MC-MWC) for multi-channel multi-session wireless multicast. The salient features of MC-MWC are three-fold. (i) High throughput: we show that MC-MWC achieves order-optimal throughput in the many-user many-channel asymptotic regime. Moreover, the number of channels required by a conventional channel-allocation based scheme is shown to be doubly-exponentially larger than that required by MC-MWC. (ii) Low delay: using large deviations theory, we show that the delay of MC-MWC decreases linearly with the number of channels, while the delay reduction of conventional schemes is no more than a finite constant. (iii) Low feedback overhead: the feedback overhead of MC-MWC is a constant that is independent of both the number of receivers in each session and the number of sessions in the network. Finally, our trace-driven simulation and numerical results validate the analytical results and show that the implementation complexity of MC-MWC is low.
- We consider an $\ell_2$-regularized non-convex optimization problem for recovering signals from their noisy phaseless observations. We design and study the performance of a message passing algorithm that aims to solve this optimization problem. We consider the asymptotic setting $m,n \rightarrow \infty$, $m/n \rightarrow \delta$ and obtain sharp performance bounds, where $m$ is the number of measurements and $n$ is the signal dimension. We show that for complex signals the algorithm can perform accurate recovery with only $m= \left(\frac{64}{\pi^2}-4\right)n \approx 2.5n$ measurements. Also, we provide sharp analysis on the sensitivity of the algorithm to noise. We highlight the following facts about our message passing algorithm: (i) Adding $\ell_2$ regularization to the non-convex loss function can be beneficial. (ii) Spectral initialization has marginal impact on the performance of the algorithm. The sharp analyses in this paper, not only enable us to compare the performance of our method with other phase recovery schemes, but also shed light on designing better iterative algorithms for other non-convex optimization problems.
- Although prior works have exploited the UAV's mobility to enhance the wireless communication performance under different setups, the fundamental capacity limits of UAV-enabled/aided multiuser communication systems have not yet been characterized. To fill this gap, we consider in this paper a UAV-enabled two-user broadcast channel (BC), where a UAV flying at a constant altitude is deployed to send independent information to two users at different fixed locations on the ground. We aim to characterize the capacity region of this new type of BC over a given UAV flight duration, by jointly optimizing the UAV's trajectory and transmit power/rate allocations over time, subject to the UAV's maximum speed and maximum transmit power constraints. First, to draw essential insights, we consider two special cases with asymptotically large/low UAV flight duration/speed, respectively. For the former case, it is shown that a simple hover-fly-hover (HFH) UAV trajectory with time division multiple access (TDMA) based orthogonal multiuser transmission is capacity-achieving, while in the latter case, the UAV should hover at a fixed location that is nearer to the user with larger achievable rate and in general superposition coding (SC) based non-orthogonal transmission with interference cancellation at the receiver of the nearer user is required. Next, we consider the general case with finite UAV speed and flight duration. We show that the optimal UAV trajectory should follow a general HFH structure, i.e., the UAV successively hovers at a pair of initial and final locations above the line segment of the two users each with a certain amount of time and flies unidirectionally between them at the maximum speed, and SC is generally needed.
- Dec 29 2017 cs.SI physics.soc-ph arXiv:1712.09648v1Our world produces massive data every day; they exist in diverse forms, from pairwise data and matrix to time series and trajectories. Meanwhile, we have access to the versatile toolkit of network analysis. Networks also have different forms; from simple networks to higher-order network, each representation has different capabilities in carrying information. For researchers who want to leverage the power of the network toolkit, and apply it beyond networks data to sequential data, diffusion data, and many more, the question is: how to represent big data and networks? This dissertation makes a first step to answering the question. It proposes the higher-order network, which is a critical piece for representing higher-order interaction data; it introduces a scalable algorithm for building the network, and visualization tools for interactive exploration. Finally, it presents broad applications of the higher-order network in the real-world.
- Dec 29 2017 cs.SI physics.soc-ph arXiv:1712.09658v2A major branch of anomaly detection methods relies on dynamic networks: raw sequence data is first converted to a series of networks, then critical change points are identified in the evolving network structure. However, existing approaches use first-order networks (FONs) to represent the underlying raw data, which may lose important higher-order sequence patterns, making higher-order anomalies undetectable in subsequent analysis. We present a novel higher-order anomaly detection method that is both parameter-free and scalable, building on an improved higher-order network (HON) construction algorithm. We show the proposed higher-order anomaly detection algorithm is effective in discovering variable orders of anomalies. Our data includes a synthetic 11 billion web clickstreams and a real-world taxi trajectory data.
- Hybrid circuit and packet switching for data center networking (DCN) has received considerable research attention recently. A hybrid-switched DCN employs a much faster circuit switch that is reconfigurable with a nontrivial cost, and a much slower packet switch that is reconfigurable with no cost, to interconnect its racks of servers. The research problem is, given a traffic demand matrix (between the racks), how to compute a good circuit switch configuration schedule so that the vast majority of the traffic demand is removed by the circuit switch, leaving a remaining demand matrix that contains only small elements for the packet switch to handle. In this paper, we propose two new hybrid switch scheduling algorithms under two different scheduling constraints. Our first algorithm, called 2-hop Eclipse, strikes a much better tradeoff between the resulting performance (of the hybrid switch) and the computational complexity (of the algorithm) than the state of the art solution Eclipse/Eclipse++. Our second algorithm, called BFF (best first fit), is the first hybrid switching solution that exploits the potential partial reconfiguration capability of the circuit switch for performance gains.
- Dec 13 2017 cs.CV arXiv:1712.04119v1Positron emission tomography (PET) is widely used in various clinical applications, including cancer diagnosis, heart disease and neuro disorders. The use of radioactive tracer in PET imaging raises concerns due to the risk of radiation exposure. To minimize this potential risk in PET imaging, efforts have been made to reduce the amount of radio-tracer usage. However, lowing dose results in low Signal-to-Noise-Ratio (SNR) and loss of information, both of which will heavily affect clinical diagnosis. Besides, the ill-conditioning of low-dose PET image reconstruction makes it a difficult problem for iterative reconstruction algorithms. Previous methods proposed are typically complicated and slow, yet still cannot yield satisfactory results at significantly low dose. Here, we propose a deep learning method to resolve this issue with an encoder-decoder residual deep network with concatenate skip connections. Experiments shows the proposed method can reconstruct low-dose PET image to a standard-dose quality with only two-hundredth dose. Different cost functions for training model are explored. Multi-slice input strategy is introduced to provide the network with more structural information and make it more robust to noise. Evaluation on ultra-low-dose clinical data shows that the proposed method can achieve better result than the state-of-the-art methods and reconstruct images with comparable quality using only 0.5% of the original regular dose.
- Consider the problem of private information retrieval (PIR) over a distributed storage system where $M$ records are stored across $N$ servers by using an $[N,K]$ MDS code. For simplicity, this problem is usually referred as the coded-PIR problem. The capacity of coded-PIR with privacy against any individual server was determined by Banawan and Ulukus in 2016, i.e., $\mathcal{C}_{\tiny C-PIR}=(1+\frac{K}{N}+\dots+\frac{K^{M-1}}{N^{M-1}})^{-1}$. They also presented a linear capacity-achieving scheme with sub-packetization $KN^{M}$. In this paper we focus on minimizing the sub-packetization for linear capacity-achieving coded-PIR schemes. We prove that the sub-packetization for all linear capacity-achieving coded-PIR schemes in the nontrivial cases (i.e. $N>K\geq 1$ and $M>1$) must be no less than $Kn^{M-1}$, where $n=N/{\rm gcd}(N,K)$. Moreover, we design a linear capacity-achieving coded-PIR scheme with sub-packetization $Kn^{M-1}$ for all $N>K\geq 1$ and $M>1$. Therefore, $Kn^{M-1}$ is the optimal sub-packetization for linear capacity-achieving coded-PIR schemes.
- In the first part of this paper, inspired by the geometric method of Jean-Pierre Marec, we consider the two-impulse Hohmann transfer problem between two coplanar circular orbits as a constrained nonlinear programming problem. By using the Kuhn-Tucker theorem, we analytically prove the global optimality of the Hohmann transfer. Two sets of feasible solutions are found, one of which corresponding to the Hohmann transfer is the global minimum, and the other is a local minimum. In the second part, we formulate the Hohmann transfer problem as two-point and multi-point boundary-value problems by using the calculus of variations. With the help of the Matlab solver bvp4c, two numerical examples are solved successfully, which verifies that the Hohmann transfer is indeed the solution of these boundary-value problems. Via static and dynamic constrained optimization, the solution to the orbit transfer problem proposed by W. Hohmann ninety-two years ago and its global optimality are re-discovered.
- This paper develops a novel methodology for using symbolic knowledge in deep learning. From first principles, we derive a semantic loss function that bridges between neural output vectors and logical constraints. This loss function captures how close the neural network is to satisfying the constraints on its output. An experimental evaluation shows that our semantic loss function effectively guides the learner to achieve (near-)state-of-the-art results on semi-supervised multi-class classification. Moreover, it significantly increases the ability of the neural network to predict structured objects, such as rankings and paths. These discrete concepts are tremendously difficult to learn, and benefit from a tight integration of deep learning and symbolic reasoning methods.
- We study the min-cost seed selection problem in online social networks, where the goal is to select a set of seed nodes with the minimum total cost such that the expected number of influenced nodes in the network exceeds a predefined threshold. We propose several algorithms that outperform the previous studies both on the theoretical approximation ratios and on the experimental performance. Under the case where the nodes have heterogeneous costs, our algorithms are the first bi- criteria approximation algorithms with polynomial running time and provable logarithmic performance bounds using a general contagion model. Under the case where the users have uniform costs, our algorithms achieve logarithmic approximation ratio and provable time complexity which is smaller than that of existing algorithms in orders of magnitude. We conduct extensive experiments using real social networks. The experimental results show that, our algorithms significantly outperform the existing algorithms both on the total cost and on the running time, and also scale well to billion-scale networks.
- This paper proposes the first end-to-end deep framework for high dynamic range (HDR) imaging of dynamic scenes with large-scale foreground motions. In state-of-the-art deep HDR imaging such as [13], the problem is formulated as an image composition problem, by first aligning input images using optical flows which are still error-prone due to occlusion and large motions. In our end-to-end approach, HDR imaging is formulated as an image translation problem and no optical flows are used. Moreover, our simple translation network can automatically hallucinate plausible HDR details in the presence of total occlusion, saturation and under-exposure, which are otherwise almost impossible to recover by conventional optimization approaches. We perform extensive qualitative and quantitative comparisons to show that our end-to-end HDR approach produces excellent results where color artifacts and geometry distortion are significantly reduced compared with existing state-ofthe-art methods.
- Convolutional Neural Networks (CNN) and the locally connected layer are limited in capturing the importance and relations of different local receptive fields, which are often crucial for tasks such as face verification, visual question answering, and word sequence prediction. To tackle the issue, we propose a novel locally smoothed neural network (LSNN) in this paper. The main idea is to represent the weight matrix of the locally connected layer as the product of the kernel and the smoother, where the kernel is shared over different local receptive fields, and the smoother is for determining the importance and relations of different local receptive fields. Specifically, a multi-variate Gaussian function is utilized to generate the smoother, for modeling the location relations among different local receptive fields. Furthermore, the content information can also be leveraged by setting the mean and precision of the Gaussian function according to the content. Experiments on some variant of MNIST clearly show our advantages over CNN and locally connected layer.
- We consider the problem of estimating means of two Gaussians in a 2-Gaussian mixture, which is not balanced and is corrupted by noise of an arbitrary distribution. We present a robust algorithm to estimate the parameters, together with upper bounds on the numbers of samples required for the estimate to be correct, where the bounds are parametrised by the dimension, ratio of the mixing coefficients, a measure of the separation of the two Gaussians, related to Mahalanobis distance, and a condition number of the covariance matrix. In theory, this is the first sample-complexity result for imbalanced mixtures corrupted by adversarial noise. In practice, our algorithm outperforms the vanilla Expectation-Maximisation (EM) algorithm in terms of estimation error.
- Nov 21 2017 cs.CL arXiv:1711.07010v3Named Entity Recognition and Relation Extraction for Chinese literature text is regarded as the highly difficult problem, partially because of the lack of tagging sets. In this paper, we build a discourse-level dataset from hundreds of Chinese literature articles for improving this task. To build a high quality dataset, we propose two tagging methods to solve the problem of data inconsistency, including a heuristic tagging method and a machine auxiliary tagging method. Based on this corpus, we also introduce several widely used models to conduct experiments. Experimental results not only show the usefulness of the proposed dataset, but also provide baselines for further research. The dataset is available at https://github.com/lancopku/Chinese-Literature-NER-RE-Dataset.
- Automatic conflict detection has grown in relevance with the advent of body-worn technology, but existing metrics such as turn-taking and overlap are poor indicators of conflict in police-public interactions. Moreover, standard techniques to compute them fall short when applied to such diversified and noisy contexts. We develop a pipeline catered to this task combining adaptive noise removal, non-speech filtering and new measures of conflict based on the repetition and intensity of phrases in speech. We demonstrate the effectiveness of our approach on body-worn audio data collected by the Los Angeles Police Department.
- This paper studies an unmanned aerial vehicle (UAV)-enabled multicast channel, in which a UAV serves as a mobile transmitter to deliver common information to a set of $K$ ground users. We aim to characterize the capacity of this channel over a finite UAV communication period, subject to its maximum speed constraint and an average transmit power constraint. To achieve the capacity, the UAV should use a sufficiently long code that spans over its whole communication period. Accordingly, the multicast channel capacity is achieved via maximizing the minimum achievable time-averaged rates of the $K$ users, by jointly optimizing the UAV's trajectory and transmit power allocation over time. However, this problem is non-convex and difficult to be solved optimally. To tackle this problem, we first consider a relaxed problem by ignoring the maximum UAV speed constraint, and obtain its globally optimal solution via the Lagrange dual method. The optimal solution reveals that the UAV should hover above a finite number of ground locations, with the optimal hovering duration and transmit power at each location. Next, based on such a multi-location-hovering solution, we present a successive hover-and-fly trajectory design and obtain the corresponding optimal transmit power allocation for the case with the maximum UAV speed constraint. Numerical results show that our proposed joint UAV trajectory and transmit power optimization significantly improves the achievable rate of the UAV-enabled multicast channel, and also greatly outperforms the conventional multicast channel with a fixed-location transmitter.
- In this addendum, we show that the switching algorithm QPS-SERENA can be converted R(QPS-SERENA), an algorithm for computing approximate Maximum Weight Matching (MWM). Empirically, R(QPS-SERENA) computes $(1-\epsilon)$-MWM within linear time (with respect to the number of edges $N^2$) for any fixed $\epsilon\in (0,1)$, for complete bipartite graphs with \it i.i.d. uniform edge weight distributions. This efficacy matches that of the state-of-art solution, although we so far cannot prove any theoretical guarantees on the time complexities needed to attain a certain approximation ratio. Then, we have similarly converted QPS-SERENADE to R(QPS-SERENADE), which empirically should output $(1-\epsilon)$-MWM within only $O(N \log N)$ time for the same type of complete bipartite graphs as described above.
- Nov 07 2017 cs.CL arXiv:1711.01427v1In recent years, neural networks have proven to be effective in Chinese word segmentation. However, this promising performance relies on large-scale training data. Neural networks with conventional architectures cannot achieve the desired results in low-resource datasets due to the lack of labelled training data. In this paper, we propose a deep stacking framework to improve the performance on word segmentation tasks with insufficient data by integrating datasets from diverse domains. Our framework consists of two parts, domain-based models and deep stacking networks. The domain-based models are used to learn knowledge from different datasets. The deep stacking networks are designed to integrate domain-based models. To reduce model conflicts, we innovatively add communication paths among models and design various structures of deep stacking networks, including Gaussian-based Stacking Networks, Concatenate-based Stacking Networks, Sequence-based Stacking Networks and Tree-based Stacking Networks. We conduct experiments on six low-resource datasets from various domains. Our proposed framework shows significant performance improvements on all datasets compared with several strong baselines.
- Suppose a database containing $M$ records is replicated across $N$ servers, and a user wants to privately retrieve one record by accessing the servers such that identity of the retrieved record is secret against any up to $T$ servers. A scheme designed for this purpose is called a private information retrieval (PIR) scheme. In practice, capacity-achieving and small sub-packetization are both desired for PIR schemes, because the former implies the highest download rate and the latter usually means simple realization. For general values of $N,T,M$, the only known capacity-achieving PIR scheme was designed by Sun and Jafar in 2016 with sub-packetization $N^M$. In this paper, we design a linear capacity-achieving PIR scheme with much smaller sub-packetization $dn^{M-1}$, where $d={\rm gcd}(N,T)$ and $n=N/d$. Furthermore, we prove that for any linear capacity-achieving PIR scheme it must have sub-packetization no less than $dn^{M-1}$, implying our scheme has the optimal sub-packetization. Moreover, comparing with Sun and Jafar's scheme, our scheme reduces the field size by a factor of $\frac{1}{Nd^{M-2}}$.
- Nov 01 2017 cs.CL arXiv:1710.11332v1Recently, encoder-decoder models are widely used in social media text summarization. However, these models sometimes select noise words in irrelevant sentences as part of a summary by error, thus declining the performance. In order to inhibit irrelevant sentences and focus on key information, we propose an effective approach by learning sentence weight distribution. In our model, we build a multi-layer perceptron to predict sentence weights. During training, we use the ROUGE score as an alternative to the estimated sentence weight, and try to minimize the gap between estimated weights and predicted weights. In this way, we encourage our model to focus on the key sentences, which have high relevance with the summary. Experimental results show that our approach outperforms baselines on a large-scale social media corpus.
- Nov 01 2017 cs.CL arXiv:1710.11334v1In recent years, more research has been devoted to studying the subtask of the complete shallow discourse parsing, such as indentifying discourse connective and arguments of connective. There is a need to design a full discourse parser to pull these subtasks together. So we develop a discourse parser turning the free text into discourse relations. The parser includes connective identifier, arguments identifier, sense classifier and non-explicit identifier, which connects with each other in pipeline. Each component applies the maximum entropy model with abundant lexical and syntax features extracted from the Penn Discourse Tree-bank. The head-based representation of the PDTB is adopted in the arguments identifier, which turns the problem of indentifying the arguments of discourse connective into finding the head and end of the arguments. In the non-explicit identifier, the contextual type features like words which have high frequency and can reflect the discourse relation are introduced to improve the performance of non-explicit identifier. Compared with other methods, experimental results achieve the considerable performance.