results for au:Xu_J in:cs

- Dec 13 2017 cs.CV arXiv:1712.04119v1Positron emission tomography (PET) is widely used in various clinical applications, including cancer diagnosis, heart disease and neuro disorders. The use of radioactive tracer in PET imaging raises concerns due to the risk of radiation exposure. To minimize this potential risk in PET imaging, efforts have been made to reduce the amount of radio-tracer usage. However, lowing dose results in low Signal-to-Noise-Ratio (SNR) and loss of information, both of which will heavily affect clinical diagnosis. Besides, the ill-conditioning of low-dose PET image reconstruction makes it a difficult problem for iterative reconstruction algorithms. Previous methods proposed are typically complicated and slow, yet still cannot yield satisfactory results at significantly low dose. Here, we propose a deep learning method to resolve this issue with an encoder-decoder residual deep network with concatenate skip connections. Experiments shows the proposed method can reconstruct low-dose PET image to a standard-dose quality with only two-hundredth dose. Different cost functions for training model are explored. Multi-slice input strategy is introduced to provide the network with more structural information and make it more robust to noise. Evaluation on ultra-low-dose clinical data shows that the proposed method can achieve better result than the state-of-the-art methods and reconstruct images with comparable quality using only 0.5% of the original regular dose.
- Consider the problem of private information retrieval (PIR) over a distributed storage system where $M$ records are stored across $N$ servers by using an $[N,K]$ MDS code. For simplicity, this problem is usually referred as the coded-PIR problem. The capacity of coded-PIR with privacy against any individual server was determined by Banawan and Ulukus in 2016, i.e., $\mathcal{C}_{\tiny C-PIR}=(1+\frac{K}{N}+\dots+\frac{K^{M-1}}{N^{M-1}})^{-1}$. They also presented a linear capacity-achieving scheme with sub-packetization $KN^{M}$. In this paper we focus on minimizing the sub-packetization for linear capacity-achieving coded-PIR schemes. We prove that the sub-packetization for all linear capacity-achieving coded-PIR schemes in the nontrivial cases (i.e. $N>K\geq 1$ and $M>1$) must be no less than $Kn^{M-1}$, where $n=N/{\rm gcd}(N,K)$. Moreover, we design a linear capacity-achieving coded-PIR scheme with sub-packetization $Kn^{M-1}$ for all $N>K\geq 1$ and $M>1$. Therefore, $Kn^{M-1}$ is the optimal sub-packetization for linear capacity-achieving coded-PIR schemes.
- In the first part of this paper, inspired by the geometric method of Jean-Pierre Marec, we consider the two-impulse Hohmann transfer problem between two coplanar circular orbits as a constrained nonlinear programming problem. By using the Kuhn-Tucker theorem, we analytically prove the global optimality of the Hohmann transfer. Two sets of feasible solutions are found, one of which corresponding to the Hohmann transfer is the global minimum, and the other is a local minimum. In the second part, we formulate the Hohmann transfer problem as two-point and multi-point boundary-value problems by using the calculus of variations. With the help of the Matlab solver bvp4c, two numerical examples are solved successfully, which verifies that the Hohmann transfer is indeed the solution of these boundary-value problems. Via static and dynamic constrained optimization, the solution to the orbit transfer problem proposed by W. Hohmann ninety-two years ago and its global optimality are re-discovered.
- This paper develops a novel methodology for using symbolic knowledge in deep learning. From first principles, we derive a semantic loss function that bridges between neural output vectors and logical constraints. This loss function captures how close the neural network is to satisfying the constraints on its output. An experimental evaluation shows that our semantic loss function effectively guides the learner to achieve (near-)state-of-the-art results on semi-supervised multi-class classification. Moreover, it significantly increases the ability of the neural network to predict structured objects, such as rankings and paths. These discrete concepts are tremendously difficult to learn, and benefit from a tight integration of deep learning and symbolic reasoning methods.
- This paper proposes the first end-to-end deep framework for high dynamic range (HDR) imaging of dynamic scenes with large-scale foreground motions. In state-of-the-art deep HDR imaging such as [13], the problem is formulated as an image composition problem, by first aligning input images using optical flows which are still error-prone due to occlusion and large motions. In our end-to-end approach, HDR imaging is formulated as an image translation problem and no optical flows are used. Moreover, our simple translation network can automatically hallucinate plausible HDR details in the presence of total occlusion, saturation and under-exposure, which are otherwise almost impossible to recover by conventional optimization approaches. We perform extensive qualitative and quantitative comparisons to show that our end-to-end HDR approach produces excellent results where color artifacts and geometry distortion are significantly reduced compared with existing state-ofthe-art methods.
- Convolutional Neural Networks (CNN) and the locally connected layer are limited in capturing the importance and relations of different local receptive fields, which are often crucial for tasks such as face verification, visual question answering, and word sequence prediction. To tackle the issue, we propose a novel locally smoothed neural network (LSNN) in this paper. The main idea is to represent the weight matrix of the locally connected layer as the product of the kernel and the smoother, where the kernel is shared over different local receptive fields, and the smoother is for determining the importance and relations of different local receptive fields. Specifically, a multi-variate Gaussian function is utilized to generate the smoother, for modeling the location relations among different local receptive fields. Furthermore, the content information can also be leveraged by setting the mean and precision of the Gaussian function according to the content. Experiments on some variant of MNIST clearly show our advantages over CNN and locally connected layer.
- We consider the problem of estimating means of two Gaussians in a 2-Gaussian mixture, which is not balanced and is corrupted by noise of an arbitrary distribution. We present a robust algorithm to estimate the parameters, together with upper bounds on the numbers of samples required for the estimate to be correct, where the bounds are parametrised by the dimension, ratio of the mixing coefficients, a measure of the separation of the two Gaussians, related to Mahalanobis distance, and a condition number of the covariance matrix. In theory, this is the first sample-complexity result for imbalanced mixtures corrupted by adversarial noise. In practice, our algorithm outperforms the vanilla Expectation-Maximisation (EM) algorithm in terms of estimation error.
- Nov 21 2017 cs.CL arXiv:1711.07010v3Named Entity Recognition and Relation Extraction for Chinese literature text is regarded as the highly difficult problem, partially because of the lack of tagging sets. In this paper, we build a discourse-level dataset from hundreds of Chinese literature articles for improving this task. To build a high quality dataset, we propose two tagging methods to solve the problem of data inconsistency, including a heuristic tagging method and a machine auxiliary tagging method. Based on this corpus, we also introduce several widely used models to conduct experiments. Experimental results not only show the usefulness of the proposed dataset, but also provide baselines for further research. The dataset is available at https://github.com/lancopku/Chinese-Literature-NER-RE-Dataset.
- Automatic conflict detection has grown in relevance with the advent of body-worn technology, but existing metrics such as turn-taking and overlap are poor indicators of conflict in police-public interactions. Moreover, standard techniques to compute them fall short when applied to such diversified and noisy contexts. We develop a pipeline catered to this task combining adaptive noise removal, non-speech filtering and new measures of conflict based on the repetition and intensity of phrases in speech. We demonstrate the effectiveness of our approach on body-worn audio data collected by the Los Angeles Police Department.
- This paper studies an unmanned aerial vehicle (UAV)-enabled multicast channel, in which a UAV serves as a mobile transmitter to deliver common information to a set of $K$ ground users. We aim to characterize the capacity of this channel over a finite UAV communication period, subject to its maximum speed constraint and an average transmit power constraint. To achieve the capacity, the UAV should use a sufficiently long code that spans over its whole communication period. Accordingly, the multicast channel capacity is achieved via maximizing the minimum achievable time-averaged rates of the $K$ users, by jointly optimizing the UAV's trajectory and transmit power allocation over time. However, this problem is non-convex and difficult to be solved optimally. To tackle this problem, we first consider a relaxed problem by ignoring the maximum UAV speed constraint, and obtain its globally optimal solution via the Lagrange dual method. The optimal solution reveals that the UAV should hover above a finite number of ground locations, with the optimal hovering duration and transmit power at each location. Next, based on such a multi-location-hovering solution, we present a successive hover-and-fly trajectory design and obtain the corresponding optimal transmit power allocation for the case with the maximum UAV speed constraint. Numerical results show that our proposed joint UAV trajectory and transmit power optimization significantly improves the achievable rate of the UAV-enabled multicast channel, and also greatly outperforms the conventional multicast channel with a fixed-location transmitter.
- In this addendum, we show that the switching algorithm QPS-SERENA can be converted R(QPS-SERENA), an algorithm for computing approximate Maximum Weight Matching (MWM). Empirically, R(QPS-SERENA) computes $(1-\epsilon)$-MWM within linear time (with respect to the number of edges $N^2$) for any fixed $\epsilon\in (0,1)$, for complete bipartite graphs with \it i.i.d. uniform edge weight distributions. This efficacy matches that of the state-of-art solution, although we so far cannot prove any theoretical guarantees on the time complexities needed to attain a certain approximation ratio. Then, we have similarly converted QPS-SERENADE to R(QPS-SERENADE), which empirically should output $(1-\epsilon)$-MWM within only $O(N \log N)$ time for the same type of complete bipartite graphs as described above.
- Nov 07 2017 cs.CL arXiv:1711.01427v1In recent years, neural networks have proven to be effective in Chinese word segmentation. However, this promising performance relies on large-scale training data. Neural networks with conventional architectures cannot achieve the desired results in low-resource datasets due to the lack of labelled training data. In this paper, we propose a deep stacking framework to improve the performance on word segmentation tasks with insufficient data by integrating datasets from diverse domains. Our framework consists of two parts, domain-based models and deep stacking networks. The domain-based models are used to learn knowledge from different datasets. The deep stacking networks are designed to integrate domain-based models. To reduce model conflicts, we innovatively add communication paths among models and design various structures of deep stacking networks, including Gaussian-based Stacking Networks, Concatenate-based Stacking Networks, Sequence-based Stacking Networks and Tree-based Stacking Networks. We conduct experiments on six low-resource datasets from various domains. Our proposed framework shows significant performance improvements on all datasets compared with several strong baselines.
- Suppose a database containing $M$ records is replicated across $N$ servers, and a user wants to privately retrieve one record by accessing the servers such that identity of the retrieved record is secret against any up to $T$ servers. A scheme designed for this purpose is called a private information retrieval (PIR) scheme. In practice, capacity-achieving and small sub-packetization are both desired for PIR schemes, because the former implies the highest download rate and the latter usually means simple realization. For general values of $N,T,M$, the only known capacity-achieving PIR scheme was designed by Sun and Jafar in 2016 with sub-packetization $N^M$. In this paper, we design a linear capacity-achieving PIR scheme with much smaller sub-packetization $dn^{M-1}$, where $d={\rm gcd}(N,T)$ and $n=N/d$. Furthermore, we prove that for any linear capacity-achieving PIR scheme it must have sub-packetization no less than $dn^{M-1}$, implying our scheme has the optimal sub-packetization. Moreover, comparing with Sun and Jafar's scheme, our scheme reduces the field size by a factor of $\frac{1}{Nd^{M-2}}$.
- Nov 01 2017 cs.CL arXiv:1710.11332v1Recently, encoder-decoder models are widely used in social media text summarization. However, these models sometimes select noise words in irrelevant sentences as part of a summary by error, thus declining the performance. In order to inhibit irrelevant sentences and focus on key information, we propose an effective approach by learning sentence weight distribution. In our model, we build a multi-layer perceptron to predict sentence weights. During training, we use the ROUGE score as an alternative to the estimated sentence weight, and try to minimize the gap between estimated weights and predicted weights. In this way, we encourage our model to focus on the key sentences, which have high relevance with the summary. Experimental results show that our approach outperforms baselines on a large-scale social media corpus.
- Nov 01 2017 cs.CL arXiv:1710.11334v1In recent years, more research has been devoted to studying the subtask of the complete shallow discourse parsing, such as indentifying discourse connective and arguments of connective. There is a need to design a full discourse parser to pull these subtasks together. So we develop a discourse parser turning the free text into discourse relations. The parser includes connective identifier, arguments identifier, sense classifier and non-explicit identifier, which connects with each other in pipeline. Each component applies the maximum entropy model with abundant lexical and syntax features extracted from the Penn Discourse Tree-bank. The head-based representation of the PDTB is adopted in the arguments identifier, which turns the problem of indentifying the arguments of discourse connective into finding the head and end of the arguments. In the non-explicit identifier, the contextual type features like words which have high frequency and can reflect the discourse relation are introduced to improve the performance of non-explicit identifier. Compared with other methods, experimental results achieve the considerable performance.
- Most of today's high-speed switches and routers adopt an input-queued crossbar switch architecture. Such a switch needs to compute a matching (crossbar schedule) between the input ports and output ports during each switching cycle (time slot). A key research challenge in designing large (in number of input/output ports $N$) input-queued crossbar switches is to develop crossbar scheduling algorithms that can compute "high quality" matchings - i.e. those that result in high switch throughput (ideally $100\%$) and low queueing delays for packets - at line rates. SERENA is arguably the best algorithm in that regard: It outputs excellent matching decisions that result in $100\%$ switch throughput and near-optimal queueing delays. However, since SERENA is a centralized algorithm with $O(N)$ computational complexity, it cannot support switches that both are large (in terms of $N$) and have a very high line rate per port. In this work, we propose SERENADE (SERENA, the Distributed Edition), a parallel algorithm suite that emulates SERENA in only $O(\log N)$ %(distributed) iterations between input ports and output ports, and hence has a time complexity of only $O(\log N)$ per port. Through extensive simulations, we show that all three variants in the SERENADE suite can, either provably or empirically, achieve 100\% throughput, and that they have similar delay performances as SERENA under heavy traffic loads.
- Oct 20 2017 cs.IR arXiv:1710.06997v1When applying learning to rank algorithms to Web search, a large number of features are usually designed to capture the relevance signals. Most of these features are computed based on the extracted textual elements, link analysis, and user logs. However, Web pages are not solely linked texts, but have structured layout organizing a large variety of elements in different styles. Such layout itself can convey useful visual information, indicating the relevance of a Web page. For example, the query-independent layout (i.e., raw page layout) can help identify the page quality, while the query-dependent layout (i.e., page rendered with matched query words) can further tell rich structural information (e.g., size, position and proximity) of the matching signals. However, such visual information of layout has been seldom utilized in Web search in the past. In this work, we propose to learn rich visual features automatically from the layout of Web pages (i.e., Web page snapshots) for relevance ranking. Both query-independent and query-dependent snapshots are considered as the new inputs. We then propose a novel visual perception model inspired by human's visual search behaviors on page viewing to extract the visual features. This model can be learned end-to-end together with traditional human-crafted features. We also show that such visual features can be efficiently acquired in the online setting with an extended inverted indexing scheme. Experiments on benchmark collections demonstrate that learning visual features from Web page snapshots can significantly improve the performance of relevance ranking in ad-hoc Web retrieval tasks.
- Oct 18 2017 cs.NI arXiv:1710.06121v1Detecting the anomaly behaviors such as network failure or Internet intentional attack in the large-scale Internet is a vital but challenging task. While numerous techniques have been developed based on Internet traffic in past years, anomaly detection for structured datasets by complex network have just been of focus recently. In this paper, a anomaly detection method for large-scale Internet topology is proposed by considering the changes of network crashes. In order to quantify the dynamic changes of Internet topology, the network path changes coefficient(NPCC) is put forward which will highlight the Internet abnormal state after it is attacked continuously. Furthermore we proposed the decision function which is inspired by Fibonacci Sequence to determine whether the Internet is abnormal or not. That is the current Internet is abnormal if its NPCC is beyond the normal domain which structured by the previous k NPCCs of Internet topology. Finally the new Internet anomaly detection method was tested over the topology data of three Internet anomaly events. The results show that the detection accuracy of all events are over 97%, the detection precision of each event are 90.24%, 83.33% and 66.67%, when k = 36. According to the experimental values of the index F_1, we found the the better the detection performance is, the bigger the k is, and our method has better performance for the anomaly behaviors caused by network failure than that caused by intentional attack. Compared with traditional anomaly detection, our work may be more simple and powerful for the government or organization in items of detecting large-scale abnormal events.
- Oct 17 2017 cs.IR arXiv:1710.05649v1This paper concerns a deep learning approach to relevance ranking in information retrieval (IR). Existing deep IR models such as DSSM and CDSSM directly apply neural networks to generate ranking scores, without explicit understandings of the relevance. According to the human judgement process, a relevance label is generated by the following three steps: 1) relevant locations are detected, 2) local relevances are determined, 3) local relevances are aggregated to output the relevance label. In this paper we propose a new deep learning architecture, namely DeepRank, to simulate the above human judgment process. Firstly, a detection strategy is designed to extract the relevant contexts. Then, a measure network is applied to determine the local relevances by utilizing a convolutional neural network (CNN) or two-dimensional gated recurrent units (2D-GRU). Finally, an aggregation network with sequential integration and term gating mechanism is used to produce a global relevance score. DeepRank well captures important IR characteristics, including exact/semantic matching signals, proximity heuristics, query term importance, and diverse relevance requirement. Experiments on both benchmark LETOR dataset and a large scale clickthrough data show that DeepRank can significantly outperform learning to ranking methods, and existing deep learning methods.
- Mobile Edge Computing (MEC) pushes computing functionalities away from the centralized cloud to the proximity of data sources, thereby reducing service provision latency and saving backhaul network bandwidth. Although computation offloading has been extensively studied in the literature, service caching is an equally, if not more, important design topic of MEC, yet receives much less attention. Service caching refers to caching application services and their related data (libraries/databases) in the edge server, e.g. MEC-enabled Base Station (BS), enabling corresponding computation tasks to be executed. Since only a small number of services can be cached in resource-limited edge server at the same time, which services to cache has to be judiciously decided to maximize the system performance. In this paper, we investigate collaborative service caching in MEC-enabled dense small cell (SC) networks. We propose an efficient decentralized algorithm, called CSC (Collaborative Service Caching), where a network of small cell BSs optimize service caching collaboratively to address a number of key challenges in MEC systems, including service heterogeneity, spatial demand coupling, and decentralized coordination. Our algorithm is developed based on parallel Gibbs sampling by exploiting the special structure of the considered problem using graphing coloring. The algorithm significantly improves the time efficiency compared to conventional Gibbs sampling, yet guarantees provable convergence and optimality. CSC is further extended to the SC network with selfish BSs, where a coalitional game is formulated to incentivize collaboration. A coalition formation algorithm is developed by employing the merge-and-split rules and ensures the stability of the SC coalitions.
- Graph based semi-supervised learning (GSSL) has intuitive representation and can be improved by exploiting the matrix calculation. However, it has to perform iterative optimization to achieve a preset objective, which usually leads to low efficiency. Another inconvenience lying in GSSL is that when new data come, the graph construction and the optimization have to be conducted all over again. We propose a sound assumption, arguing that: the neighboring data points are not in peer-to-peer relation, but in a partial-ordered relation induced by the local density and distance between the data; and the label of a center can be regarded as the contribution of its followers. Starting from the assumption, we develop a highly efficient non-iterative label propagation algorithm based on a novel data structure named as optimal leading forest (LaPOLeaF). The major weaknesses of the traditional GSSL are addressed by this study. We further scale LaPOLeaF to accommodate big data by utilizing block distance matrix technique, parallel computing, and Locality-Sensitive Hashing (LSH). Experiments on large datasets have shown the promising results of the proposed methods.
- Imitation learning holds the promise to address challenging robotic tasks such as autonomous navigation. It however requires a human supervisor to oversee the training process and send correct control commands to robots without feedback, which is always prone to error and expensive. To minimize human involvement and avoid manual labeling of data in the robotic autonomous navigation with imitation learning, this paper proposes a novel semi-supervised imitation learning solution based on a multi-sensory design. This solution includes a suboptimal sensor policy based on sensor fusion to automatically label states encountered by a robot to avoid human supervision during training. In addition, a recording policy is developed to throttle the adversarial affect of learning too much from the suboptimal sensor policy. This solution allows the robot to learn a navigation policy in a self-supervised manner. With extensive experiments in indoor environments, this solution can achieve near human performance in most of the tasks and even surpasses human performance in case of unexpected events such as hardware failures or human operation errors. To best of our knowledge, this is the first work that synthesizes sensor fusion and imitation learning to enable robotic autonomous navigation in the real world without human supervision.
- In this paper, we use the house price data ranging from January 2004 to October 2016 to predict the average house price of November and December in 2016 for each district in Beijing, Shanghai, Guangzhou and Shenzhen. We apply Autoregressive Integrated Moving Average model to generate the baseline while LSTM networks to build prediction model. These algorithms are compared in terms of Mean Squared Error. The result shows that the LSTM model has excellent properties with respect to predict time series. Also, stateful LSTM networks and stack LSTM networks are employed to further study the improvement of accuracy of the house prediction model.
- This paper studies a new unmanned aerial vehicle (UAV)-enabled wireless power transfer (WPT) system, where a UAV-mounted mobile energy transmitter (ET) is dispatched to deliver wireless energy to a set of on-ground energy receivers (ERs). We investigate how the UAV should optimally exploit its mobility via trajectory design to maximize the energy transferred to all ERs during a finite period. First, we consider the maximization of the sum energy received by all ERs by optimizing the UAV's trajectory subject to its maximum speed constraint. We obtain its optimal solution, which shows that the UAV should hover at one single fixed location during the whole period. However, the sum-energy maximization incurs a "near-far" fairness issue. To overcome this issue, we consider a different problem to maximize the minimum received energy among all ERs. We first consider an ideal case by ignoring the UAV's maximum speed constraint, and show that the relaxed problem can be optimally solved via the Lagrange dual method. Then, for the general case with the UAV's maximum speed constraint considered, we propose a new successive hover-and-fly trajectory motivated by the optimal trajectory in the ideal case, and obtain efficient trajectory designs by applying the successive convex programing (SCP).
- Sep 20 2017 cs.NE arXiv:1709.06114v2Research on the performance of recycled concrete as building material in the current world is an important subject. Given the complex composition of recycled concrete, conventional methods for forecasting slump scarcely obtain satisfactory results. Based on theory of nonlinear prediction method, we propose a recycled concrete slump prediction model based on geometric semantic genetic programming (GSGP) and combined it with recycled concrete features. Tests show that the model can accurately predict the recycled concrete slump by using the established prediction model to calculate the recycled concrete slump with different mixing ratios in practical projects and by comparing the predicted values with the experimental values. By comparing the model with several other nonlinear prediction models, we can conclude that GSGP has higher accuracy and reliability than conventional methods.
- As traditional neural network consumes a significant amount of computing resources during back propagation, \citetSun2017mePropSB propose a simple yet effective technique to alleviate this problem. In this technique, only a small subset of the full gradients are computed to update the model parameters. In this paper we extend this technique into the Convolutional Neural Network(CNN) to reduce calculation in back propagation, and the surprising results verify its validity in CNN: only 5\% of the gradients are passed back but the model still achieves the same effect as the traditional CNN, or even better. We also show that the top-$k$ selection of gradients leads to a sparse calculation in back propagation, which may bring significant computational benefits for high computational complexity of convolution operation in CNN.
- There is a special type of text which the order of the rows makes no difference (e.g., a word list). To compress these special texts, the traditional lossless compression method is not the ideal choice. A new method that can achieve better compression results for this type of texts is proposed. The texts are pre-processed by a method named SSE and are then compressed through the traditional lossless compression method. Comparison shows that an improved compression result is achieved.
- The recent development of CNN-based image dehazing has revealed the effectiveness of end-to-end modeling. However, extending the idea to end-to-end video dehazing has not been explored yet. In this paper, we propose an End-to-End Video Dehazing Network (EVD-Net), to exploit the temporal consistency between consecutive video frames. A thorough study has been conducted over a number of structure options, to identify the best temporal fusion strategy. Furthermore, we build an End-to-End United Video Dehazing and Detection Network(EVDD-Net), which concatenates and jointly trains EVD-Net with a video object detection model. The resulting augmented end-to-end pipeline has demonstrated much more stable and accurate detection results in hazy video.
- This paper studies the problem of estimating the grahpon model - the underlying generating mechanism of a network. Graphon estimation arises in many applications such as predicting missing links in networks and learning user preferences in recommender systems. The graphon model deals with a random graph of $n$ vertices such that each pair of two vertices $i$ and $j$ are connected independently with probability $\rho \times f(x_i,x_j)$, where $x_i$ is the unknown $d$-dimensional label of vertex $i$, $f$ is an unknown symmetric function, and $\rho$ is a scaling parameter characterizing the graph sparsity. Recent studies have identified the minimax error rate of estimating the graphon from a single realization of the random graph. However, there exists a wide gap between the known error rates of computationally efficient estimation procedures and the minimax optimal error rate. Here we analyze a spectral method, namely universal singular value thresholding (USVT) algorithm, in the relatively sparse regime with the average vertex degree $n\rho=\Omega(\log n)$. When $f$ belongs to Hölder or Sobolev space with smoothness index $\alpha$, we show the error rate of USVT is at most $(n\rho)^{ -2 \alpha / (2\alpha+d)}$, approaching the minimax optimal error rate $\log (n\rho)/(n\rho)$ for $d=1$ as $\alpha$ increases. Furthermore, when $f$ is analytic, we show the error rate of USVT is at most $\log^d (n\rho)/(n\rho)$. In the special case of stochastic block model with $k$ blocks, the error rate of USVT is at most $k/(n\rho)$, which is larger than the minimax optimal error rate by at most a multiplicative factor $k/\log k$. This coincides with the computational gap observed for community detection. A key step of our analysis is to derive the eigenvalue decaying rate of the edge probability matrix using piecewise polynomial approximations of the graphon function $f$.
- Merging mobile edge computing (MEC) functionality with the dense deployment of base stations (BSs) provides enormous benefits such as a real proximity, low latency access to computing resources. However, the envisioned integration creates many new challenges, among which mobility management (MM) is a critical one. Simply applying existing radio access oriented MM schemes leads to poor performance mainly due to the co-provisioning of radio access and computing services of the MEC-enabled BSs. In this paper, we develop a novel user-centric energy-aware mobility management (EMM) scheme, in order to optimize the delay due to both radio access and computation, under the long-term energy consumption constraint of the user. Based on Lyapunov optimization and multi-armed bandit theories, EMM works in an online fashion without future system state information, and effectively handles the imperfect system state information. Theoretical analysis explicitly takes radio handover and computation migration cost into consideration and proves a bounded deviation on both the delay performance and energy consumption compared to the oracle solution with exact and complete future system information. The proposed algorithm also effectively handles the scenario in which candidate BSs randomly switch on/off during the offloading process of a task. Simulations show that the proposed algorithms can achieve close-to-optimal delay performance while satisfying the user energy consumption constraint.
- We present an approach to accelerating a wide variety of image processing operators. Our approach uses a fully-convolutional network that is trained on input-output pairs that demonstrate the operator's action. After training, the original operator need not be run at all. The trained network operates at full resolution and runs in constant time. We investigate the effect of network architecture on approximation accuracy, runtime, and memory footprint, and identify a specific architecture that balances these considerations. We evaluate the presented approach on ten advanced image processing operators, including multiple variational models, multiscale tone and detail manipulation, photographic style transfer, nonlocal dehazing, and nonphotorealistic stylization. All operators are approximated by the same model. Experiments demonstrate that the presented approach is significantly more accurate than prior approximation schemes. It increases approximation accuracy as measured by PSNR across the evaluated operators by 8.5 dB on the MIT-Adobe dataset (from 27.5 to 36 dB) and reduces DSSIM by a multiplicative factor of 3 compared to the most accurate prior approximation scheme, while being the fastest. We show that our models generalize across datasets and across resolutions, and investigate a number of extensions of the presented approach. The results are shown in the supplementary video at https://youtu.be/eQyfHgLx8Dc
- Designing an e-commerce recommender system that serves hundreds of millions of active users is a daunting challenge. From a human vision perspective, there're two key factors that affect users' behaviors: items' attractiveness and their matching degree with users' interests. This paper proposes Telepath, a vision-based bionic recommender system model, which understands users from such perspective. Telepath is a combination of a convolutional neural network (CNN), a recurrent neural network (RNN) and deep neural networks (DNNs). Its CNN subnetwork simulates the human vision system to extract key visual signals of items' attractiveness and generate corresponding activations. Its RNN and DNN subnetworks simulate cerebral cortex to understand users' interest based on the activations generated from browsed items. In practice, the Telepath model has been launched to JD's recommender system and advertising system. For one of the major item recommendation blocks on the JD app, click-through rate (CTR), gross merchandise value (GMV) and orders have increased 1.59%, 8.16% and 8.71% respectively. For several major ads publishers of JD demand-side platform, CTR, GMV and return on investment have increased 6.58%, 61.72% and 65.57% respectively by the first launch, and further increased 2.95%, 41.75% and 41.37% respectively by the second launch.
- This paper studies the wireless surveillance of a hybrid automatic repeat request (HARQ) based suspicious communication link over Rayleigh fading channels. We propose a proactive eavesdropping approach, where a half-duplex monitor can opportunistically jam the suspicious link to exploit its potential retransmissions for overhearing more efficiently. In particular, we consider that the suspicious link uses at most two HARQ rounds for transmitting the same data packet, and we focus on two cases without and with HARQ combining at the monitor receiver. In both cases, we aim to maximize the successful eavesdropping probability at the monitor, by adaptively allocating the jamming power in the first HARQ round according to fading channel conditions, subject to an average jamming power constraint. For both cases, we show that the optimal jamming power allocation follows a threshold-based policy, and the monitor jams with constant power when the eavesdropping channel gain is less than the threshold. Numerical results show that the proposed proactive eavesdropping scheme achieves higher successful eavesdropping probability than the conventional passive eavesdropping, and HARQ combining can help further improve the eavesdropping performance.
- Aug 31 2017 cs.NE arXiv:1708.09116v2Genetic programming has been widely used in the engineering field. Compared with the conventional genetic programming and artificial neural network, geometric semantic genetic programming (GSGP) is superior in astringency and computing efficiency. In this paper, GSGP is adopted for the classification and regression analysis of a sample dataset. Furthermore, a model for slope stability analysis is established on the basis of geometric semantics. According to the results of the study based on GSGP, the method can analyze slope stability objectively and is highly precise in predicting slope stability and safety factors. Hence, the predicted results can be used as a reference for slope safety design.
- Computational elucidation of membrane protein (MP) structures is challenging partially due to lack of sufficient solved structures for homology modeling. Here we describe a high-throughput deep transfer learning method that first predicts MP contacts by learning from non-membrane proteins (non-MPs) and then predicting three-dimensional structure models using the predicted contacts as distance restraints. Tested on 510 non-redundant MPs, our method has contact prediction accuracy at least 0.18 better than existing methods, predicts correct folds for 218 MPs (TMscore at least 0.6), and generates three-dimensional models with RMSD less than 4 Angstrom and 5 Angstrom for 57 and 108 MPs, respectively. A rigorous blind test in the continuous automated model evaluation (CAMEO) project shows that our method predicted high-resolution three-dimensional models for two recent test MPs of 210 residues with RMSD close to 2 Angstrom. We estimated that our method could predict correct folds for between 1,345 and 1,871 reviewed human multi-pass MPs including a few hundred new folds, which shall facilitate the discovery of drugs targeting at membrane proteins.
- Long Short-Term Memory (LSTM) is the primary recurrent neural networks architecture for acoustic modeling in automatic speech recognition systems. Residual learning is an efficient method to help neural networks converge easier and faster. In this paper, we propose several types of residual LSTM methods for our acoustic modeling. Our experiments indicate that, compared with classic LSTM, our architecture shows more than 8% relative reduction in Phone Error Rate (PER) on TIMIT tasks. At the same time, our residual fast LSTM approach shows 4% relative reduction in PER on the same task. Besides, we find that all this architecture could have good results on THCHS-30, Librispeech and Switchboard corpora.
- Mobile robots are cyber-physical systems where the cyberspace and the physical world are strongly coupled. Attacks against mobile robots can transcend cyber defenses and escalate into disastrous consequences in the physical world. In this paper, we focus on the detection of active attacks that are capable of directly influencing robot mission operation. Through leveraging physical dynamics of mobile robots, we develop RIDS, a novel robot intrusion detection system that can detect actuator attacks as well as sensor attacks for nonlinear mobile robots subject to stochastic noises. We implement and evaluate a RIDS on Khepera mobile robot against concrete attack scenarios via various attack channels including signal interference, sensor spoofing, logic bomb, and physical damage. Evaluation of 20 experiments shows that the averages of false positive rates and false negative rates are both below 1%. Average detection delay for each attack remains within 0.40s.
- Aug 01 2017 cs.DC arXiv:1707.09455v2The amount of data moved over dedicated and non-dedicated network links increases much faster than the increase in the network capacity, but the current solutions fail to guarantee even the promised achievable transfer throughputs. In this paper, we propose a novel dynamic throughput optimization model based on mathematical modeling with offline knowledge discovery/analysis and adaptive online decision making. In offline analysis, we mine historical transfer logs to perform knowledge discovery about the transfer characteristics. Online phase uses the discovered knowledge from the offline analysis along with real-time investigation of the network condition to optimize the protocol parameters. As real-time investigation is expensive and provides partial knowledge about the current network status, our model uses historical knowledge about the network and data to reduce the real-time investigation overhead while ensuring near optimal throughput for each transfer. Our network and data agnostic solution is tested over different networks and achieved up to 93% accuracy compared with the optimal achievable throughput possible on those networks.
- Aug 01 2017 cs.CV arXiv:1707.09862v1Existing manifold learning methods are not appropriate for image retrieval task, because most of them are unable to process query image and they have much additional computational cost especially for large scale database. Therefore, we propose the iterative manifold embedding (IME) layer, of which the weights are learned off-line by unsupervised strategy, to explore the intrinsic manifolds by incomplete data. On the large scale database that contains 27000 images, IME layer is more than 120 times faster than other manifold learning methods to embed the original representations at query time. We embed the original descriptors of database images which lie on manifold in a high dimensional space into manifold-based representations iteratively to generate the IME representations in off-line learning stage. According to the original descriptors and the IME representations of database images, we estimate the weights of IME layer by ridge regression. In on-line retrieval stage, we employ the IME layer to map the original representation of query image with ignorable time cost (2 milliseconds). We experiment on five public standard datasets for image retrieval. The proposed IME layer significantly outperforms related dimension reduction methods and manifold learning methods. Without post-processing, Our IME layer achieves a boost in performance of state-of-the-art image retrieval methods with post-processing on most datasets, and needs less computational cost.
- Jul 31 2017 cs.CL arXiv:1707.09168v1The charge prediction task is to determine appropriate charges for a given case, which is helpful for legal assistant systems where the user input is fact description. We argue that relevant law articles play an important role in this task, and therefore propose an attention-based neural network method to jointly model the charge prediction task and the relevant article extraction task in a unified framework. The experimental results show that, besides providing legal basis, the relevant articles can also clearly improve the charge prediction results, and our full model can effectively predict appropriate charges for cases with different expression styles.
- Jul 26 2017 cs.IR arXiv:1707.07700v1The effective of information retrieval (IR) systems have become more important than ever. Deep IR models have gained increasing attention for its ability to automatically learning features from raw text; thus, many deep IR models have been proposed recently. However, the learning process of these deep IR models resemble a black box. Therefore, it is necessary to identify the difference between automatically learned features by deep IR models and hand-crafted features used in traditional learning to rank approaches. Furthermore, it is valuable to investigate the differences between these deep IR models. This paper aims to conduct a deep investigation on deep IR models. Specifically, we conduct an extensive empirical study on two different datasets, including Robust and LETOR4.0. We first compared the automatically learned features and hand-crafted features on the respects of query term coverage, document length, embeddings and robustness. It reveals a number of disadvantages compared with hand-crafted features. Therefore, we establish guidelines for improving existing deep IR models. Furthermore, we compare two different categories of deep IR models, i.e. representation-focused models and interaction-focused models. It is shown that two types of deep IR models focus on different categories of words, including topic-related words and query-related words.
- Radio-frequency (RF) signals enabled wireless information and power transfer (WIPT) is a cost-effective technique to achieve two-way communications and at the same time provide energy supplies for low-power wireless devices. However, the information transmission in WIPT is vulnerable to the eavesdropping by the energy receivers (ERs). To achieve secrecy communications with information nodes (INs) while satisfying the energy transfer requirement of ERs, an efficient solution is to exploit a dual use of the energy signals also as useful interference or artificial noise (AN) to interfere with the ERs, thus preventing against their potential information eavesdropping. Towards this end, this article provides an overview on the joint design of energy and information signals to achieve energy-efficient and secure WIPT under various practical setups, including simultaneous wireless information and power transfer (SWIPT), wireless powered cooperative relaying and jamming, and wireless powered communication networks (WPCN). We also present some research directions that are worth pursuing in the future.
- This paper studies the secrecy communication in an orthogonal frequency division multiplexing (OFDM) system, where a source sends confidential information to a destination in the presence of a potential eavesdropper. We employ wireless powered cooperative jamming to improve the secrecy rate of this system with the assistance of a cooperative jammer, which works in the harvest-then-jam protocol over two time-slots. In the first slot, the source sends dedicated energy signals to power the jammer; in the second slot, the jammer uses the harvested energy to jam the eavesdropper, in order to protect the simultaneous secrecy communication from the source to the destination. In particular, we consider two types of receivers at the destination, namely Type-I and Type-II receivers, which do not have and have the capability of canceling the (a-priori known) jamming signals, respectively. For both types of receivers, we maximize the secrecy rate at the destination by jointly optimizing the transmit power allocation at the source and the jammer over sub-carriers, as well as the time allocation between the two time-slots. First, we present the globally optimal solution to this problem via the Lagrange dual method, which, however, is of high implementation complexity. Next, to balance tradeoff between the algorithm complexity and performance, we propose alternative low-complexity solutions based on minorization maximization and heuristic successive optimization, respectively. Simulation results show that the proposed approaches significantly improve the secrecy rate, as compared to benchmark schemes without joint power and time allocation.
- This paper proposes an image dehazing model built with a convolutional neural network (CNN), called All-in-One Dehazing Network (AOD-Net). It is designed based on a re-formulated atmospheric scattering model. Instead of estimating the transmission matrix and the atmospheric light separately as most previous models did, AOD-Net directly generates the clean image through a light-weight CNN. Such a novel end-to-end design makes it easy to embed AOD-Net into other deep models, e.g., Faster R-CNN, for improving high-level task performance on hazy images. Experimental results on both synthesized and natural hazy image datasets demonstrate our superior performance than the state-of-the-art in terms of PSNR, SSIM and the subjective visual quality. Furthermore, when concatenating AOD-Net with Faster R-CNN and training the joint pipeline from end to end, we witness a large improvement of the object detection performance on hazy images.
- Jul 19 2017 cs.CL arXiv:1707.05635v1Representing texts as fixed-length vectors is central to many language processing tasks. Most traditional methods build text representations based on the simple Bag-of-Words (BoW) representation, which loses the rich semantic relations between words. Recent advances in natural language processing have shown that semantically meaningful representations of words can be efficiently acquired by distributed models, making it possible to build text representations based on a better foundation called the Bag-of-Word-Embedding (BoWE) representation. However, existing text representation methods using BoWE often lack sound probabilistic foundations or cannot well capture the semantic relatedness encoded in word vectors. To address these problems, we introduce the Spherical Paragraph Model (SPM), a probabilistic generative model based on BoWE, for text representation. SPM has good probabilistic interpretability and can fully leverage the rich semantics of words, the word co-occurrence information as well as the corpus-wide information to help the representation learning of texts. Experimental results on topical classification and sentiment analysis demonstrate that SPM can achieve new state-of-the-art performances on several benchmark datasets.
- Mobile edge computing (MEC) has been regarded as a promising technique to enhance the computation capabilities of wireless devices, by enabling them to offload computation-intensive tasks to base stations (BSs) at the network edge. This paper studies a new multiuser MEC system by exploiting the multi-antenna non-orthogonal multiple access (NOMA)-based computation offloading. In this system, multiple users can simultaneously offload their computation tasks to one multi-antenna BS over the same time/frequency resources for remote execution, and the BS can employ successive interference cancellation (SIC) for information decoding. We consider the partial offloading case, such that each user can partition the computation task into two parts for local computing and offloading, respectively. Under this setup, we minimize the weighted sum of the energy consumption at all users subject to their computation latency constraints, by jointly optimizing each user's task partition, local central processing unit (CPU) frequency for local computing, and transmit power and rate for offloading, as well as the BS's decoding order for SIC. We present an efficient algorithm to obtain the globally optimal solution to this problem by applying the Lagrange dual method. Numerical results show that the proposed NOMA-based partial offloading design significantly improves the energy efficiency of the multiuser MEC system, as compared to benchmark schemes with orthogonal multiple access (OMA)-based partial offloading, local computing only, and full offloading only.
- Jul 05 2017 cs.CV arXiv:1707.01058v2This work make the first attempt to generate articulated human motion sequence from a single image. On the one hand, we utilize paired inputs including human skeleton information as motion embedding and a single human image as appearance reference, to generate novel motion frames, based on the conditional GAN infrastructure. On the other hand, a triplet loss is employed to pursue appearance-smoothness between consecutive frames. As the proposed framework is capable of jointly exploiting the image appearance space and articulated/kinematic motion space, it generates realistic articulated motion sequence, in contrast to most previous video generation methods which yield blurred motion effects. We test our model on two human action datasets including KTH and Human3.6M, and the proposed framework generates very promising results on both datasets.
- Trimming techniques are efficient ways to generate complex geometries in Computer-Aided Design(CAD). In this paper, an improved isogeometric analysis(IGA) method for trimmed geometries is proposed. We will show that the proposed method reduces the numerical error of physical solution by 50% for simple trimmed geometries, and the condition number of stiffness matrix is also decreased. Furthermore, the number of integration elements and integration points involved in the solving process can be significantly reduced compared to previous approaches, drastically improving the computational efficiency for IGA problems on the trimmed geometry. Several examples are illustrated to show the effectiveness of the proposed approach.
- Jul 04 2017 cs.CG arXiv:1707.00607v2In this paper, we propose a general framework for constructing IGA-suitable planar B-spline parameterizations from given complex CAD boundaries consisting of a set of B-spline curves. Instead of forming the computational domain by a simple boundary, planar domains with high genus and more complex boundary curves are considered. Firstly, some pre-processing operations including Bézier extraction and subdivision are performed on each boundary curve in order to generate a high-quality planar parameterization; then a robust planar domain partition framework is proposed to construct high-quality patch-meshing results with few singularities from the discrete boundary formed by connecting the end points of the resulting boundary segments. After the topology information generation of quadrilateral decomposition, the optimal placement of interior Bézier curves corresponding to the interior edges of the quadrangulation is constructed by a global optimization method to achieve a patch-partition with high quality. Finally, after the imposition of C1=G1-continuity constraints on the interface of neighboring Bézier patches with respect to each quad in the quadrangulation, the high-quality Bézier patch parameterization is obtained by a C1-constrained local optimization method to achieve uniform and orthogonal iso-parametric structures while keeping the continuity conditions between patches. The efficiency and robustness of the proposed method are demonstrated by several examples which are compared to results obtained by the skeleton-based parameterization approach.
- In many applications involving large dataset or online updating, stochastic gradient descent (SGD) provides a scalable way to compute parameter estimates and has gained increasing popularity due to its numerical convenience and memory efficiency. While the asymptotic properties of SGD-based estimators have been established decades ago, statistical inference such as interval estimation remains much unexplored. The traditional resampling method such as the bootstrap is not computationally feasible since it requires to repeatedly draw independent samples from the entire dataset. The plug-in method is not applicable when there are no explicit formulas for the covariance matrix of the estimator. In this paper, we propose a scalable inferential procedure for stochastic gradient descent, which, upon the arrival of each observation, updates the SGD estimate as well as a large number of randomly perturbed SGD estimates. The proposed method is easy to implement in practice. We establish its theoretical properties for a general class of models that includes generalized linear models and quantile regression models as special cases. The finite-sample performance and numerical utility is evaluated by simulation studies and two real data applications.
- Jun 26 2017 cs.SI arXiv:1706.07484v1What tweet features are associated with higher effectiveness in tweets? Through the mining of 122 million engagements of 2.5 million original tweets, we present a systematic review of tweet time, entities, composition, and user account features. We show that the relationship between various features and tweeting effectiveness is non-linear; for example, tweets that use a few hashtags have higher effectiveness than using no or too many hashtags. This research closely relates to various industrial applications that are based on tweet features, including the analysis of advertising campaigns, the prediction of user engagement, the extraction of signals for automated trading, etc.
- Low-resolution digital-to-analog converters (DACs) and analog-to-digital converters (ADCs) are considered to reduce cost and power consumption in multiuser massive multiple-input multiple-output (MIMO). Using the Bussgang theorem, we derive the asymptotic downlink achievable rate w.r.t the resolutions of both DACs and ADCs, i.e., $b_{DA}$ and $b_{AD}$, under the assumption of large antenna number, $N$, and fixed user load ratio, $\beta$. We characterize the rate loss caused by finite-bit-resolution converters and reveal that the quantization distortion is ignorable at low signal-to-noise ratio (SNR) even with low-resolution converters at both sides. While for maintaining the same rate loss at high SNR, it is discovered that one-more-bit DAC resolution is needed when more users are scheduled with $\beta$ increased by four times. More specifically for one-bit rate loss requirement, $b_{DA}$ can be set by $\left\lceil b_{AD}+\frac{1}{2}\log\beta \right\rceil$ given $b_{AD}$. Similar observations on ADCs are also obtained with numerical verifications.
- This paper studies a new unmanned aerial vehicle (UAV)-enabled wireless power transfer (WPT) system, where a UAV-mounted energy transmitter (ET) broadcasts wireless energy to charge distributed energy receivers (ERs) on the ground. In particular, we consider a basic two-user scenario, and investigate how the UAV can optimally exploit its mobility to maximize the amount of energy transferred to the two ERs during a given charging period. We characterize the achievable energy region of the two ERs, by optimizing the UAV's trajectory subject to a maximum speed constraint. We show that when the distance between the two ERs is smaller than a certain threshold, the boundary of the energy region is achieved when the UAV hovers above a fixed location between them for all time; while when their distance is larger than the threshold, to achieve the boundary of the energy region, the UAV in general needs to hover and fly between two different locations above the line connecting them. Numerical results show that the optimized UAV trajectory can significantly improve the WPT efficiency and fairness of the two ERs, especially when the UAV's maximum speed is large and/or the charging duration is long.
- Jun 12 2017 cs.CL arXiv:1706.02861v3Endowing a chatbot with personality or an identity is quite challenging but critical to deliver more realistic and natural conversations. In this paper, we address the issue of generating responses that are coherent to a pre-specified agent profile. We design a model consisting of three modules: a profile detector to decide whether a post should be responded using the profile and which key should be addressed, a bidirectional decoder to generate responses forward and backward starting from a selected profile value, and a position detector that predicts a word position from which decoding should start given a selected profile value. We show that general conversation data from social media can be used to generate profile-coherent responses. Manual and automatic evaluation shows that our model can deliver more coherent, natural, and diversified responses.
- Jun 09 2017 cs.CL arXiv:1706.02459v1Current Chinese social media text summarization models are based on an encoder-decoder framework. Although its generated summaries are similar to source texts literally, they have low semantic relevance. In this work, our goal is to improve semantic relevance between source texts and summaries for Chinese social media summarization. We introduce a Semantic Relevance Based neural model to encourage high semantic similarity between texts and summaries. In our model, the source text is represented by a gated attention encoder, while the summary representation is produced by a decoder. Besides, the similarity score between the representations is maximized during training. Our experiments show that the proposed model outperforms baseline systems on a social media corpus.
- Jun 02 2017 cs.CV arXiv:1706.00212v2Key to automatically generate natural scene images is to properly arrange among various spatial elements, especially in the depth direction. To this end, we introduce a novel depth structure preserving scene image generation network (DSP-GAN), which favors a hierarchical and heterogeneous architecture, for the purpose of depth structure preserving scene generation. The main trunk of the proposed infrastructure is built on a Hawkes point process that models the spatial dependency between different depth layers. Within each layer generative adversarial sub-networks are trained collaboratively to generate realistic scene components, conditioned on the layer information produced by the point process. We experiment our model on a sub-set of SUNdataset with annotated scene images and demonstrate that our models are capable of generating depth-realistic natural scene image.
- Jun 02 2017 cs.CR arXiv:1706.00324v3We present a new, but simple, randomised order-preserving encryption (OPE) scheme based on the general approximate common divisor problem (GACDP). This appears to be the first OPE scheme to be based on a computational hardness primitive, rather than a security game. This scheme requires only $O(1)$ arithmetic operations for encryption and decryption. We show that the scheme has optimal information leakage under the assumption of uniformly distributed plaintexts, and we indicate that this property extends to some non-uniform distributions. We report on an extensive evaluation of our algorithms. The results clearly demonstrate highly favourable execution times in comparison with existing OPE schemes.
- May 30 2017 cs.CV arXiv:1705.09912v1Most of the existing denoising algorithms are developed for grayscale images, while it is not a trivial work to extend them for color image denoising because the noise statistics in R, G, B channels can be very different for real noisy images. In this paper, we propose a multi-channel (MC) optimization model for real color image denoising under the weighted nuclear norm minimization (WNNM) framework. We concatenate the RGB patches to make use of the channel redundancy, and introduce a weight matrix to balance the data fidelity of the three channels in consideration of their different noise statistics. The proposed MC-WNNM model does not have an analytical solution. We reformulate it into a linear equality-constrained problem and solve it with the alternating direction method of multipliers. Each alternative updating step has closed-form solution and the convergence can be guaranteed. Extensive experiments on both synthetic and real noisy image datasets demonstrate the superiority of the proposed MC-WNNM over state-of-the-art denoising methods.
- We consider the problem of distributed statistical machine learning in adversarial settings, where some unknown and time-varying subset of working machines may be compromised and behave arbitrarily to prevent an accurate model from being learned. This setting captures the potential adversarial attacks faced by Federated Learning -- a modern machine learning paradigm that is proposed by Google researchers and has been intensively studied for ensuring user privacy. Formally, we focus on a distributed system consisting of a parameter server and $m$ working machines. Each working machine keeps $N/m$ data samples, where $N$ is the total number of samples. The goal is to collectively learn the underlying true model parameter of dimension $d$. In classical batch gradient descent methods, the gradients reported to the server by the working machines are aggregated via simple averaging, which is vulnerable to a single Byzantine failure. In this paper, we propose a Byzantine gradient descent method based on the geometric median of means of the gradients. We show that our method can tolerate $q \le (m-1)/2$ Byzantine failures, and the parameter estimate converges in $O(\log N)$ rounds with an estimation error of $\sqrt{d(2q+1)/N}$, hence approaching the optimal error rate $\sqrt{d/N}$ in the centralized and failure-free setting. The total computational complexity of our algorithm is of $O((Nd/m) \log N)$ at each working machine and $O(md + kd \log^3 N)$ at the central server, and the total communication cost is of $O(m d \log N)$. We further provide an application of our general results to the linear regression problem. A key challenge arises in the above problem is that Byzantine failures create arbitrary and unspecified dependency among the iterations and the aggregated gradients. We prove that the aggregated gradient converges uniformly to the true gradient function.
- May 15 2017 cs.CV arXiv:1705.04505v1Most of existing image denoising methods learn image priors from either external data or the noisy image itself to remove noise. However, priors learned from external data may not be adaptive to the image to be denoised, while priors learned from the given noisy image may not be accurate due to the interference of corrupted noise. Meanwhile, the noise in real-world noisy images is very complex, which is hard to be described by simple distributions such as Gaussian distribution, making real noisy image denoising a very challenging problem. We propose to exploit the information in both external data and the given noisy image, and develop an external prior guided internal prior learning method for real noisy image denoising. We first learn external priors from an independent set of clean natural images. With the aid of learned external priors, we then learn internal priors from the given noisy image to refine the prior model. The external and internal priors are formulated as a set of orthogonal dictionaries to efficiently reconstruct the desired image. Extensive experiments are performed on several real noisy image datasets. The proposed method demonstrates highly competitive denoising performance, outperforming state-of-the-art denoising methods including those designed for real noisy images.
- Small cell base stations (SBSs) endowed with cloud-like computing capabilities are considered as a key enabler of edge computing (EC), which provides ultra-low latency and location-awareness for a variety of emerging mobile applications and the Internet of Things. However, due to the limited computation resources of an individual SBS, providing computation services of high quality to its users faces significant challenges when it is overloaded with an excessive amount of computation workload. In this paper, we propose collaborative edge computing among SBSs by forming SBS coalitions to share computation resources with each other, thereby accommodating more computation workload in the edge system and reducing reliance on the remote cloud. A novel SBS coalition formation algorithm is developed based on the coalitional game theory to cope with various new challenges in small-cell-based edge systems, including the co-provisioning of radio access and computing services, cooperation incentives, and potential security risks. To address these challenges, the proposed method (1) allows collaboration at both the user-SBS association stage and the SBS peer offloading stage by exploiting the ultra dense deployment of SBSs, (2) develops a payment-based incentive mechanism that implements proportionally fair utility division to form stable SBS coalitions, and (3) builds a social trust network for managing security risks among SBSs due to collaboration. Systematic simulations in practical scenarios are carried out to evaluate the efficacy and performance of the proposed method, which shows that tremendous edge computing performance improvement can be achieved.
- We propose a method for demonstrating sub community structure in scientific networks of relatively small size from analyzing databases of publications. Research relationships between the network members can be visualized as a graph with vertices corresponding to authors and with edges indicating joint authorship. Using a fast clustering algorithm combined with a graph layout algorithm, we demonstrate how to display these clustering results in an attractive and informative way. The small size of the graph allows us to develop tools that keep track of how these research sub communities evolve in time, as well as to present the research articles that create the links between the network members. These tools are included in a web app, where the visitor can easily identify the various sub communities, providing also valuable information for administrational purposes. Our method was developed for the GEAR mathematical network and it can be applied to other networks.
- May 04 2017 cs.CV arXiv:1705.01247v3In this paper, we propose a simple but effective semantic part-based weighting aggregation (PWA) for image retrieval. The proposed PWA utilizes the discriminative filters of deep convolutional layers as part detectors. Moreover, we propose the effective unsupervised strategy to select some part detectors to generate the "probabilistic proposals", which highlight certain discriminative parts of objects and suppress the noise of background. The final global PWA representation could then be acquired by aggregating the regional representations weighted by the selected "probabilistic proposals" corresponding to various semantic content. We conduct comprehensive experiments on four standard datasets and show that our unsupervised PWA outperforms the state-of-the-art unsupervised and supervised aggregation methods. Code is available at https://github.com/XJhaoren/PWA.
- Wireless surveillance is becoming increasingly important to protect the public security by legitimately eavesdropping suspicious wireless communications. This paper studies the wireless surveillance of a two-hop suspicious communication link by a half-duplex legitimate monitor. By exploring the suspicious link's two-hop nature, the monitor can adaptively choose among the following three eavesdropping modes to improve the eavesdropping performance: (I) \emphpassive eavesdropping to intercept both hops to decode the message collectively, (II) \emphproactive eavesdropping via \emphnoise jamming over the first hop, and (III) \emphproactive eavesdropping via \emphhybrid jamming over the second hop. In both proactive eavesdropping modes, the (noise/hybrid) jamming over one hop is for the purpose of reducing the end-to-end communication rate of the suspicious link and accordingly making the interception more easily over the other hop. Under this setup, we maximize the eavesdropping rate at the monitor by jointly optimizing the eavesdropping mode selection as well as the transmit power for noise and hybrid jamming. Numerical results show that the eavesdropping mode selection significantly improves the eavesdropping rate as compared to each individual eavesdropping mode.
- We present an optical flow estimation approach that operates on the full four-dimensional cost volume. This direct approach shares the structural benefits of leading stereo matching pipelines, which are known to yield high accuracy. To this day, such approaches have been considered impractical due to the size of the cost volume. We show that the full four-dimensional cost volume can be constructed in a fraction of a second due to its regularity. We then exploit this regularity further by adapting semi-global matching to the four-dimensional setting. This yields a pipeline that achieves significantly higher accuracy than state-of-the-art optical flow methods while being faster than most. Our approach outperforms all published general-purpose optical flow methods on both Sintel and KITTI 2015 benchmarks.
- This paper proposes a novel joint computation and communication cooperation approach in mobile edge computing (MEC) systems, which enables user cooperation in both computation and communication for improving the MEC performance. In particular, we consider a basic three-node MEC system that consists of a user node, a helper node, and an access point (AP) node attached with an MEC server. We focus on the user's latency-constrained computation over a finite block, and develop a four-slot protocol for implementing the joint computation and communication cooperation. Under this setup, we jointly optimize the computation and communication resource allocation at both the user and the helper, so as to minimize their total energy consumption subject to the user's computation latency constraint. We provide the optimal solution to this problem. Numerical results show that the proposed joint cooperation approach significantly improves the computation capacity and the energy efficiency at the user and helper nodes, as compared to other benchmark schemes without such a joint design.
- When tracking user-specific online activities, each user's preference is revealed in the form of choices and comparisons. For example, a user's purchase history tracks her choices, i.e. which item was chosen among a subset of offerings. A user's comparisons are observed either explicitly as in movie ratings or implicitly as in viewing times of news articles. Given such individualized ordinal data, we address the problem of collaboratively learning representations of the users and the items. The learned features can be used to predict a user's preference of an unseen item to be used in recommendation systems. This also allows one to compute similarities among users and items to be used for categorization and search. Motivated by the empirical successes of the MultiNomial Logit (MNL) model in marketing and transportation, and also more recent successes in word embedding and crowdsourced image embedding, we pose this problem as learning the MNL model parameters that best explains the data. We propose a convex optimization for learning the MNL model, and show that it is minimax optimal up to a logarithmic factor by comparing its performance to a fundamental lower bound. This characterizes the minimax sample complexity of the problem, and proves that the proposed estimator cannot be improved upon other than by a logarithmic factor. Further, the analysis identifies how the accuracy depends on the topology of sampling via the spectrum of the sampling graph. This provides a guideline for designing surveys when one can choose which items are to be compared. This is accompanies by numerical simulations on synthetic and real datasets confirming our theoretical predictions.
- Suppose there are $N$ distributed databases each storing a full set of $M$ independent files. A user wants to retrieve $r$ out of the $M$ files without revealing the identity of the $r$ files. When $r=1$ it is the classic problem of private information retrieval (PIR). In this paper we study the problem of private multi-file retrieval (PMFR) which covers the case of general $r$. We first prove an upper bound on the capacity of PMFR schemes which indicates the minimum possible download size per unit of retrieved files. Then we design a general PMFR scheme which happens to attain the upper bound when $r\geq\frac{M}{2}$, thus achieving the optimal communication cost. As $r$ goes down we show the trivial approach of executing $r$ independent PIR instances achieves the near optimal communication cost. Comparing with the capacity-achieving PIR schemes, our PMFR scheme reduces the number of subpackages needed for each file from $N^M$ to $N^2$, which implies a great reduction of implementation complexity.
- Mobile Edge Computing (MEC) (a.k.a. fog computing) has recently emerged to enable low-latency and location-aware data processing at the edge of mobile networks. Providing grid power supply in support of MEC, however, is costly and even infeasible, thus mandating on-site renewable energy as a major or even sole power supply in many scenarios. Nonetheless, the high intermittency and unpredictability of energy harvesting creates many new challenges of performing effective MEC. In this paper, we develop an algorithm called GLOBE that performs joint geographical load balancing (GLB) (for computation workload) and admission control (for communication data traffic), for optimizing the system performance of a network of MEC-enabled base stations. By leveraging the Lyapunov optimization with perturbation technique, GLOBE operates online without requiring future system information and addresses significant challenges caused by battery state dynamics and energy causality constraints. We prove that GLOBE achieves a close-to-optimal system performance compared to the offline algorithm that knows full future information, and present a critical tradeoff between battery capacity and system performance. Simulation results validate our analysis and demonstrate the superior performance of GLOBE compared to benchmark algorithms.
- The (ultra-)dense deployment of small-cell base stations (SBSs) endowed with cloud-like computing functionalities paves the way for pervasive mobile edge computing (MEC), enabling ultra-low latency and location-awareness for a variety of emerging mobile applications and the Internet of Things. To handle spatially uneven computation workloads in the network, cooperation among SBSs via workload peer offloading is essential to avoid large computation latency at overloaded SBSs and provide high quality of service to end users. However, performing effective peer offloading faces many unique challenges in small cell networks due to limited energy resources committed by self-interested SBS owners, uncertainties in the system dynamics and co-provisioning of radio access and computing services. This paper develops a novel online SBS peer offloading framework, called OPEN, by leveraging the Lyapunov technique, in order to maximize the long-term system performance while keeping the energy consumption of SBSs below individual long-term constraints. OPEN works online without requiring information about future system dynamics, yet provides provably near-optimal performance compared to the oracle solution that has the complete future information. In addition, this paper formulates a novel peer offloading game among SBSs, analyzes its equilibrium and efficiency loss in terms of the price of anarchy in order to thoroughly understand SBSs' strategic behaviors, thereby enabling decentralized and autonomous peer offloading decision making. Extensive simulations are carried out and show that peer offloading among SBSs dramatically improves the edge computing performance.