results for au:Chen_L in:cs

- Mar 28 2017 cs.CV arXiv:1703.08912v1In this paper, we will investigate the contribution of color names for salient object detection. Each input image is first converted to the color name space, which is consisted of 11 probabilistic channels. By exploring the topological structure relationship between the figure and the ground, we obtain a saliency map through a linear combination of a set of sequential attention maps. To overcome the limitation of only exploiting the surroundedness cue, two global cues with respect to color names are invoked for guiding the computation of another weighted saliency map. Finally, we integrate the two saliency maps into a unified framework to infer the saliency result. In addition, an improved post-processing procedure is introduced to effectively suppress the background while uniformly highlight the salient objects. Experimental results show that the proposed model produces more accurate saliency maps and performs well against 23 saliency models in terms of three evaluation metrics on three public datasets.
- Mar 27 2017 cs.CV arXiv:1703.08388v1Face recognition (FR) methods report significant performance by adopting the convolutional neural network (CNN) based learning methods. Although CNNs are mostly trained by optimizing the softmax loss, the recent trend shows an improvement of accuracy with different strategies, such as task-specific CNN learning with different loss functions, fine-tuning on target dataset, metric learning and concatenating features from multiple CNNs. Incorporating these tasks obviously requires additional efforts. Moreover, it demotivates the discovery of efficient CNN models for FR which are trained only with identity labels. We focus on this fact and propose an easily trainable and single CNN based FR method. Our CNN model exploits the residual learning framework. Additionally, it uses normalized features to compute the loss. Our extensive experiments show excellent generalization on different datasets. We obtain very competitive and state-of-the-art results on the LFW, IJB-A, YouTube faces and CACD datasets.
- The (ultra-)dense deployment of small-cell base stations (SBSs) endowed with cloud-like computing functionalities paves the way for pervasive mobile edge computing (MEC), enabling ultra-low latency and location-awareness for a variety of emerging mobile applications and the Internet of Things. To handle spatially uneven computation workloads in the network, cooperation among SBSs via workload peer offloading is essential to avoid large computation latency at overloaded SBSs and provide high quality of service to end users. However, performing effective peer offloading faces many unique challenges in small cell networks due to limited energy resources committed by self-interested SBS owners, uncertainties in the system dynamics and co-provisioning of radio access and computing services. This paper develops a novel online SBS peer offloading framework, called OPEN, by leveraging the Lyapunov technique, in order to maximize the long-term system performance while keeping the energy consumption of SBSs below individual long-term constraints. OPEN works online without requiring information about future system dynamics, yet provides provably near-optimal performance compared to the oracle solution that has the complete future information. In addition, this paper formulates a novel peer offloading game among SBSs, analyzes its equilibrium and efficiency loss in terms of the price of anarchy in order to thoroughly understand SBSs' strategic behaviors, thereby enabling decentralized and autonomous peer offloading decision making. Extensive simulations are carried out and show that peer offloading among SBSs dramatically improves the edge computing performance.
- Mobile edge computing (a.k.a. fog computing) has recently emerged to enable in-situ processing of delay-sensitive applications at the edge of mobile networks. Providing grid power supply in support of mobile edge computing, however, is costly and even infeasible (in certain rugged or under-developed areas), thus mandating on-site renewable energy as a major or even sole power supply in increasingly many scenarios. Nonetheless, the high intermittency and unpredictability of renewable energy make it very challenging to deliver a high quality of service to users in energy harvesting mobile edge computing systems. In this paper, we address the challenge of incorporating renewables into mobile edge computing and propose an efficient reinforcement learning-based resource management algorithm, which learns on-the-fly the optimal policy of dynamic workload offloading (to the centralized cloud) and edge server provisioning to minimize the long-term system cost (including both service delay and operational cost). Our online learning algorithm uses a decomposition of the (offline) value iteration and (online) reinforcement learning, thus achieving a significant improvement of learning rate and run-time performance when compared to standard reinforcement learning algorithms such as Q-learning. We prove the convergence of the proposed algorithm and analytically show that the learned policy has a simple monotone structure amenable to practical implementation. Our simulation results validate the efficacy of our algorithm, which significantly improves the edge computing performance compared to fixed or myopic optimization schemes and conventional reinforcement learning algorithms.
- Mar 14 2017 cs.CY arXiv:1703.04174v1Structured Peer Learning (SPL) is a form of peer-based supplemental instruction that focuses on mentoring, guidance, and development of technical, communication, and social skills in both the students receiving assistance and the students in teaching roles. This paper explores the methodology, efficacy, and reasoning behind the practical realization of a SPL program designed to increase student knowledge and success in undergraduate Computer Science courses. Students expressed an increased level of comfort when asking for help from student teachers versus traditional educational resources, historically showed an increased average grade in lower-level courses, and felt that the program positively impacted their desire to continue in or switch to a Computer major. Additionally, results indicated that advances in programming, analytical thinking, and abstract analysis skills were evident in not only the students but also the student teachers, suggesting a strong bidirectional flow of knowledge.
- Mar 06 2017 cs.CV arXiv:1703.01243v1One of the major challenges in Minimally Invasive Surgery (MIS) such as laparoscopy is the lack of depth perception. In recent years, laparoscopic scene tracking and surface reconstruction has been a focus of investigation to provide rich additional information to aid the surgical process and compensate for the depth perception issue. However, robust 3D surface reconstruction and augmented reality with depth perception on the reconstructed scene are yet to be reported. This paper presents our work in this area. First, we adopt a state-of-the-art visual simultaneous localization and mapping (SLAM) framework - ORB-SLAM - and extend the algorithm for use in MIS scenes for reliable endoscopic camera tracking and salient point mapping. We then develop a robust global 3D surface reconstruction frame- work based on the sparse point clouds extracted from the SLAM framework. Our approach is to combine an outlier removal filter within a Moving Least Squares smoothing algorithm and then employ Poisson surface reconstruction to obtain smooth surfaces from the unstructured sparse point cloud. Our proposed method has been quantitatively evaluated compared with ground-truth camera trajectories and the organ model surface we used to render the synthetic simulation videos. In vivo laparoscopic videos used in the tests have demonstrated the robustness and accuracy of our proposed framework on both camera tracking and surface reconstruction, illustrating the potential of our algorithm for depth augmentation and depth-corrected augmented reality in MIS with monocular endoscopes.
- Mar 03 2017 cs.DS arXiv:1703.00575v1In a recent paper, Braun, Chung and Graham [1] have addressed a single-processor scheduling problem with time restrictions. Given a fixed integer $B \geq 2$, there is a set of jobs to be processed by a single processor subject to the following B-constraint. For any real $x$, no unit time interval $[x, x+1)$ is allowed to intersect more than $B$ jobs. The problem has been shown to be NP-hard when $B$ is part of the input and left as an open question whether it remains NP-hard or not if $B$ is fixed [1, 5, 7]. This paper contributes to answering this question that we prove the problem is NP-hard even when $B=2$. A PTAS is also presented for any constant $B \geq 2$.
- Mar 01 2017 cs.CV arXiv:1702.08692v1Fine-grained recognition is a challenging task due to the small intra-category variances. Most of top-performing fine-grained recognition methods leverage parts of objects for better performance. Therefore, part annotations which are extremely computationally expensive are required. In this paper, we propose a novel cascaded deep CNN detection framework for fine-grained recognition which is trained to detect the whole object without considering parts. Nevertheless, most of current top-performing detection networks use the N+1 class (N object categories plus background) softmax loss, and the background category with much more training samples dominates the feature learning progress so that the features are not good for object categories with fewer samples. To bridge this gap, we introduce a cascaded structure to eliminate background and exploit a one-vs-rest loss to capture more minute variances among different subordinate categories. Experiments show that our proposed recognition framework achieves comparable performance with state-of-the-art, part-free, fine-grained recognition methods on the CUB-200-2011 Bird dataset. Moreover, our method even outperforms most of part-based methods while does not need part annotations at the training stage and is free from any annotations at test stage.
- In this paper we study the use of coding techniques to accelerate machine learning (ML). Coding techniques, such as prefix codes, have been extensively studied and used to accelerate low-level data processing primitives such as scans in a relational database system. However, there is little work on how to exploit them to accelerate ML algorithms. In fact, applying coding techniques for faster ML faces a unique challenge: one needs to consider both how the codes fit into the optimization algorithm used to train a model, and the interplay between the model structure and the coding scheme. Surprisingly and intriguingly, our study demonstrates that a slight variant of the classical Lempel-Ziv-Welch (LZW) coding scheme is a good fit for several popular ML algorithms, resulting in substantial runtime savings. Comprehensive experiments on several real-world datasets show that our LZW-based ML algorithms exhibit speedups of up to 31x compared to a popular and state-of-the-art ML library, with no changes to ML accuracy, even though the implementations of our LZW variants are not heavily tuned. Thus, our study reveals a new avenue for accelerating ML algorithms using coding techniques and we hope this opens up a new direction for more research.
- In the Best-$k$-Arm problem, we are given $n$ stochastic bandit arms, each associated with an unknown reward distribution. We are required to identify the $k$ arms with the largest means by taking as few samples as possible. In this paper, we make progress towards a complete characterization of the instance-wise sample complexity bounds for the Best-$k$-Arm problem. On the lower bound side, we obtain a novel complexity term to measure the sample complexity that every Best-$k$-Arm instance requires. This is derived by an interesting and nontrivial reduction from the Best-$1$-Arm problem. We also provide an elimination-based algorithm that matches the instance-wise lower bound within doubly-logarithmic factors. The sample complexity of our algorithm strictly dominates the state-of-the-art for Best-$k$-Arm (module constant factors).
- Feb 10 2017 cs.CL arXiv:1702.02584v1For the purpose of automatically evaluating speakers' humor usage, we build a presentation corpus containing humorous utterances based on TED talks. Compared to previous data resources supporting humor recognition research, ours has several advantages, including (a) both positive and negative instances coming from a homogeneous data set, (b) containing a large number of speakers, and (c) being open. Focusing on using lexical cues for humor recognition, we systematically compare a newly emerging text classification method based on Convolutional Neural Networks (CNNs) with a well-established conventional method using linguistic knowledge. The CNN method shows its advantages on both higher recognition accuracies and being able to learn essential features automatically.
- As enjoying the closed form solution, least squares support vector machine (LSSVM) has been widely used for classification and regression problems having the comparable performance with other types of SVMs. However, LSSVM has two drawbacks: sensitive to outliers and lacking sparseness. Robust LSSVM (R-LSSVM) overcomes the first partly via nonconvex truncated loss function, but the current algorithms for R-LSSVM with the dense solution are faced with the second drawback and are inefficient for training large-scale problems. In this paper, we interpret the robustness of R-LSSVM from a re-weighted viewpoint and give a primal R-LSSVM by the representer theorem. The new model may have sparse solution if the corresponding kernel matrix has low rank. Then approximating the kernel matrix by a low-rank matrix and smoothing the loss function by entropy penalty function, we propose a convergent sparse R-LSSVM (SR-LSSVM) algorithm to achieve the sparse solution of primal R-LSSVM, which overcomes two drawbacks of LSSVM simultaneously. The proposed algorithm has lower complexity than the existing algorithms and is very efficient for training large-scale problems. Many experimental results illustrate that SR-LSSVM can achieve better or comparable performance with less training time than related algorithms, especially for training large scale problems.
- Feb 03 2017 cs.DB arXiv:1702.00567v1Data fusion has played an important role in data mining because high-quality data is required in a lot of applications. As on-line data may be out-of-date and errors in the data may propagate with copying and referring between sources, it is hard to achieve satisfying results with merely applying existing data fusion methods to fuse Web data. In this paper, we make use of the crowd to achieve high-quality data fusion result. We design a framework selecting a set of tasks to ask crowds in order to improve the confidence of data. Since data are correlated and crowds may provide incorrect answers, how to select a proper set of tasks to ask the crowd is a very challenging problem. In this paper, we design an approximation solution to address this challenge since we prove that the problem is at NP-hard. To further improve the efficiency, we design a pruning strategy and a preprocessing method, which effectively improve the performance of the proposed approximation solution. Furthermore, we find that under certain scenarios, we are not interested in all the facts, but only a specific set of facts. Thus, for these specific scenarios, we also develop another approximation solution which is much faster than the general approximation solution. We verify the solutions with extensive experiments on a real crowdsourcing platform.
- Feb 01 2017 cs.DS arXiv:1701.08809v1We investigate the problem of scheduling the maintenance of edges in a network, motivated by the goal of minimizing outages in transportation or telecommunication networks. We focus on maintaining connectivity between two nodes over time; for the special case of path networks, this is related to the problem of minimizing the busy time of machines. We show that the problem can be solved in polynomial time in arbitrary networks if preemption is allowed. If preemption is restricted to integral time points, the problem is NP-hard and in the non-preemptive case we give strong non-approximability results. Furthermore, we give tight bounds on the power of preemption, that is, the maximum ratio of the values of non-preemptive and preemptive optimal solutions. Interestingly, the preemptive and the non-preemptive problem can be solved efficiently on paths, whereas we show that mixing both leads to a weakly NP-hard problem that allows for a simple 2-approximation.
- Merging Mobile Edge Computing (MEC), which is an emerging paradigm to meet the increasing computation demands from mobile devices, with the dense deployment of Base Stations (BSs), is foreseen as a key step towards the next generation mobile networks. However, new challenges arise for designing energy efficient networks since radio access resources and computing resources of BSs have to be jointly managed, and yet they are complexly coupled with traffic in both spatial and temporal domains. In this paper, we address the challenge of incorporating MEC into dense cellular networks, and propose an efficient online algorithm, called ENGINE (ENErgy constrained offloadINg and slEeping) which makes joint computation offloading and BS sleeping decisions in order to maximize the quality of service while keeping the energy consumption low. Our algorithm leverages Lyapunov optimization technique, works online and achieves a close-to-optimal performance without using future information. Our simulation results show that our algorithm can effectively reduce energy consumption without sacrificing the user quality of service.
- Merging mobile edge computing with the dense deployment of small cell base stations promises enormous benefits such as a real proximity, ultra-low latency access to cloud functionalities. However, the envisioned integration creates many new challenges and one of the most significant is mobility management, which is becoming a key bottleneck to the overall system performance. Simply applying existing solutions leads to poor performance due to the highly overlapped coverage areas of multiple base stations in the proximity of the user and the co-provisioning of radio access and computing services. In this paper, we develop a novel user-centric mobility management scheme, leveraging Lyapunov optimization and multi-armed bandits theories, in order to maximize the edge computation performance for the user while keeping the user's communication energy consumption below a constraint. The proposed scheme effectively handles the uncertainties present at multiple levels in the system and provides both short-term and long-term performance guarantee. Simulation results show that our proposed scheme can significantly improve the computation performance (compared to state of the art) while satisfying the communication energy constraint.
- Jan 17 2017 cs.CV arXiv:1701.04043v1Tensor principal component analysis (TPCA) is a multi-linear extension of principal component analysis which converts a set of correlated measurements into several principal components. In this paper, we propose a new robust TPCA method to extract the princi- pal components of the multi-way data based on tensor singular value decomposition. The tensor is split into a number of blocks of the same size. The low rank component of each block tensor is extracted using iterative tensor singular value thresholding method. The prin- cipal components of the multi-way data are the concatenation of all the low rank components of all the block tensors. We give the block tensor incoherence conditions to guarantee the successful decom- position. This factorization has similar optimality properties to that of low rank matrix derived from singular value decomposition. Ex- perimentally, we demonstrate its effectiveness in two applications, including motion separation for surveillance videos and illumination normalization for face images.
- Energy internet is one of the most promising future energy infrastructures which could enhance energy efficiency and improve system operating flexibility. In analog to the micro-grid, micro energy internet puts more emphasis on distribution level and consumer side. The concept and design principles of smart micro energy internet are proposed to accommodate micro-grids, distributed poly-generation systems, energy storage facilities, and related energy distribution infrastructures. The dispatch and control system of the smart micro energy internet should be responsive to external disturbance and able to approach a satisfactory operating point which compromises multiple criteria, such as safety, economy, and environment protection. To realize the vision of the smart micro energy internet, engineering game theory based dispatch and control strategies are investigated. Based on the proposed architecture and energy management system, a prototype of the first domestic solar based smart micro energy internet is initially established in Qinghai University.
- Dec 30 2016 cs.NI arXiv:1612.08778v1Cellular communication networks are plagued with redundant capacity, which results in low utilization and cost-effectiveness of network capital investments. The redundant capacity can be exploited to deliver secondary traffic that is ultra-elastic and delay-tolerant. In this paper, we propose an analytical framework to study the capacity-delay tradeoff of elastic/secondary traffic in large scale cellular networks with spectrum aggregation. Our framework integrates stochastic geometry and queueing theory models and gives analytical insights into the capacity-delay performance in the interference limited regime. Closed-form results are obtained to characterize the mean delay and delay distribution as functions of per user throughput capacity. The impacts of spectrum aggregation, user and base station (BS) densities, traffic session payload, and primary traffic dynamics on the capacity-delay tradeoff relationship are investigated. The fundamental capacity limit is derived and its scaling behavior is revealed. Our analysis shows the feasibility of providing secondary communication services over cellular networks and highlights some critical design issues.
- Differential space time block coding (STBC) achieves full spatial diversity and avoids channel estimation overhead. Over highly frequency-selective channels, STBC is integrated with orthogonal frequency division multiplexing (OFDM) to achieve high performance. However, low-cost implementation of differential STBC-OFDM using direct-conversion transceivers is sensitive to In-phase/Quadrature-phase imbalance (IQI). In this paper, we quantify the performance impact of IQI at the receiver front-end on differential STBC-OFDM systems and propose a compensation algorithm to mitigate its effect. The proposed receiver IQI compensation works in an adaptive decision-directed manner without using known pilots or training sequences, which reduces the rate loss due to training overhead. Our numerical results show that our proposed compensation algorithm can effectively mitigate receive IQI in differential STBC-OFDM.
- Differential space time block coding (STBC) achieves full spatial diversity and avoids channel estimation overhead. Over highly frequency-selective channels, STBC is integrated with orthogonal frequency division multiplexing (OFDM) to efficiently mitigate intersymbol interference effects. However, low-cost implementation of STBC-OFDM with direct-conversion transceivers is sensitive to In-phase/Quadrature-phase imbalance (IQI). In this paper, we quantify the performance impact of IQI at both the transmitter and receiver radio frequency front-ends on differential STBC-OFDM systems which has not been investigated before in the literature. In addition, we propose a widely-linear compensation algorithm at the receiver to mitigate the performance degradation caused by the IQI at the transmitter and receiver ends. Moreover, a parameter-based generalized algorithm is proposed to extract the IQI parameters and improve the performance under high-mobility. The adaptive compensation algorithms are blind and work in a decision-directed manner without using known pilots or training sequences. Numerical results show that our proposed compensation algorithms can effectively mitigate IQI in differential STBC-OFDM.
- Dec 23 2016 cs.DB arXiv:1612.07448v4Providing machine learning (ML) over relational data is a mainstream requirement for data analytics systems. While almost all the ML tools require the input data to be presented as a single table, many datasets are multi-table, which forces data scientists to join those tables first, leading to data redundancy and runtime waste. Recent works on "factorized" ML mitigate this issue for a few specific ML algorithms by pushing ML through joins. But their approaches require a manual rewrite of ML implementations. Such piecemeal methods create a massive development overhead when extending such ideas to other ML algorithms. In this paper, we show that it is possible to mitigate this overhead by leveraging a popular formal algebra to represent the computations of many ML algorithms: linear algebra. We introduce a new logical data type to represent normalized data and devise a framework of algebraic rewrite rules to convert a large set of linear algebra operations over denormalized data into operations over normalized data. We show how this enables us to automatically "factorize" several popular ML algorithms, thus unifying and generalizing several prior works. We prototype our framework in the popular ML environment R and an industrial R-over-RDBMS tool. Experiments with both synthetic and real normalized data show that our framework also yields significant speed-ups, up to 36x on real data.
- In the near future, there will likely be special-purpose quantum computers with 40-50 high-quality qubits. This paper lays general theoretical foundations for how to use such devices to demonstrate "quantum supremacy": that is, a clear quantum speedup for some task, motivated by the goal of overturning the Extended Church-Turing Thesis as confidently as possible. First, we study the hardness of sampling the output distribution of a random quantum circuit, along the lines of a recent proposal by the the Quantum AI group at Google. We show that there's a natural hardness assumption, which has nothing to do with sampling, yet implies that no efficient classical algorithm can pass a statistical test that the quantum sampling procedure's outputs do pass. Compared to previous work, the central advantage is that we can now talk directly about the observed outputs, rather than about the distribution being sampled. Second, in an attempt to refute our hardness assumption, we give a new algorithm, for simulating a general quantum circuit with n qubits and m gates in polynomial space and m^O(n) time. We then discuss why this and other known algorithms fail to refute our assumption. Third, resolving an open problem of Aaronson and Arkhipov, we show that any strong quantum supremacy theorem--of the form "if approximate quantum sampling is classically easy, then PH collapses"--must be non-relativizing. Fourth, refuting a conjecture by Aaronson and Ambainis, we show that the Fourier Sampling problem achieves a constant versus linear separation between quantum and randomized query complexities. Fifth, we study quantum supremacy relative to oracles in P/poly. Previous work implies that, if OWFs exist, then quantum supremacy is possible relative to such oracles. We show that some assumption is needed: if SampBPP=SampBQP and NP is in BPP, then quantum supremacy is impossible relative to such oracles.
- Dec 09 2016 cs.DB arXiv:1612.02564v3Huge amount of data with both space and text information, e.g., geo-tagged tweets, is flooding on the Internet. Such spatio-textual data stream contains valuable information for millions of users with various interests on different keywords and locations. Publish/subscribe systems enable efficient and effective information distribution by allowing users to register continuous queries with both spatial and textual constraints. However, the explosive growth of data scale and user base has posed challenges to the existing centralized publish/subscribe systems for spatio-textual data streams. In this paper, we propose our distributed publish/subscribe system, called PS2Stream, which digests a massive spatio-textual data stream and directs the stream to target users with registered interests. Compared with existing systems, PS2Stream achieves a better workload distribution in terms of both minimizing the total amount of workload and balancing the load of workers. To achieve this, we propose a new workload distribution algorithm considering both space and text properties of the data. Additionally, PS2Stream supports dynamic load adjustments to adapt to the change of the workload, which makes PS2Stream adaptive. Extensive empirical evaluation, on commercial cloud computing platform with real data, validates the superiority of our system design and advantages of our techniques on system performance improvement.
- Heart diseases constitute a global health burden, and the problem is exacerbated by the error-prone nature of listening to and interpreting heart sounds. This motivates the development of automated classification to screen for abnormal heart sounds. Existing machine learning-based systems achieve accurate classification of heart sound recordings but rely on expert features that have not been thoroughly evaluated on noisy recordings. Here we propose a segmental convolutional neural network architecture that achieves automatic feature learning from noisy heart sound recordings. Our experiments show that our best model, trained on noisy recording segments acquired with an existing hidden semi-markov model-based approach, attains a classification accuracy of 87.5% on the 2016 PhysioNet/CinC Challenge dataset, compared to the 84.6% accuracy of the state-of-the-art statistical classifier trained and evaluated on the same dataset. Our results indicate the potential of using neural network-based methods to increase the accuracy of automated classification of heart sound recordings for improved screening of heart diseases.
- There exist multitudes of cloud performance metrics, including workload performance, application placement, software/hardware optimization, scalability, capacity, reliability, agility and so on. In this paper, we consider jointly optimizing the performance of the software applications in the cloud. The challenges lie in bringing a diversity of raw data into tidy data format, unifying performance data from multiple systems based on timestamps, and assessing the quality of the processed performance data. Even after verifying the quality of cloud performance data, additional challenges block optimizing cloud computing. In this paper, we identify the challenges of cloud computing from the perspectives of computing environment, data collection, performance analytics and production environment.
- Nov 22 2016 cs.NI arXiv:1611.06302v1With the dense deployment of small cell networks, low-cost backhaul schemes for small cell base stations (SBSs) have attracted great attentions. Self-backhaul using cellular communication technology is considered as a promising solution. Although some excellent works have been done on self-backhaul in small cell networks, most of them do not consider the recent advances of full-duplex (FD) and massive multiple-input and multiple-output (MIMO) technologies. In this paper, we propose a self-backhaul scheme for small cell networks by combining FD and massive MIMO technologies. In our proposed scheme, the macro base station (MBS) is equipped with massive MIMO antennas, and the SBSs have the FD communication ability. By treating the SBSs as \textitspecial macro users, we can achieve the simultaneous transmissions of the access link of users and the backhaul link of SBSs in the same frequency. Furthermore, considering the existence of inter-tier and intra-tier interference, we formulate the power allocation problem of the MBS and SBSs as an optimization problem. Because the formulated power allocation problem is a non-convex problem, we transform the original problem into a difference of convex program (DCP) by successive convex approximation method (SCAM) and variable transformation, and then solve it using a constrained concave convex procedure (CCCP) based iterative algorithm. Finally, extensive simulations are conducted with different system configurations to verify the effectiveness of the proposed scheme.
- Nov 18 2016 cs.CV arXiv:1611.05594v1Visual attention has been successfully applied in structural prediction tasks such as visual captioning and question answering. Existing visual attention models are generally spatial, i.e., the attention is modeled as spatial probabilities that re-weight the last conv-layer feature map of a CNN which encodes an input image. However, we argue that such spatial attention does not necessarily conform to the attention mechanism --- a dynamic feature extractor that combines contextual fixations over time, as CNN features are naturally spatial, channel-wise and multi-layer. In this paper, we introduce a novel convolutional neural network dubbed SCA-CNN that incorporates Spatial and Channel-wise Attentions in a CNN. In the task of image captioning, SCA-CNN dynamically modulates the sentence generation context in multi-layer feature maps, encoding where (i.e., attentive spatial locations at multiple layers) and what (i.e., attentive channels) the visual attention is. We evaluate the SCA-CNN architecture on three benchmark image captioning datasets: Flickr8K, Flickr30K, and MSCOCO. SCA-CNN achieves significant improvements over state-of-the-art visual attention-based image captioning methods.
- Device-to-device (D2D) computation offloading has recently been proposed to enhance mobile edge computing (MEC) performance by exploiting spare computing resources in proximity user devices, thereby alleviating computation burdens from the network infrastructure and enabling truly pervasive edge computing. A key challenge in this new mobile computing paradigm is how to provide self-interested users with incentives to participate in D2D computing. Although incentive mechanism design has been intensively studied in the literature, this paper considers a much more challenging yet much under-investigated problem in which user incentives are complexly coupled with security risks, which is extremely important since D2D-enhanced MEC systems are vulnerable to distributed attacks, such as distributed denial of service (DDoS) attacks, due to its autonomous nature. In this paper, we build a novel mathematical framework incorporating game theory and classic epidemic models to investigate the interplay between user incentives and security risks in D2D-enhanced MEC systems under infectious DDoS attacks. A key result derived from this analysis is a phase change effect between persistent and non-persistent DDoS attacks, which is significantly different in nature from classic epidemic results for non-strategic users. Based on this, we determine the optimal reward in a contract-based incentive mechanism that the network operator should offer to users in order to maximize the operator's utility. The optimal solution exhibits an interesting "Less is More" effect: although giving users a higher reward promote more participation, it may harm the operator's utility. This is because too much participation fosters persistent DDoS attack and as a result, the \textiteffective participation level does not improve. Extensive simulations are carried out to verify the analytic conclusions.
- A main focus of machine learning research has been improving the generalization accuracy and efficiency of prediction models. Many models such as SVM, random forest, and deep neural nets have been proposed and achieved great success. However, what emerges as missing in many applications is actionability, i.e., the ability to turn prediction results into actions. For example, in applications such as customer relationship management, clinical prediction, and advertisement, the users need not only accurate prediction, but also actionable instructions which can transfer an input to a desirable goal (e.g., higher profit repays, lower morbidity rates, higher ads hit rates). Existing effort in deriving such actionable knowledge is few and limited to simple action models which restricted to only change one attribute for each action. The dilemma is that in many real applications those action models are often more complex and harder to extract an optimal solution. In this paper, we propose a novel approach that achieves actionability by combining learning with planning, two core areas of AI. In particular, we propose a framework to extract actionable knowledge from random forest, one of the most widely used and best off-the-shelf classifiers. We formulate the actionability problem to a sub-optimal action planning (SOAP) problem, which is to find a plan to alter certain features of a given input so that the random forest would yield a desirable output, while minimizing the total costs of actions. Technically, the SOAP problem is formulated in the SAS+ planning formalism, and solved using a Max-SAT based approach. Our experimental results demonstrate the effectiveness and efficiency of the proposed approach on a personal credit dataset and other benchmarks. Our work represents a new application of automated planning on an emerging and challenging machine learning paradigm.
- Oct 27 2016 cs.DB arXiv:1610.08411v1For decades, the crowdsourcing has gained much attention from both academia and industry, which outsources a number of tasks to human workers. Existing works considered improving the task accuracy through voting or learning methods, they usually did not fully take into account reducing the latency of the task completion. When a task requester posts a group of tasks (e.g., sentiment analysis), and one can only obtain answers of all tasks after the last task is accomplished. As a consequence, the time delay of even one task in this group could delay the next step of the task requester's work from minutes to days, which is quite undesirable for the task requester. Inspired by the importance of the task accuracy and latency, in this paper, we will propose a novel crowdsourcing framework, namely Fast and Reliable crOwdsourcinG framework (FROG), which intelligently assigns tasks to workers, such that the latencies of tasks are reduced and the expected accuracies of tasks are met. Specifically, our FROG framework consists of two important components, task scheduler and notification modules. For the task scheduler module, we formalize a FROG task scheduling (FROG-TS) problem, in which the server actively assigns workers with tasks with high reliability and low latency. We prove that the FROG-TS problem is NP-hard. Thus, we design two heuristic approaches, request-based and batch-based scheduling. For the notification module, we define an efficient worker notifying (EWN) problem, which only sends task invitations to those workers with high probabilities of accepting the tasks. To tackle the EWN problem, we propose a smooth kernel density estimation approach to estimate the probability that a worker accepts the task invitation. Through extensive experiments, we demonstrate the effectiveness and efficiency of our proposed FROG platform on both real and synthetic data sets.
- Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V;E) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W and letting G(W) be the induced subgraph in G of the vertices in W. In addition to G(W), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that correctly estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K, and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios. We conclude with extensions and directions for future work.
- With the dense deployment of the remote radio heads (RRHs), the huge network power consumption has become a great challenge for green cloud radio access networks (Cloud-RANs), and multiuser downlink beamforming has been proposed as a promising solution. Moreover, the increasing number of mobile users (MUs) causes that admission control is essential for Cloud-RAN with limited fronthaul capacity and predefined power budget. In this paper, we consider the problem of joint multiuser downlink beamforming and admission control (JBAC) to enhance the admitted MUs in the network and reduce the network power consumption, while taking into account the Quality of Service requirements of the MUs, the power budget constraints and fronthaul limitation. It is shown that the JBAC problem is a mixed integer nonlinear problem, and still non-convex even though the continuous relaxation is adopted. Therefore, we first transform the JBAC problem into a Mixed-Integer Semidefinite Program. Then, we propose a bound improving Branch and Bound algorithm to yield the near-optimal solution. For practical application, a polynomial-time heuristic algorithm is proposed to derive the sub-optimal solution. Extensive simulations are conducted with different system configurations to show the effectiveness of the proposed two schemes.
- As the use of crowdsourcing increases, it is important to think about performance optimization. For this purpose, it is possible to think about each worker as a HPU(Human Processing Unit), and to draw inspiration from performance optimization on traditional computers or cloud nodes with CPUs. However, as we characterize HPUs in detail for this purpose, we find that there are important differences between CPUs and HPUs, leading to the need for completely new optimization algorithms. In this paper, we study the specific optimization problem of obtaining results fastest for a crowd sourced job with a fixed total budget. In crowdsourcing, jobs are usually broken down into sets of small tasks, which are assigned to workers one at a time. We consider three scenarios of increasing complexity: Identical Round Homogeneous tasks, Multiplex Round Homogeneous tasks, and Multiple Round Heterogeneous tasks. For each scenario, we analyze the stochastic behavior of the HPU clock-rate as a function of the remuneration offered. After that, we develop an optimum Budget Allocation strategy to minimize the latency for job completion. We validate our results through extensive simulations and experiments on Amazon Mechanical Turk.
- Frequency estimation of multiple sinusoids is significant in both theory and application. In some application scenarios, only sub-Nyquist samples are available to estimate the frequencies. A conventional approach is to sample the signals at several lower rates. In this paper, we address frequency estimation of the signals in the time domain through undersampled data. We analyze the impact of undersampling and demonstrate that three sub-Nyquist channels are generally enough to estimate the frequencies provided the undersampling ratios are pairwise coprime. We deduce the condition that leads to the failure of resolving frequency ambiguity when two coprime undersampling channels are utilized. When three-channel sub-Nyquist samples are used jointly, the frequencies can be determined uniquely and the correct frequencies are estimated. Numerical experiments verify the correctness of our analysis and conclusion.
- Sep 28 2016 cs.CV arXiv:1609.08436v5The detection of ground plane and free space remains challenging for non-flat plane, especially with the varying latitudinal and longitudinal slope or in the case of multi-ground plane. In this paper, we propose a framework of the ground plane detection with stereo vision. The main contribution of this paper is a newly proposed descriptor which is implemented in the disparity image to obtain a disparity texture image. The ground plane regions can be distinguished from their surroundings effectively in the disparity texture image. Because the descriptor is implemented in the local area of the image, it can address well the problem of non-flat plane. And we also present a complete framework to detect the ground plane regions base on the disparity texture image with convolutional neural network architecture.
- Sep 12 2016 cs.CC arXiv:1609.02888v3In both query and communication complexity, we give separations between the class NISZK, containing those problems with non-interactive statistical zero knowledge proof systems, and the class UPP, containing those problems with randomized algorithms with unbounded error. These results significantly improve on earlier query separations of Vereschagin [Ver95] and Aaronson [Aar12] and earlier communication complexity separations of Klauck [Kla11] and Razborov and Sherstov [RS10]. In addition, our results imply an oracle relative to which the class NISZK is not contained in PP. This answers an open question of Watrous from 2002 [Aar]. The technical core of our result is a stronger hardness amplification theorem for approximate degree, which roughly says that composing the gapped-majority function with any function of high approximate degree yields a function with high threshold degree. Using our techniques, we also give oracles relative to which the following two separations hold: perfect zero knowledge (PZK) is not contained in its complement (coPZK), and SZK (indeed, even NISZK) is not contained in PZK (indeed, even HVPZK). Along the way, we show that HVPZK is contained in PP in a relativizing manner. We prove a number of implications of these results, which may be of independent interest outside of structural complexity. Specifically, our oracle separation implies that certain parameters of the Polarization Lemma of Sahai and Vadhan [SV03] cannot be much improved in a black-box manner. Additionally, it implies new lower bounds for property testing algorithms with error probability arbitrarily close to 1/2. Finally, our results imply that two-message protocols in the streaming interactive proofs model of Cormode et al. [CTY11] are surprisingly powerful in the sense that, with just logarithmic cost, they can compute functions outside of UPP^CC.
- Aug 29 2016 cs.CV arXiv:1608.07444v1It is well known that clothing fashion is a distinctive and often habitual trend in the style in which a person dresses. Clothing fashions are usually expressed with visual stimuli such as style, color, and texture. However, it is not clear which visual stimulus places higher/lower influence on the updating of clothing fashion. In this study, computer vision and machine learning techniques are employed to analyze the influence of different visual stimuli on clothing-fashion updates. Specifically, a classification-based model is proposed to quantify the influence of different visual stimuli, in which each visual stimulus's influence is quantified by its corresponding accuracy in fashion classification. Experimental results demonstrate that, on clothing-fashion updates, the style holds a higher influence than the color, and the color holds a higher influence than the texture.
- Multi-view data clustering refers to categorizing a data set by making good use of related information from multiple representations of the data. It becomes important nowadays because more and more data can be collected in a variety of ways, in different settings and from different sources, so each data set can be represented by different sets of features to form different views of it. Many approaches have been proposed to improve clustering performance by exploring and integrating heterogeneous information underlying different views. In this paper, we propose a new multi-view fuzzy clustering approach called MinimaxFCM by using minimax optimization based on well-known Fuzzy c means. In MinimaxFCM the consensus clustering results are generated based on minimax optimization in which the maximum disagreements of different weighted views are minimized. Moreover, the weight of each view can be learned automatically in the clustering process. In addition, there is only one parameter to be set besides the fuzzifier. The detailed problem formulation, updating rules derivation, and the in-depth analysis of the proposed MinimaxFCM are provided here. Experimental studies on nine multi-view data sets including real world image and document data sets have been conducted. We observed that MinimaxFCM outperforms related multi-view clustering approaches in terms of clustering accuracy, demonstrating the great potential of MinimaxFCM for multi-view data analysis.
- Incremental clustering approaches have been proposed for handling large data when given data set is too large to be stored. The key idea of these approaches is to find representatives to represent each cluster in each data chunk and final data analysis is carried out based on those identified representatives from all the chunks. However, most of the incremental approaches are used for single view data. As large multi-view data generated from multiple sources becomes prevalent nowadays, there is a need for incremental clustering approaches to handle both large and multi-view data. In this paper we propose a new incremental clustering approach called incremental minimax optimization based fuzzy clustering (IminimaxFCM) to handle large multi-view data. In IminimaxFCM, representatives with multiple views are identified to represent each cluster by integrating multiple complementary views using minimax optimization. The detailed problem formulation, updating rules derivation, and the in-depth analysis of the proposed IminimaxFCM are provided. Experimental studies on several real world multi-view data sets have been conducted. We observed that IminimaxFCM outperforms related incremental fuzzy clustering in terms of clustering accuracy, demonstrating the great potential of IminimaxFCM for large multi-view data analysis.
- In the best arm identification (Best-1-Arm) problem, we are given $n$ stochastic bandit arms, each associated with a reward distribution with an unknown mean. We would like to identify the arm with the largest mean with probability $1-\delta$, using as few samples as possible. Understanding the sample complexity of Best-1-Arm has attracted significant attention since the last decade. However, the optimal sample complexity is still unknown. Recently, Chen and Li made an interesting conjecture, called gap-entropy conjecture, concerning the instance optimal sample complexity of Best-1-Arm. Given a Best-1-Arm instance, let $\mu_{[i]}$ denote the $i$th largest mean and $\Delta_{[i]}=\mu_{[1]}-\mu_{[i]}$ denote the corresponding gap. $H(I)=\sum_{i=2}^{n}\Delta_{[i]}^{-2}$ denotes the complexity of the instance. The gap-entropy conjecture states that for any instance $I$, $\Omega(H(I)\cdot(\log\delta^{-1}+\mathsf{Ent}(I)))$ is an instance lower bound, where $\mathsf{Ent}(I)$ is an entropy-like term completely determined by $\Delta_{[i]}$s, and there is a $\delta$-correct algorithm for Best-1-Arm with sample complexity $O(H(I)\cdot(\log\delta^{-1}+\mathsf{Ent}(I))+\Delta_{[2]}^{-2}\log\log\Delta_{[2]}^{-1})$. If the conjecture is true, we would have a complete understanding of the instance-wise sample complexity of Best-1-Arm. We make significant progress towards the resolution of the gap-entropy conjecture. For the upper bound, we provide a highly nontrivial $\delta$-correct algorithm which requires $$O(H(I)⋅(\log\delta^-1+\mathsfEnt(I))+\Delta_[2]^-2\log\log \Delta_[2]^-1\operatorname*polylog(n,\delta^-1))$$ samples. For the lower bound, we show that for any Best-1-Arm instance with all gaps of the form $2^{-k}$, any $\delta$-correct monotone algorithm requires at least $\Omega(H(I)\cdot(\log\delta^{-1}+\mathsf{Ent}(I)))$ samples in expectation.
- Jul 19 2016 cs.DS arXiv:1607.04913v2Depth first search (DFS) tree is one of the most well-known data structures for designing efficient graph algorithms. Given an undirected graph $G=(V,E)$ with $n$ vertices and $m$ edges, the textbook algorithm takes $O(n+m)$ time to construct a DFS tree. In this paper, we study the problem of maintaining a DFS tree when the graph is undergoing updates. Formally, we show the following results. Incremental DFS tree: Given any arbitrary online sequence of edge or vertex insertions, we can report a DFS tree in $O(n)$ worst case time per operation, and $O\left(\min\{m \log n + n \log^2 n, n^2\}\right)$ preprocessing time. Fault tolerant DFS tree: There exists a data structure of size $O(m \log n)$, such that given any set $\mathcal{F}$ of failed vertices or edges, it can report a DFS tree of the graph $G \setminus \mathcal{F}$ in $O(n|\mathcal{F}|\log^2 n)$ time. Fully dynamic DFS tree: Given any arbitrary online sequence of edge or vertex updates, we can report a DFS tree in $O(\sqrt{nm}\log^{1.5} n)$ worst case time per operation. Our result for incremental DFS tree improves the previous $O(n \log^3 n)$ worst case update time by Baswana et al., and matches the trivial $\Omega(n)$ lower bound when it is required to explicitly output the DFS tree. Our other results also improve the corresponding state-of-the-art results by Baswana et al. as well. Our work builds on the framework of the breakthrough work by Baswana et al., together with a novel use of a tree-partition lemma by Duan and Zhang, and the celebrated fractional cascading technique by Chazelle and Guibas.
- Jul 05 2016 cs.CC arXiv:1607.00945v2The width measure treedepth, also known as vertex ranking, centered coloring and elimination tree height, is a well-established notion which has recently seen a resurgence of interest. Since graphs of bounded treedepth are more restricted than graphs of bounded tree- or pathwidth, we are interested in the algorithmic utility of this additional structure. On the negative side, we show that every dynamic programming algorithm on treedepth decompositions of depth~$t$ cannot solve Dominating Set with $O((3-\epsilon)^t \cdot \log n)$ space for any $\epsilon > 0$. This result implies the same space lower bound for dynamic programming algorithms on tree and path decompositions. We supplement this result by showing a space lower bound of $O((3-\epsilon)^t \cdot \log n)$ for 3-Coloring and $O((2-\epsilon)^t \cdot \log n)$ for Vertex Cover. This formalizes the common intuition that dynamic programming algorithms on graph decompositions necessarily consume a lot of space and complements known results of the time-complexity of problems restricted to low-treewidth classes. We then show that treedepth lends itself to the design of branching algorithms. This class of algorithms has in general distinct advantages over dynamic programming algorithms: a) They use less space than algorithms based on dynamic programming, b) they are easy to parallelize and c) they provide possible solutions before terminating. Specifically, we design for Dominating Set a pure branching algorithm that runs in time $t^{O(t^2)}\cdot n$ and uses space $O(t^3 \log t + t \log n)$ and a hybrid of branching and dynamic programming that achieves a running time of $O(3^t \log t \cdot n)$ while using $O(2^t t \log t + t \log n)$ space. Algorithms for 3-Coloring and Vertex Cover with space complexity $O(t \cdot \log n)$ and time complexity $O(3^t \cdot n)$ and $O(2^t\cdot n)$, respectively, are included for completeness.
- Jul 01 2016 cs.CV arXiv:1606.09349v1Zero-shot learning (ZSL) extends the conventional image classification technique to a more challenging situation where the test image categories are not seen in the training samples. Most studies on ZSL utilize side information such as attributes or word vectors to bridge the relations between the seen classes and the unseen classes. However, existing approaches on ZSL typically exploit a shared space for each type of side information independently, which cannot make full use of the complementary knowledge of different types of side information. To this end, this paper presents an MBFA-ZSL approach to embed different types of side information as well as the visual feature into one shared space. Specifically, we first develop an algorithm named Multi-Battery Factor Analysis (MBFA) to build a unified semantic space, and then employ multiple types of side information in it to achieve the ZSL. The close-form solution makes MBFA-ZSL simple to implement and efficient to run on large datasets. Extensive experiments on the popular AwA, CUB, and SUN datasets show its significant superiority over the state-of-the-art approaches.
- In this paper, we present subgraph2vec, a novel approach for learning latent representations of rooted subgraphs from large graphs inspired by recent advancements in Deep Learning and Graph Kernels. These latent representations encode semantic substructure dependencies in a continuous vector space, which is easily exploited by statistical models for tasks such as graph classification, clustering, link prediction and community detection. subgraph2vec leverages on local information obtained from neighbourhoods of nodes to learn their latent representations in an unsupervised fashion. We demonstrate that subgraph vectors learnt by our approach could be used in conjunction with classifiers such as CNNs, SVMs and relational data clustering algorithms to achieve significantly superior accuracies. Also, we show that the subgraph vectors could be used for building a deep learning variant of Weisfeiler-Lehman graph kernel. Our experiments on several benchmark and large-scale real-world datasets reveal that subgraph2vec achieves significant improvements in accuracies over existing graph kernels on both supervised and unsupervised learning tasks. Specifically, on two realworld program analysis tasks, namely, code clone and malware detection, subgraph2vec outperforms state-of-the-art kernels by more than 17% and 4%, respectively.
- It is well-known that malware constantly evolves so as to evade detection and this causes the entire malware population to be non-stationary. Contrary to this fact, prior works on machine learning based Android malware detection have assumed that the distribution of the observed malware characteristics (i.e., features) do not change over time. In this work, we address the problem of malware population drift and propose a novel online machine learning based framework, named DroidOL to handle it and effectively detect malware. In order to perform accurate detection, security-sensitive behaviors are captured from apps in the form of inter-procedural control-flow sub-graph features using a state-of-the-art graph kernel. In order to perform scalable detection and to adapt to the drift and evolution in malware population, an online passive-aggressive classifier is used. In a large-scale comparative analysis with more than 87,000 apps, DroidOL achieves 84.29% accuracy outperforming two state-of-the-art malware techniques by more than 20% in their typical batch learning setting and more than 3% when they are continuously re-trained. Our experimental findings strongly indicate that online learning based approaches are highly suitable for real-world malware detection.
- In this paper, we propose a novel graph kernel specifically to address a challenging problem in the field of cyber-security, namely, malware detection. Previous research has revealed the following: (1) Graph representations of programs are ideally suited for malware detection as they are robust against several attacks, (2) Besides capturing topological neighbourhoods (i.e., structural information) from these graphs it is important to capture the context under which the neighbourhoods are reachable to accurately detect malicious neighbourhoods. We observe that state-of-the-art graph kernels, such as Weisfeiler-Lehman kernel (WLK) capture the structural information well but fail to capture contextual information. To address this, we develop the Contextual Weisfeiler-Lehman kernel (CWLK) which is capable of capturing both these types of information. We show that for the malware detection problem, CWLK is more expressive and hence more accurate than WLK while maintaining comparable efficiency. Through our large-scale experiments with more than 50,000 real-world Android apps, we demonstrate that CWLK outperforms two state-of-the-art graph kernels (including WLK) and three malware detection techniques by more than 5.27% and 4.87% F-measure, respectively, while maintaining high efficiency. This high accuracy and efficiency make CWLK suitable for large-scale real-world malware detection.
- Jun 14 2016 cs.CC arXiv:1606.04016v2We study the following problem: with the power of postselection (classically or quantumly), what is your ability to answer adaptive queries to certain languages? More specifically, for what kind of computational classes $\mathcal{C}$, we have $\mathsf{P}^{\mathcal{C}}$ belongs to $\mathsf{PostBPP}$ and $\mathsf{PostBQP}$? While a complete answer to the above question seems impossible given the development of present computational complexity theory. We study the analogous question in query complexity, which sheds light on the limitation of \em relativized methods (the relativization barrier) to the above question. Informally, we show that, for a partial function $f$, if there is no efficient (In the world of query complexity, being efficient means using $O(\operatorname*{polylog}(n))$ time.) \em small bounded-error algorithm for $f$ classically or quantumly, then there is no efficient postselection bounded-error algorithm to answer adaptive queries to $f$ classically or quantumly. Our results imply a new proof for the classical oracle separation $\mathsf{P}^{\mathsf{NP}^{\mathcal{O}}} \not\subset \mathsf{PP}^{\mathcal{O}}$. They also lead to a new oracle separation $\mathsf{P}^{\mathsf{SZK}^{\mathcal{O}}} \not\subset \mathsf{PP}^{\mathcal{O}}$. Our result also implies a hardness amplification construction for polynomial approximation: given a function $f$ on $n$ bits, we construct an adaptive-version of $f$, denoted by $F$, on $O(m \cdot n)$ bits, such that if $f$ requires large degree to approximate to error $2/3$ in a certain one-sided sense, then $F$ requires large degree to approximate even to error $1/2 - 2^{-m}$. Our construction achieves the same amplification in the work of Thaler (ICALP, 2016), by composing a function with $O(\log n)$ \em deterministic query complexity.
- Jun 06 2016 cs.CV arXiv:1606.00915v1In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or 'atrous convolution', as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second, we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third, we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed "DeepLab" system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7% mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.
- Jun 01 2016 cs.DB arXiv:1605.09675v2Recently, with the rapid development of mobile devices and the crowdsourcing platforms, the spatial crowdsourcing has attracted much attention from the database community. Specifically, spatial crowdsourcing refers to sending a location-based request to workers according to their positions, and workers need to physically move to specified locations to conduct tasks. Many works have studied task assignment problems in spatial crowdsourcing, however, their problem definitions are quite different from each other. As a result, there is no work to compare the performances of existing algorithms on task assignment in spatial crowdsourcing. In this paper, we present a comprehensive experimental comparison of most existing algorithms on task assignment in spatial crowdsourcing. Specifically, we first give some general definitions about spatial workers and spatial tasks based on definitions in the existing works studying task assignment problems in spatial crowdsourcing such that the existing algorithms can be applied on same synthetic and real data sets. Then, we provide a uniform implementation for all the algorithms of task assignment problems in spatial crowdsourcing. Finally, based on the results on both synthetic and real data sets, we conclude the strengths and weaknesses of tested algorithms, which can guide further researches on the same area and practical implementations of spatial crowdsourcing systems.
- May 30 2016 cs.LG arXiv:1605.08481v1The best arm identification problem (BEST-1-ARM) is the most basic pure exploration problem in stochastic multi-armed bandits. The problem has a long history and attracted significant attention for the last decade. However, we do not yet have a complete understanding of the optimal sample complexity of the problem: The state-of-the-art algorithms achieve a sample complexity of $O(\sum_{i=2}^{n} \Delta_{i}^{-2}(\ln\delta^{-1} + \ln\ln\Delta_i^{-1}))$ ($\Delta_{i}$ is the difference between the largest mean and the $i^{th}$ mean), while the best known lower bound is $\Omega(\sum_{i=2}^{n} \Delta_{i}^{-2}\ln\delta^{-1})$ for general instances and $\Omega(\Delta^{-2} \ln\ln \Delta^{-1})$ for the two-arm instances. We propose to study the instance-wise optimality for the BEST-1-ARM problem. Previous work has proved that it is impossible to have an instance optimal algorithm for the 2-arm problem. However, we conjecture that modulo the additive term $\Omega(\Delta_2^{-2} \ln\ln \Delta_2^{-1})$ (which is an upper bound and worst case lower bound for the 2-arm problem), there is an instance optimal algorithm for BEST-1-ARM. Moreover, we introduce a new quantity, called the gap entropy for a best-arm problem instance, and conjecture that it is the instance-wise lower bound. Hence, resolving this conjecture would provide a final answer to the old and basic problem.
- We study the pure exploration problem subject to a matroid constraint (Best-Basis) in a stochastic multi-armed bandit game. In a Best-Basis instance, we are given $n$ stochastic arms with unknown reward distributions, as well as a matroid $\mathcal{M}$ over the arms. Let the weight of an arm be the mean of its reward distribution. Our goal is to identify a basis of $\mathcal{M}$ with the maximum total weight, using as few samples as possible. The problem is a significant generalization of the best arm identification problem and the top-$k$ arm identification problem, which have attracted significant attentions in recent years. We study both the exact and PAC versions of Best-Basis, and provide algorithms with nearly-optimal sample complexities for these versions. Our results generalize and/or improve on several previous results for the top-$k$ arm identification problem and the combinatorial pure exploration problem when the combinatorial constraint is a matroid.
- May 09 2016 cs.FL arXiv:1605.01810v1The Schützenberger product of monoids is a key tool for the algebraic treatment of language concatenation. In this paper we generalize the Schützenberger product to the level of monoids in an algebraic category $\mathscr{D}$, leading to a uniform view of the corresponding constructions for monoids (Schützenberger), ordered monoids (Pin), idempotent semirings (Klíma and Polák) and algebras over a field (Reutenauer). In addition, assuming that $\mathscr{D}$ is part of a Stone-type duality, we derive a characterization of the languages recognized by Schützenberger products.
- We experimentally characterize the performance of mobile VLC and propose using OCT precoding to combat mobility-induced performance degradation. Results show that for approximate 300-Mb/s mobile VLC transmission, OCT precoding outperforms adaptive-loaded DMT and offers significant packet loss rate reduction.
- May 06 2016 cs.LG arXiv:1605.01623v1The convergence of Stochastic Gradient Descent (SGD) using convex loss functions has been widely studied. However, vanilla SGD methods using convex losses cannot perform well with noisy labels, which adversely affect the update of the primal variable in SGD methods. Unfortunately, noisy labels are ubiquitous in real world applications such as crowdsourcing. To handle noisy labels, in this paper, we present a family of robust losses for SGD methods. By employing our robust losses, SGD methods successfully reduce negative effects caused by noisy labels on each update of the primal variable. We not only reveal that the convergence rate is O(1/T) for SGD methods using robust losses, but also provide the robustness analysis on two representative robust losses. Comprehensive experimental results on six real-world datasets show that SGD methods using robust losses are obviously more robust than other baseline methods in most situations with fast convergence.
- Apr 11 2016 cs.SI physics.soc-ph arXiv:1604.02444v1Almost all real-world networks are subject to constant evolution, and plenty of evolving networks have been investigated to uncover the underlying mechanisms for a deeper understanding of the organization and development of them. Compared with the rapid expansion of the empirical studies about evolution mechanisms exploration, the future links prediction methods corresponding to the evolution mechanisms are deficient. Real-world information always contain hints of what would happen next, which is also the case in the observed evolving networks. In this paper, we firstly propose a structured-dependent index to strengthen the robustness of link prediction methods. Then we treat the observed links and their timestamps in evolving networks as known information. We envision evolving networks as dynamic systems and model the evolutionary dynamics of nodes similarity. Based on the iterative updating of nodes' network position, the potential trend of evolving networks is uncovered, which improves the accuracy of future links prediction. Experiments on various real-world networks show that the proposed index performs better than baseline methods and the spatial-temporal position drift model performs well in real-world evolving networks.
- MIMO interference network optimization is important for increasingly crowded wireless communication networks. We provide a new algorithm, named Dual Link algorithm, for the classic problem of weighted sum-rate maximization for MIMO multiaccess channels (MAC), broadcast channels (BC), and general MIMO interference channels with Gaussian input and a total power constraint. For MIMO MAC/BC, the algorithm finds optimal signals to achieve the capacity region boundary. For interference channels with Gaussian input assumption, two of the previous state-of-the-art algorithms are the WMMSE algorithm and the polite water-filling (PWF) algorithm. The WMMSE algorithm is provably convergent, while the PWF algorithm takes the advantage of the optimal transmit signal structure and converges the fastest in most situations but is not guaranteed to converge in all situations. It is highly desirable to design an algorithm that has the advantages of both algorithms. The dual link algorithm is such an algorithm. Its fast and guaranteed convergence is important to distributed implementation and time varying channels. In addition, the technique and a scaling invariance property used in the convergence proof may find applications in other non-convex problems in communication networks.
- In real-world and online social networks, individuals receive and transmit information in real time. Cascading information transmissions --- phone calls, text messages, social media posts --- may be understood as a realization of a diffusion process operating on the network, and its branching path can be represented by a directed tree. One important feature of dynamic real-world diffusion processes is that the process may not traverse every edge in the network on which it operates. When the network itself is unknown, the path of the diffusion process may reveal some, but not all, of the edges connecting nodes that have received the diffusing information. The network reconstruction/inference problem is to estimate connections that are not revealed by the diffusion processes. This problem naturally arises in a many disciplines. Most of existing works on network reconstruction study this problem by deriving a likelihood function for the realized diffusion process given full knowledge of the network on which it operates, and attempting to find the network topology that maximizes this likelihood. The major challenge in this work is the intractability of the optimization problem. In this paper, we focus on the network reconstruction problem for a broad class of real-world diffusion processes, exemplified by a network diffusion scheme called respondent-driven sampling (RDS) that is widely used in epidemiology. We prove that under a reasonable model of network diffusion, the likelihood of an observed RDS realization is a Bayesian log-submodular model. We propose a novel, accurate, and computationally efficient variational inference algorithm for the network reconstruction problem under this model. In this algorithm, we allow for more flexibility for the possible deviation of the subjects' reported total degrees in the underlying graphical structure from the true ones.
- Mar 28 2016 cs.CV arXiv:1603.07957v1This paper proposes a novel semi-supervised method on object recognition. First, based on Boost Picking, a universal algorithm, Boost Picking Teaching (BPT), is proposed to train an effective binary-classifier just using a few labeled data and amounts of unlabeled data. Then, an ensemble strategy is detailed to synthesize multiple BPT-trained binary-classifiers to be a high-performance multi-classifier. The rationality of the strategy is also analyzed in theory. Finally, the proposed method is tested on two databases, CIFAR-10 and CIFAR-100. Using 2% labeled data and 98% unlabeled data, the accuracies of the proposed method on the two data sets are 78.39% and 50.77% respectively.
- In this paper, we consider the problem of actively learning a linear classifier through query synthesis where the learner can construct artificial queries in order to estimate the true decision boundaries. This problem has recently gained a lot of interest in automated science and adversarial reverse engineering for which only heuristic algorithms are known. In such applications, queries can be constructed de novo to elicit information (e.g., automated science) or to evade detection with minimal cost (e.g., adversarial reverse engineering). We develop a general framework, called dimension coupling (DC), that 1) reduces a d-dimensional learning problem to d-1 low dimensional sub-problems, 2) solves each sub-problem efficiently, 3) appropriately aggregates the results and outputs a linear classifier, and 4) provides a theoretical guarantee for all possible schemes of aggregation. The proposed method is proved resilient to noise. We show that the DC framework avoids the curse of dimensionality: its computational complexity scales linearly with the dimension. Moreover, we show that the query complexity of DC is near optimal (within a constant factor of the optimum algorithm). To further support our theoretical analysis, we compare the performance of DC with the existing work. We observe that DC consistently outperforms the prior arts in terms of query complexity while often running orders of magnitude faster.
- Mar 02 2016 cs.HC arXiv:1603.00441v1As the number of vehicles continues to grow, parking spaces are at a premium in city streets. Additionally, due to the lack of knowledge about street parking spaces, heuristic circling the blocks not only costs drivers' time and fuel, but also increases city congestion. In the wake of recent trend to build convenient, green and energy-efficient smart cities, we rethink common techniques adopted by high-profile smart parking systems, and present a user-engaged (crowdsourcing) and sonar-based prototype to identify urban on-street parking spaces. The prototype includes an ultrasonic sensor, a GPS receiver and associated Arduino micro-controllers. It is mounted on the passenger side of a car to measure the distance from the vehicle to the nearest roadside obstacle. Multiple road tests are conducted around Wheatley, Oxford to gather results and emulate the crowdsourcing approach. By extracting parked vehicles' features from the collected trace, a supervised learning algorithm is developed to estimate roadside parking occupancy and spot illegal parking vehicles. A quantity estimation model is derived to calculate the required number of sensing units to cover urban streets. The estimation is quantitatively compared to a fixed sensing solution. The results show that the crowdsourcing way would need substantially fewer sensors compared to the fixed sensing system.
- Recently, the deep neural network (derived from the artificial neural network) has attracted many researchers' attention by its outstanding performance. However, since this network requires high-performance GPUs and large storage, it is very hard to use it on individual devices. In order to improve the deep neural network, many trials have been made by refining the network structure or training strategy. Unlike those trials, in this paper, we focused on the basic propagation function of the artificial neural network and proposed the binarized deep neural network. This network is a pure binary system, in which all the values and calculations are binarized. As a result, our network can save a lot of computational resource and storage. Therefore, it is possible to use it on various devices. Moreover, the experimental results proved the feasibility of the proposed network.
- Compressive sensing (CS) is a data acquisition technique that measures sparse or compressible signals at a sampling rate lower than their Nyquist rate. Results show that sparse signals can be reconstructed using greedy algorithms, often requiring prior knowledge such as the signal sparsity or the noise level. As a substitute to prior knowledge, cross validation (CV), a statistical method that examines whether a model overfits its data, has been proposed to determine the stopping condition of greedy algorithms. This paper first analyzes cross validation in a general compressive sensing framework and developed general cross validation techniques which could be used to understand CV-based sparse recovery algorithms. Furthermore, we provide theoretical analysis for OMP-CV, a cross validation modification of orthogonal matching pursuit, which has very good sparse recovery performance. Finally, numerical experiments are given to validate our theoretical results and investigate the behaviors of OMP-CV.
- This paper proposes a universal method, Boost Picking, to train supervised classification models mainly by un-labeled data. Boost Picking only adopts two weak classifiers to estimate and correct the error. It is theoretically proved that Boost Picking could train a supervised model mainly by un-labeled data as effectively as the same model trained by 100% labeled data, only if recalls of the two weak classifiers are all greater than zero and the sum of precisions is greater than one. Based on Boost Picking, we present "Test along with Training (TawT)" to improve the generalization of supervised models. Both Boost Picking and TawT are successfully tested in varied little data sets.
- Feb 19 2016 cs.CV arXiv:1602.05660v1This letter proposes a synthetic aperture radar (SAR) image registration method named Feature-Area Optimization (FAO). First, the traditional area-based optimization model is reconstructed and decomposed into three key but uncertain factors: initialization, slice set and regularization. Next, structural features are extracted by scale invariant feature transform (SIFT) in dual-resolution space (SIFT-DRS), a novel SIFT-Like method dedicated to FAO. Then, the three key factors are determined based on these features. Finally, solving the factor-determined optimization model can get the registration result. A series of experiments demonstrate that the proposed method can register multi-temporal SAR images accurately and efficiently.
- Eilenberg-type correspondences, relating varieties of languages (e.g. of finite words, infinite words, or trees) to pseudovarieties of finite algebras, form the backbone of algebraic language theory. Numerous such correspondences are known in the literature. We demonstrate that they all arise from the same recipe: one models languages and the algebras recognizing them by monads on an algebraic category, and applies a Stone-type duality. Our main contribution is a variety theorem that covers e.g. Wilke's and Pin's work on $\infty$-languages, the variety theorem for cost functions of Daviaud, Kuperberg, and Pin, and unifies the two previous categorical approaches of Bojańczyk and of Adámek et al. In addition we derive a number of new results, including an extension of the local variety theorem of Gehrke, Grigorieff, and Pin from finite to infinite words.
- WeChat is a mobile messaging application that has 549 million active users as of Q1 2015, and "WeChat Moments" (WM) serves its social-networking function that allows users to post/share links of web pages. WM differs from the other social networks as it imposes many restrictions on the information diffusion process to mitigate the information overload. In this paper, we conduct a measurement study on information diffusion in the WM network by crawling and analyzing the spreading statistics of more than 160,000 pages that involve approximately 40 million users. Specifically, we identify the relationship of the number of posted pages and the number of views, the diffusion path length, the similarity and distribution of users' locations as well as their connections with the GDP of the users' province. For each individual WM page, we measure its temporal characteristics (e.g., the life time, the popularity within a time period); for each individual user, we evaluate how many of, or how likely, one's friends will view his posted pages. Our results will help the business to decide when and how to release the marketing pages over WM for better publicity.
- Feb 02 2016 cs.NI arXiv:1602.00066v1The base station (BS) in a multi-channel cognitive radio (CR) network has to broadcast to secondary (or unlicensed) receivers/users on more than one broadcast channels via channel hopping (CH), because a single broadcast channel can be reclaimed by the primary (or licensed) user, leading to broadcast failures. Meanwhile, a secondary receiver needs to synchronize its clock with the BS's clock to avoid broadcast failures caused by the possible clock drift between the CH sequences of the secondary receiver and the BS. In this paper, we propose a CH-based broadcast protocol called SASS, which enables a BS to successfully broadcast to secondary receivers over multiple broadcast channels via channel hopping. Specifically, the CH sequences are constructed on basis of a mathematical construct---the Self-Adaptive Skolem sequence. Moreover, each secondary receiver under SASS is able to adaptively synchronize its clock with that of the BS without any information exchanges, regardless of any amount of clock drift.
- User behaviour targeting is essential in online advertising. Compared with sponsored search keyword targeting and contextual advertising page content targeting, user behaviour targeting builds users' interest profiles via tracking their online behaviour and then delivers the relevant ads according to each user's interest, which leads to higher targeting accuracy and thus more improved advertising performance. The current user profiling methods include building keywords and topic tags or mapping users onto a hierarchical taxonomy. However, to our knowledge, there is no previous work that explicitly investigates the user online visits similarity and incorporates such similarity into their ad response prediction. In this work, we propose a general framework which learns the user profiles based on their online browsing behaviour, and transfers the learned knowledge onto prediction of their ad response. Technically, we propose a transfer learning model based on the probabilistic latent factor graphic models, where the users' ad response profiles are generated from their online browsing profiles. The large-scale experiments based on real-world data demonstrate significant improvement of our solution over some strong baselines.
- Dec 31 2015 cs.DB arXiv:1512.08518v5Spatial crowdsourcing refers to a system that periodically assigns a number of location-based workers with spatial tasks nearby (e.g., taking photos or videos at some spatial locations). Previous works on the spatial crowdsourcing usually designed task assignment strategies that maximize some assignment scores, which are however only based on available workers/tasks in the system at the time point of assigning workers/tasks. These strategies may achieve local optimality, due to the neglect of future workers/tasks that may join the system. In contrast, in this paper, we aim to achieve "globally" optimal task assignments, by considering not only those present, but also future (via predictions), workers/tasks. Specifically, we formalize an important problem, namely prediction-based spatial crowdsourcing (PB-SC), which expects to obtain a "globally" optimal strategy for worker-and-task assignments, over both present and predicted task/worker locations, such that the total assignment quality score is maximized under the constraint of the traveling budget. In this paper, we design an effective grid-based prediction method to estimate spatial distributions of workers/tasks in the future, and then utilize the predicted ones in our procedure of task assignments. We prove that the PB-SC problem is NP-hard, and thus intractable. Therefore, we propose efficient approximate algorithms to tackle the PB-SC problem, including greedy and divide-and-conquer (D&C) approaches, which can efficiently assign workers to spatial tasks with high quality scores and low budget consumptions, by considering both current and future task/worker distributions. Through extensive experiments, we demonstrate the efficiency and effectiveness of our PB-SC processing approaches on real/synthetic data.