results for au:Zhou_Y in:cs

- Mar 28 2017 cs.CV arXiv:1703.08603v1It has been well demonstrated that adversarial examples, i.e., natural images with visually imperceptible perturbations added, generally exist for deep networks to fail on image classification. In this paper, we extend adversarial examples to semantic segmentation and object detection which are much more difficult. Our observation is that both segmentation and detection are based on classifying multiple targets on an image (e.g., the basic target is a pixel or a receptive field in segmentation, and an object proposal in detection), which inspires us to optimize a loss function over a set of pixels/proposals for generating adversarial perturbations. Based on this idea, we propose a novel algorithm named Dense Adversary Generation (DAG), which generates a large family of adversarial examples, and applies to a wide range of state-of-the-art deep networks for segmentation and detection. We also find that the adversarial perturbations can be transferred across networks with different training data, based on different architectures, and even for different recognition tasks. In particular, the transferability across networks with the same architecture is more significant than in other cases. Besides, summing up heterogeneous perturbations often leads to better transfer performance, which provides an effective method of black-box adversarial attack.
- In this work we introduce a conditional accelerated lazy stochastic gradient descent algorithm with optimal number of calls to a stochastic first-order oracle and convergence rate $O\left(\frac{1}{\varepsilon^2}\right)$ improving over the projection-free, Online Frank-Wolfe based stochastic gradient descent of Hazan and Kale [2012] with convergence rate $O\left(\frac{1}{\varepsilon^4}\right)$.
- Data deduplication is able to effectively identify and eliminate redundant data and only maintain a single copy of files and chunks. Hence, it is widely used in cloud storage systems to save storage space and network bandwidth. However, the occurrence of deduplication can be easily identified by monitoring and analyzing network traffic, which leads to the risk of user privacy leakage. The attacker can carry out a very dangerous side channel attack, i.e., learn-the-remaining-information (LRI) attack, to reveal users' privacy information by exploiting the side channel of network traffic in deduplication. Existing work addresses the LRI attack at the cost of the high bandwidth efficiency of deduplication. In order to address this problem, we propose a simple yet effective scheme, called randomized redundant chunk scheme (RRCS), to significantly mitigate the risk of the LRI attack while maintaining the high bandwidth efficiency of deduplication. The basic idea behind RRCS is to add randomized redundant chunks to mix up the real deduplication states of files used for the LRI attack, which effectively obfuscates the view of the attacker, who attempts to exploit the side channel of network traffic for the LRI attack. Our security analysis shows that RRCS could significantly mitigate the risk of the LRI attack. We implement the RRCS prototype and evaluate it by using three large-scale real-world datasets. Experimental results demonstrate the efficiency and efficacy of RRCS.
- Simulation-based learning (SBL) is gaining popularity as a low-cost and convenient training technique in a vast range of applications. However, for a SBL platform to be fully utilized as an effective training tool, it is essential that feedback on performance is provided automatically in real-time during training. It is the aim of this paper to develop an efficient and effective feedback extraction method for the provision of real-time feedback in SBL. Existing methods either have low effectiveness in improving novice skills or suffer from low efficiency, resulting in their inability to be used in real-time. In this paper, we propose a neural network based method (NNFB) to extract feedback based on the adversarial technique. NNFB utilizes a bounded update to minimize a L1 regularized loss via back-propagation. We empirically show that NNFB can be used to extract simple, yet effective feedback. Also, NNFB was observed to have high effectiveness and efficiency when compared to existing methods, thus making it a promising option for real-time feedback extraction in SBL.
- Maximum rank-distance (MRD) codes are extremal codes in the space of $m\times n$ matrices over a finite field, equipped with the rank metric. Up to generalizations, the classical examples of such codes were constructed in the 1970s and are today known as Gabidulin codes. Motivated by several recent approaches to construct MRD codes that are inequivalent to Gabidulin codes, we study the equivalence issue for Gabidulin codes themselves. This shows in particular that the family of Gabidulin codes already contains a huge subset of MRD codes that are pairwise inequivalent, provided that $2\le m\le n-2$.
- Feb 09 2017 cs.CV arXiv:1702.02512v1This paper presents a robust and efficient semi-dense visual odometry solution for RGB-D cameras. The core of our method is a 2D-3D ICP pipeline which estimates the pose of the sensor by registering the projection of a 3D semi-dense map of the reference frame with the 2D semi-dense region extracted in the current frame. The processing is speeded up by efficiently implemented approximate nearest neighbour fields under the Euclidean distance criterion, which permits the use of compact Gauss-Newton updates in the optimization. The registration is formulated as a maximum a posterior problem to deal with outliers and sensor noises, and consequently the equivalent weighted least squares problem is solved by iteratively reweighted least squares method. A variety of robust weight functions are tested and the optimum is determined based on the characteristics of the sensor model. Extensive evaluation on publicly available RGB-D datasets shows that the proposed method predominantly outperforms existing state-of-the-art methods.
- Transportation mode detection with personal devices has been investigated for over ten years due to its importance in monitoring ones' activities, understanding human mobility, and assisting traffic management. However, two main limitations are still preventing it from large-scale deployments: high power consumption, and the lack of high-volume and diverse labeled data. In order to reduce power consumption, existing approaches are sampling using fewer sensors and with lower frequency, which however lead to a lower accuracy. A common way to obtain labeled data is recording the ground truth while collecting data, but such method cannot apply to large-scale deployment due to its inefficiency. To address these issues, we adopt a new low-frequency sampling manner with a hierarchical transportation mode identification algorithm and propose an offline data labeling approach with its manual and automatic implementations. Through a real-world large-scale experiment and comparison with related works, our sampling manner and algorithm are proved to consume much less energy while achieving a competitive accuracy around 85%. The new offline data labeling approach is also validated to be efficient and effective in providing ground truth for model training and testing.
- We present a new class of decentralized first-order methods for nonsmooth and stochastic optimization problems defined over multiagent networks. Considering that communication is a major bottleneck in decentralized optimization, our main goal in this paper is to develop algorithmic frameworks which can significantly reduce the number of inter-node communications. We first propose a decentralized primal-dual method which can find an $\epsilon$-solution both in terms of functional optimality gap and feasibility residual in $O(1/\epsilon)$ inter-node communication rounds when the objective functions are convex and the local primal subproblems are solved exactly. Our major contribution is to present a new class of decentralized primal-dual type algorithms, namely the decentralized communication sliding (DCS) methods, which can skip the inter-node communications while agents solve the primal subproblems iteratively through linearizations of their local objective functions. By employing DCS, agents can still find an $\epsilon$-solution in $O(1/\epsilon)$ (resp., $O(1/\sqrt{\epsilon})$) communication rounds for general convex functions (resp., strongly convex functions), while maintaining the $O(1/\epsilon^2)$ (resp., $O(1/\epsilon)$) bound on the total number of intra-node subgradient evaluations. We also present a stochastic counterpart for these algorithms, denoted by SDCS, for solving stochastic optimization problems whose objective function cannot be evaluated exactly. In comparison with existing results for decentralized nonsmooth and stochastic optimization, we can reduce the total number of inter-node communication rounds by orders of magnitude while still maintaining the optimal complexity bounds on intra-node stochastic subgradient evaluations. The bounds on the subgradient evaluations are actually comparable to those required for centralized nonsmooth and stochastic optimization.
- Jan 13 2017 cs.AI arXiv:1701.03322v1We argue that extensibility is a key challenge for knowledge representation. For this purpose, we propose assertional logic - a knowledge model for easier extension with new AI building blocks. In assertional logic, all syntactic objects are categorized as set theoretic constructs including individuals, concepts and operators, and all kinds of knowledge are formalized by equality assertions. When extending with a new building block, one only needs to consider its interactions with the basic form of knowledge (i.e., equality assertions) without going deeper into its interactions with other existing ones. We first present a primitive form of assertional logic that uses minimal assumed knowledge and constructs. Then, we show how to extend it by definitions, which are special kinds of knowledge, i.e., assertions. As a case study, we show how assertional logic can be used to unify logic and probability, and more important AI building blocks including time.
- Jan 10 2017 cs.CV arXiv:1701.01924v1Image blur and image noise are common distortions during image acquisition. In this paper, we systematically study the effect of image distortions on the deep neural network (DNN) image classifiers. First, we examine the DNN classifier performance under four types of distortions. Second, we propose two approaches to alleviate the effect of image distortion: re-training and fine-tuning with noisy images. Our results suggest that, under certain conditions, fine-tuning with noisy images can alleviate much effect due to distorted inputs, and is more practical than re-training.
- Jan 10 2017 cs.CV arXiv:1701.01833v1Deep Convolution Neural Networks (DCNNs) are capable of learning unprecedentedly effective image representations. However, their ability in handling significant local and global image rotations remains limited. In this paper, we propose Active Rotating Filters (ARFs) that actively rotate during convolution and produce feature maps with location and orientation explicitly encoded. An ARF acts as a virtual filter bank containing the filter itself and its multiple unmaterialised rotated versions. During back-propagation, an ARF is collectively updated using errors from all its rotated versions. DCNNs using ARFs, referred to as Oriented Response Networks (ORNs), can produce within-class rotation-invariant deep features while maintaining inter-class discrimination for classification tasks. The oriented response produced by ORNs can also be used for image and object orientation estimation tasks. Over multiple state-of-the-art DCNN architectures, such as VGG, ResNet, and STN, we consistently observe that replacing regular filters with the proposed ARFs leads to significant reduction in the number of network parameters and improvement in classification performance. We report the best results on several commonly used benchmarks.
- Dec 28 2016 cs.CV arXiv:1612.08230v1Deep neural networks have been widely adopted for automatic organ segmentation from CT-scanned images. However, the segmentation accuracy on some small organs (e.g., the pancreas) is sometimes below satisfaction, arguably because deep networks are easily distracted by the complex and variable background region which occupies a large fraction of the input volume. In this paper, we propose a coarse-to-fine approach to deal with this problem. We train two deep neural networks using different regions of the input volume. The first one, the coarse-scaled model, takes the entire volume as its input. It is used for roughly locating the spatial position of the pancreas. The second one, the fine-scaled model, only sees a small input region covering the pancreas, thus eliminating the background noise and providing more accurate segmentation especially around the boundary areas. At the testing stage, we first use the coarse-scaled model to roughly locate the pancreas, then adopt the fine-scaled model to refine the initial segmentation in an iterative manner to obtain increasingly better segmentation. We evaluate our algorithm on the NIH pancreas segmentation dataset with 82 volumes, and outperform the state-of-the-art [18] by more than 4% measured by the Dice-Sorensen Coefficient (DSC). In addition, we report 62.43% DSC for our worst case, significantly better than 34.11% reported in [18].
- Dec 22 2016 cs.CV physics.optics arXiv:1612.07120v1We have designed a single-pixel camera with imaging around corners based on computational ghost imaging. It can obtain the image of an object when the camera cannot look at the object directly. Our imaging system explores the fact that a bucket detector in a ghost imaging setup has no spatial resolution capability. A series of experiments have been designed to confirm our predictions. This camera has potential applications for imaging around corner or other similar environments where the object cannot be observed directly.
- Designing chaotic maps with complex dynamics is a challenging topic. This paper introduces the nonlinear chaotic processing (NCP) model, which contains six basic nonlinear operations. Each operation is a general framework that can use existing chaotic maps as seed maps to generate a huge number of new chaotic maps. The proposed NCP model can be easily extended by introducing new nonlinear operations or by arbitrarily combining existing ones. The properties and chaotic behaviors of the NCP model are investigated. To show its effectiveness and usability, as examples, we provide four new chaotic maps generated by the NCP model and evaluate their chaotic performance using Lyapunov exponent, Shannon entropy, correlation dimension and initial state sensitivity. The experimental results show that these chaotic maps have more complex chaotic behaviors than existing ones.
- Nov 30 2016 cs.CV arXiv:1611.09580v1Visual surveillance systems have become one of the largest data sources of Big Visual Data in real world. However, existing systems for video analysis still lack the ability to handle the problems of scalability, expansibility and error-prone, though great advances have been achieved in a number of visual recognition tasks and surveillance applications, e.g., pedestrian/vehicle detection, people/vehicle counting. Moreover, few algorithms explore the specific values/characteristics in large-scale surveillance videos. To address these problems in large-scale video analysis, we develop a scalable video parsing and evaluation platform through combining some advanced techniques for Big Data processing, including Spark Streaming, Kafka and Hadoop Distributed Filesystem (HDFS). Also, a Web User Interface is designed in the system, to collect users' degrees of satisfaction on the recognition tasks so as to evaluate the performance of the whole system. Furthermore, the highly extensible platform running on the long-term surveillance videos makes it possible to develop more intelligent incremental algorithms to enhance the performance of various visual recognition tasks.
- Nov 28 2016 cs.SI arXiv:1611.08351v1According to NSDUH (National Survey on Drug Use and Health), 20 million Americans consumed drugs in the past few 30 days. Combating illicit drug use is of great interest to public health and law enforcement agencies. Despite of the importance, most of the existing studies on drug uses rely on surveys. Surveys on sensitive topics such as drug use may not be answered truthfully by the people taking them. Selecting a representative sample to survey is another major challenge. In this paper, we explore the possibility of using big multimedia data, including both images and text, from social media in order to discover drug use patterns at fine granularity with respect to demographics. Instagram posts are searched and collected by drug related terms by analyzing the hashtags supplied with each post. A large and dynamic dictionary of frequent drug related slangs is used to find these posts. User demographics are extracted using robust face image analysis algorithms. These posts are then mined to find common trends with regard to the time and location they are posted, and further in terms of age and gender of the drug users. Furthermore, by studying the accounts followed by the users of drug related posts, we extract common interests shared by drug users.
- Generalized twisted Gabidulin codes are one of the few known families of maximum rank matrix codes over finite fields. As a subset of m by n matrices, when m=n, the automorphism group of any generalized twisted Gabidulin code has been completely determined by the authors recently. In this paper, we consider the same problem for m<n. Under certain conditions on their parameters, we determine their middle nuclei and right nuclei, which are important invariants with respect to the equivalence for rank metric codes. Furthermore, we also use them to derive necessary conditions on the automorphisms of generalized twisted Gabidulin codes.
- Oct 11 2016 cs.CV arXiv:1610.02579v1The visual cues from multiple support regions of different sizes and resolutions are complementary in classifying a candidate box in object detection. Effective integration of local and contextual visual cues from these regions has become a fundamental problem in object detection. In this paper, we propose a gated bi-directional CNN (GBD-Net) to pass messages among features from different support regions during both feature learning and feature extraction. Such message passing can be implemented through convolution between neighboring support regions in two directions and can be conducted in various layers. Therefore, local and contextual visual patterns can validate the existence of each other by learning their nonlinear relationships and their close interactions are modeled in a more complex way. It is also shown that message passing is not always helpful but dependent on individual samples. Gated functions are therefore needed to control message transmission, whose on-or-offs are controlled by extra visual evidence from the input sample. The effectiveness of GBD-Net is shown through experiments on three object detection datasets, ImageNet, Pascal VOC2007 and Microsoft COCO. This paper also shows the details of our approach in wining the ImageNet object detection challenge of 2016, with source code provided on \urlhttps://github.com/craftGBD/craftGBD.
- Oct 11 2016 cs.CL arXiv:1610.02806v1We describe an attentive encoder that combines tree-structured recursive neural networks and sequential recurrent neural networks for modelling sentence pairs. Since existing attentive models exert attention on the sequential structure, we propose a way to incorporate attention into the tree topology. Specially, given a pair of sentences, our attentive encoder uses the representation of one sentence, which generated via an RNN, to guide the structural encoding of the other sentence on the dependency parse tree. We evaluate the proposed attentive encoder on three tasks: semantic similarity, paraphrase identification and true-false question selection. Experimental results show that our encoder outperforms all baselines and achieves state-of-the-art results on two tasks.
- Mean-squared-error (MSE) is one of the most widely used performance metrics for the designs and analysis of multi-input-multiple-output (MIMO) communications. Weighted MSE minimization, a more general formulation of MSE minimization, plays an important role in MIMO transceiver optimization. While this topic has a long history and has been extensively studied, existing treatments on the methods in solving the weighted MSE optimization are more or less sporadic and non-systematic. In this paper, we firstly review the two major methodologies, Lagrange multiplier method and majorization theory based method, and their common procedures in solving the weighted MSE minimization. Then some problems and limitations of the methods that were usually neglected or glossed over in existing literature are provided. These problems are fundamental and of critical importance for the corresponding MIMO transceiver optimizations. In addition, a new extended matrix-field weighted MSE model is proposed. Its solutions and applications are discussed in details. Compared with existing models, this new model has wider applications, e.g., nonlinear MIMO transceiver designs and capacity-maximization transceiver designs for general MIMO networks.
- Trajectory analysis is essential in many applications. In this paper, we address the problem of representing motion trajectories in a highly informative way, and consequently utilize it for analyzing trajectories. Our approach first leverages the complete information from given trajectories to construct a thermal transfer field which provides a context-rich way to describe the global motion pattern in a scene. Then, a 3D tube is derived which depicts an input trajectory by integrating its surrounding motion patterns contained in the thermal transfer field. The 3D tube effectively: 1) maintains the movement information of a trajectory, 2) embeds the complete contextual motion pattern around a trajectory, 3) visualizes information about a trajectory in a clear and unified way. We further introduce a droplet-based process. It derives a droplet vector from a 3D tube, so as to characterize the high-dimensional 3D tube information in a simple but effective way. Finally, we apply our tube-and-droplet representation to trajectory analysis applications including trajectory clustering, trajectory classification & abnormality detection, and 3D action recognition. Experimental comparisons with state-of-the-art algorithms demonstrate the effectiveness of our approach.
- Aug 31 2016 cs.NE arXiv:1608.08607v2The balance between convergence and diversity is a key issue of evolutionary multi-objective optimization. The recently proposed stable matching-based selection provides a new perspective to handle this balance under the framework of decomposition multi-objective optimization. In particular, the stable matching between subproblems and solutions, which achieves an equilibrium between their mutual preferences, implicitly strikes a balance between the convergence and diversity. Nevertheless, the original stable matching model has a high risk of matching a solution with a unfavorable subproblem which finally leads to an imbalanced selection result. In this paper, we propose an adaptive two-level stable matching-based selection for decomposition multi-objective optimization. Specifically, borrowing the idea of stable matching with incomplete lists, we match each solution with one of its favorite subproblems by restricting the length of its preference list during the first-level stable matching. During the second-level stable matching, the remaining subproblems are thereafter matched with their favorite solutions according to the classic stable matching model. In particular, we develop an adaptive mechanism to automatically set the length of preference list for each solution according to its local competitiveness. The performance of our proposed method is validated and compared with several state-of-the-art evolutionary multi-objective optimization algorithms on 62 benchmark problem instances. Empirical results fully demonstrate the competitive performance of our proposed method on problems with complicated Pareto sets and those with more than three objectives.
- Aug 30 2016 cs.CV arXiv:1608.07724v1Based on life-long observations of physical, chemical, and biologic phenomena in the natural world, humans can often easily picture in their minds what an object will look like in the future. But, what about computers? In this paper, we learn computational models of object transformations from time-lapse videos. In particular, we explore the use of generative models to create depictions of objects at future times. These models explore several different prediction tasks: generating a future state given a single depiction of an object, generating a future state given two depictions of an object at different times, and generating future states recursively in a recurrent framework. We provide both qualitative and quantitative evaluations of the generated results, and also conduct a human evaluation to compare variations of our models.
- Magnetic skyrmions are promising candidates for next-generation information carriers, owing to their small size, topological stability, and ultralow depinning current density. A wide variety of skyrmionic device concepts and prototypes have been proposed, highlighting their potential applications. Here, we report on a bioinspired skyrmionic device with synaptic plasticity. The synaptic weight of the proposed device can be strengthened/weakened by positive/negative stimuli, mimicking the potentiation/depression process of a biological synapse. Both short-term plasticity(STP) and long-term potentiation(LTP) functionalities have been demonstrated for a spiking time-dependent plasticity(STDP) scheme. This proposal suggests new possibilities for synaptic devices for use in spiking neuromorphic computing applications.
- Label ranking aims to learn a mapping from instances to rankings over a finite number of predefined labels. Random forest is a powerful and one of the most successfully general-purpose machine learning algorithms of modern times. In the literature, there seems no research has yet been done in applying random forest to label ranking. In this paper, We present a powerful random forest label ranking method which uses random decision trees to retrieve nearest neighbors that are not only similar in the feature space but also in the ranking space. We have developed a novel two-step rank aggregation strategy to effectively aggregate neighboring rankings discovered by the random forest into a final predicted ranking. Compared with existing methods, the new random forest method has many advantages including its intrinsically scalable tree data structure, highly parallel-able computational architecture and much superior performances. We present extensive experimental results to demonstrate that our new method achieves the best predictive accuracy performances compared with state-of-the-art methods for datasets with complete ranking and datasets with only partial ranking information.
- This paper investigates the compress-and-forward scheme for an uplink cloud radio access network (C-RAN) model, where multi-antenna base-stations (BSs) are connected to a cloud-computing based central processor (CP) via capacity-limited fronthaul links. The BSs compress the received signals with Wyner-Ziv coding and send the representation bits to the CP; the CP performs the decoding of all the users' messages. Under this setup, this paper makes progress toward the optimal structure of the fronthaul compression and CP decoding strategies for the compress-and-forward scheme in C-RAN. On the CP decoding strategy design, this paper shows that under a sum fronthaul capacity constraint, a generalized successive decoding strategy of the quantization and user message codewords that allows arbitrary interleaved order at the CP achieves the same rate region as the optimal joint decoding. Further, it is shown that a practical strategy of successively decoding the quantization codewords first, then the user messages, achieves the same maximum sum rate as joint decoding under individual fronthaul constraints. On the joint optimization of user transmission and BS quantization strategies, this paper shows that if the input distributions are assumed to be Gaussian, then under joint decoding, the optimal quantization scheme for maximizing the achievable rate region is Gaussian. Moreover, Gaussian input and Gaussian quantization with joint decoding achieve to within a constant gap of the capacity region of the Gaussian multiple-input multiple-output (MIMO) uplink C-RAN model. Finally, this paper addresses the computational aspect of optimizing uplink MIMO C-RAN by showing that under fixed Gaussian input, the sum rate maximization problem over the Gaussian quantization noise covariance matrices can be formulated as convex optimization problems, thereby facilitating its efficient solution.
- Aug 16 2016 cs.DM arXiv:1608.04217v1Given a set of $n$ elements separated by a pairwise distance matrix, the minimum differential dispersion problem (Min-Diff DP) aims to identify a subset of m elements (m < n) such that the difference between the maximum sum and the minimum sum of the inter-element distances between any two chosen elements is minimized. We propose an effective iterated local search (denoted by ILS_MinDiff) for Min-Diff DP. To ensure an effective exploration and exploitation of the search space, the proposed ILS_MinDiff algorithm iterates through three sequential search phases: a fast descent-based neighborhood search phase to find a local optimum from a given starting solution, a local optima exploring phase to visit nearby high-quality solutions around a given local optimum, and a local optima escaping phase to move away from the current search region. Experimental results on six data sets of 190 benchmark instances demonstrate that ILS_MinDiff competes favorably with the state-of-the-art algorithms by finding 130 improved best results (new upper bounds).
- Jul 18 2016 cs.CV arXiv:1607.04564v3Vehicle detection and annotation for streaming video data with complex scenes is an interesting but challenging task for urban traffic surveillance. In this paper, we present a fast framework of Detection and Annotation for Vehicles (DAVE), which effectively combines vehicle detection and attributes annotation. DAVE consists of two convolutional neural networks (CNNs): a fast vehicle proposal network (FVPN) for vehicle-like objects extraction and an attributes learning network (ALN) aiming to verify each proposal and infer each vehicle's pose, color and type simultaneously. These two nets are jointly optimized so that abundant latent knowledge learned from the ALN can be exploited to guide FVPN training. Once the system is trained, it can achieve efficient vehicle detection and annotation for real-world traffic surveillance data. We evaluate DAVE on a new self-collected UTS dataset and the public PASCAL VOC2007 car and LISA 2010 datasets, with consistent improvements over existing algorithms.
- Jul 14 2016 cs.CY arXiv:1607.03575v1Ads are an important revenue source for mobile app development, especially for free apps, whose expense can be compensated by ad revenue. The ad benefits also carry with costs. For example, too many ads can interfere the user experience, leading to less user retention and reduced earnings ultimately. In the paper, we aim at understanding the ad costs from users perspective. We utilize app reviews, which are widely recognized as expressions of user perceptions, to identify the ad costs concerned by users. Four types of ad costs, i.e., number of ads, memory/CPU overhead, traffic usage, and bettery consumption, have been discovered from user reviews. To verify whether different ad integration schemes generate different ad costs, we first obtain the commonly used ad schemes from 104 popular apps, and then design a framework named IntelliAd to automatically measure the ad costs of each scheme. To demonstrate whether these costs indeed influence users reactions, we finally observe the correlations between the measured ad costs and the user perceptions. We discover that the costs related to memory/CPU overhead and battery consumption are more concerned by users, while the traffic usage is less concerned by users in spite of its obvious variations among different schemes in the experiments. Our experimental results provide the developers with suggestions on better incorporating ads into apps and, meanwhile, ensuring the user experience.
- For each rank metric code $\mathcal{C}\subseteq \mathbb{K}^{m\times n}$, we associate a translation structure, the kernel of which is showed to be invariant with respect to the equivalence on rank metric codes. When $\mathbb{K}$ is a finite field $\mathbb{F}_q$ and $\mathcal{C}$ is a maximum rank distance code with minimum distance $d<\min\{m,n\}$, its kernel is proved to be $\mathbb{F}_q$. For an arbitrary linear rank metric code, we also propose and investigate other two invariants which are called its middle nucleus and right nucleus. For a linear maximum rank distance code in $\mathbb{F}_q^{m\times n}$ with $m\leqslant n$, we show that its right nucleus must be a finite field under the condition $\max\{d,m-d+2\} \geqslant \left\lfloor \frac{n}{2} \right\rfloor +1$ and its middle nucleus is always a finite field. For three types of rank metric codes in $\mathbb{F}_{2}^{n\times n}$ derived from an arbitrary quadratic APN function on $\mathbb{F}_{2^n}$, we get similar results on their kernels and middle (right) nuclei. For several known maximum rank distance codes, their middle (right) nuclei are calculated.
- We present an eigenspectrum partitioning scheme without inversion for the recently described real-space electronic transport code, TRANSEC. The primary advantage of TRANSEC is its highly parallel algorithm, which enables studying conductance in large systems. The present scheme adds a new source of parallelization, significantly enhancing TRANSEC's parallel scalability, especially for systems with many electrons. In principle, partitioning could enable super-linear parallel speedup, as we demonstrate in calculations within TRANSEC. In practical cases, we report better than five-fold improvement in CPU time and similar improvements in wall time, compared to previously-published large calculations. Importantly, the suggested scheme is relatively simple to implement. It can be useful for general large Hermitian or weakly non-Hermitian eigenvalue problems, whenever relatively accurate inversion via direct or iterative linear solvers is impractical.
- We study the phase retrieval problem, which solves quadratic system of equations, i.e., recovers a vector $\boldsymbol{x}\in \mathbb{R}^n$ from its magnitude measurements $y_i=|\langle \boldsymbol{a}_i, \boldsymbol{x}\rangle|, i=1,..., m$. We develop a gradient-like algorithm (referred to as RWF representing reshaped Wirtinger flow) by minimizing a nonconvex nonsmooth loss function. In comparison with existing nonconvex Wirtinger flow (WF) algorithm \citecandes2015phase, although the loss function becomes nonsmooth, it involves only the second power of variable and hence reduces the complexity. We show that for random Gaussian measurements, RWF enjoys geometric convergence to a global optimal point as long as the number $m$ of measurements is on the order of $n$, the dimension of the unknown $\boldsymbol{x}$. This improves the sample complexity of WF, and achieves the same sample complexity as truncated Wirtinger flow (TWF) \citechen2015solving, but without truncation in gradient loop. Furthermore, RWF costs less computationally than WF, and runs faster numerically than both WF and TWF. We further develop the incremental (stochastic) reshaped Wirtinger flow (IRWF) and show that IRWF converges linearly to the true signal. We further establish performance guarantee of an existing Kaczmarz method for the phase retrieval problem based on its connection to IRWF. We also empirically demonstrate that IRWF outperforms existing ITWF algorithm (stochastic version of TWF) as well as other batch algorithms.
- Apr 26 2016 cs.SI arXiv:1604.07096v1Drug use by people is on the rise and is of great interest to public health agencies and law enforcement agencies. As found by the National Survey on Drug Use and Health, 20 million Americans aged 12 years or older consumed illicit drugs in the past few 30 days. Given their ubiquity in everyday life, drug abuse related studies have received much and constant attention. However, most of the existing studies rely on surveys. Surveys present a fair number of problems because of their nature. Surveys on sensitive topics such as illicit drug use may not be answered truthfully by the people taking them. Selecting a representative sample to survey is another major challenge. In this paper, we explore the possibility of using big data from social media in order to understand illicit drug use behaviors. Instagram posts are collected using drug related terms by analyzing the hashtags supplied with each post. A large and dynamic dictionary of frequent illicit drug related slang is used to find these posts. These posts are studied to find common drug consumption behaviors with regard to time of day and week. Furthermore, by studying the accounts followed by the users of drug related posts, we hope to discover common interests shared by drug users.
- This paper considers the joint fronthaul compression and transmit beamforming design for the uplink cloud radio access network (C-RAN), in which multi-antenna user terminals communicate with a cloud-computing based centralized processor (CP) through multi-antenna base-stations (BSs) serving as relay nodes. A compress-and-forward relaying strategy, named the VMAC scheme, is employed, in which the BSs can either perform single-user compression or Wyner-Ziv coding to quantize the received signals and send the quantization bits to the CP via capacity-limited fronthaul links; the CP performs successive decoding with either successive interference cancellation (SIC) receiver or linear minimum-mean-square-error (MMSE) receiver. Under this setup, this paper investigates the joint optimization of the transmit beamformers at the users and the quantization noise covariance matrices at the BSs for maximizing the network utility. A novel weighted minimum-mean-square-error successive convex approximation (WMMSE-SCA) algorithm is first proposed for maximizing the weighted sum rate under the user transmit power and fronthaul capacity constraints with single-user compression. Assuming a heuristic decompression order, the proposed algorithm is then adapted for optimizing the transmit beamforming and fronthaul compression under Wyner-Ziv coding. This paper also proposes a low-complexity separate design consisting of optimizing transmit beamformers for the Gaussian vector multiple-access channel along with per-antenna quantizers with uniform quantization noise levels across the antennas at each BS. Numerical results show that with optimized beamforming and fronthaul compression, C-RAN can significantly outperform conventional cellular networks. Furthermore, the low complexity separate design already performs very close to the optimized joint design in regime of practical interest.
- Apr 07 2016 cs.NI arXiv:1604.01675v1Unraveling quality of experience (QoE) of video streaming is very challenging in bandwidth shared wireless networks. It is unclear how QoE metrics such as starvation probability and buffering time interact with dynamics of streaming traffic load. In this paper, we collect view records from one of the largest streaming providers in China over two weeks and perform an in-depth measurement study on flow arrival and viewing time that shed light on the real traffic pattern. Our most important observation is that the viewing time of streaming users fits a hyper-exponential distribution quite well. This implies that all the views can be categorized into two classes, short and long views with separated time scales. We then map the measured traffic pattern to bandwidth shared cellular networks and propose an analytical framework to compute the closed-form starvation probability on the basis of ordinary differential equations (ODEs). Our framework can be naturally extended to investigate practical issues including the progressive downloading and the finite video duration. Extensive trace-driven simulations validate the accuracy of our models. Our study reveals that the starvation metrics of the short and long views possess different sensitivities to the scheduling priority at base station. Hence, a better QoE tradeoff between the short and long views has a potential to be leveraged by offering them different scheduling weights. The flow differentiation involves tremendous technical and non-technical challenges because video content is owned by content providers but not the network operators and the viewing time of each session is unknown beforehand. To overcome these difficulties, we propose an online Bayesian approach to infer the viewing time of each incoming flow with the "least" information from content providers.
- Apr 04 2016 cs.AI arXiv:1604.00377v1Grouping problems aim to partition a set of items into multiple mutually disjoint subsets according to some specific criterion and constraints. Grouping problems cover a large class of important combinatorial optimization problems that are generally computationally difficult. In this paper, we propose a general solution approach for grouping problems, i.e., reinforcement learning based local search (RLS), which combines reinforcement learning techniques with descent-based local search. The viability of the proposed approach is verified on a well-known representative grouping problem (graph coloring) where a very simple descent-based coloring algorithm is applied. Experimental studies on popular DIMACS and COLOR02 benchmark graphs indicate that RLS achieves competitive performances compared to a number of well-known coloring algorithms.
- In this paper, we consider the robot motion (or task) planning problem under some given time bounded high level specifications. We use metric interval temporal logic (MITL), a member of the temporal logic family, to represent the task specification and then we provide a constructive way to generate a timed automaton and methods to look for accepting runs on the automaton to find a feasible motion (or path) sequence for the robot to complete the task.
- Mar 17 2016 cs.SE arXiv:1603.05069v1Modern cyber-physical systems (CPS) have a close inter-dependence between software and physical components. Automotive embedded systems are typical CPS, as physical chips, sensors and actuators are physical components and software embedded within are the cyber components. The current stage of embedded systems is highly complex in architecture design for both software and hardware. It is common in industrial practice that high level control algorithm development and low level code implementation on hardware platforms are developed separately with limited shared information. However, software code and hardware architecture become closely related with the increasing complexity. Correlated requirements and dependencies between hardware and software are emerging problems of industrial practice. We demonstrate in this paper a method to link model based system design with real-time simulations and analysis of the architecture model. This allows hardware software co-design and thus early selection of hardware architecture.
- Mar 14 2016 cs.AI arXiv:1603.03511v1In this paper, we propose a set theoretic approach for knowledge representation. While the syntax of an application domain is captured by set theoretic constructs including individuals, concepts and operators, knowledge is formalized by equality assertions. We first present a primitive form that uses minimal assumed knowledge and constructs. Then, assuming naive set theory, we extend it by definitions, which are special kinds of knowledge. Interestingly, we show that the primitive form is expressive enough to define logic operators, not only propositional connectives but also quantifiers.
- Mar 08 2016 cs.SI arXiv:1603.02175v1In this paper, we mine and learn to predict how similar a pair of users' interests towards videos are, based on demographic (age, gender and location) and social (friendship, interaction and group membership) information of these users. We use the video access patterns of active users as ground truth (a form of benchmark). We adopt tag-based user profiling to establish this ground truth, and justify why it is used instead of video-based methods, or many latent topic models such as LDA and Collaborative Filtering approaches. We then show the effectiveness of the different demographic and social features, and their combinations and derivatives, in predicting user interest similarity, based on different machine-learning methods for combining multiple features. We propose a hybrid tree-encoded linear model for combining the features, and show that it out-performs other linear and treebased models. Our methods can be used to predict user interest similarity when the ground-truth is not available, e.g. for new users, or inactive users whose interests may have changed from old access data, and is useful for video recommendation. Our study is based on a rich dataset from Tencent, a popular service provider of social networks, video services, and various other services in China.
- While the authors of Batch Normalization (BN) identify and address an important problem involved in training deep networks-- Internal Covariate Shift-- the current solution has certain drawbacks. Specifically, BN depends on batch statistics for layerwise input normalization during training which makes the estimates of mean and standard deviation of input (distribution) to hidden layers inaccurate for validation due to shifting parameter values (especially during initial training epochs). Also, BN cannot be used with batch-size 1 during training. We address these drawbacks by proposing a non-adaptive normalization technique for removing internal covariate shift, that we call Normalization Propagation. Our approach does not depend on batch statistics, but rather uses a data-independent parametric estimate of mean and standard-deviation in every layer thus being computationally faster compared with BN. We exploit the observation that the pre-activation before Rectified Linear Units follow Gaussian distribution in deep networks, and that once the first and second order statistics of any given dataset are normalized, we can forward propagate this normalization without the need for recalculating the approximate statistics for hidden layers.
- Mar 01 2016 cond-mat.mes-hall cs.ET arXiv:1602.08799v1Magnetic skyrmion holds promise as information carriers in the next-generation memory and logic devices, owing to the topological stability, small size and extremely low current needed to drive it. One of the most potential applications of skyrmion is to design racetrack memory (RM), named Sk-RM, instead of utilizing domain wall (DW). However, current studies face some key design challenges, e.g., skyrmion manipulation, data representation and synchronization etc. To address these challenges, we propose here a complementary Sk-RM structure with voltage manipulation. Functionality and performance of the proposed design are investigated with micromagnetic simulations.
- With the rapid growth of Internet technologies, cloud computing and social networks have become ubiquitous. An increasing number of people participate in social networks and massive online social data are obtained. In order to exploit knowledge from copious amounts of data obtained and predict social behavior of users, we urge to realize data mining in social networks. Almost all online websites use cloud services to effectively process the large scale of social data, which are gathered from distributed data centers. These data are so large-scale, high-dimension and widely distributed that we propose a distributed sparse online algorithm to handle them. Additionally, privacy-protection is an important point in social networks. We should not compromise the privacy of individuals in networks, while these social data are being learned for data mining. Thus we also consider the privacy problem in this article. Our simulations shows that the appropriate sparsity of data would enhance the performance of our algorithm and the privacy-preserving method does not significantly hurt the performance of the proposed algorithm.
- Feb 12 2016 cs.DC arXiv:1602.03770v1Load balancing, operator instance collocations and horizontal scaling are critical issues in Parallel Stream Processing Engines to achieve low data processing latency, optimized cluster utilization and minimized communication cost respectively. In previous work, these issues are typically tackled separately and independently. We argue that these problems are tightly coupled in the sense that they all need to determine the allocations of workloads and migrate computational states at runtime. Optimizing them independently would result in suboptimal solutions. Therefore, in this paper, we investigate how these three issues can be modeled as one integrated optimization problem. In particular, we first consider jobs where workload allocations have little effect on the communication cost, and model the problem of load balance as a Mixed-Integer Linear Program. Afterwards, we present an extended solution called ALBIC, which support general jobs. We implement the proposed techniques on top of Apache Storm, an open-source Parallel Stream Processing Engine. The extensive experimental results over both synthetic and real datasets show that our techniques clearly outperform existing approaches.
- Feb 11 2016 cs.CV arXiv:1602.03346v1Action parsing in videos with complex scenes is an interesting but challenging task in computer vision. In this paper, we propose a generic 3D convolutional neural network in a multi-task learning manner for effective Deep Action Parsing (DAP3D-Net) in videos. Particularly, in the training phase, action localization, classification and attributes learning can be jointly optimized on our appearancemotion data via DAP3D-Net. For an upcoming test video, we can describe each individual action in the video simultaneously as: Where the action occurs, What the action is and How the action is performed. To well demonstrate the effectiveness of the proposed DAP3D-Net, we also contribute a new Numerous-category Aligned Synthetic Action dataset, i.e., NASA, which consists of 200; 000 action clips of more than 300 categories and with 33 pre-defined action attributes in two hierarchical levels (i.e., low-level attributes of basic body part movements and high-level attributes related to action motion). We learn DAP3D-Net using the NASA dataset and then evaluate it on our collected Human Action Understanding (HAU) dataset. Experimental results show that our approach can accurately localize, categorize and describe multiple actions in realistic videos.
- Self-propagating malware (e.g., an Internet worm) exploits security loopholes in software to infect servers and then use them to scan the Internet for more vulnerable servers. While the mechanisms of worm infection and their propagation models are well understood, defense against worms remains an open problem. One branch of defense research investigates the behavioral difference between worm-infected hosts and normal hosts to set them apart. One particular observation is that a worm-infected host, which scans the Internet with randomly selected addresses, has a much higher connection-failure rate than a normal host. Rate-limit algorithms have been proposed to control the spread of worms by traffic shaping based on connection failure rate. However, these rate-limit algorithms can work properly only if it is possible to measure failure rates of individual hosts efficiently and accurately. This paper points out a serious problem in the prior method. To address this problem, we first propose a solution based on a highly efficient double-bitmap data structure, which places only a small memory footprint on the routers, while providing good measurement of connection failure rates whose accuracy can be tuned by system parameters. Furthermore, we propose another solution based on shared register array data structure, achieving better memory efficiency and much larger estimation range than our double-bitmap solution.
- Feb 08 2016 cs.DC arXiv:1602.01959v3In-memory caching of intermediate data and eager combining of data in shuffle buffers have been shown to be very effective in minimizing the re-computation and I/O cost in distributed data processing systems like Spark and Flink. However, it has also been widely reported that these techniques would create a large amount of long-living data objects in the heap, which may quickly saturate the garbage collector, especially when handling a large dataset, and hence would limit the scalability of the system. To eliminate this problem, we propose a lifetime-based memory management framework, which, by automatically analyzing the user-defined functions and data types, obtains the expected lifetime of the data objects, and then allocates and releases memory space accordingly to minimize the garbage collection overhead. In particular, we present Deca, a concrete implementation of our proposal on top of Spark, which transparently decomposes and groups objects with similar lifetimes into byte arrays and releases their space altogether when their lifetimes come to an end. An extensive experimental study using both synthetic and real datasets shows that, in comparing to Spark, Deca is able to 1) reduce the garbage collection time by up to 99.9%, 2) to achieve up to 22.7x speed up in terms of execution time in cases without data spilling and 41.6x speedup in cases with data spilling, and 3) to consume up to 46.6% less memory.
- Jan 14 2016 cond-mat.mes-hall cs.ET arXiv:1601.03085v1Magnetic domain-wall (DW) has been widely investigated for future memory and computing systems. However, energy efficiency and stability become two major challenges of DW-based systems. In this letter, we first propose exploiting skyrmions as on-chip and inter-chip interconnects for DW-based systems, owing to the topological stability, small size and ultra-low depinning current density. In the proposed technique, data are processed in the form of DWs but are transmitted instead in the form of skyrmions. The reversible conversion between a skyrmion and a DW pair can be physically achieved by connecting a wide and a narrow magnetic nanowire. Our proposed technique can realize highly compact, robust and energy-efficient on-chip and inter-chip interconnects for DW-based systems, enabling the system to take advantages of both the DW and skyrmion.
- Association rules is a very important part of data mining. It is used to find the interesting patterns from transaction databases. Apriori algorithm is one of the most classical algorithms of association rules, but it has the bottleneck in efficiency. In this article, we proposed a prefixed-itemset-based data structure for candidate itemset generation, with the help of the structure we managed to improve the efficiency of the classical Apriori algorithm.
- Jan 07 2016 cs.CV arXiv:1601.01145v2We address the vehicle detection and classification problems using Deep Neural Networks (DNNs) approaches. Here we answer to questions that are specific to our application including how to utilize DNN for vehicle detection, what features are useful for vehicle classification, and how to extend a model trained on a limited size dataset, to the cases of extreme lighting condition. Answering these questions we propose our approach that outperforms state-of-the-art methods, and achieves promising results on image with extreme lighting conditions.
- Dec 29 2015 cs.SE arXiv:1512.07950v1Android applications (apps) grow dramatically in recent years. Apps are user interface (UI) centric typically. Rapid UI responsiveness is key consideration to app developers. However, we still lack a handy tool for profiling app performance so as to diagnose performance problems. This paper presents PersisDroid, a tool specifically designed for this task. The key notion of PersisDroid is that the UI-triggered asynchronous executions also contribute to the UI performance, and hence its performance should be properly captured to facilitate performance diagnosis. However, Android allows tremendous ways to start the asynchronous executions, posing a great challenge to profiling such execution. This paper finds that they can be grouped into six categories. As a result, they can be tracked and profiled according to the specifics of each category with a dynamic instrumentation approach carefully tailored for Android. PersisDroid can then properly profile the asynchronous executions in task granularity, which equips it with low-overhead and high compatibility merits. Most importantly, the profiling data can greatly help the developers in detecting and locating performance anomalies. We code and open-source release PersisDroid. The tool is applied in diagnosing 20 open-source apps, and we find 11 of them contain potential performance problems, which shows its effectiveness in performance diagnosis for Android apps.
- Dec 22 2015 cs.AI arXiv:1512.06747v1Accurate and computationally efficient means for classifying human activities have been the subject of extensive research efforts. Most current research focuses on extracting complex features to achieve high classification accuracy. We propose a template selection approach based on Dynamic Time Warping, such that complex feature extraction and domain knowledge is avoided. We demonstrate the predictive capability of the algorithm on both simulated and real smartphone data.
- Dec 22 2015 cs.CR arXiv:1512.06578v1In e-healthcare record systems (EHRS), attribute-based encryption (ABE) appears as a natural way to achieve fine-grained access control on health records. Some proposals exploit key-policy ABE (KP-ABE) to protect privacy in such a way that all users are associated with specific access policies and only the ciphertexts matching the users' access policies can be decrypted. An issue with KP-ABE is that it requires an a priori formulation of access policies during key generation, which is not always practicable in EHRS because the policies to access health records are sometimes determined after key generation. In this paper, we revisit KPABE and propose a dynamic ABE paradigm, referred to as access policy redefinable ABE (APR-ABE). To address the above issue, APR-ABE allows users to redefine their access policies and delegate keys for the redefined ones; hence a priori precise policies are no longer mandatory. We construct an APR-ABE scheme with short ciphertexts and prove its full security in the standard model under several static assumptions.
- Dec 08 2015 cs.CV arXiv:1512.01691v1In this paper we present a framework for secure identification using deep neural networks, and apply it to the task of template protection for face authentication. We use deep convolutional neural networks (CNNs) to learn a mapping from face images to maximum entropy binary (MEB) codes. The mapping is robust enough to tackle the problem of exact matching, yielding the same code for new samples of a user as the code assigned during training. These codes are then hashed using any hash function that follows the random oracle model (like SHA-512) to generate protected face templates (similar to text based password protection). The algorithm makes no unrealistic assumptions and offers high template security, cancelability, and state-of-the-art matching performance. The efficacy of the approach is shown on CMU-PIE, Extended Yale B, and Multi-PIE face databases. We achieve high (~95%) genuine accept rates (GAR) at zero false accept rate (FAR) with up to 1024 bits of template security.
- In this paper, we propose a reachable set based collision avoidance algorithm for unmanned aerial vehicles (UAVs). UAVs have been deployed for agriculture research and management, surveillance and sensor coverage for threat detection and disaster search and rescue operations. It is essential for the aircraft to have on-board collision avoidance capability to guarantee safety. Instead of the traditional approach of collision avoidance between trajectories, we propose a collision avoidance scheme based on reachable sets and tubes. We then formulate the problem as a convex optimization problem seeking suitable control constraint sets for participating aircraft. We have applied the approach on a case study of two quadrotors and two fix-wing aircraft collision avoidance scenario.
- Matrix-parametrized models, including multiclass logistic regression and sparse coding, are used in machine learning (ML) applications ranging from computer vision to computational biology. When these models are applied to large-scale ML problems starting at millions of samples and tens of thousands of classes, their parameter matrix can grow at an unexpected rate, resulting in high parameter synchronization costs that greatly slow down distributed learning. To address this issue, we propose a Sufficient Factor Broadcasting (SFB) computation model for efficient distributed learning of a large family of matrix-parameterized models, which share the following property: the parameter update computed on each data sample is a rank-1 matrix, i.e., the outer product of two "sufficient factors" (SFs). By broadcasting the SFs among worker machines and reconstructing the update matrices locally at each worker, SFB improves communication efficiency --- communication costs are linear in the parameter matrix's dimensions, rather than quadratic --- without affecting computational correctness. We present a theoretical convergence analysis of SFB, and empirically corroborate its efficiency on four different matrix-parametrized ML models.
- We consider the problem of low rank approximation of binary matrices. Here we are given a $d \times n$ binary matrix $A$ and a small integer $k < d$. The goal is to find two binary matrices $U$ and $V$ of sizes $d \times k$ and $k \times n$ respectively, so that the Frobenius norm of $A-U V$ is minimized. There are two models of this problem, depending on the definition of the product of binary matrices: The $\mathrm{GF}(2)$ model and the Boolean semiring model. Previously, the only known results are $2$-approximation algorithms for the special case $k=1$ \citeKDD:ShenJY09, Jiang14 (where the two models are equivalent). In this paper, we give the first results for the general case $k>1$ for both $\mathrm{GF}(2)$ and Boolean model. For the $\mathrm{GF}(2)$ model, we show that a simple column-selection algorithm achieves $O(k)$-approximation. For the Boolean model, we develop a new algorithm and show that it is $O(2^k)$-approximation. For constant $k$, both algorithms run in polynomial time in the size of the matrix. We also show that the low rank binary matrix approximation problem is NP-hard even for $k=1$, solving a conjecture in \citeKoyuturk03.
- Nov 03 2015 cs.DS arXiv:1511.00648v2Given a constraint satisfaction problem (CSP) on $n$ variables, $x_1, x_2, \dots, x_n \in \{\pm 1\}$, and $m$ constraints, a global cardinality constraint has the form of $\sum_{i = 1}^{n} x_i = (1-2p)n$, where $p \in (\Omega(1), 1 - \Omega(1))$ and $pn$ is an integer. Let $AVG$ be the expected number of constraints satisfied by randomly choosing an assignment to $x_1, x_2, \dots, x_n$, complying with the global cardinality constraint. The CSP above average with the global cardinality constraint problem asks whether there is an assignment (complying with the cardinality constraint) that satisfies more than $(AVG+t)$ constraints, where $t$ is an input parameter. In this paper, we present an algorithm that finds a valid assignment satisfying more than $(AVG+t)$ constraints (if there exists one) in time $(2^{O(t^2)} + n^{O(d)})$. Therefore, the CSP above average with the global cardinality constraint problem is fixed-parameter tractable.
- In this paper, we present an optimization based method for path planning of a mobile robot subject to time bounded temporal constraints, in a dynamic environment. Temporal logic (TL) can address very complex task specification such as safety, coverage, motion sequencing etc. We use metric temporal logic (MTL) to encode the task specifications with timing constraints. We then translate the MTL formulae into mixed integer linear constraints and solve the associated optimization problem using a mixed integer linear program solver. This approach is different from the automata based methods which generate a finite abstraction of the environment and dynamics, and use an automata theoretic approach to formally generate a path that satisfies the TL. We have applied our approach on several case studies in complex dynamical environments subjected to timed temporal specifications.
- Oct 01 2015 cs.CV arXiv:1509.09243v1The normal compositional model (NCM) has been extensively used in hyperspectral unmixing. However, most of the previous research has focused on estimation of endmembers and/or their variability. Also, little work has employed spatial information in NCM. In this paper, we show that NCM can be used for calculating the uncertainty of the estimated endmembers with spatial priors incorporated for better unmixing. This results in a spatial compositional model (SCM) which features (i) spatial priors that force neighboring abundances to be similar based on their pixel similarity and (ii) a posterior that is obtained from a likelihood model which does not assume pixel independence. The resulting algorithm turns out to be easy to implement and efficient to run. We compared SCM with current state-of-the-art algorithms on synthetic and real images. The results show that SCM can in the main provide more accurate endmembers and abundances. Moreover, the estimated uncertainty can serve as a prediction of endmember error under certain conditions.
- Sep 08 2015 cs.CV arXiv:1509.01654v1Wearable cameras, such as Google Glass and Go Pro, enable video data collection over larger areas and from different views. In this paper, we tackle a new problem of locating the co-interest person (CIP), i.e., the one who draws attention from most camera wearers, from temporally synchronized videos taken by multiple wearable cameras. Our basic idea is to exploit the motion patterns of people and use them to correlate the persons across different videos, instead of performing appearance-based matching as in traditional video co-segmentation/localization. This way, we can identify CIP even if a group of people with similar appearance are present in the view. More specifically, we detect a set of persons on each frame as the candidates of the CIP and then build a Conditional Random Field (CRF) model to select the one with consistent motion patterns in different videos and high spacial-temporal consistency in each video. We collect three sets of wearable-camera videos for testing the proposed algorithm. All the involved people have similar appearances in the collected videos and the experiments demonstrate the effectiveness of the proposed algorithm.
- Sep 08 2015 cs.CV arXiv:1509.01719v1This paper introduces a new method to solve the cross-domain recognition problem. Different from the traditional domain adaption methods which rely on a global domain shift for all classes between source and target domain, the proposed method is more flexible to capture individual class variations across domains. By adopting a natural and widely used assumption -- "the data samples from the same class should lay on a low-dimensional subspace, even if they come from different domains", the proposed method circumvents the limitation of the global domain shift, and solves the cross-domain recognition by finding the compact joint subspaces of source and target domain. Specifically, given labeled samples in source domain, we construct subspaces for each of the classes. Then we construct subspaces in the target domain, called anchor subspaces, by collecting unlabeled samples that are close to each other and highly likely all fall into the same class. The corresponding class label is then assigned by minimizing a cost function which reflects the overlap and topological structure consistency between subspaces across source and target domains, and within anchor subspaces, respectively.We further combine the anchor subspaces to corresponding source subspaces to construct the compact joint subspaces. Subsequently, one-vs-rest SVM classifiers are trained in the compact joint subspaces and applied to unlabeled data in the target domain. We evaluate the proposed method on two widely used datasets: object recognition dataset for computer vision tasks, and sentiment classification dataset for natural language processing tasks. Comparison results demonstrate that the proposed method outperforms the comparison methods on both datasets.
- Sep 02 2015 cs.LG arXiv:1509.00181v7With the rapid growth in multimedia services and the enormous offers of video contents in online social networks, users have difficulty in obtaining their interests. Therefore, various personalized recommendation systems have been proposed. However, they ignore that the accelerated proliferation of social media data has led to the big data era, which has greatly impeded the process of video recommendation. In addition, none of them has considered both the privacy of users' contexts (e,g., social status, ages and hobbies) and video service vendors' repositories, which are extremely sensitive and of significant commercial value. To handle the problems, we propose a cloud-assisted differentially private video recommendation system based on distributed online learning. In our framework, service vendors are modeled as distributed cooperative learners, recommending videos according to user's context, while simultaneously adapting the video-selection strategy based on user-click feedback to maximize total user clicks (reward). Considering the sparsity and heterogeneity of big social media data, we also propose a novel geometric differentially private model, which can greatly reduce the performance (recommendation accuracy) loss. Our simulation shows the proposed algorithms outperform other existing methods and keep a delicate balance between computing accuracy and privacy preserving level.
- In this paper, we propose a multi-kernel classifier learning algorithm to optimize a given nonlinear and nonsmoonth multivariate classifier performance measure. Moreover, to solve the problem of kernel function selection and kernel parameter tuning, we proposed to construct an optimal kernel by weighted linear combination of some candidate kernels. The learning of the classifier parameter and the kernel weight are unified in a single objective function considering to minimize the upper boundary of the given multivariate performance measure. The objective function is optimized with regard to classifier parameter and kernel weight alternately in an iterative algorithm by using cutting plane algorithm. The developed algorithm is evaluated on two different pattern classification methods with regard to various multivariate performance measure optimization problems. The experiment results show the proposed algorithm outperforms the competing methods.
- Aug 21 2015 cs.DC arXiv:1508.04907v2Fault-tolerance techniques for stream processing engines can be categorized into passive and active approaches. A typical passive approach periodically checkpoints a processing task's runtime states and can recover a failed task by restoring its runtime state using its latest checkpoint. On the other hand, an active approach usually employs backup nodes to run replicated tasks. Upon failure, the active replica can take over the processing of the failed task with minimal latency. However, both approaches have their own inadequacies in Massively Parallel Stream Processing Engines (MPSPE). The passive approach incurs a long recovery latency especially when a number of correlated nodes fail simultaneously, while the active approach requires extra replication resources. In this paper, we propose a new fault-tolerance framework, which is Passive and Partially Active (PPA). In a PPA scheme, the passive approach is applied to all tasks while only a selected set of tasks will be actively replicated. The number of actively replicated tasks depends on the available resources. If tasks without active replicas fail, tentative outputs will be generated before the completion of the recovery process. We also propose effective and efficient algorithms to optimize a partially active replication plan to maximize the quality of tentative outputs. We implemented PPA on top of Storm, an open-source MPSPE and conducted extensive experiments using both real and synthetic datasets to verify the effectiveness of our approach.
- Aug 19 2015 cs.CV arXiv:1508.04224v1The problem of tag completion is to learn the missing tags of an image. In this paper, we propose to learn a tag scoring vector for each image by local linear learning. A local linear function is used in the neighborhood of each image to predict the tag scoring vectors of its neighboring images. We construct a unified objective function for the learning of both tag scoring vectors and local linear function parame- ters. In the objective, we impose the learned tag scoring vectors to be consistent with the known associations to the tags of each image, and also minimize the prediction error of each local linear function, while reducing the complexity of each local function. The objective function is optimized by an alternate optimization strategy and gradient descent methods in an iterative algorithm. We compare the proposed algorithm against different state-of-the-art tag completion methods, and the results show its advantages.
- Recently, Sheekey constructed a new family of maximal rank distance codes as sets of $q$-polynomials over $\mathbb{F}_{q^n}$, which are called the twisted Gabidulin codes. We investigate a generalization of them, which we call generalized twisted Gabidulin codes. Their Delsarte duals and adjoint codes are considered. We also completely determine the equivalence between different members of the generalized twisted Gabidulin codes, from which it follows that the generalized Gabidulin codes and the twisted Gabidulin codes are both proper subsets of this new family.
- Jul 14 2015 cs.CV arXiv:1507.03060v2One popular approach to interactively segment the foreground object of interest from an image is to annotate a bounding box that covers the foreground object. Then, a binary labeling is performed to achieve a refined segmentation. One major issue of the existing algorithms for such interactive image segmentation is their preference of an input bounding box that tightly encloses the foreground object. This increases the annotation burden, and prevents these algorithms from utilizing automatically detected bounding boxes. In this paper, we develop a new LooseCut algorithm that can handle cases where the input bounding box only loosely covers the foreground object. We propose a new Markov Random Fields (MRF) model for segmentation with loosely bounded boxes, including a global similarity constraint to better distinguish the foreground and background, and an additional energy term to encourage consistent labeling of similar-appearance pixels. This MRF model is then solved by an iterated max-flow algorithm. In the experiments, we evaluate LooseCut in three publicly-available image datasets, and compare its performance against several state-of-the-art interactive image segmentation algorithms. We also show that LooseCut can be used for enhancing the performance of unsupervised video segmentation and image saliency detection.
- Jul 02 2015 cs.CV arXiv:1507.00019v2In this paper, we study the problem of using contextual da- ta points of a data point for its classification problem. We propose to represent a data point as the sparse linear reconstruction of its context, and learn the sparse context to gather with a linear classifier in a su- pervised way to increase its discriminative ability. We proposed a novel formulation for context learning, by modeling the learning of context reconstruction coefficients and classifier in a unified objective. In this objective, the reconstruction error is minimized and the coefficient spar- sity is encouraged. Moreover, the hinge loss of the classifier is minimized and the complexity of the classifier is reduced. This objective is opti- mized by an alternative strategy in an iterative algorithm. Experiments on three benchmark data set show its advantage over state-of-the-art context-based data representation and classification methods.
- Jun 16 2015 cs.CV arXiv:1506.04340v1In this paper we present Deep Secure Encoding: a framework for secure classification using deep neural networks, and apply it to the task of biometric template protection for faces. Using deep convolutional neural networks (CNNs), we learn a robust mapping of face classes to high entropy secure codes. These secure codes are then hashed using standard hash functions like SHA-256 to generate secure face templates. The efficacy of the approach is shown on two face databases, namely, CMU-PIE and Extended Yale B, where we achieve state of the art matching performance, along with cancelability and high security with no unrealistic assumptions. Furthermore, the scheme can work in both identification and verification modes.