results for au:Wang_Y in:cs

- Genome-wide association studies (GWAS) have achieved great success in the genetic study of Alzheimer's disease (AD). Collaborative imaging genetics studies across different research institutions show the effectiveness of detecting genetic risk factors. However, the high dimensionality of GWAS data poses significant challenges in detecting risk SNPs for AD. Selecting relevant features is crucial in predicting the response variable. In this study, we propose a novel Distributed Feature Selection Framework (DFSF) to conduct the large-scale imaging genetics studies across multiple institutions. To speed up the learning process, we propose a family of distributed group Lasso screening rules to identify irrelevant features and remove them from the optimization. Then we select the relevant group features by performing the group Lasso feature selection process in a sequence of parameters. Finally, we employ the stability selection to rank the top risk SNPs that might help detect the early stage of AD. To the best of our knowledge, this is the first distributed feature selection model integrated with group Lasso feature selection as well as detecting the risk genetic factors across multiple research institutions system. Empirical studies are conducted on 809 subjects with 5.9 million SNPs which are distributed across several individual institutions, demonstrating the efficiency and effectiveness of the proposed method.
- Despite the recent success of deep-learning based semantic segmentation, deploying a pre-trained road scene segmenter to a city whose images are not presented in the training set would not achieve satisfactory performance due to dataset biases. Instead of collecting a large number of annotated images of each city of interest to train or refine the segmenter, we propose an unsupervised learning approach to adapt road scene segmenters across different cities. By utilizing Google Street View and its time-machine feature, we can collect unannotated images for each road scene at different times, so that the associated static-object priors can be extracted accordingly. By advancing a joint global and class-specific domain adversarial learning framework, adaptation of pre-trained segmenters to that city can be achieved without the need of any user annotation or interaction. We show that our method improves the performance of semantic segmentation in multiple cities across continents, while it performs favorably against state-of-the-art approaches requiring annotated training data.
- Apr 26 2017 cs.LO arXiv:1704.07774v1We make a mixture of Milner's $\pi$-calculus and our previous work on truly concurrent process algebra, which is called $\pi_{tc}$. We introduce syntax and semantics of $\pi_{tc}$, its properties based on strongly truly concurrent bisimilarities. Also, we include an axiomatization of $\pi_{tc}$. $\pi_{tc}$ can be used as a formal tool in verifying mobile systems in a truly concurrent flavor.
- A special homotopy continuation method, as a combination of the polyhedral homotopy and the linear product homotopy, is proposed for computing all the isolated solutions to a special class of polynomial systems. The root number bound of this method is between the total degree bound and the mixed volume bound and can be easily computed. The new algorithm has been implemented as a program called LPH using C++. Our experiments show its efficiency compared to the polyhedral or other homotopies on such systems. As an application, the algorithm can be used to find witness points on each connected component of a real variety.
- Apr 26 2017 cs.CV arXiv:1704.07711v1Signal decomposition is a classical problem in signal processing, which aims to separate an observed signal into two or more components each with its own property. Usually each component is described by its own subspace or dictionary. Extensive research has been done for the case where the components are additive, but in real world applications, the components are often non-additive. For example, an image may consist of a foreground object overlaid on a background, where each pixel either belongs to the foreground or the background. In such a situation, to separate signal components, we need to find a binary mask which shows the location of each component. Therefore it requires to solve a binary optimization problem. Since most of the binary optimization problems are intractable, we relax this problem to the approximated continuous problem, and solve it by alternating optimization technique, which solves the mask and the subspace projection for each component, alternatingly and iteratively. We show the application of the proposed algorithm for three applications: separation of text from background in images, separation of moving objects from a background undergoing global camera motion in videos, separation of sinusoidal and spike components in one dimensional signals. We demonstrate in each case that considering the non-additive nature of the problem can lead to significant improvement.
- Apr 25 2017 cs.SY arXiv:1704.07201v1Clock synchronization is a widely discussed topic in the engineering literature. Ensuring that individual clocks are closely aligned is important in network systems, since the correct timing of various events in a network is usually necessary for proper system implementation. However, many existing clock synchronization algorithms update clock values abruptly, resulting in discontinuous clocks which have been shown to lead to undesirable behavior. In this paper, we propose using the pulse-coupled oscillator model to guarantee clock continuity, demonstrating two general methods for achieving continuous phase evolution in any pulse-coupled oscillator network. We provide rigorous mathematical proof that the pulse-coupled oscillator algorithm is able to converge to the synchronized state when the phase continuity methods are applied. We provide simulation results supporting these proofs. We further investigate the convergence behavior of other pulse-coupled oscillator synchronization algorithms using the proposed methods.
- Apr 25 2017 cs.CV arXiv:1704.06972v1Recently, there has been a lot of interest in automatically generating descriptions for an image. Most existing language-model based approaches for this task learn to generate an image description word by word in its original word order. However, for humans, it is more natural to locate the objects and their relationships first, and then elaborate on each object, describing notable attributes. We present a coarse-to-fine method that decomposes the original image description into a skeleton sentence and its attributes, and generates the skeleton sentence and attribute phrases separately. By this decomposition, our method can generate more accurate and novel descriptions than the previous state-of-the-art. Experimental results on the MS-COCO and a larger scale Stock3M datasets show that our algorithm yields consistent improvements across different evaluation metrics, especially on the SPICE metric, which has much higher correlation with human ratings than the conventional metrics. Furthermore, our algorithm can generate descriptions with varied length, benefiting from the separate control of the skeleton and attributes. This enables image description generation that better accommodates user preferences.
- The proliferation of social media in communication and information dissemination has made it an ideal platform for spreading rumors. Automatically debunking rumors at their stage of diffusion is known as \textitearly rumor detection, which refers to dealing with sequential posts regarding disputed factual claims with certain variations and highly textual duplication over time. Thus, identifying trending rumors demands an efficient yet flexible model that is able to capture long-range dependencies among postings and produce distinct representations for the accurate early detection. However, it is a challenging task to apply conventional classification algorithms to rumor detection in earliness since they rely on hand-crafted features which require intensive manual efforts in the case of large amount of posts. This paper presents a deep attention model on the basis of recurrent neural networks (RNN) to learn \textitselectively temporal hidden representations of sequential posts for identifying rumors. The proposed model delves soft-attention into the recurrence to simultaneously pool out distinct features with particular focus and produce hidden representations that capture contextual variations of relevant posts over time. Extensive experiments on real datasets collected from social media websites demonstrate that (1) the deep attention based RNN model outperforms state-of-the-arts that rely on hand-crafted features; (2) the introduction of soft attention mechanism can effectively distill relevant parts to rumors from original posts in advance; (3) the proposed method detects rumors more quickly and accurately than competitors.
- Apr 13 2017 cs.AI arXiv:1704.03612v1In this paper, we develop a novel paradigm, namely hypergraph shift, to find robust graph modes by probabilistic voting strategy, which are semantically sound besides the self-cohesiveness requirement in forming graph modes. Unlike the existing techniques to seek graph modes by shifting vertices based on pair-wise edges (i.e, an edge with $2$ ends), our paradigm is based on shifting high-order edges (hyperedges) to deliver graph modes. Specifically, we convert the problem of seeking graph modes as the problem of seeking maximizers of a novel objective function with the aim to generate good graph modes based on sifting edges in hypergraphs. As a result, the generated graph modes based on dense subhypergraphs may more accurately capture the object semantics besides the self-cohesiveness requirement. We also formally prove that our technique is always convergent. Extensive empirical studies on synthetic and real world data sets are conducted on clustering and graph matching. They demonstrate that our techniques significantly outperform the existing techniques.
- Apr 12 2017 cs.CV arXiv:1704.03155v1Previous approaches for scene text detection have already achieved promising performances across various benchmarks. However, they usually fall short when dealing with challenging scenarios, even when equipped with deep neural network models, because the overall performance is determined by the interplay of multiple stages and components in the pipelines. In this work, we propose a simple yet powerful pipeline that yields fast and accurate text detection in natural scenes. The pipeline directly predicts words or text lines of arbitrary orientations and quadrilateral shapes in full images, eliminating unnecessary intermediate steps (e.g., candidate aggregation and word partitioning), with a single neural network. The simplicity of our pipeline allows concentrating efforts on designing loss functions and neural network architecture. Experiments on standard datasets including ICDAR 2015, COCO-Text and MSRA-TD500 demonstrate that the proposed algorithm significantly outperforms state-of-the-art methods in terms of both accuracy and efficiency. On the ICDAR 2015 dataset, the proposed algorithm achieves an F-score of 0.7820 at 13.2fps at 720p resolution.
- The millimeter-wave (mmWave) communication is envisioned to provide orders of magnitude capacity improvement. However, it is challenging to realize a sufficient link margin due to high path loss and blockages. To address this difficulty, in this paper, we explore the potential gain of ultra-densification for enhancing mmWave communications from a network-level perspective. By deploying the mmWave base stations (BSs) in an extremely dense and amorphous fashion, the access distance is reduced and the choice of serving BSs is enriched for each user, which are intuitively effective for mitigating the propagation loss and blockages. Nevertheless, co-channel interference under this model will become a performance-limiting factor. To solve this problem, we propose a large-scale channel state information (CSI) based interference coordination approach. Note that the large-scale CSI is highly location-dependent, and can be obtained with a quite low cost. Thus, the scalability of the proposed coordination framework can be guaranteed. Particularly, using only the large-scale CSI of interference links, a coordinated frequency resource block allocation problem is formulated for maximizing the minimum achievable rate of the users, which is uncovered to be a NP-hard integer programming problem. To circumvent this difficulty, a greedy scheme with polynomial-time complexity is proposed by adopting the bisection method and linear integer programming tools. Simulation results demonstrate that the proposed coordination scheme based on large-scale CSI only can still offer substantial gains over the existing methods. Moreover, although the proposed scheme is only guaranteed to converge to a local optimum, it performs well in terms of both user fairness and system efficiency.
- Apr 11 2017 cs.CV arXiv:1704.02446v1Prestack seismic data carries much useful information that can help us find more complex atypical reservoirs. Therefore, we are increasingly inclined to use prestack seismic data for seis- mic facies recognition. However, due to the inclusion of ex- cessive redundancy, effective feature extraction from prestack seismic data becomes critical. In this paper, we consider seis- mic facies recognition based on prestack data as an image clus- tering problem in computer vision (CV) by thinking of each prestack seismic gather as a picture. We propose a convo- lutional autoencoder (CAE) network for deep feature learn- ing from prestack seismic data, which is more effective than principal component analysis (PCA) in redundancy removing and valid information extraction. Then, using conventional classification or clustering techniques (e.g. K-means or self- organizing maps) on the extracted features, we can achieve seismic facies recognition. We applied our method to the prestack data from physical model and LZB region. The re- sult shows that our approach is superior to the conventionals.
- Apr 10 2017 cs.SI arXiv:1704.02042v1We propose a framework to measure, evaluate, and rank campaign effectiveness in the ongoing 2016 U.S. presidential election. Using Twitter data collected from Sept. 2015 to Jan. 2016, we first uncover the tweeting tactics of the candidates and second, using negative binomial regression and exploiting the variations in 'likes,' we evaluate the effectiveness of these tactics. Thirdly, we rank the candidates' campaign tactics by calculating the conditional expectation of their generated 'likes.' We show that while Ted Cruz and Marco Rubio put much weight on President Obama, this tactic is not being well received by their supporters. We demonstrate that Hillary Clinton's tactic of linking herself to President Obama resonates well with her supporters but the same is not true for Bernie Sanders. In addition, we show that Donald Trump is a major topic for all the other candidates and that the women issue is equally emphasized in Sanders' campaign as in Clinton's. Finally, we suggest two ways that politicians can use the feedback mechanism in social media to improve their campaign: (1) use feedback from social media to improve campaign tactics within social media; (2) prototype policies and test the public response from the social media.
- Apr 06 2017 cs.CV arXiv:1704.01256v1Traffic congestion is a widespread problem. Dynamic traffic routing systems and congestion pricing are getting importance in recent research. Lane prediction and vehicle density estimation is an important component of such systems. We introduce a novel problem of vehicle self-positioning which involves predicting the number of lanes on the road and vehicle's position in those lanes using videos captured by a dashboard camera. We propose an integrated closed-loop approach where we use the presence of vehicles to aid the task of self-positioning and vice-versa. To incorporate multiple factors and high-level semantic knowledge into the solution, we formulate this problem as a Bayesian framework. In the framework, the number of lanes, the vehicle's position in those lanes and the presence of other vehicles are considered as parameters. We also propose a bounding box selection scheme to reduce the number of false detections and increase the computational efficiency. We show that the number of box proposals decreases by a factor of 6 using the selection approach. It also results in large reduction in the number of false detections. The entire approach is tested on real-world videos and is found to give acceptable results.
- Mar 31 2017 cs.AI arXiv:1703.10316v1Knowledge graph embedding aims to embed entities and relations of knowledge graphs into low-dimensional vector spaces. Translating embedding methods regard relations as the translation from head entities to tail entities, which achieve the state-of-the-art results among knowledge graph embedding methods. However, a major limitation of these methods is the time consuming training process, which may take several days or even weeks for large knowledge graphs, and result in great difficulty in practical applications. In this paper, we propose an efficient parallel framework for translating embedding methods, called ParTrans-X, which enables the methods to be paralleled without locks by utilizing the distinguished structures of knowledge graphs. Experiments on two datasets with three typical translating embedding methods, i.e., TransE [3], TransH [17], and a more efficient variant TransE- AdaGrad [10] validate that ParTrans-X can speed up the training process by more than an order of magnitude.
- Mar 30 2017 cs.CV arXiv:1703.10025v1Extending state-of-the-art object detectors from image to video is challenging. The accuracy of detection suffers from degenerated object appearances in videos, e.g., motion blur, video defocus, rare poses, etc. Existing work attempts to exploit temporal information on box level, but such methods are not trained end-to-end. We present flow-guided feature aggregation, an accurate and end-to-end learning framework for video object detection. It leverages temporal coherence on feature level instead. It improves the per-frame features by aggregation of nearby features along the motion paths, and thus improves the video recognition accuracy. Our method significantly improves upon strong single-frame baselines in ImageNet VID, especially for more challenging fast moving objects. Our framework is principled, and on par with the best engineered systems winning the ImageNet VID challenges 2016, without additional bells-and-whistles. The code would be released.
- Mar 30 2017 cs.CV arXiv:1703.09746v1Very large-scale Deep Neural Networks (DNNs) have achieved remarkable successes in a large variety of computer vision tasks. However, the high computation intensity of DNNs makes it challenging to deploy these models on resource-limited systems. Some studies used low-rank approaches that approximate the filters by low-rank basis to accelerate the testing. Those works directly decomposed the pre-trained DNNs by Low-Rank Approximations (LRA). How to train DNNs toward lower-rank space for more efficient DNNs, however, remains as an open area. To solve the issue, in this work, we propose Force Regularization, which uses attractive forces to enforce filters so as to coordinate more weight information into lower-rank space. We mathematically and empirically prove that after applying our technique, standard LRA methods can reconstruct filters using much lower basis and thus result in faster DNNs. The effectiveness of our approach is comprehensively evaluated in ResNets, AlexNet, and GoogLeNet. In AlexNet, for example, Force Regularization gains 2x speedup on modern GPU without accuracy loss and 4.05x speedup on CPU by paying small accuracy degradation. Moreover, Force Regularization better initializes the low-rank DNNs such that the fine-tuning can converge faster toward higher accuracy. The obtained lower-rank DNNs can be further sparsified, proving that Force Regularization can be integrated with state-of-the-art sparsity-based acceleration methods.
- A text-to-speech synthesis system typically consists of multiple stages, such as a text analysis frontend, an acoustic model and an audio synthesis module. Building these components often requires extensive domain expertise and may contain brittle design choices. In this paper, we present Tacotron, an end-to-end generative text-to-speech model that synthesizes speech directly from characters. Given <text, audio> pairs, the model can be trained completely from scratch with random initialization. We present several key techniques to make the sequence-to-sequence framework perform well for this challenging task. Tacotron achieves a 3.82 subjective 5-scale mean opinion score on US English, outperforming a production parametric system in terms of naturalness. In addition, since Tacotron generates speech at the frame level, it's substantially faster than sample-level autoregressive methods.
- Average consensus is fundamental for distributed systems since it underpins key functionalities of such systems ranging from distributed information fusion, decision-making, to decentralized control. In order to reach an agreement, existing average consensus algorithms require each agent to exchange explicit state information with its neighbors. This leads to the disclosure of private state information, which is undesirable in cases where privacy is of concern. In this paper, we propose a novel approach that enables secure and privacy-preserving average consensus in a decentralized architecture in the absence of any trusted third-parties. By leveraging homomorphic cryptography, our approach can guarantee consensus to the exact value in a deterministic manner. The proposed approach is light-weight in computation and communication, and applicable to time-varying interaction topology cases. Implementation details and numerical examples are provided to demonstrate the capability of our approach.
- Mar 27 2017 cs.CV arXiv:1703.08493v1In the field of connectomics, neuroscientists seek to identify cortical connectivity comprehensively. Neuronal boundary detection from the Electron Microscopy (EM) images is often done to assist the automatic reconstruction of neuronal circuit. But the segmentation of EM images is a challenging problem, as it requires the detector to be able to detect both filament-like thin and blob-like thick membrane, while suppressing the ambiguous intracellular structure. In this paper, we propose multi-stage multi-recursive-input fully convolutional networks to address this problem. The multiple recursive inputs for one stage, i.e., the multiple side outputs with different receptive field sizes learned from the lower stage, provide multi-scale contextual boundary information for the consecutive learning. This design is biologically-plausible, as it likes a human visual system to compare different possible segmentation solutions to address the ambiguous boundary issue. Our multi-stage networks are trained end-to-end. It achieves promising results on a public available mouse piriform cortex dataset, which significantly outperforms other competitors.
- Outsourcing integrated circuit (IC) manufacturing to offshore foundries has grown exponentially in recent years. Given the critical role of ICs in the control and operation of vehicular systems and other modern engineering designs, such offshore outsourcing has led to serious security threats due to the potential of insertion of hardware trojans - malicious designs that, when activated, can lead to highly detrimental consequences. In this paper, a novel game-theoretic framework is proposed to analyze the interactions between a hardware manufacturer, acting as attacker, and an IC testing facility, acting as defender. The problem is formulated as a noncooperative game in which the attacker must decide on the type of trojan that it inserts while taking into account the detection penalty as well as the damage caused by the trojan. Meanwhile, the resource-constrained defender must decide on the best testing strategy that allows optimizing its overall utility which accounts for both damages and the fines. The proposed game is based on the robust behavioral framework of prospect theory (PT) which allows capturing the potential uncertainty, risk, and irrational behavior in the decision making of both the attacker and defender. For both, the standard rational expected utility (EUT) case and the PT case, a novel algorithm based on fictitious play is proposed and shown to converge to a mixed-strategy Nash equilibrium. For an illustrative case study, thorough analytical results are derived for both EUT and PT to study the properties of the reached equilibrium as well as the impact of key system parameters such as the defender-set fine. Simulation results assess the performance of the proposed framework under both EUT and PT and show that the use of PT will provide invaluable insights on the outcomes of the proposed hardware trojan game, in particular, and system security, in general.
- In this paper we analyze the gate activation signals inside the gated recurrent neural networks, and find the temporal structure of such signals is highly correlated with the phoneme boundaries. This correlation is further verified by a set of experiments for phoneme segmentation, in which better results compared to standard approaches were obtained.
- Data analysis often concerns not only the space where data come from, but also various types of maps attached to data. In recent years, several related structures have been used to study maps on data, including Reeb spaces, mappers and multiscale mappers. The construction of these structures also relies on the so-called \emphnerve of a cover of the domain. In this paper, we aim to analyze the topological information encoded in these structures in order to provide better understanding of these structures and facilitate their practical usage. More specifically, we show that the one-dimensional homology of the nerve complex $N(\mathcal{U})$ of a path-connected cover $\mathcal{U}$ of a domain $X$ cannot be richer than that of the domain $X$ itself. Intuitively, this result means that no new $H_1$-homology class can be "created" under a natural map from $X$ to the nerve complex $N(\mathcal{U})$. Equipping $X$ with a pseudometric $d$, we further refine this result and characterize the classes of $H_1(X)$ that may survive in the nerve complex using the notion of \emphsize of the covering elements in $\mathcal{U}$. These fundamental results about nerve complexes then lead to an analysis of the $H_1$-homology of Reeb spaces, mappers and multiscale mappers. The analysis of $H_1$-homology groups unfortunately does not extend to higher dimensions. Nevertheless, by using a map-induced metric, establishing a Gromov-Hausdorff convergence result between mappers and the domain, and interleaving relevant modules, we can still analyze the persistent homology groups of (multiscale) mappers to establish a connection to Reeb spaces.
- Mar 22 2017 cs.CV arXiv:1703.06993v2In this paper, we reveal the importance and benefits of introducing second-order operations into deep neural networks. We propose a novel approach named Second-Order Response Transform (SORT), which appends element-wise product transform to the linear sum of a two-branch network module. A direct advantage of SORT is to facilitate cross-branch response propagation, so that each branch can update its weights based on the current status of the other branch. Moreover, SORT augments the family of transform operations and increases the nonlinearity of the network, making it possible to learn flexible functions to fit the complicated distribution of feature space. SORT can be applied to a wide range of network architectures, including a branched variant of a chain-styled network and a residual network, with very light-weighted modifications. We observe consistent accuracy gain on both small (CIFAR10, CIFAR100 and SVHN) and big (ILSVRC2012) datasets. In addition, SORT is very efficient, as the extra computation overhead is less than 5%.
- Mar 21 2017 cs.NE arXiv:1703.06263v1In the evolutionary computation research community, the performance of most evolutionary algorithms (EAs) depends strongly on their implemented coordinate system. However, the commonly used coordinate system is fixed and not well suited for different function landscapes, EAs thus might not search efficiently. To overcome this shortcoming, in this paper we propose a framework, named ACoS, to adaptively tune the coordinate systems in EAs. In ACoS, an Eigen coordinate system is established by making use of the cumulative population distribution information, which can be obtained based on a covariance matrix adaptation strategy and an additional archiving mechanism. Since the population distribution information can reflect the features of the function landscape to some extent, EAs in the Eigen coordinate system have the capability to identify the modality of the function landscape. In addition, the Eigen coordinate system is coupled with the original coordinate system, and they are selected according to a probability vector. The probability vector aims to determine the selection ratio of each coordinate system for each individual, and is adaptively updated based on the collected information from the offspring. ACoS has been applied to two of the most popular EA paradigms, i.e., particle swarm optimization (PSO) and differential evolution (DE), for solving 30 test functions with 30 and 50 dimensions at the 2014 IEEE Congress on Evolutionary Computation. The experimental studies demonstrate its effectiveness.
- Mar 17 2017 cs.CG arXiv:1703.05475v2Graphs and network data are ubiquitous across a wide spectrum of scientific and application domains. Often in practice, an input graph can be considered as an observed snapshot of a (potentially continuous) hidden domain or process. Subsequent analysis, processing, and inferences are then performed on this observed graph. In this paper we advocate the perspective that an observed graph is often a noisy version of some discretized 1-skeleton of a hidden domain, and specifically we will consider the following natural network model: We assume that there is a true graph ${G^*}$ which is a certain proximity graph for points sampled from a hidden domain $\mathcal{X}$; while the observed graph $G$ is an Erd$\"{o}$s-R$\'{e}$nyi type perturbed version of ${G^*}$. Our network model is related to, and slightly generalizes, the much-celebrated small-world network model originally proposed by Watts and Strogatz. However, the main question we aim to answer is orthogonal to the usual studies of network models (which often focuses on characterizing / predicting behaviors and properties of real-world networks). Specifically, we aim to recover the metric structure of ${G^*}$ (which reflects that of the hidden space $\mathcal{X}$ as we will show) from the observed graph $G$. Our main result is that a simple filtering process based on the \emphJaccard index can recover this metric within a multiplicative factor of $2$ under our network model. Our work makes one step towards the general question of inferring structure of a hidden space from its observed noisy graph representation. In addition, our results also provide a theoretical understanding for Jaccard-Index-based denoising approaches.
- Mar 17 2017 cs.SE arXiv:1703.05375v1Agile techniques recently have received attention in developing safety-critical systems. However, a lack of empirical knowledge of performing safety assurance techniques in practice, especially safety analysis into agile development processes prevents further steps. In this article, we aim at investigating the feasibility and the effects of our S-Scrum development process, and stepwise improving and proposing an Optimized S-Scrum development process for safety-critical systems in a real environment. We conducted an exploratory case study in a one-year student project "Smart Home" at the University of Stuttgart, Germany. We participated in the project and collected quantitative and qualitative data from questionnaire, interviews, participant observation, physical artifacts, and documentation review. Furthermore, we evaluated the Optimized S-Scrum in industry by conducting interviews. The first-stage results showed that by integrating STPA (System-Theoretic Process Analysis) can ensure the safety during each sprint and enhance the safety of delivered products, while the agility of S-Scrum is slightly worse than the original Scrum. Six challenges have been explored: Management changes the team's priorities during an iteration; Disturbed safety-related communication; Non-functional requirements are determined too late; Insufficient upfront planning; Insufficient well-defined completion criteria; Excessive time to perform upfront planning. We investigated further the causalities and optimizations. The second-stage results revealed that the safety and agility have been improved after the optimizations. We have gained a positive assessment and suggestions from industry. The optimized S-Scrum is feasible for developing safety-critical systems concerning the capability to ensure safety and the acceptable agility in a student project. Further attempt is still needed in industrial projects.
- Mar 16 2017 cs.CV arXiv:1703.04611v2Subspace learning is an important problem, which has many applications in image and video processing. It can be used to find a low-dimensional representation of signals and images. But in many applications, the desired signal is heavily distorted by outliers and noise, which negatively affect the learned subspace. In this work, we present a novel algorithm for learning a subspace for signal representation, in the presence of structured outliers and noise. The proposed algorithm tries to jointly detect the outliers and learn the subspace for images. We present an alternating optimization algorithm for solving this problem, which iterates between learning the subspace and finding the outliers. This algorithm has been trained on a large number of image patches, and the learned subspace is used for image segmentation, and is shown to achieve better segmentation results than prior methods, including least absolute deviation fitting, k-means clustering based segmentation in DjVu, and shape primitive extraction and coding algorithm.
- Mar 14 2017 cs.CV arXiv:1703.04135v1Recently, Deep Convolutional Neural Networks (DCNNs) have made unprecedented progress, achieving the accuracy close to, or even better than human-level perception in various tasks. There is a timely need to map the latest software DCNNs to application-specific hardware, in order to achieve orders of magnitude improvement in performance, energy efficiency and compactness. Stochastic Computing (SC), as a low-cost alternative to the conventional binary computing paradigm, has the potential to enable massively parallel and highly scalable hardware implementation of DCNNs. One major challenge in SC based DCNNs is designing accurate nonlinear activation functions, which have a significant impact on the network-level accuracy but cannot be implemented accurately by existing SC computing blocks. In this paper, we design and optimize SC based neurons, and we propose highly accurate activation designs for the three most frequently used activation functions in software DCNNs, i.e, hyperbolic tangent, logistic, and rectified linear units. Experimental results on LeNet-5 using MNIST dataset demonstrate that compared with a binary ASIC hardware DCNN, the DCNN with the proposed SC neurons can achieve up to 61X, 151X, and 2X improvement in terms of area, power, and energy, respectively, at the cost of small precision degradation.In addition, the SC approach achieves up to 21X and 41X of the area, 41X and 72X of the power, and 198200X and 96443X of the energy, compared with CPU and GPU approaches, respectively, while the error is increased by less than 3.07%. ReLU activation is suggested for future SC based DCNNs considering its superior performance under a small bit stream length.
- Automatic decision-making approaches, such as reinforcement learning (RL), have been applied to (partially) solve the resource allocation problem adaptively in the cloud computing system. However, a complete cloud resource allocation framework exhibits high dimensions in state and action spaces, which prohibit the usefulness of traditional RL techniques. In addition, high power consumption has become one of the critical concerns in design and control of cloud computing systems, which degrades system reliability and increases cooling cost. An effective dynamic power management (DPM) policy should minimize power consumption while maintaining performance degradation within an acceptable level. Thus, a joint virtual machine (VM) resource allocation and power management framework is critical to the overall cloud computing system. Moreover, novel solution framework is necessary to address the even higher dimensions in state and action spaces. In this paper, we propose a novel hierarchical framework for solving the overall resource allocation and power management problem in cloud computing systems. The proposed hierarchical framework comprises a global tier for VM resource allocation to the servers and a local tier for distributed power management of local servers. The emerging deep reinforcement learning (DRL) technique, which can deal with complicated control problems with large state space, is adopted to solve the global tier problem. Furthermore, an autoencoder and a novel weight sharing structure are adopted to handle the high-dimensional state space and accelerate the convergence speed. On the other hand, the local tier of distributed server power managements comprises an LSTM based workload predictor and a model-free RL based power manager, operating in a distributed manner.
- Mar 07 2017 cs.SY arXiv:1703.01888v1We consider a multitask estimation problem where nodes in a network are divided into several connected clusters, with each cluster performing a least-mean-squares estimation of a different random parameter vector. Inspired by the adapt-then-combine diffusion strategy, we propose a multitask diffusion strategy whose mean stability can be ensured whenever individual nodes are stable in the mean, regardless of the inter-cluster cooperation weights. In addition, the proposed strategy is able to achieve an asymptotically unbiased estimation, when the parameters have same mean. We also develop an inter-cluster cooperation weights selection scheme that allows each node in the network to locally optimize its inter-cluster cooperation weights. Numerical results demonstrate that our approach leads to a lower average steady-state network mean-square deviation, compared with using weights selected by various other commonly adopted methods in the literature.
- Mar 07 2017 cs.SI arXiv:1703.01393v1Reciprocity in directed networks points to user's willingness to return favors in building mutual interactions. High reciprocity has been widely observed in many directed social media networks such as following relations in Twitter and Tumblr. Therefore, reciprocal relations between users are often regarded as a basic mechanism to create stable social ties and play a crucial role in the formation and evolution of networks. Each reciprocity relation is formed by two parasocial links in a back-and-forth manner with a time delay. Hence, understanding the delay can help us gain better insights into the underlying mechanisms of network dynamics. Meanwhile, the accurate prediction of delay has practical implications in advancing a variety of real-world applications such as friend recommendation and marketing campaign. For example, by knowing when will users follow back, service providers can focus on the users with potential long reciprocal delay for effective targeted marketing. This paper presents the initial investigation of the time delay in reciprocal relations. Our study is based on a large-scale directed network from Tumblr that consists of 62.8 million users and 3.1 billion user following relations with a timespan of multiple years (from 31 Oct 2007 to 24 Jul 2013). We reveal a number of interesting patterns about the delay that motivate the development of a principled learning model to predict the delay in reciprocal relations. Experimental results on the above mentioned dynamic networks corroborate the effectiveness of the proposed delay prediction model.
- Mar 06 2017 cs.CV arXiv:1703.01210v1This paper details the methodology and results of the EmotioNet challenge. This challenge is the first to test the ability of computer vision algorithms in the automatic analysis of a large number of images of facial expressions of emotion in the wild. The challenge was divided into two tracks. The first track tested the ability of current computer vision algorithms in the automatic detection of action units (AUs). Specifically, we tested the detection of 11 AUs. The second track tested the algorithms' ability to recognize emotion categories in images of facial expressions. Specifically, we tested the recognition of 16 basic and compound emotion categories. The results of the challenge suggest that current computer vision and machine learning algorithms are unable to reliably solve these two tasks. The limitations of current algorithms are more apparent when trying to recognize emotion. We also show that current algorithms are not affected by mild resolution changes, small occluders, gender or age, but that 3D pose is a major limiting factor on performance. We provide an in-depth discussion of the points that need special attention moving forward.
- Mar 06 2017 cs.CV arXiv:1703.01229v1Deep neural networks are playing an important role in state-of-the-art visual recognition. To represent high-level visual concepts, modern networks are equipped with large convolutional layers, which use a large number of filters and contribute significantly to model complexity. For example, more than half of the weights of AlexNet are stored in the first fully-connected layer (4,096 filters). We formulate the function of a convolutional layer as learning a large visual vocabulary, and propose an alternative way, namely Deep Collaborative Learning (DCL), to reduce the computational complexity. We replace a convolutional layer with a two-stage DCL module, in which we first construct a couple of smaller convolutional layers individually, and then fuse them at each spatial position to consider feature co-occurrence. In mathematics, DCL can be explained as an efficient way of learning compositional visual concepts, in which the vocabulary size increases exponentially while the model complexity only increases linearly. We evaluate DCL on a wide range of visual recognition tasks, including a series of multi-digit number classification datasets, and some generic image classification datasets such as SVHN, CIFAR and ILSVRC2012. We apply DCL to several state-of-the-art network structures, improving the recognition accuracy meanwhile reducing the number of parameters (16.82% fewer in AlexNet).
- Mar 03 2017 cs.CV arXiv:1703.00551v1We consider the problem of semantic image segmentation using deep convolutional neural networks. We propose a novel network architecture called the label refinement network that predicts segmentation labels in a coarse-to-fine fashion at several resolutions. The segmentation labels at a coarse resolution are used together with convolutional features to obtain finer resolution segmentation labels. We define loss functions at several stages in the network to provide supervisions at different stages. Our experimental results on several standard datasets demonstrate that the proposed model provides an effective way of producing pixel-wise dense image labeling.
- Mar 02 2017 cs.LO arXiv:1703.00159v1We design a calculus for true concurrency called CTC, including its syntax and operational semantics. CTC has good properties modulo several kinds of strongly truly concurrent bisimulations and weakly truly concurrent bisimulations, such as monoid laws, static laws, new expansion law for strongly truly concurrent bisimulations, $\tau$ laws for weakly truly concurrent bisimulations, and full congruences for strongly and weakly truly concurrent bisimulations, and also unique solution for recursion.
- Recently low displacement rank (LDR) matrices, or so-called structured matrices, have been proposed to compress large-scale neural networks. Empirical results have shown that neural networks with weight matrices of LDR matrices, referred as LDR neural networks, can achieve significant reduction in space and computational complexity while retaining high accuracy. We formally study LDR matrices in deep learning. First, we prove the universal approximation property of LDR neural networks with a mild condition on the displacement operators. We then show that the error bounds of LDR neural networks are as efficient as general neural networks with both single-layer and multiple-layer structure. Finally, we propose back-propagation based training algorithm for general LDR neural networks.
- In this paper, an approximated proximal alternating method (APAM) is proposed for solving non-convex and non-smooth problems. Our APAM approximately solves the subproblems to certain inexactness so that it requires neither computing/estimating the Lipschitz constants nor calculating exact solutions. With a special designed error condition, APAM is proved to globally converge to a critical point of the problem, which as far as we know, is so far the best convergence result in non-convex optimization. Moreover, two practical strategies for checking the inexactness criterion are theoretically analyzed, which are more valid than the commonly used criteria in practice. Our algorithm APAM is applied to non-convex dictionary learning problem on both synthetic and real-world data. The experimental results with detailed analyses and discussions are given to help verify the efficiency of APAM.
- This paper presents a survey on generalized Reed-Solomon codes and various decoding algorithms: Berlekamp-Massey decoding algorithms; Berlekamp-Welch decoding algorithms; Euclidean decoding algorithms; discrete Fourier decoding algorithms, Chien's search algorithm, and Forney's algorithm.
- Feb 28 2017 cs.NI arXiv:1702.08122v1Vehicle-to-infrastructure (V2I) communication may provide high data rates to vehicles via millimeter-wave (mmWave) microcellular networks. This paper uses stochastic geometry to analyze the coverage of urban mmWave microcellular networks. Prior work used a pathloss model with a line-of-sight probability function based on randomly oriented buildings, to determine whether a link was line-of-sight or non-line-of-sight. In this paper, we use a pathloss model inspired by measurements, which uses a Manhattan distance model and accounts for differences in pathloss exponents and losses when turning corners. In our model, users and base stations (BSs) are randomly located on a network formed by a two dimensional Poisson line process. Our model is well suited for urban microcellular networks where the base stations are deployed at street level. Based on this new approach, we derive the coverage probability under certain BS association rules to obtain closed-form solutions without much complexity. In addition, we draw two main conclusions from our work. First, non-line-of-sight BSs are not a major benefit for association or source of interference. Second, there is an ultradense regime where deploying (active) BSs does not enhance coverage.
- Feb 28 2017 cs.CL arXiv:1702.07793v1Deep learning approaches have been widely used in Automatic Speech Recognition (ASR) and they have achieved a significant accuracy improvement. Especially, Convolutional Neural Networks (CNNs) have been revisited in ASR recently. However, most CNNs used in existing work have less than 10 layers which may not be deep enough to capture all human speech signal information. In this paper, we propose a novel deep and wide CNN architecture denoted as RCNN-CTC, which has residual connections and Connectionist Temporal Classification (CTC) loss function. RCNN-CTC is an end-to-end system which can exploit temporal and spectral structures of speech signals simultaneously. Furthermore, we introduce a CTC-based system combination, which is different from the conventional frame-wise senone-based one. The basic subsystems adopted in the combination are different types and thus mutually complementary to each other. Experimental results show that our proposed single system RCNN-CTC can achieve the lowest word error rate (WER) on WSJ and Tencent Chat data sets, compared to several widely used neural network systems in ASR. In addition, the proposed system combination can offer a further error reduction on these two data sets, resulting in relative WER reductions of $14.91\%$ and $6.52\%$ on WSJ dev93 and Tencent Chat data sets respectively.
- Segmental structure is a common pattern in many types of sequences such as phrases in human languages. In this paper, we present a probabilistic model for sequences via their segmentations. The probability of a segmented sequence is calculated as the product of the probabilities of all its segments, where each segment is modeled using existing tools such as recurrent neural networks. Since the segmentation of a sequence is usually unknown in advance, we sum over all valid segmentations to obtain the final probability for the sequence. An efficient dynamic programming algorithm is developed for forward and backward computations without resorting to any approximation. We demonstrate our approach on text segmentation and speech recognition tasks. In addition to quantitative results, we also show that our approach can discover meaningful segments in their respective application contexts.
- We show that given an estimate $\widehat{A}$ that is close to a general high-rank positive semi-definite (PSD) matrix $A$ in spectral norm (i.e., $\|\widehat{A}-A\|_2 \leq \delta$), the simple truncated SVD of $\widehat{A}$ produces a multiplicative approximation of $A$ in Frobenius norm. This observation leads to many interesting results on general high-rank matrix estimation problems, which we briefly summarize below ($A$ is an $n\times n$ high-rank PSD matrix and $A_k$ is the best rank-$k$ approximation of $A$): (1) High-rank matrix completion: By observing $\Omega(\frac{n\max\{\epsilon^{-4},k^2\}\mu_0^2\|A\|_F^2\log n}{\sigma_{k+1}(A)^2})$ elements of $A$ where $\sigma_{k+1}\left(A\right)$ is the $\left(k+1\right)$-th singular value of $A$ and $\mu_0$ is the incoherence, the truncated SVD on a zero-filled matrix satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(\epsilon))\|A-A_k\|_F$ with high probability. (2)High-rank matrix de-noising: Let $\widehat{A}=A+E$ where $E$ is a Gaussian random noise matrix with zero mean and $\nu^2/n$ variance on each entry. Then the truncated SVD of $\widehat{A}$ satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(\sqrt{\nu/\sigma_{k+1}(A)}))\|A-A_k\|_F + O(\sqrt{k}\nu)$. (3) Low-rank Estimation of high-dimensional covariance: Given $N$ i.i.d.~samples $X_1,\cdots,X_N\sim\mathcal N_n(0,A)$, can we estimate $A$ with a relative-error Frobenius norm bound? We show that if $N = \Omega\left(n\max\{\epsilon^{-4},k^2\}\gamma_k(A)^2\log N\right)$ for $\gamma_k(A)=\sigma_1(A)/\sigma_{k+1}(A)$, then $\|\widehat{A}_k-A\|_F \leq (1+O(\epsilon))\|A-A_k\|_F$ with high probability, where $\widehat{A}=\frac{1}{N}\sum_{i=1}^N{X_iX_i^\top}$ is the sample covariance.
- Feb 23 2017 cs.CV arXiv:1702.06683v2The United States spends more than $1B each year on initiatives such as the American Community Survey (ACS), a labor-intensive door-to-door study that measures statistics relating to race, gender, education, occupation, unemployment, and other demographic factors. Although a comprehensive source of data, the lag between demographic changes and their appearance in the ACS can exceed half a decade. As digital imagery becomes ubiquitous and machine vision techniques improve, automated data analysis may provide a cheaper and faster alternative. Here, we present a method that determines socioeconomic trends from 50 million images of street scenes, gathered in 200 American cities by Google Street View cars. Using deep learning-based computer vision techniques, we determined the make, model, and year of all motor vehicles encountered in particular neighborhoods. Data from this census of motor vehicles, which enumerated 22M automobiles in total (8% of all automobiles in the US), was used to accurately estimate income, race, education, and voting patterns, with single-precinct resolution. (The average US precinct contains approximately 1000 people.) The resulting associations are surprisingly simple and powerful. For instance, if the number of sedans encountered during a 15-minute drive through a city is higher than the number of pickup trucks, the city is likely to vote for a Democrat during the next Presidential election (88% chance); otherwise, it is likely to vote Republican (82%). Our results suggest that automated systems for monitoring demographic trends may effectively complement labor-intensive approaches, with the potential to detect trends with fine spatial resolution, in close to real time.
- Feb 22 2017 cs.OH arXiv:1702.06391v1This work proves a new result on the correct convergence of Min-Sum Loopy Belief Propagation (LBP) in an interpolation problem on a square grid graph. The focus is on the notion of local solutions, a numerical quantity attached to each site of the graph that can be used for obtaining MAP estimates. The main result is that over an $N\times N$ grid graph with a one-run boundary configuration, the local solutions at each $i \in B$ can be calculated using Min-Sum LBP by passing difference messages in $2N$ iterations, which parallels the well-known convergence time in trees.
- Feb 21 2017 cs.CV arXiv:1702.05573v1We examine the problem of joint top-down active search of multiple objects under interaction, e.g., person riding a bicycle, cups held by the table, etc.. Such objects under interaction often can provide contextual cues to each other to facilitate more efficient search. By treating each detector as an agent, we present the first collaborative multi-agent deep reinforcement learning algorithm to learn the optimal policy for joint active object localization, which effectively exploits such beneficial contextual information. We learn inter-agent communication through cross connections with gates between the Q-networks, which is facilitated by a novel multi-agent deep Q-learning algorithm with joint exploitation sampling. We verify our proposed method on multiple object detection benchmarks. Not only does our model help to improve the performance of state-of-the-art active localization models, it also reveals interesting co-detection patterns that are intuitively interpretable.
- Feb 16 2017 cs.AI arXiv:1702.04594v1The Minimum Weight Dominating Set (MWDS) problem is an important generalization of the Minimum Dominating Set (MDS) problem with extensive applications. This paper proposes a new local search algorithm for the MWDS problem, which is based on two new ideas. The first idea is a heuristic called two-level configuration checking (CC2), which is a new variant of a recent powerful configuration checking strategy (CC) for effectively avoiding the recent search paths. The second idea is a novel scoring function based on the frequency of being uncovered of vertices. Our algorithm is called CC2FS, according to the names of the two ideas. The experimental results show that, CC2FS performs much better than some state-of-the-art algorithms in terms of solution quality on a broad range of MWDS benchmarks.
- Feb 15 2017 cs.CV arXiv:1702.04037v1Recently Trajectory-pooled Deep-learning Descriptors were shown to achieve state-of-the-art human action recognition results on a number of datasets. This paper improves their performance by applying rank pooling to each trajectory, encoding the temporal evolution of deep learning features computed along the trajectory. This leads to Evolution-Preserving Trajectory (EPT) descriptors, a novel type of video descriptor that significantly outperforms Trajectory-pooled Deep-learning Descriptors. EPT descriptors are defined based on dense trajectories, and they provide complimentary benefits to video descriptors that are not based on trajectories. In particular, we show that the combination of EPT descriptors and VideoDarwin leads to state-of-the-art performance on Hollywood2 and UCF101 datasets.
- Feb 15 2017 cs.CV arXiv:1702.04179v2Given a pedestrian image as a query, the purpose of person re-identification is to identify the correct match from a large collection of gallery images depicting the same person captured by disjoint camera views. The critical challenge is how to construct a robust yet discriminative feature representation to capture the compounded variations in pedestrian appearance. To this end, deep learning methods have been proposed to extract hierarchical features against extreme variability of appearance. However, existing methods in this category generally neglect the efficiency in the matching stage whereas the searching speed of a re-identification system is crucial in real-world applications. In this paper, we present a novel deep hashing framework with Convolutional Neural Networks (CNNs) for fast person re-identification. Technically, we simultaneously learn both CNN features and hash functions/codes to get robust yet discriminative features and similarity-preserving hash codes. Thereby, person re-identification can be resolved by efficiently computing and ranking the Hamming distances between images. A structured loss function defined over positive pairs and hard negatives is proposed to formulate a novel optimization problem so that fast convergence and more stable optimized solution can be obtained. Extensive experiments on two benchmarks CUHK03 \citeFPNN and Market-1501 \citeMarket1501 show that the proposed deep architecture is efficacy over state-of-the-arts.
- We argue that the implementation and verification of compilers for functional programming languages are greatly simplified by employing a higher-order representation of syntax known as Higher-Order Abstract Syntax or HOAS. The underlying idea of HOAS is to use a meta-language that provides a built-in and logical treatment of binding related notions. By embedding the meta-language within a larger programming or reasoning framework, it is possible to absorb the treatment of binding structure in the object language into the meta-theory of the system, thereby greatly simplifying the overall implementation and reasoning processes. We develop the above argument in this thesis by presenting and demonstrating the effectiveness of an approach to the verified implementation of compiler transformations for functional programs that exploits HOAS. In this approach, transformations on functional programs are first articulated in the form of rule-based relational specifications. These specifications are rendered into programs in the language lambda Prolog. On the one hand, these programs serve directly as implementations. On the other hand, they can be used as input to the Abella system which allows us to prove properties about them and thereby about the implementations. Both lambda Prolog and Abella support the use of the HOAS approach. Thus, they constitute a framework that can be used to test out the benefits of the HOAS approach in verified compilation. We use them to implement and verify a compiler for a representative functional programming language that embodies the transformations that form the core of many compilers for such languages. In both the programming and the reasoning phases, we show how the use of the HOAS approach significantly simplifies the representation, manipulation, analysis and reasoning of binding structure.
- Feb 14 2017 cs.DB arXiv:1702.03825v1The value proposition of a dataset often resides in the implicit interconnections or explicit relationships (patterns) among individual entities, and is often modeled as a graph. Effective visualization of such graphs can lead to key insights uncovering such value. In this article we propose a visualization method to explore graphs with numerical attributes associated with nodes (or edges) -- referred to as scalar graphs. Such numerical attributes can represent raw content information, similarities, or derived information reflecting important network measures such as triangle density and centrality. The proposed visualization strategy seeks to simultaneously uncover the relationship between attribute values and graph topology, and relies on transforming the network to generate a terrain map. A key objective here is to ensure that the terrain map reveals the overall distribution of components-of-interest (e.g. dense subgraphs, k-cores) and the relationships among them while being sensitive to the attribute values over the graph. We also design extensions that can capture the relationship across multiple numerical attributes (scalars). We demonstrate the efficacy of our method on several real-world data science tasks while scaling to large graphs with millions of nodes.
- Synapse crossbar is an elementary structure in Neuromorphic Computing Systems (NCS). However, the limited size of crossbars and heavy routing congestion impedes the NCS implementations of big neural networks. In this paper, we propose a two-step framework (namely, \textitgroup scissor) to scale NCS designs to big neural networks. The first step is \textitrank clipping, which integrates low-rank approximation into the training to reduce total crossbar area. The second step is \textitgroup connection deletion, which structurally prunes connections to reduce routing congestion between crossbars. Tested on convolutional neural networks of \textitLeNet on MNIST database and \textitConvNet on CIFAR-10 database, our experiments show significant reduction of crossbar area and routing area in NCS designs. Without accuracy loss, rank clipping reduces total crossbar area to 13.62\% and 51.81\% in the NCS designs of \textitLeNet and \textitConvNet, respectively. Following rank clipping, group connection deletion further reduces the routing area of \textitLeNet and \textitConvNet to 8.1\% and 52.06\%, respectively.
- We consider the problem of estimating and constructing component-wise confidence intervals of a sparse high-dimensional linear regression model when some covariates of the design matrix are missing completely at random. A variant of the Dantzig selector (Candes & Tao, 2007) is analyzed for estimating the regression model and a de-biasing argument is employed to construct component-wise confidence intervals under additional assumptions on the covariance of the design matrix. We also derive rates of convergence of the mean-square estimation error and the average confidence interval length, and show that the dependency over several model parameters (e.g., sparsity $s$, portion of observed covariates $\rho_*$, signal level $\|\beta_0\|_2$) are optimal in a minimax sense.
- The meta distribution of the signal-to-interference ratio (SIR) provides fine-grained information about the performance of individual links in a wireless network. This paper focuses on the analysis of the meta distribution of the SIR for both the cellular network uplink and downlink with fractional power control. For the uplink scenario, an approximation of the interfering user point process with a non-homogeneous Poisson point process is used. The moments of the meta distribution for both scenarios are calculated. Some bounds, the analytical expression, the mean local delay, and the beta approximation of the meta distribution are provided. The results give interesting insights into the effect of the power control in both the uplink and downlink. Detailed simulations show that the approximations made in the analysis are well justified.
- Feb 07 2017 cs.DC arXiv:1702.01508v1This paper made a short review of Cloud Computing and Big Data, and discussed the portability of general data mining algorithms to Cloud Computing platform. It revealed the Cloud Computing platform based on Map-Reduce cannot solve all the Big Data and data mining problems. Transplanting the general data mining algorithms to the real-time Cloud Computing platform will be one of the research focuses in Cloud Computing and Big Data.
- Feb 07 2017 cs.AI arXiv:1702.01510v1With the advent of modern computer networks, fault diagnosis has been a focus of research activity. This paper reviews the history of fault diagnosis in networks and discusses the main methods in information gathering section, information analyzing section and diagnosing and revolving section of fault diagnosis in networks. Emphasis will be placed upon knowledge-based methods with discussing the advantages and shortcomings of the different methods. The survey is concluded with a description of some open problems.
- Feb 07 2017 cs.CV arXiv:1702.01507v1Multi-camera tracking is quite different from single camera tracking, and it faces new technology and system architecture challenges. By analyzing the corresponding characteristics and disadvantages of the existing algorithms, problems in multi-camera tracking are summarized and some new directions for future work are also generalized.
- This paper presents a new technique for automatically synthesizing SQL queries from natural language. Our technique is fully automated, works for any database without requiring additional customization, and does not require users to know the underlying database schema. Our method achieves these goals by combining natural language processing, program synthesis, and automated program repair. Given the user's English description, our technique first uses semantic parsing to generate a query sketch, which is subsequently completed using type-directed program synthesis and assigned a confidence score using database contents. However, since the user's description may not accurately reflect the actual database schema, our approach also performs fault localization and repairs the erroneous part of the sketch. This synthesize-repair loop is repeated until the algorithm infers a query with a sufficiently high confidence score. We have implemented the proposed technique in a tool called Sqlizer and evaluate it on three different databases. Our experiments show that the desired query is ranked within the top 5 candidates in close to 90% of the cases.
- Influence maximization (IM), which selects a set of $k$ users (called seeds) to maximize the influence spread over a social network, is a fundamental problem in a wide range of applications such as viral marketing and network monitoring. Existing IM solutions fail to consider the highly dynamic nature of social influence, which results in either poor seed qualities or long processing time when the network evolves. To address this problem, we define a novel IM query named Stream Influence Maximization (SIM) on social streams. Technically, SIM adopts the sliding window model and maintains a set of $k$ seeds with the largest influence value over the most recent social actions. Next, we propose the Influential Checkpoints (IC) framework to facilitate continuous SIM query processing. The IC framework creates a checkpoint for each window slide and ensures an $\varepsilon$-approximate solution. To improve its efficiency, we further devise a Sparse Influential Checkpoints (SIC) framework which selectively keeps $O(\frac{\log{N}}{\beta})$ checkpoints for a sliding window of size $N$ and maintains an $\frac{\varepsilon(1-\beta)}{2}$-approximate solution. Experimental results on both real-world and synthetic datasets confirm the effectiveness and efficiency of our proposed frameworks against the state-of-the-art IM approaches.
- Feb 07 2017 cs.CV arXiv:1702.01334v1Iris is one of the popular biometrics that is widely used for identity authentication. Different features have been used to perform iris recognition in the past. Most of them are based on hand-crafted features designed by biometrics experts. Due to tremendous success of deep learning in computer vision problems, there has been a lot of interest in applying features learned by convolutional neural networks on general image recognition to other tasks such as segmentation, face recognition, and object detection. In this paper, we have investigated the application of deep features extracted from VGG-Net for iris recognition. The proposed scheme has been tested on two well-known iris databases, and has shown promising results with the best accuracy rate of 99.4\%, which outperforms the previous best result.
- Feb 03 2017 cs.NI arXiv:1702.00536v1We present the multi-hop extensions of the recently proposed energy-efficient time synchronization scheme for wireless sensor networks, which is based on the asynchronous source clock frequency recovery and reversed two-way message exchanges. We consider two hierarchical extensions based on packet relaying and time-translating gateways, respectively, and analyze their performance with respect to the number of layers and the delay variations through simulations. The simulation results demonstrate that the time synchronization performance of the packet relaying, which has lower complexity, is close to that of time-translating gateways.
- Feb 02 2017 cs.SI arXiv:1702.00048v1Motivated by the two paradoxical facts that the marginal cost of following one extra candidate is close to zero and that the majority of Twitter users choose to follow only one or two candidates, we study the Twitter follow behaviors observed in the 2016 U.S. presidential election. Specifically, we complete the following tasks: (1) analyze Twitter follow patterns of the presidential election on Twitter, (2) use negative binomial regression to study the effects of gender and occupation on the number of candidates that one follows, and (3) use multinomial logistic regression to investigate the effects of gender, occupation and celebrities on the choice of candidates to follow.
- In this paper we introduce a fully end-to-end approach for visual tracking in videos that learns to predict the bounding box locations of a target object at every frame. An important insight is that the tracking problem can be considered as a sequential decision-making process and historical semantics encode highly relevant information for future decisions. Based on this intuition, we formulate our model as a recurrent convolutional neural network agent that interacts with a video overtime, and our model can be trained with reinforcement learning (RL) algorithms to learn good tracking policies that pay attention to continuous, inter-frame correlation and maximize tracking performance in the long run. The proposed tracking algorithm achieves state-of-the-art performance in an existing tracking benchmark and operates at frame-rates faster than real-time. To the best of our knowledge, our tracker is the first neural-network tracker that combines convolutional and recurrent networks with RL algorithms.
- Temporal point processes are powerful tools to model event occurrences and have a plethora of applications in social sciences. In this paper, we consider the problem of how to design the optimal control policy for point processes, such that the stochastic system driven by the process is steered to a target state. In particular, we exploit the key insight to view the stochastic optimal control problem from the perspective of optimal measure and variational inference. We further propose a convex optimization framework and an efficient algorithm to update the policy adaptively to the current system state. Experiments on synthetic and real-world data show that our algorithm can steer the user activities much more accurately and efficiently than other stochastic control methods.
- This paper presents privileged multi-label learning (PrML) to explore and exploit the relationship between labels in multi-label learning problems. We suggest that for each individual label, it cannot only be implicitly connected with other labels via the low-rank constraint over label predictors, but also its performance on examples can receive the explicit comments from other labels together acting as an \emphOracle teacher. We generate privileged label feature for each example and its individual label, and then integrate it into the framework of low-rank based multi-label learning. The proposed algorithm can therefore comprehensively explore and exploit label relationships by inheriting all the merits of privileged information and low-rank constraints. We show that PrML can be efficiently solved by dual coordinate descent algorithm using iterative optimization strategy with cheap updates. Experiments on benchmark datasets show that through privileged label features, the performance can be significantly improved and PrML is superior to several competing methods in most cases.
- What people buy is an important aspect or view of lifestyles. Studying people's shopping patterns in different urban regions can not only provide valuable information for various commercial opportunities, but also enable a better understanding about urban infrastructure and urban lifestyle. In this paper, we aim to predict city-wide shopping patterns. This is a challenging task due to the sparsity of the available data -- over 60% of the city regions are unknown for their shopping records. To address this problem, we incorporate another important view of human lifestyles, namely mobility patterns. With information on "where people go", we infer "what people buy". Moreover, to model the relations between regions, we exploit spatial interactions in our method. To that end, Collective Matrix Factorization (CMF) with an interaction regularization model is applied to fuse the data from multiple views or sources. Our experimental results have shown that our model outperforms the baseline methods on two standard metrics. Our prediction results on multiple shopping patterns reveal the divergent demands in different urban regions, and thus reflect key functional characteristics of a city. Furthermore, we are able to extract the connection between the two views of lifestyles, and achieve a better or novel understanding of urban lifestyles.
- Jan 24 2017 cs.SI arXiv:1701.06250v2The 2016 U.S. presidential election has witnessed the major role of Twitter in the year's most important political event. Candidates used this social media platform extensively for online campaigns. Meanwhile, social media has been filled with rumors, which might have had huge impacts on voters' decisions. In this paper, we present a thorough analysis of rumor tweets from the followers of two presidential candidates: Hillary Clinton and Donald Trump. To overcome the difficulty of labeling a large amount of tweets as training data, we detect rumor tweets by matching them with verified rumor articles. We analyze over 8 million tweets collected from the followers of the two candidates. Our results provide answers to several primary concerns about rumors in this election, including: which side of the followers posted the most rumors, who posted these rumors, what rumors they posted, and when they posted these rumors. The insights of this paper can help us understand the online rumor behaviors in American politics.
- In rare disease physician targeting, a major challenge is how to identify physicians who are treating diagnosed or underdiagnosed rare diseases patients. Rare diseases have extremely low incidence rate. For a specified rare disease, only a small number of patients are affected and a fractional of physicians are involved. The existing targeting methodologies, such as segmentation and profiling, are developed under mass market assumption. They are not suitable for rare disease market where the target classes are extremely imbalanced. The authors propose a graphical model approach to predict targets by jointly modeling physician and patient features from different data spaces and utilizing the extra relational information. Through an empirical example with medical claim and prescription data, the proposed approach demonstrates better accuracy in finding target physicians. The graph representation also provides visual interpretability of relationship among physicians and patients. The model can be extended to incorporate more complex dependency structures. This article contributes to the literature of exploring the benefit of utilizing relational dependencies among entities in healthcare industry.
- A major challenge faced in the design of large-scale cyber-physical systems, such as power systems, the Internet of Things or intelligent transportation systems, is that traditional distributed optimal control methods do not scale gracefully, neither in controller synthesis nor in controller implementation, to systems composed of millions, billions or even trillions of interacting subsystems. This paper shows that this challenge can now be addressed by leveraging the recently introduced System Level Approach (SLA) to controller synthesis. In particular, in the context of the SLA, we define suitable notions of separability for control objective functions and system constraints such that the global optimization problem (or iterate update problems of a distributed optimization algorithm) can be decomposed into parallel subproblems. We then further show that if additional locality (i.e., sparsity) constraints are imposed, then these subproblems can be solved using local models and local decision variables. The SLA is essential to maintaining the convexity of the aforementioned problems under locality constraints. As a consequence, the resulting synthesis methods have $O(1)$ complexity relative to the size of the global system. We further show that many optimal control problems of interest, such as (localized) LQR and LQG, $\mathcal{H}_2$ optimal control with joint actuator and sensor regularization, and (localized) mixed $\mathcal{H}_2/\mathcal{L}_1$ optimal control problems, satisfy these notions of separability, and use these problems to explore tradeoffs in performance, actuator and sensing density, and average versus worst-case performance for a large-scale power inspired system.
- Jan 19 2017 cs.CV arXiv:1701.05003v1Given a query photo issued by a user (q-user), the landmark retrieval is to return a set of photos with their landmarks similar to those of the query, while the existing studies on the landmark retrieval focus on exploiting geometries of landmarks for similarity matches between candidate photos and a query photo. We observe that the same landmarks provided by different users over social media community may convey different geometry information depending on the viewpoints and/or angles, and may subsequently yield very different results. In fact, dealing with the landmarks with \illshapes caused by the photography of q-users is often nontrivial and has seldom been studied. In this paper we propose a novel framework, namely multi-query expansions, to retrieve semantically robust landmarks by two steps. Firstly, we identify the top-$k$ photos regarding the latent topics of a query landmark to construct multi-query set so as to remedy its possible \illshape. For this purpose, we significantly extend the techniques of Latent Dirichlet Allocation. Then, motivated by the typical \emphcollaborative filtering methods, we propose to learn a \emphcollaborative deep networks based semantically, nonlinear and high-level features over the latent factor for landmark photo as the training set, which is formed by matrix factorization over \emphcollaborative user-photo matrix regarding the multi-query set. The learned deep network is further applied to generate the features for all the other photos, meanwhile resulting into a compact multi-query set within such space. Extensive experiments are conducted on real-world social media data with both landmark photos together with their user information to show the superior performance over the existing methods.