results for au:Wu_L in:cs

- Mar 22 2018 cs.CR arXiv:1803.07751v1With the rapid development of health equipments, increasingly more patients have installed the implantable medical devices (IMD) in their bodies for diagnostic, monitoring, and therapeutic purposes. IMDs are extremely limited in computation power and battery capacity. Meanwhile, IMDs have to communicate with an external programmer device (i.e., IMD programmer) through the wireless channel, which put them under the risk of unauthorized access and malicious wireless attacks. In this paper, we propose a proxy-based fine-grained access control scheme for IMDs, which can prolong the IMD's lifetime by delegating the access control computations to the proxy device (e.g., smartphone). In our scheme, the proxy communicates with the IMD programmer through an audio cable, which is resistant to a number of wireless attacks. Additionally, we use the ciphertext-policy attribute-based encryption (CP-ABE) to enforce fine-grained access control. The proposed scheme is implemented on real emulator devices and evaluated through experimental tests. The experiments show that the proposed scheme is lightweight and effective.
- Mar 22 2018 cs.CR arXiv:1803.07760v1The Internet of Things (IoT) has become increasingly popular in people's daily lives. The pervasive IoT devices are encouraged to share data with each other in order to better serve the users. However, users are reluctant to share sensitive data due to privacy concerns. In this paper, we study the anonymous data aggregation for the IoT system, in which the IoT company servers, though not fully trustworthy, are used to assist the aggregation. We propose an efficient and accountable aggregation scheme that can preserve the data anonymity. We analyze the communication and computation overheads of the proposed scheme, and evaluate the total execution time and the per-user communication overhead with extensive simulations. The results show that our scheme is more efficient than the previous peer-shuffle protocol, especially for data aggregation from multiple providers.
- Mar 16 2018 cs.CL arXiv:1803.05567v1Machine translation has made rapid advances in recent years. Millions of people are using it today in online translation systems and mobile applications in order to communicate across language barriers. The question naturally arises whether such systems can approach or achieve parity with human translations. In this paper, we first address the problem of how to define and accurately measure human parity in translation. We then describe Microsoft's machine translation system and measure the quality of its translations on the widely used WMT 2017 news translation task from Chinese to English. We find that our latest neural machine translation system has reached a new state-of-the-art, and that the translation quality is at human parity when compared to professional human translations. We also find that it significantly exceeds the quality of crowd-sourced non-professional translations.
- In this paper, we propose a listwise approach for constructing user-specific rankings in recommendation systems in a collaborative fashion. We contrast the listwise approach to previous pointwise and pairwise approaches, which are based on treating either each rating or each pairwise comparison as an independent instance respectively. By extending the work of (Cao et al. 2007), we cast listwise collaborative ranking as maximum likelihood under a permutation model which applies probability mass to permutations based on a low rank latent score matrix. We present a novel algorithm called SQL-Rank, which can accommodate ties and missing data and can run in linear time. We develop a theoretical framework for analyzing listwise ranking methods based on a novel representation theory for the permutation model. Applying this framework to collaborative ranking, we derive asymptotic statistical rates as the number of users and items grow together. We conclude by demonstrating that our SQL-Rank method often outperforms current state-of-the-art algorithms for implicit feedback such as Weighted-MF and BPR and achieve favorable results when compared to explicit feedback algorithms such as matrix factorization and collaborative ranking.
- Understanding the generalization of deep learning has raised lots of concerns recently, where the learning algorithms play an important role in generalization performance, such as stochastic gradient descent (SGD). Along this line, we particularly study the anisotropic noise introduced by SGD, and investigate its importance for the generalization in deep neural networks. Through a thorough empirical analysis, it is shown that the anisotropic diffusion of SGD tends to follow the curvature information of the loss landscape, and thus is beneficial for escaping from sharp and poor minima effectively, towards more stable and flat minima. We verify our understanding through comparing this anisotropic diffusion with full gradient descent plus isotropic diffusion (i.e. Langevin dynamics) and other types of position-dependent noise.
- State-of-the-art deep neural networks are known to be vulnerable to adversarial examples, formed by applying small but malicious perturbations to the original inputs. Moreover, the perturbations can \textittransfer across models: adversarial examples generated for a specific model will often mislead other unseen models. Consequently the adversary can leverage it to attack deployed systems without any query, which severely hinder the application of deep learning, especially in the areas where security is crucial. In this work, we systematically study how two classes of factors that might influence the transferability of adversarial examples. One is about model-specific factors, including network architecture, model capacity and test accuracy. The other is the local smoothness of loss function for constructing adversarial examples. Based on these understanding, a simple but effective strategy is proposed to enhance transferability. We call it variance-reduced attack, since it utilizes the variance-reduced gradient to generate adversarial example. The effectiveness is confirmed by a variety of experiments on both CIFAR-10 and ImageNet datasets.
- For many machine learning problem settings, particularly with structured inputs such as sequences or sets of objects, a distance measure between inputs can be specified more naturally than a feature representation. However, most standard machine models are designed for inputs with a vector feature representation. In this work, we consider the estimation of a function $f:\mathcal{X} \rightarrow \R$ based solely on a dissimilarity measure $d:\mathcal{X}\times\mathcal{X} \rightarrow \R$ between inputs. In particular, we propose a general framework to derive a family of \emphpositive definite kernels from a given dissimilarity measure, which subsumes the widely-used \emphrepresentative-set method as a special case, and relates to the well-known \emphdistance substitution kernel in a limiting case. We show that functions in the corresponding Reproducing Kernel Hilbert Space (RKHS) are Lipschitz-continuous w.r.t. the given distance metric. We provide a tractable algorithm to estimate a function from this RKHS, and show that it enjoys better generalizability than Nearest-Neighbor estimates. Our approach draws from the literature of Random Features, but instead of deriving feature maps from an existing kernel, we construct novel kernels from a random feature map, that we specify given the distance measure. We conduct classification experiments with such disparate domains as strings, time series, and sets of vectors, where our proposed framework compares favorably to existing distance-based learning methods such as $k$-nearest-neighbors, distance-substitution kernels, pseudo-Euclidean embedding, and the representative-set method.
- Feb 12 2018 cs.CV arXiv:1802.03269v1This paper addresses the problem of unsupervised domain adaptation on the task of pedestrian detection in crowded scenes. First, we utilize an iterative algorithm to iteratively select and auto-annotate positive pedestrian samples with high confidence as the training samples for the target domain. Meanwhile, we also reuse negative samples from the source domain to compensate for the imbalance between the amount of positive samples and negative samples. Second, based on the deep network we also design an unsupervised regularizer to mitigate influence from data noise. More specifically, we transform the last fully connected layer into two sub-layers - an element-wise multiply layer and a sum layer, and add the unsupervised regularizer to further improve the domain adaptation accuracy. In experiments for pedestrian detection, the proposed method boosts the recall value by nearly 30% while the precision stays almost the same. Furthermore, we perform our method on standard domain adaptation benchmarks on both supervised and unsupervised settings and also achieve state-of-the-art results.
- The ongoing rapid development of the e-commercial and interest-base websites make it more pressing to evaluate objects' accurate quality before recommendation by employing an effective reputation system. The objects' quality are often calculated based on their historical information, such as selected records or rating scores, to help visitors to make decisions before watching, reading or buying. Usually high quality products obtain a higher average ratings than low quality products regardless of rating biases or errors. However many empirical cases demonstrate that consumers may be misled by rating scores added by unreliable users or deliberate tampering. In this case, users' reputation, i.e., the ability to rating trustily and precisely, make a big difference during the evaluating process. Thus, one of the main challenges in designing reputation systems is eliminating the effects of users' rating bias on the evaluation results. To give an objective evaluation of each user's reputation and uncover an object's intrinsic quality, we propose an iterative balance (IB) method to correct users' rating biases. Experiments on two online video-provided Web sites, namely MovieLens and Netflix datasets, show that the IB method is a highly self-consistent and robust algorithm and it can accurately quantify movies' actual quality and users' stability of rating. Compared with existing methods, the IB method has higher ability to find the "dark horses", i.e., not so popular yet good movies, in the Academy Awards.
- Jan 08 2018 cs.CV arXiv:1801.01760v1Person re-identification (\textitre-id) refers to matching pedestrians across disjoint yet non-overlapping camera views. The most effective way to match these pedestrians undertaking significant visual variations is to seek reliably invariant features that can describe the person of interest faithfully. Most of existing methods are presented in a supervised manner to produce discriminative features by relying on labeled paired images in correspondence. However, annotating pair-wise images is prohibitively expensive in labors, and thus not practical in large-scale networked cameras. Moreover, seeking comparable representations across camera views demands a flexible model to address the complex distributions of images. In this work, we study the co-occurrence statistic patterns between pairs of images, and propose to crossing Generative Adversarial Network (Cross-GAN) for learning a joint distribution for cross-image representations in a unsupervised manner. Given a pair of person images, the proposed model consists of the variational auto-encoder to encode the pair into respective latent variables, a proposed cross-view alignment to reduce the view disparity, and an adversarial layer to seek the joint distribution of latent representations. The learned latent representations are well-aligned to reflect the co-occurrence patterns of paired images. We empirically evaluate the proposed model against challenging datasets, and our results show the importance of joint invariant features in improving matching rates of person re-id with comparison to semi/unsupervised state-of-the-arts.
- Dec 27 2017 cs.CL arXiv:1712.08841v2Characters have commonly been regarded as the minimal processing unit in Natural Language Processing (NLP). But many non-latin languages have hieroglyphic writing systems, involving a big alphabet with thousands or millions of characters. Each character is composed of even smaller parts, which are often ignored by the previous work. In this paper, we propose a novel architecture employing two stacked Long Short-Term Memory Networks (LSTMs) to learn sub-character level representation and capture deeper level of semantic meanings. To build a concrete study and substantiate the efficiency of our neural architecture, we take Chinese Word Segmentation as a research case example. Among those languages, Chinese is a typical case, for which every character contains several components called radicals. Our networks employ a shared radical level embedding to solve both Simplified and Traditional Chinese Word Segmentation, without extra Traditional to Simplified Chinese conversion, in such a highly end-to-end way the word segmentation can be significantly simplified compared to the previous work. Radical level embeddings can also capture deeper semantic meaning below character level and improve the system performance of learning. By tying radical and character embeddings together, the parameter count is reduced whereas semantic knowledge is shared and transferred between two levels, boosting the performance largely. On 3 out of 4 Bakeoff 2005 datasets, our method surpassed state-of-the-art results by up to 0.4%. Our results are reproducible, source codes and corpora are available on GitHub.
- Dec 11 2017 cs.CL arXiv:1712.02856v2We present a simple yet elegant solution to train a single joint model on multi-criteria corpora for Chinese Word Segmentation (CWS). Our novel design requires no private layers in model architecture, instead, introduces two artificial tokens at the beginning and ending of input sentence to specify the required target criteria. The rest of the model including Long Short-Term Memory (LSTM) layer and Conditional Random Fields (CRFs) layer remains unchanged and is shared across all datasets, keeping the size of parameter collection minimal and constant. On Bakeoff 2005 and Bakeoff 2008 datasets, our innovative design has surpassed both single-criterion and multi-criteria state-of-the-art learning results. To the best knowledge, our design is the first one that has achieved the latest high performance on such large scale datasets. Source codes and corpora of this paper are available on GitHub.
- Dec 08 2017 cs.CY arXiv:1712.02547v1Prolonged sedentary behavior is prevalent among office workers and has been found to be detrimental to health. Preventing and reducing prolonged sedentary behavior require interventions, and persuasive technology is expected to make a contribution in this domain. In this paper, we use the framework of persuasive system design (PSD) principles to investigate the utilization and effectiveness of persuasive technology in intervention studies at reducing sedentary behavior at work. This systematic review reveals that reminders are the most frequently used PSD principle. The analysis on reminders shows that hourly PC reminders alone have no significant effect on reducing sedentary behavior at work, while coupling with education or other informative session seems to be promising. Details of deployed persuasive technology with behavioral theories and user experience evaluation are expected to be reported explicitly in the future intervention studies.
- Dec 08 2017 cs.IR arXiv:1712.02342v5Both reviews and user-item interactions (i.e., rating scores) have been widely adopted for user rating prediction. However, these existing techniques mainly extract the latent representations for users and items in an independent and static manner. That is, a single static feature vector is derived to encode her preference without considering the particular characteristics of each candidate item. We argue that this static encoding scheme is difficult to fully capture the users' preference. In this paper, we propose a novel context-aware user-item representation learning model for rating prediction, named CARL. Namely, CARL derives a joint representation for a given user-item pair based on their individual latent features and latent feature interactions. Then, CARL adopts Factorization Machines to further model higher-order feature interactions on the basis of the user-item pair for rating prediction. Specifically, two separate learning components are devised in CARL to exploit review data and interaction data respectively: review-based feature learning and interaction-based feature learning. In review-based learning component, with convolution operations and attention mechanism, the relevant features for a user-item pair are extracted by jointly considering their corresponding reviews. However, these features are only review-driven and may not be comprehensive. Hence, interaction-based learning component further extracts complementary features from interaction data alone, also on the basis of user-item pairs. The final rating score is then derived with a dynamic linear fusion mechanism. Experiments on five real-world datasets show that CARL achieves significantly better rating prediction accuracy than existing state-of-the-art alternatives. Also, with attention mechanism, we show that the relevant information in reviews can be highlighted to interpret the rating prediction.
- Recommender systems, inferring users' preferences from their historical activities and personal profiles, have been an enormous success in the last several years. Most of the existing works are based on the similarities of users, objects or both that derived from their purchases records in the online shopping platforms. Such approaches, however, are facing bottlenecks when the known information is limited. The extreme case is how to recommend products to new users, namely the so-called cold-start problem. The rise of the online social networks gives us a chance to break the glass ceiling. Birds of a feather flock together. Close friends may have similar hidden pattern of selecting products and the advices from friends are more trustworthy. In this paper, we integrate the individual's social relationships into recommender systems and propose a new method, called Social Mass Diffusion (SMD), based on a mass diffusion process in the combined network of users' social network and user-item bipartite network. The results show that the SMD algorithm can achieve higher recommendation accuracy than the Mass Diffusion (MD) purely on the bipartite network. Especially, the improvement is striking for small degree users. Moreover, SMD provides a good solution to the cold-start problem. The recommendation accuracy for new users significantly higher than that of the conventional popularity-based algorithm. These results may shed some light on the new designs of better personalized recommender systems and information services.
- The Lanczos method is one of the standard approaches for computing a few eigenpairs of a large, sparse, symmetric matrix. It is typically used with restarting to avoid unbounded growth of memory and computational requirements. Thick-restart Lanczos is a popular restarted variant because of its simplicity and numerically robustness. However, convergence can be slow for highly clustered eigenvalues so more effective restarting techniques and the use of preconditioning is needed. In this paper, we present a thick-restart preconditioned Lanczos method, TRPL+K, that combines the power of locally optimal restarting (+K) and preconditioning techniques with the efficiency of the thick-restart Lanczos method. TRPL+K employs an inner-outer scheme where the inner loop applies Lanczos on a preconditioned operator while the outer loop augments the resulting Lanczos subspace with certain vectors from the previous restart cycle to obtain eigenvector approximations with which it thick restarts the outer subspace. We first identify the differences from various relevant methods in the literature. Then, based on an optimization perspective, we show an asymptotic global quasi-optimality of a simplified TRPL+K method compared to an unrestarted global optimal method. Finally, we present extensive experiments showing that TRPL+K either outperforms or matches other state-of-the-art eigenmethods in both matrix-vector multiplications and computational time.
- Predicting fine-grained interests of users with temporal behavior is important to personalization and information filtering applications. However, existing interest prediction methods are incapable of capturing the subtle degreed user interests towards particular items, and the internal time-varying drifting attention of individuals is not studied yet. Moreover, the prediction process can also be affected by inter-personal influence, known as behavioral mutual infectivity. Inspired by point process in modeling temporal point process, in this paper we present a deep prediction method based on two recurrent neural networks (RNNs) to jointly model each user's continuous browsing history and asynchronous event sequences in the context of inter-user behavioral mutual infectivity. Our model is able to predict the fine-grained interest from a user regarding a particular item and corresponding timestamps when an occurrence of event takes place. The proposed approach is more flexible to capture the dynamic characteristic of event sequences by using the temporal point process to model event data and timely update its intensity function by RNNs. Furthermore, to improve the interpretability of the model, the attention mechanism is introduced to emphasize both intra-personal and inter-personal behavior influence over time. Experiments on real datasets demonstrate that our model outperforms the state-of-the-art methods in fine-grained user interest prediction.
- Sep 19 2017 cs.CV arXiv:1709.05769v1Fine-grained visual recognition typically depends on modeling subtle difference from object parts. However, these parts often exhibit dramatic visual variations such as occlusions, viewpoints, and spatial transformations, making it hard to detect. In this paper, we present a novel attention-based model to automatically, selectively and accurately focus on critical object regions with higher importance against appearance variations. Given an image, two different Convolutional Neural Networks (CNNs) are constructed, where the outputs of two CNNs are correlated through bilinear pooling to simultaneously focus on discriminative regions and extract relevant features. To capture spatial distributions among the local regions with visual attention, soft attention based spatial Long-Short Term Memory units (LSTMs) are incorporated to realize spatially recurrent yet visually selective over local input patterns. All the above intuitions equip our network with the following novel model: two-stream CNN layers, bilinear pooling layer, spatial recurrent layer with location attention are jointly trained via an end-to-end fashion to serve as the part detector and feature extractor, whereby relevant features are localized and extracted attentively. We show the significance of our network against two well-known visual recognition tasks: fine-grained image classification and person re-identification.
- The methodology of community detection can be divided into two principles: imposing a network model on a given graph, or optimizing a designed objective function. The former provides guarantees on theoretical detectability but falls short when the graph is inconsistent with the underlying model. The latter is model-free but fails to provide quality assurance for the detected communities. In this paper, we propose a novel unified framework to combine the advantages of these two principles. The presented method, SGC-GEN, not only considers the detection error caused by the corresponding model mismatch to a given graph, but also yields a theoretical guarantee on community detectability by analyzing Spectral Graph Clustering (SGC) under GENerative community models (GCMs). SGC-GEN incorporates the predictability on correct community detection with a measure of community fitness to GCMs. It resembles the formulation of supervised learning problems by enabling various community detection loss functions and model mismatch metrics. We further establish a theoretical condition for correct community detection using the normalized graph Laplacian matrix under a GCM, which provides a novel data-driven loss function for SGC-GEN. In addition, we present an effective algorithm to implement SGC-GEN, and show that the computational complexity of SGC-GEN is comparable to the baseline methods. Our experiments on 18 real-world datasets demonstrate that SGC-GEN possesses superior and robust performance compared to 6 baseline methods under 7 representative clustering metrics.
- Sep 13 2017 cs.CL arXiv:1709.03856v5We present StarSpace, a general-purpose neural embedding model that can solve a wide variety of problems: labeling tasks such as text classification, ranking tasks such as information retrieval/web search, collaborative filtering-based or content-based recommendation, embedding of multi-relational graphs, and learning word, sentence or document level embeddings. In each case the model works by embedding those entities comprised of discrete features and comparing them against each other -- learning similarities dependent on the task. Empirical results on a number of tasks show that StarSpace is highly competitive with existing methods, whilst also being generally applicable to new cases where those methods are not.
- Teams dominate the production of high-impact science and technology. Analyzing teamwork from more than 50 million papers, patents, and software products, 1954-2014, we demonstrate across this period that larger teams developed recent, popular ideas, while small teams disrupted the system by drawing on older and less prevalent ideas. Attention to work from large teams came immediately, while advances by small teams succeeded further into the future. Differences between small and large teams magnify with impact - small teams have become known for disruptive work and large teams for developing work. Differences in topic and re- search design account for part of the relationship between team size and disruption, but most of the effect occurs within people, controlling for detailed subject and article type. Our findings suggest the importance of supporting both small and large teams for the sustainable vitality of science and technology.
- Sep 06 2017 cs.CV arXiv:1709.01212v3Multi-view data clustering attracts more attention than their single view counterparts due to the fact that leveraging multiple independent and complementary information from multi-view feature spaces outperforms the single one. Multi-view Spectral Clustering aims at yielding the data partition agreement over their local manifold structures by seeking eigenvalue-eigenvector decompositions. However, as we observed, such classical paradigm still suffers from (1) overlooking the flexible local manifold structure, caused by (2) enforcing the low-rank data correlation agreement among all views; worse still, (3) LRR is not intuitively flexible to capture the latent data clustering structures. In this paper, we present the structured LRR by factorizing into the latent low-dimensional data-cluster representations, which characterize the data clustering structure for each view. Upon such representation, (b) the laplacian regularizer is imposed to be capable of preserving the flexible local manifold structure for each view. (c) We present an iterative multi-view agreement strategy by minimizing the divergence objective among all factorized latent data-cluster representations during each iteration of optimization process, where such latent representation from each view serves to regulate those from other views, such intuitive process iteratively coordinates all views to be agreeable. (d) We remark that such data-cluster representation can flexibly encode the data clustering structure from any view with adaptive input cluster number. To this end, (e) a novel non-convex objective function is proposed via the efficient alternating minimization strategy. The complexity analysis are also presented. The extensive experiments conducted against the real-world multi-view datasets demonstrate the superiority over state-of-the-arts.
- Aug 24 2017 cs.CV arXiv:1708.07089v2Aesthetic quality prediction is a challenging task in the computer vision community because of the complex interplay with semantic contents and photographic technologies. Recent studies on the powerful deep learning based aesthetic quality assessment usually use a binary high-low label or a numerical score to represent the aesthetic quality. However the scalar representation cannot describe well the underlying varieties of the human perception of aesthetics. In this work, we propose to predict the aesthetic score distribution (i.e., a score distribution vector of the ordinal basic human ratings) using Deep Convolutional Neural Network (DCNN). Conventional DCNNs which aim to minimize the difference between the predicted scalar numbers or vectors and the ground truth cannot be directly used for the ordinal basic rating distribution. Thus, a novel CNN based on the Cumulative distribution with Jensen-Shannon divergence (CJS-CNN) is presented to predict the aesthetic score distribution of human ratings, with a new reliability-sensitive learning method based on the kurtosis of the score distribution, which eliminates the requirement of the original full data of human ratings (without normalization). Experimental results on large scale aesthetic dataset demonstrate the effectiveness of our introduced CJS-CNN in this task.
- Aug 22 2017 cs.CL arXiv:1708.06073v2We describe the 2017 version of Microsoft's conversational speech recognition system, in which we update our 2016 system with recent developments in neural-network-based acoustic and language modeling to further advance the state of the art on the Switchboard speech recognition task. The system adds a CNN-BLSTM acoustic model to the set of model architectures we combined previously, and includes character-based and dialog session aware LSTM language models in rescoring. For system combination we adopt a two-stage approach, whereby subsets of acoustic models are first combined at the senone/frame level, followed by a word-level voting via confusion networks. We also added a confusion network rescoring step after system combination. The resulting system yields a 5.1\% word error rate on the 2000 Switchboard evaluation set.
- A signed tree-coloring of a signed graph $(G,\sigma)$ is a vertex coloring $c$ so that $G^{c}(i,\pm)$ is a forest for every $i\in c(u)$ and $u\in V(G)$, where $G^{c}(i,\pm)$ is the subgraph of $(G,\sigma)$ whose vertex set is the set of vertices colored by $i$ or $-i$ and edge set is the set of positive edges with two end-vertices colored both by $i$ or both by $-i$, along with the set of negative edges with one end-vertex colored by $i$ and the other colored by $-i$. If $c$ is a function from $V(G)$ to $M_n$, where $M_n$ is $\{\pm 1,\pm 2,\ldots,\pm k\}$ if $n=2k$, and $\{0,\pm 1,\pm 2,\ldots,\pm k\}$ if $n=2k+1$, then $c$ a signed tree-$n$-coloring of $(G,\sigma)$. The minimum integer $n$ such that $(G,\sigma)$ admits a signed tree-$n$-coloring is the signed vertex arboricity of $(G,\sigma)$, denoted by $va(G,\sigma)$. In this paper, we first show that two switching equivalent signed graphs have the same signed vertex arboricity, and then prove that $va(G,\sigma)\leq 3$ for every balanced signed triangulation and for every edge-maximal $K_5$-minor-free graph with balanced signature. This generalizes the well-known result that the vertex arboricity of every planar graph is at most 3.
- Aug 09 2017 cs.CV arXiv:1708.02288v3Low-Rank Representation (LRR) is arguably one of the most powerful paradigms for Multi-view spectral clustering, which elegantly encodes the multi-view local graph/manifold structures into an intrinsic low-rank self-expressive data similarity embedded in high-dimensional space, to yield a better graph partition than their single-view counterparts. In this paper we revisit it with a fundamentally different perspective by discovering LRR as essentially a latent clustered orthogonal projection based representation winged with an optimized local graph structure for spectral clustering; each column of the representation is fundamentally a cluster basis orthogonal to others to indicate its members, which intuitively projects the view-specific feature representation to be the one spanned by all orthogonal basis to characterize the cluster structures. Upon this finding, we propose our technique with the followings: (1) We decompose LRR into latent clustered orthogonal representation via low-rank matrix factorization, to encode the more flexible cluster structures than LRR over primal data objects; (2) We convert the problem of LRR into that of simultaneously learning orthogonal clustered representation and optimized local graph structure for each view; (3) The learned orthogonal clustered representations and local graph structures enjoy the same magnitude for multi-view, so that the ideal multi-view consensus can be readily achieved. The experiments over multi-view datasets validate its superiority.
- Prior asymptotic performance analyses are based on the series expansion of the moment-generating function (MGF) or the probability density function (PDF) of channel coefficients. However, these techniques fail for lognormal fading channels because the Taylor series of the PDF of a lognormal random variable is zero at the origin and the MGF does not have an explicit form. Although lognormal fading model has been widely applied in wireless communications and free-space optical communications, few analytical tools are available to provide elegant performance expressions for correlated lognormal channels. In this work, we propose a novel framework to analyze the asymptotic outage probabilities of selection combining (SC), equal-gain combining (EGC) and maximum-ratio combining (MRC) over equally correlated lognormal fading channels. Based on these closed-form results, we reveal the followings: i) the outage probability of EGC or MRC becomes an infinitely small quantity compared to that of SC at large signal-to-noise ratio (SNR); ii) channel correlation can result in an infinite performance loss at large SNR. More importantly, the analyses reveal insights into the long-standing problem of performance analyses over correlated lognormal channels at high SNR, and circumvent the time-consuming Monte Carlo simulation and numerical integration.
- Jul 25 2017 cs.CV arXiv:1707.07074v4Matching pedestrians across disjoint camera views, known as person re-identification (re-id), is a challenging problem that is of importance to visual recognition and surveillance. Most existing methods exploit local regions within spatial manipulation to perform matching in local correspondence. However, they essentially extract \emphfixed representations from pre-divided regions for each image and perform matching based on the extracted representation subsequently. For models in this pipeline, local finer patterns that are crucial to distinguish positive pairs from negative ones cannot be captured, and thus making them underperformed. In this paper, we propose a novel deep multiplicative integration gating function, which answers the question of \emphwhat-and-where to match for effective person re-id. To address \emphwhat to match, our deep network emphasizes common local patterns by learning joint representations in a multiplicative way. The network comprises two Convolutional Neural Networks (CNNs) to extract convolutional activations, and generates relevant descriptors for pedestrian matching. This thus, leads to flexible representations for pair-wise images. To address \emphwhere to match, we combat the spatial misalignment by performing spatially recurrent pooling via a four-directional recurrent neural network to impose spatial dependency over all positions with respect to the entire image. The proposed network is designed to be end-to-end trainable to characterize local pairwise feature interactions in a spatially aligned manner. To demonstrate the superiority of our method, extensive experiments are conducted over three benchmark data sets: VIPeR, CUHK03 and Market-1501.
- It is widely observed that deep learning models with learned parameters generalize well, even with much more model parameters than the number of training samples. We systematically investigate the underlying reasons why deep neural networks often generalize well, and reveal the difference between the minima (with the same training error) that generalize well and those they don't. We show that it is the characteristics the landscape of the loss function that explains the good generalization capability. For the landscape of loss function for deep networks, the volume of basin of attraction of good minima dominates over that of poor minima, which guarantees optimization methods with random initialization to converge to good minima. We theoretically justify our findings through analyzing 2-layer neural networks; and show that the low-complexity solutions have a small norm of Hessian matrix with respect to model parameters. For deeper networks, extensive numerical evidence helps to support our arguments.
- Jun 13 2017 cs.CV arXiv:1706.03160v2Person re-identification (re-id) aims to match pedestrians observed by disjoint camera views. It attracts increasing attention in computer vision due to its importance to surveillance system. To combat the major challenge of cross-view visual variations, deep embedding approaches are proposed by learning a compact feature space from images such that the Euclidean distances correspond to their cross-view similarity metric. However, the global Euclidean distance cannot faithfully characterize the ideal similarity in a complex visual feature space because features of pedestrian images exhibit unknown distributions due to large variations in poses, illumination and occlusion. Moreover, intra-personal training samples within a local range are robust to guide deep embedding against uncontrolled variations, which however, cannot be captured by a global Euclidean distance. In this paper, we study the problem of person re-id by proposing a novel sampling to mine suitable \textitpositives (i.e. intra-class) within a local range to improve the deep embedding in the context of large intra-class variations. Our method is capable of learning a deep similarity metric adaptive to local sample structure by minimizing each sample's local distances while propagating through the relationship between samples to attain the whole intra-class minimization. To this end, a novel objective function is proposed to jointly optimize similarity metric learning, local positive mining and robust deep embedding. This yields local discriminations by selecting local-ranged positive samples, and the learned features are robust to dramatic intra-class variations. Experiments on benchmarks show state-of-the-art results achieved by our method.
- In this paper, we study a new learning paradigm for Neural Machine Translation (NMT). Instead of maximizing the likelihood of the human translation as in previous works, we minimize the distinction between human translation and the translation given by an NMT model. To achieve this goal, inspired by the recent success of generative adversarial networks (GANs), we employ an adversarial training architecture and name it as Adversarial-NMT. In Adversarial-NMT, the training of the NMT model is assisted by an adversary, which is an elaborately designed Convolutional Neural Network (CNN). The goal of the adversary is to differentiate the translation result generated by the NMT model from that by human. The goal of the NMT model is to produce high quality translations so as to cheat the adversary. A policy gradient method is leveraged to co-train the NMT model and the adversary. Experimental results on English$\rightarrow$French and German$\rightarrow$English translation tasks show that Adversarial-NMT can achieve significantly better translation quality than several strong baselines.
- The proliferation of social media in communication and information dissemination has made it an ideal platform for spreading rumors. Automatically debunking rumors at their stage of diffusion is known as \textitearly rumor detection, which refers to dealing with sequential posts regarding disputed factual claims with certain variations and highly textual duplication over time. Thus, identifying trending rumors demands an efficient yet flexible model that is able to capture long-range dependencies among postings and produce distinct representations for the accurate early detection. However, it is a challenging task to apply conventional classification algorithms to rumor detection in earliness since they rely on hand-crafted features which require intensive manual efforts in the case of large amount of posts. This paper presents a deep attention model on the basis of recurrent neural networks (RNN) to learn \textitselectively temporal hidden representations of sequential posts for identifying rumors. The proposed model delves soft-attention into the recurrence to simultaneously pool out distinct features with particular focus and produce hidden representations that capture contextual variations of relevant posts over time. Extensive experiments on real datasets collected from social media websites demonstrate that (1) the deep attention based RNN model outperforms state-of-the-arts that rely on hand-crafted features; (2) the introduction of soft attention mechanism can effectively distill relevant parts to rumors from original posts in advance; (3) the proposed method detects rumors more quickly and accurately than competitors.
- Apr 13 2017 cs.AI arXiv:1704.03612v1In this paper, we develop a novel paradigm, namely hypergraph shift, to find robust graph modes by probabilistic voting strategy, which are semantically sound besides the self-cohesiveness requirement in forming graph modes. Unlike the existing techniques to seek graph modes by shifting vertices based on pair-wise edges (i.e, an edge with $2$ ends), our paradigm is based on shifting high-order edges (hyperedges) to deliver graph modes. Specifically, we convert the problem of seeking graph modes as the problem of seeking maximizers of a novel objective function with the aim to generate good graph modes based on sifting edges in hypergraphs. As a result, the generated graph modes based on dense subhypergraphs may more accurately capture the object semantics besides the self-cohesiveness requirement. We also formally prove that our technique is always convergent. Extensive empirical studies on synthetic and real world data sets are conducted on clustering and graph matching. They demonstrate that our techniques significantly outperform the existing techniques.
- Feb 15 2017 cs.CV arXiv:1702.04179v3Given a pedestrian image as a query, the purpose of person re-identification is to identify the correct match from a large collection of gallery images depicting the same person captured by disjoint camera views. The critical challenge is how to construct a robust yet discriminative feature representation to capture the compounded variations in pedestrian appearance. To this end, deep learning methods have been proposed to extract hierarchical features against extreme variability of appearance. However, existing methods in this category generally neglect the efficiency in the matching stage whereas the searching speed of a re-identification system is crucial in real-world applications. In this paper, we present a novel deep hashing framework with Convolutional Neural Networks (CNNs) for fast person re-identification. Technically, we simultaneously learn both CNN features and hash functions/codes to get robust yet discriminative features and similarity-preserving hash codes. Thereby, person re-identification can be resolved by efficiently computing and ranking the Hamming distances between images. A structured loss function defined over positive pairs and hard negatives is proposed to formulate a novel optimization problem so that fast convergence and more stable optimized solution can be obtained. Extensive experiments on two benchmarks CUHK03 \citeFPNN and Market-1501 \citeMarket1501 show that the proposed deep architecture is efficacy over state-of-the-arts.
- A considerable amount of machine learning algorithms take instance-feature matrices as their inputs. As such, they cannot directly analyze time series data due to its temporal nature, usually unequal lengths, and complex properties. This is a great pity since many of these algorithms are effective, robust, efficient, and easy to use. In this paper, we bridge this gap by proposing an efficient representation learning framework that is able to convert a set of time series with equal or unequal lengths to a matrix format. In particular, we guarantee that the pairwise similarities between time series are well preserved after the transformation. The learned feature representation is particularly suitable to the class of learning problems that are sensitive to data similarities. Given a set of $n$ time series, we first construct an $n\times n$ partially observed similarity matrix by randomly sampling $O(n \log n)$ pairs of time series and computing their pairwise similarities. We then propose an extremely efficient algorithm that solves a highly non-convex and NP-hard problem to learn new features based on the partially observed similarity matrix. We use the learned features to conduct experiments on both data classification and clustering tasks. Our extensive experimental results demonstrate that the proposed framework is both effective and efficient.
- Jan 19 2017 cs.CV arXiv:1701.05003v1Given a query photo issued by a user (q-user), the landmark retrieval is to return a set of photos with their landmarks similar to those of the query, while the existing studies on the landmark retrieval focus on exploiting geometries of landmarks for similarity matches between candidate photos and a query photo. We observe that the same landmarks provided by different users over social media community may convey different geometry information depending on the viewpoints and/or angles, and may subsequently yield very different results. In fact, dealing with the landmarks with \illshapes caused by the photography of q-users is often nontrivial and has seldom been studied. In this paper we propose a novel framework, namely multi-query expansions, to retrieve semantically robust landmarks by two steps. Firstly, we identify the top-$k$ photos regarding the latent topics of a query landmark to construct multi-query set so as to remedy its possible \illshape. For this purpose, we significantly extend the techniques of Latent Dirichlet Allocation. Then, motivated by the typical \emphcollaborative filtering methods, we propose to learn a \emphcollaborative deep networks based semantically, nonlinear and high-level features over the latent factor for landmark photo as the training set, which is formed by matrix factorization over \emphcollaborative user-photo matrix regarding the multi-query set. The learned deep network is further applied to generate the features for all the other photos, meanwhile resulting into a compact multi-query set within such space. Extensive experiments are conducted on real-world social media data with both landmark photos together with their user information to show the superior performance over the existing methods.
- Dec 07 2016 cs.CV arXiv:1612.01655v1Boundary incompleteness raises great challenges to automatic prostate segmentation in ultrasound images. Shape prior can provide strong guidance in estimating the missing boundary, but traditional shape models often suffer from hand-crafted descriptors and local information loss in the fitting procedure. In this paper, we attempt to address those issues with a novel framework. The proposed framework can seamlessly integrate feature extraction and shape prior exploring, and estimate the complete boundary with a sequential manner. Our framework is composed of three key modules. Firstly, we serialize the static 2D prostate ultrasound images into dynamic sequences and then predict prostate shapes by sequentially exploring shape priors. Intuitively, we propose to learn the shape prior with the biologically plausible Recurrent Neural Networks (RNNs). This module is corroborated to be effective in dealing with the boundary incompleteness. Secondly, to alleviate the bias caused by different serialization manners, we propose a multi-view fusion strategy to merge shape predictions obtained from different perspectives. Thirdly, we further implant the RNN core into a multiscale Auto-Context scheme to successively refine the details of the shape prediction map. With extensive validation on challenging prostate ultrasound images, our framework bridges severe boundary incompleteness and achieves the best performance in prostate boundary delineation when compared with several advanced methods. Additionally, our approach is general and can be extended to other medical image segmentation tasks, where boundary incompleteness is one of the main challenges.
- Nov 18 2016 cs.LG arXiv:1611.05521v1Learning hash functions/codes for similarity search over multi-view data is attracting increasing attention, where similar hash codes are assigned to the data objects characterizing consistently neighborhood relationship across views. Traditional methods in this category inherently suffer three limitations: 1) they commonly adopt a two-stage scheme where similarity matrix is first constructed, followed by a subsequent hash function learning; 2) these methods are commonly developed on the assumption that data samples with multiple representations are noise-free,which is not practical in real-life applications; 3) they often incur cumbersome training model caused by the neighborhood graph construction using all $N$ points in the database ($O(N)$). In this paper, we motivate the problem of jointly and efficiently training the robust hash functions over data objects with multi-feature representations which may be noise corrupted. To achieve both the robustness and training efficiency, we propose an approach to effectively and efficiently learning low-rank kernelized \footnoteWe use kernelized similarity rather than kernel, as it is not a squared symmetric matrix for data-landmark affinity matrix. hash functions shared across views. Specifically, we utilize landmark graphs to construct tractable similarity matrices in multi-views to automatically discover neighborhood structure in the data. To learn robust hash functions, a latent low-rank kernel function is used to construct hash functions in order to accommodate linearly inseparable data. In particular, a latent kernelized similarity matrix is recovered by rank minimization on multiple kernel-based similarity matrices. Extensive experiments on real-world multi-view datasets validate the efficacy of our method in the presence of error corruptions.
- In this paper we investigate the aesthetic image classification problem, also known as automatically classifying an image into low or high aesthetic quality, which is quite a challenging problem. Considering both the local and global information of images is quite important for image aesthetic quality assessment. Currently, a powerful inception module is proposed which shows very high performance in object classification. We have the observation that the inception module has the ability of considering both the local and global features in nature. Thus, in this paper, we propose a novel DCNN structure codenamed ILGNet for image aesthetics classification, which introduces the Inception module and connects intermediate Local layers to the Global layer for the output. In addition, the ILGNet is derived from part of the GoogLeNet. Thus, we can easily use a pre-trained image classification GoogleLeNet model on the ImageNet dataset and fine tune our connected local and global layer on the large scale aesthetics assessment AVA dataset. The experimental results show that the proposed ILGNet outperforms the state of the art results in image aesthetics assessment in the AVA benchmark. The time cost of both training and test of the ILGNet are significantly less than those of full GoogLeNet with only a little reduction of the classification accuracy. Our ILGNet can achieve similar classification accuracy as that of 2/3 GoogLeNet, whose computational cost is nearly twice of ours. This makes the aesthetic assessment model more easily to be integrated into mobile and embedded systems.
- Robotic challenges like the Amazon Picking Challenge (APC) or the DARPA Challenges are an established and important way to drive scientific progress. They make research comparable on a well-defined benchmark with equal test conditions for all participants. However, such challenge events occur only occasionally, are limited to a small number of contestants, and the test conditions are very difficult to replicate after the main event. We present a new physical benchmark challenge for robotic picking: the ACRV Picking Benchmark (APB). Designed to be reproducible, it consists of a set of 42 common objects, a widely available shelf, and exact guidelines for object arrangement using stencils. A well-defined evaluation protocol enables the comparison of \emphcomplete robotic systems -- including perception and manipulation -- instead of sub-systems only. Our paper also describes and reports results achieved by an open baseline system based on a Baxter robot.
- Multi-view spectral clustering, which aims at yielding an agreement or consensus data objects grouping across multi-views with their graph laplacian matrices, is a fundamental clustering problem. Among the existing methods, Low-Rank Representation (LRR) based method is quite superior in terms of its effectiveness, intuitiveness and robustness to noise corruptions. However, it aggressively tries to learn a common low-dimensional subspace for multi-view data, while inattentively ignoring the local manifold structure in each view, which is critically important to the spectral clustering; worse still, the low-rank minimization is enforced to achieve the data correlation consensus among all views, failing to flexibly preserve the local manifold structure for each view. In this paper, 1) we propose a multi-graph laplacian regularized LRR with each graph laplacian corresponding to one view to characterize its local manifold structure. 2) Instead of directly enforcing the low-rank minimization among all views for correlation consensus, we separately impose low-rank constraint on each view, coupled with a mutual structural consensus constraint, where it is able to not only well preserve the local manifold structure but also serve as a constraint for that from other views, which iteratively makes the views more agreeable. Extensive experiments on real-world multi-view data sets demonstrate its superiority.
- Aug 19 2016 cs.CL arXiv:1608.05129v1Sentiment in social media is increasingly considered as an important resource for customer segmentation, market understanding, and tackling other socio-economic issues. However, sentiment in social media is difficult to measure since user-generated content is usually short and informal. Although many traditional sentiment analysis methods have been proposed, identifying slang sentiment words remains untackled. One of the reasons is that slang sentiment words are not available in existing dictionaries or sentiment lexicons. To this end, we propose to build the first sentiment dictionary of slang words to aid sentiment analysis of social media content. It is laborious and time-consuming to collect and label the sentiment polarity of a comprehensive list of slang words. We present an approach to leverage web resources to construct an extensive Slang Sentiment word Dictionary (SlangSD) that is easy to maintain and extend. SlangSD is publicly available for research purposes. We empirically show the advantages of using SlangSD, the newly-built slang sentiment word dictionary for sentiment classification, and provide examples demonstrating its ease of use with an existing sentiment system.
- The increasing number of applications requiring the solution of large scale singular value problems have rekindled interest in iterative methods for the SVD. Some promising recent ad- vances in large scale iterative methods are still plagued by slow convergence and accuracy limitations for computing smallest singular triplets. Furthermore, their current implementations in MATLAB cannot address the required large problems. Recently, we presented a preconditioned, two-stage method to effectively and accurately compute a small number of extreme singular triplets. In this research, we present a high-performance software, PRIMME SVDS, that implements our hybrid method based on the state-of-the-art eigensolver package PRIMME for both largest and smallest singular values. PRIMME SVDS fills a gap in production level software for computing the partial SVD, especially with preconditioning. The numerical experiments demonstrate its superior performance compared to other state-of-the-art software and its good parallel performance under strong and weak scaling.
- Jun 07 2016 cs.CV arXiv:1606.01595v1Person re-identification is to seek a correct match for a person of interest across views among a large number of imposters. It typically involves two procedures of non-linear feature extractions against dramatic appearance changes, and subsequent discriminative analysis in order to reduce intra- personal variations while enlarging inter-personal differences. In this paper, we introduce a hybrid architecture which combines Fisher vectors and deep neural networks to learn non-linear representations of person images to a space where data can be linearly separable. We reinforce a Linear Discriminant Analysis (LDA) on top of the deep neural network such that linearly separable latent representations can be learnt in an end-to-end fashion. By optimizing an objective function modified from LDA, the network is enforced to produce feature distributions which have a low variance within the same class and high variance between classes. The objective is essentially derived from the general LDA eigenvalue problem and allows to train the network with stochastic gradient descent and back-propagate LDA gradients to compute the gradients involved in Fisher vector encoding. For evaluation we test our approach on four benchmark data sets in person re-identification (VIPeR [1], CUHK03 [2], CUHK01 [3], and Market1501 [4]). Extensive experiments on these benchmarks show that our model can achieve state-of-the-art results.
- Jun 07 2016 cs.CV arXiv:1606.01609v2In this paper, we present an end-to-end approach to simultaneously learn spatio-temporal features and corresponding similarity metric for video-based person re-identification. Given the video sequence of a person, features from each frame that are extracted from all levels of a deep convolutional network can preserve a higher spatial resolution from which we can model finer motion patterns. These low-level visual percepts are leveraged into a variant of recurrent model to characterize the temporal variation between time-steps. Features from all time-steps are then summarized using temporal pooling to produce an overall feature representation for the complete sequence. The deep convolutional network, recurrent layer, and the temporal pooling are jointly trained to extract comparable hidden-unit representations from input pair of time series to compute their corresponding similarity value. The proposed framework combines time series modeling and metric learning to jointly learn relevant features and a good similarity measure between time sequences of person. Experiments demonstrate that our approach achieves the state-of-the-art performance for video-based person re-identification on iLIDS-VID and PRID 2011, the two primary public datasets for this purpose.
- With the widespread use of mobile computing devices in contemporary society, our trajectories in the physical space and virtual world are increasingly closely connected. Using the anonymous smartphone data of $1 \times 10^5$ users in 30 days, we constructed the mobility network and the attention network to study the correlations between online and offline human behaviours. In the mobility network, nodes are physical locations and edges represent the movements between locations, and in the attention network, nodes are websites and edges represent the switch of users between websites. We apply the box-covering method to renormalise the networks. The investigated network properties include the size of box $l_B$ and the number of boxes $N(l_B)$. We find two universal classes of behaviours: the mobility network is featured by a small-world property, $N(l_B) \simeq e^{-l_B}$, whereas the attention network is characterised by a self-similar property $N(l_B) \simeq l_B^{-\gamma}$. In particular, with the increasing of the length of box $l_B$, the degree correlation of the network changes from positive to negative which indicates that there are two layers of structure in the mobility network. We use the results of network renormalisation to detect the community and map the structure of the mobility network. Further, we located the most relevant websites visited in these communities, and identified three typical location-based behaviours, including the shopping, dating, and taxi-calling. Finally, we offered a revised geometric network model to explain our findings in the perspective of spatial-constrained attachment.
- Jan 28 2016 cs.CV arXiv:1601.07255v2In this paper, we propose a deep end-to-end neu- ral network to simultaneously learn high-level features and a corresponding similarity metric for person re-identification. The network takes a pair of raw RGB images as input, and outputs a similarity value indicating whether the two input images depict the same person. A layer of computing neighborhood range differences across two input images is employed to capture local relationship between patches. This operation is to seek a robust feature from input images. By increasing the depth to 10 weight layers and using very small (3$\times$3) convolution filters, our architecture achieves a remarkable improvement on the prior-art configurations. Meanwhile, an adaptive Root- Mean-Square (RMSProp) gradient decent algorithm is integrated into our architecture, which is beneficial to deep nets. Our method consistently outperforms state-of-the-art on two large datasets (CUHK03 and Market-1501), and a medium-sized data set (CUHK01).
- Nov 30 2015 cs.CV arXiv:1511.08531v2Matching individuals across non-overlapping camera networks, known as person re-identification, is a fundamentally challenging problem due to the large visual appearance changes caused by variations of viewpoints, lighting, and occlusion. Approaches in literature can be categoried into two streams: The first stream is to develop reliable features against realistic conditions by combining several visual features in a pre-defined way; the second stream is to learn a metric from training data to ensure strong inter-class differences and intra-class similarities. However, seeking an optimal combination of visual features which is generic yet adaptive to different benchmarks is a unsoved problem, and metric learning models easily get over-fitted due to the scarcity of training data in person re-identification. In this paper, we propose two effective structured learning based approaches which explore the adaptive effects of visual features in recognizing persons in different benchmark data sets. Our framework is built on the basis of multiple low-level visual features with an optimal ensemble of their metrics. We formulate two optimization algorithms, CMCtriplet and CMCstruct, which directly optimize evaluation measures commonly used in person re-identification, also known as the Cumulative Matching Characteristic (CMC) curve.
- Nov 25 2015 physics.soc-ph cs.SI arXiv:1511.07616v1To uncover the mechanisms underlying the collaborative production of knowledge, we investigate a very large online Question and Answer system that includes the question asking and answering activities of millions of users over five years. We created knowledge networks in which nodes are questions and edges are the successive answering activities of users. We find that these networks have two common properties: 1) the mitigation of degree inequality among nodes; and 2) the assortative mixing of nodes. This means that, while the system tends to reduce attention investment on old questions in order to supply sufficient attention to new questions, it is not easy for novel knowledge be integrated into the existing body of knowledge. We propose a mixing model to combine preferential attachment and reversed preferential attachment processes to model the evolution of knowledge networks and successfully reproduce the ob- served patterns. Our mixing model is not only theoretically interesting but also provide insights into the management of online communities.
- Oct 13 2015 cs.CY arXiv:1510.03247v2The gerrymandering problem is a worldwide problem which sets great threat to democracy and justice in district based elections. Thanks to partisan redistricting commissions, district boundaries are often manipulated to benefit incumbents. Since an independent commission is hard to come by, the possibility of impartially generating districts with a computer is explored in this thesis. We have developed an algorithm to randomly produce legal redistricting schemes for Pennsylvania.
- Sep 18 2015 physics.soc-ph cs.SI arXiv:1509.05083v3Online communities are becoming increasingly important as platforms for large-scale human cooperation. These communities allow users seeking and sharing professional skills to solve problems collaboratively. To investigate how users cooperate to complete a large number of knowledge-producing tasks, we analyze StackExchange, one of the largest question and answer systems in the world. We construct attention networks to model the growth of 110 communities in the StackExchange system and quantify individual answering strategies using the linking dynamics of attention networks. We identify two types of users taking different strategies. One strategy (type A) aims at performing maintenance by doing simple tasks, while the other strategy (type B) aims investing time in doing challenging tasks. We find that the number of type A needs to be twice as big as type B users for a sustainable growth of communities.
- A number of applications require the computation of the trace of a matrix that is implicitly available through a function. A common example of a function is the inverse of a large, sparse matrix, which is the focus of this paper. When the evaluation of the function is expensive, the task is computationally challenging because the standard approach is based on a Monte Carlo method which converges slowly. We present a different approach that exploits the pattern correlation, if present, between the diagonal of the inverse of the matrix and the diagonal of some approximate inverse that can be computed inexpensively. We leverage various sampling and fitting techniques to fit the diagonal of the approximation to the diagonal of the inverse. Depending on the quality of the approximate inverse, our method may serve as a standalone kernel for providing a fast trace estimate with a small number of samples. Furthermore, the method can be used as a variance reduction method for Monte Carlo in some cases. This is decided dynamically by our algorithm. An extensive set of experiments with various technique combinations on several matrices from some real applications demonstrate the potential of our method.
- This paper discussed some job scheduling algorithms for Hadoop platform, and proposed a jobs scheduling optimization algorithm based on Bayes Classification viewing the shortcoming of those algorithms which are used. The proposed algorithm can be summarized as follows. In the scheduling algorithm based on Bayes Classification, the jobs in job queue will be classified into bad job and good job by Bayes Classification, when JobTracker gets task request, it will select a good job from job queue, and select tasks from good job to allocate JobTracker, then the execution result will feedback to the JobTracker. Therefore the scheduling algorithm based on Bayes Classification influence the job classification via learning the result of feedback with the JobTracker will select the most appropriate job to execute on TaskTracker every time. We need to consider the feature usage of job resource and the influence of TaskTracker resource on task execution, the former of which we call it job feature, for instance, the average usage rate of CPU and average usage rate of memory, the latter node feature, such as the usage rate of CPU and the size of idle physical memory, the two are called feature variables. Results show that it has a significant improvement in execution efficiency and stability of job scheduling.
- A novel algorithm and implementation of real-time identification and tracking of blob-filaments in fusion reactor data is presented. Similar spatio-temporal features are important in many other applications, for example, ignition kernels in combustion and tumor cells in a medical image. This work presents an approach for extracting these features by dividing the overall task into three steps: local identification of feature cells, grouping feature cells into extended feature, and tracking movement of feature through overlapping in space. Through our extensive work in parallelization, we demonstrate that this approach can effectively make use of a large number of compute nodes to detect and track blob-filaments in real time in fusion plasma. On a set of 30GB fusion simulation data, we observed linear speedup on 1024 processes and completed blob detection in less than three milliseconds using Edison, a Cray XC30 system at NERSC.
- Potts model based on a Markov process computation solves the community structure problem effectivelyMar 30 2015 physics.soc-ph cs.SI arXiv:1503.08035v1Potts model is a powerful tool to uncover community structure in complex networks. Here, we propose a new framework to reveal the optimal number of communities and stability of network structure by quantitatively analyzing the dynamics of Potts model. Specifically we model the community structure detection Potts procedure by a Markov process, which has a clear mathematical explanation. Then we show that the local uniform behavior of spin values across multiple timescales in the representation of the Markov variables could naturally reveal the network's hierarchical community structure. In addition, critical topological information regarding to multivariate spin configuration could also be inferred from the spectral signatures of the Markov process. Finally an algorithm is developed to determine fuzzy communities based on the optimal number of communities and the stability across multiple timescales. The effectiveness and efficiency of our algorithm are theoretically analyzed as well as experimentally validated.
- It is proposed a class of statistical estimators $\hat H =(\hat H_1, \ldots, \hat H_d)$ for the Hurst parameters $H=(H_1, \ldots, H_d)$ of fractional Brownian field via multi-dimensional wavelet analysis and least squares, which are asymptotically normal. These estimators can be used to detect self-similarity and long-range dependence in multi-dimensional signals, which is important in texture classification and improvement of diffusion tensor imaging (DTI) of nuclear magnetic resonance (NMR). Some fractional Brownian sheets will be simulated and the simulated data are used to validate these estimators. We find that when $H_i \geq 1/2$, the estimators are efficient, and when $H_i < 1/2$, there are some bias.
- The problem of software artifact retrieval has the goal to effectively locate software artifacts, such as a piece of source code, in a large code repository. This problem has been traditionally addressed through the textual query. In other words, information retrieval techniques will be exploited based on the textual similarity between queries and textual representation of software artifacts, which is generated by collecting words from comments, identifiers, and descriptions of programs. However, in addition to these semantic information, there are rich information embedded in source codes themselves. These source codes, if analyzed properly, can be a rich source for enhancing the efforts of software artifact retrieval. To this end, in this paper, we develop a feature extraction method on source codes. Specifically, this method can capture both the inherent information in the source codes and the semantic information hidden in the comments, descriptions, and identifiers of the source codes. Moreover, we design a heterogeneous metric learning approach, which allows to integrate code features and text features into the same latent semantic space. This, in turn, can help to measure the artifact similarity by exploiting the joint power of both code and text features. Finally, extensive experiments on real-world data show that the proposed method can help to improve the performances of software artifact retrieval with a significant margin.
- The computation of a few singular triplets of large, sparse matrices is a challenging task, especially when the smallest magnitude singular values are needed in high accuracy. Most recent efforts try to address this problem through variations of the Lanczos bidiagonalization method, but they are still challenged even for medium matrix sizes due to the difficulty of the problem. We propose a novel SVD approach that can take advantage of preconditioning and of any well designed eigensolver to compute both largest and smallest singular triplets. Accuracy and efficiency is achieved through a hybrid, two-stage meta-method, PHSVDS. In the first stage, PHSVDS solves the normal equations up to the best achievable accuracy. If further accuracy is required, the method switches automatically to an eigenvalue problem with the augmented matrix. Thus it combines the advantages of the two stages, faster convergence and accuracy, respectively. For the augmented matrix, solving the interior eigenvalue is facilitated by a proper use of the good initial guesses from the first stage and an efficient implementation of the refined projection method. We also discuss how to precondition PHSVDS and to cope with some issues that arise. Numerical experiments illustrate the efficiency and robustness of the method.
- Jul 17 2014 physics.soc-ph cs.SI arXiv:1407.4194v1Travel activities have been widely applied to quantify spatial interactions between places, regions and nations. In this paper, we model the spatial connectivities between 652 Traffic Analysis Zones (TAZs) in Beijing by a taxi OD dataset. First, we unveil the gravitational structure of intra-urban spatial connectivities of Beijing. On overall, the inter-TAZ interactions are well governed by the Gravity Model $G_{ij} = {\lambda}p_{i}p_{j}/d_{ij}$, where $p_{i}$, $p_{j}$ are degrees of TAZ $i$, $j$ and $d_{ij}$ the distance between them, with a goodness-of-fit around 0.8. Second, the network based analysis well reveals the polycentric form of Beijing. Last, we detect the semantics of inter-TAZ connectivities based on their spatiotemporal patterns. We further find that inter-TAZ connections deviating from the Gravity Model can be well explained by link semantics.
- Currently, connectomes (e.g., functional or structural brain graphs) can be estimated in humans at $\approx 1~mm^3$ scale using a combination of diffusion weighted magnetic resonance imaging, functional magnetic resonance imaging and structural magnetic resonance imaging scans. This manuscript summarizes a novel, scalable implementation of open-source algorithms to rapidly estimate magnetic resonance connectomes, using both anatomical regions of interest (ROIs) and voxel-size vertices. To assess the reliability of our pipeline, we develop a novel nonparametric non-Euclidean reliability metric. Here we provide an overview of the methods used, demonstrate our implementation, and discuss available user extensions. We conclude with results showing the efficacy and reliability of the pipeline over previous state-of-the-art.
- Recent studies uncovered important core/periphery network structures characterizing complex sets of cooperative and competitive interactions between network nodes, be they proteins, cells, species or humans. Better characterization of the structure, dynamics and function of core/periphery networks is a key step of our understanding cellular functions, species adaptation, social and market changes. Here we summarize the current knowledge of the structure and dynamics of "traditional" core/periphery networks, rich-clubs, nested, bow-tie and onion networks. Comparing core/periphery structures with network modules, we discriminate between global and local cores. The core/periphery network organization lies in the middle of several extreme properties, such as random/condensed structures, clique/star configurations, network symmetry/asymmetry, network assortativity/disassortativity, as well as network hierarchy/anti-hierarchy. These properties of high complexity together with the large degeneracy of core pathways ensuring cooperation and providing multiple options of network flow re-channelling greatly contribute to the high robustness of complex systems. Core processes enable a coordinated response to various stimuli, decrease noise, and evolve slowly. The integrative function of network cores is an important step in the development of a large variety of complex organisms and organizations. In addition to these important features and several decades of research interest, studies on core/periphery networks still have a number of unexplored areas.
- We view web forums as virtual living organisms feeding on user's attention and investigate how these organisms grow at the expense of collective attention. We find that the "body mass" ($PV$) and "energy consumption" ($UV$) of the studied forums exhibits the allometric growth property, i.e., $PV_t \sim UV_t ^ \theta$. This implies that within a forum, the network transporting attention flow between threads has a structure invariant of time, despite of the continuously changing of the nodes (threads) and edges (clickstreams). The observed time-invariant topology allows us to explain the dynamics of networks by the behavior of threads. In particular, we describe the clickstream dissipation on threads using the function $D_i \sim T_i ^ \gamma$, in which $T_i$ is the clickstreams to node $i$ and $D_i$ is the clickstream dissipated from $i$. It turns out that $\gamma$, an indicator for dissipation efficiency, is negatively correlated with $\theta$ and $1/\gamma$ sets the lower boundary for $\theta$. Our findings have practical consequences. For example, $\theta$ can be used as a measure of the "stickiness" of forums, because it quantifies the stable ability of forums to convert $UV$ into $PV$, i.e., to remain users "lock-in" the forum. Meanwhile, the correlation between $\gamma$ and $\theta$ provides a convenient method to evaluate the `stickiness" of forums. Finally, we discuss an optimized "body mass" of forums at around $10^5$ that minimizes $\gamma$ and maximizes $\theta$.
- In this paper, we investigate the possibility to use two tilings of the hyperbolic plane as basic frame for devising a way to input texts in Chinese characters into messages of cellphones, smartphones, ipads and tablets.
- Mar 12 2013 cs.CC cond-mat.dis-nn arXiv:1303.2413v1The random 3-satisfiability (3-SAT) problem is in the unsatisfiable (UNSAT) phase when the clause density $\alpha$ exceeds a critical value $\alpha_s \approx 4.267$. However, rigorously proving the unsatisfiability of a given large 3-SAT instance is extremely difficult. In this paper we apply the mean-field theory of statistical physics to the unsatisfiability problem, and show that a specific type of UNSAT witnesses (Feige-Kim-Ofek witnesses) can in principle be constructed when the clause density $\alpha > 19$. We then construct Feige-Kim-Ofek witnesses for single 3-SAT instances through a simple random sampling algorithm and a focused local search algorithm. The random sampling algorithm works only when $\alpha$ scales at least linearly with the variable number $N$, but the focused local search algorithm works for clause densty $\alpha > c N^{b}$ with $b \approx 0.59$ and prefactor $c \approx 8$. The exponent $b$ can be further decreased by enlarging the single parameter $S$ of the focused local search algorithm.
- In this paper, a novel carrier frequency offset estimation approach, including preamble structure, carrier frequency offset estimation algorithm, is proposed for hexagonal multi-carrier transmission (HMCT) system. The closed-form Cramer-Rao lower bound of the proposed carrier frequency offset estimation scheme is given. Theoretical analyses and simulation results show that the proposed preamble structure and carrier frequency offset estimation algorithm for HMCT system obtains an approximation to the Cramer-Rao lower bound mean square error (MSE) performance over the doubly dispersive (DD) propagation channel.
- The discovery of new and interesting patterns in large datasets, known as data mining, draws more and more interest as the quantities of available data are exploding. Data mining techniques may be applied to different domains and fields such as computer science, health sector, insurances, homeland security, banking and finance, etc. In this paper we are interested by the discovery of a specific category of patterns, known as rare and non-present patterns. We present a novel approach towards the discovery of non-present patterns using rare item-set mining.
- Feb 01 2012 cs.DB arXiv:1201.6564v1Computing the shortest path between two given locations in a road network is an important problem that finds applications in various map services and commercial navigation products. The state-of-the-art solutions for the problem can be divided into two categories: spatial-coherence-based methods and vertex-importance-based approaches. The two categories of techniques, however, have not been compared systematically under the same experimental framework, as they were developed from two independent lines of research that do not refer to each other. This renders it difficult for a practitioner to decide which technique should be adopted for a specific application. Furthermore, the experimental evaluation of the existing techniques, as presented in previous work, falls short in several aspects. Some methods were tested only on small road networks with up to one hundred thousand vertices; some approaches were evaluated using distance queries (instead of shortest path queries), namely, queries that ask only for the length of the shortest path; a state-of-the-art technique was examined based on a faulty implementation that led to incorrect query results. To address the above issues, this paper presents a comprehensive comparison of the most advanced spatial-coherence-based and vertex-importance-based approaches. Using a variety of real road networks with up to twenty million vertices, we evaluated each technique in terms of its preprocessing time, space consumption, and query efficiency (for both shortest path and distance queries). Our experimental results reveal the characteristics of different techniques, based on which we provide guidelines on selecting appropriate methods for various scenarios.
- The core of the Web is a hyperlink navigation system collaboratively set up by webmasters to help users find desired websites. But does this system really work as expected? We show that the answer seems to be negative: there is a substantial mismatch between hyperlinks and the pathways that users actually take. A closer look at empirical surfing activities reveals the reason of the mismatch: webmasters try to build a global virtual world without geographical or cultural boundaries, but users in fact prefer to navigate within more fragmented, language-based groups of websites. We call this type of behavior "preferential navigation" and find that it is driven by "local" search engines.
- Background: The collective browsing behavior of users gives rise to a flow network transporting attention between websites. By analyzing the structure of this network we uncovered a nontrivial scaling regularity concerning the impact of websites. Methodology: We constructed three clickstreams networks, whose nodes were websites and edges were formed by the users switching between sites. We developed an indicator Ci as a measure of the impact of site i and investigated its correlation with the traffic of the site Ai both on the three networks and across the language communities within the networks. Conclusions: We found that the impact of websites increased slower than their traffic. Specifically, there existed a scaling relationship between Ci and Ai with an exponent gamma smaller than 1. We suggested that this scaling relationship characterized the decentralized structure of the clickstream circulation: the World Wide Web is a system that favors small sites in reassigning the collective attention of users.
- Allometric growth is found in many tagging systems online. That is, the number of new tags (T) is a power law function of the active population (P), or T P^gamma (gamma!=1). According to previous studies, it is the heterogeneity in individual tagging behavior that gives rise to allometric growth. These studies consider the power-law distribution model with an exponent beta, regarding 1/beta as an index for heterogeneity. However, they did not discuss whether power-law is the only distribution that leads to allometric growth, or equivalently, whether the positive correlation between heterogeneity and allometric growth holds in systems of distributions other than power-law. In this paper, the authors systematically examine the growth pattern of systems of six different distributions, and find that both power-law distribution and log-normal distribution lead to allometric growth. Furthermore, by introducing Shannon entropy as an indicator for heterogeneity instead of 1/beta, the authors confirm that the positive relationship between heterogeneity and allometric growth exists in both cases of power-law and log-normal distributions.