results for au:Zhou_T in:cs

- Feb 20 2018 physics.soc-ph cs.SI arXiv:1802.06721v2Religious beliefs could facilitate human cooperation [1-6], promote civic engagement [7-10], improve life satisfaction [11-13] and even boom economic development [14-16]. On the other side, some aspects of religion may lead to regional violence, intergroup conflict and moral prejudice against atheists [17-23]. Analogous to the separation of races [24], the religious segregation is a major ingredient resulting in increasing alienation, misunderstanding, cultural conflict and even violence among believers of different faiths [18,19,25]. Thus far, quantitative understanding of religious segregation is rare. Here we analyze a directed social network extracted from weibo.com (the largest directed social network in China, similar to twitter.com), which is consisted of 6875 believers in Christianism, Buddhism, Islam and Taoism. This religion network is highly segregative, with only 1.6% of links connecting individuals in different religions. Comparative analysis shows that the extent of segregation for different religions is much higher than that for different races and slightly higher than that for different political parties. The few cross-religion links play a critical role in maintaining network connectivity, being remarkably more important than links with highest betweennesses [26] or bridgenesses [27]. Further content analysis shows that 46.7% of these cross-religion links are probably related to charitable issues. Our findings provide quantitative insights into religious segregation and valuable clues to encourage cross-religion communications.
- Feb 15 2018 physics.soc-ph cs.GT arXiv:1802.04966v2With remarkable significance in migration prediction, global disease mitigation, urban plan and many others, an arresting challenge is to predict human mobility fluxes between any two locations. A number of methods have been proposed against the above challenge, including the intervening opportunity model, the gravity model, the radiation model, the population-weighted opportunity model, and so on. Despite their theoretical elegance, all models ignored an intuitive and important ingredient in individual decision about where to go, that is, the possible congestion on the way and the possible crowding in the destination. Here we propose a microscopic mechanism underlying mobility decisions, named destination choice game (DCG), which takes into account the crowding effects resulted from spatial interactions among individuals. In comparison with the state-of-the-art models, the present one shows more accurate prediction on mobility fluxes across wide scales from intracity trips to intercity travels, and further to internal migrations. The well-known gravity model is happen to be the equilibrium solution of a degenerated DCG neglecting the crowding effects in the destinations.
- Feb 01 2018 cs.CL arXiv:1801.10296v1Many natural language processing tasks solely rely on sparse dependencies between a few tokens in a sentence. Soft attention mechanisms show promising performance in modeling local/global dependencies by soft probabilities between every two tokens, but they are not effective and efficient when applied to long sentences. By contrast, hard attention mechanisms directly select a subset of tokens but are difficult and inefficient to train due to their combinatorial nature. In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other. In ReSA, a hard attention trims a sequence for a soft self-attention to process, while the soft attention feeds reward signals back to facilitate the training of the hard one. For this purpose, we develop a novel hard attention called "reinforced sequence sampling (RSS)", selecting tokens in parallel and trained via policy gradient. Using two RSS modules, ReSA efficiently extracts the sparse dependencies between each pair of selected tokens. We finally propose an RNN/CNN-free sentence-encoding model, "reinforced self-attention network (ReSAN)", solely based on ReSA. It achieves state-of-the-art performance on both Stanford Natural Language Inference (SNLI) and Sentences Involving Compositional Knowledge (SICK) datasets.
- Sep 28 2017 cs.RO arXiv:1709.09269v1To enable a natural and fluent human robot collaboration flow, it is critical for a robot to comprehend their human peers' on-going actions, predict their behaviors in the near future, and plan its actions correspondingly. Specifically, the capability of making early predictions is important, so that the robot can foresee the precise timing of a turn-taking event and start motion planning and execution early enough to smooth the turn-taking transition. Such proactive behavior would reduce human's waiting time, increase efficiency and enhance naturalness in collaborative task. To that end, this paper presents the design and implementation of an early turn-taking prediction algorithm, catered for physical human robot collaboration scenarios. Specifically, a Robotic Scrub Nurse (RSN) system which can comprehend surgeon's multimodal communication cues and perform turn-taking prediction is presented. The developed algorithm was tested on a collected data set of simulated surgical procedures in a surgeon-nurse tandem. The proposed turn-taking prediction algorithm is found to be significantly superior to its algorithmic counterparts, and is more accurate than human baseline when little partial input is given (less than 30% of full action). After observing more information, the algorithm can achieve comparable performances as humans with a F1 score of 0.90.
- Sep 28 2017 cs.RO arXiv:1709.09276v1Turn-taking is essential to the structure of human teamwork. Humans are typically aware of team members' intention to keep or relinquish their turn before a turn switch, where the responsibility of working on a shared task is shifted. Future co-robots are also expected to provide such competence. To that end, this paper proposes the Cognitive Turn-taking Model (CTTM), which leverages cognitive models (i.e., Spiking Neural Network) to achieve early turn-taking prediction. The CTTM framework can process multimodal human communication cues (both implicit and explicit) and predict human turn-taking intentions in an early stage. The proposed framework is tested on a simulated surgical procedure, where a robotic scrub nurse predicts the surgeon's turn-taking intention. It was found that the proposed CTTM framework outperforms the state-of-the-art turn-taking prediction algorithms by a large margin. It also outperforms humans when presented with partial observations of communication cues (i.e., less than 40% of full actions). This early prediction capability enables robots to initiate turn-taking actions at an early stage, which facilitates collaboration and increases overall efficiency.
- Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively. Attention mechanisms have recently attracted enormous interest due to their highly parallelizable computation, significantly less training time, and flexibility in modeling dependencies. We propose a novel attention mechanism in which the attention between elements from input sequence(s) is directional and multi-dimensional (i.e., feature-wise). A light-weight neural net, "Directional Self-Attention Network (DiSAN)", is then proposed to learn sentence embedding, based solely on the proposed attention without any RNN/CNN structure. DiSAN is only composed of a directional self-attention with temporal order encoded, followed by a multi-dimensional attention that compresses the sequence into a vector representation. Despite its simple form, DiSAN outperforms complicated RNN models on both prediction quality and time efficiency. It achieves the best test accuracy among all sentence encoding methods and improves the most recent best result by 1.02% on the Stanford Natural Language Inference (SNLI) dataset, and shows state-of-the-art test accuracy on the Stanford Sentiment Treebank (SST), Multi-Genre natural language inference (MultiNLI), Sentences Involving Compositional Knowledge (SICK), Customer Review, MPQA, TREC question-type classification and Subjectivity (SUBJ) datasets.
- Drug-target interaction (DTI) prediction plays a very important role in drug development. Biochemical experiments or in vitro methods to identify such interactions are very expensive, laborious and time-consuming. Therefore, in silico approaches including docking simulation and machine learning have been proposed to solve this problem. In particular, machine learning approaches have attracted increasing attentions recently. However, in addition to the known drug-target interactions, most of the machine learning methods require extra information such as chemical structures, genome sequences, binding types and so on. Whenever such information is not available, they may perform poor. Very recently, the similarity-based link prediction methods were extended to bipartite networks, which can be applied to solve the DTI prediction problem by using topological information only. In this work, we propose a sparse learning method to solve the DTI prediction problem, which does not require extra information and performs much better than similarity-based methods. We compare the proposed method with similarity-based methods including common neighbor index, Katz index and Jaccard index on the DTI prediction problem over the four renowned and benchmark datasets. The proposed method performs remarkably better. The results suggest that although the proposed method utilizes only the known drug-target interactions, it performs very satisfactorily. The method is very suitable to predict the potential uses of the existing drugs, especially, when extra information about the drugs and targets is not available.
- May 26 2017 cs.CV arXiv:1705.08923v1Following the recent progress in image classification and captioning using deep learning, we develop a novel natural language person retrieval system based on an attention mechanism. More specifically, given the description of a person, the goal is to localize the person in an image. To this end, we first construct a benchmark dataset for natural language person retrieval. To do so, we generate bounding boxes for persons in a public image dataset from the segmentation masks, which are then annotated with descriptions and attributes using the Amazon Mechanical Turk. We then adopt a region proposal network in Faster R-CNN as a candidate region generator. The cropped images based on the region proposals as well as the whole images with attention weights are fed into Convolutional Neural Networks for visual feature extraction, while the natural language expression and attributes are input to Bidirectional Long Short- Term Memory (BLSTM) models for text feature extraction. The visual and text features are integrated to score region proposals, and the one with the highest score is retrieved as the output of our system. The experimental results show significant improvement over the state-of-the-art method for generic object retrieval and this line of research promises to benefit search in surveillance video footage.
- Apr 26 2017 cs.CV arXiv:1704.07813v2We present an unsupervised learning framework for the task of monocular depth and camera motion estimation from unstructured video sequences. We achieve this by simultaneously training depth and camera pose estimation networks using the task of view synthesis as the supervisory signal. The networks are thus coupled via the view synthesis objective during training, but can be applied independently at test time. Empirical evaluation on the KITTI dataset demonstrates the effectiveness of our approach: 1) monocular depth performing comparably with supervised methods that use either ground-truth pose or depth for training, and 2) pose estimation performing favorably with established SLAM systems under comparable input settings.
- Apr 21 2017 cs.CV arXiv:1704.06254v1We study the notion of consistency between a 3D shape and a 2D observation and propose a differentiable formulation which allows computing gradients of the 3D shape given an observation from an arbitrary view. We do so by reformulating view consistency using a differentiable ray consistency (DRC) term. We show that this formulation can be incorporated in a learning framework to leverage different types of multi-view observations e.g. foreground masks, depth, color images, semantics etc. as supervision for learning single-view 3D prediction. We present empirical analysis of our technique in a controlled setting. We also show that this approach allows us to improve over existing techniques for single-view reconstruction of objects from the PASCAL VOC dataset.
- Apr 19 2017 cs.RO arXiv:1704.05090v1This study tries to explain the connection between communication modalities and levels of supervision in teleoperation during a dexterous task, like surgery. This concept is applied to two surgical related tasks: incision and peg transfer. It was found that as the complexity of the task escalates, the combination linking human supervision with a more expressive modality shows better performance than other combinations of modalities and control. More specifically, in the peg transfer task, the combination of speech modality and action level supervision achieves shorter task completion time (77.1 +- 3.4 s) with fewer mistakes (0.20 +- 0.17 pegs dropped).
- Feb 14 2017 cs.GT arXiv:1702.03627v1This paper studies an auction design problem for a seller to sell a commodity in a social network, where each individual (the seller or a buyer) can only communicate with her neighbors. The challenge to the seller is to design a mechanism to incentivize the buyers, who are aware of the auction, to further propagate the information to their neighbors so that more buyers will participate in the auction and hence, the seller will be able to make a higher revenue. We propose a novel auction mechanism, called information diffusion mechanism (IDM), which incentivizes the buyers to not only truthfully report their valuations on the commodity to the seller, but also further propagate the auction information to all their neighbors. In comparison, the direct extension of the well-known Vickrey-Clarke-Groves (VCG) mechanism in social networks can also incentivize the information diffusion, but it will decrease the seller's revenue or even lead to a deficit sometimes. The formalization of the problem has not yet been addressed in the literature of mechanism design and our solution is very significant in the presence of large-scale online social networks.
- In this paper, the problems of simultaneously detecting and localizing multiple targets are considered for noncoherent multiple-input multiple-output (MIMO) radar with widely separated antennas. By assuming a prior knowledge of target number, an optimal solution to this problem is presented first. It is essentially a maximum-likelihood (ML) estimator searching parameters of interest in a high dimensional space. However, the complexity of this method increases exponentially with the number G of targets.Besides, without the prior information of the number of targets, a multi-hypothesis testing strategy to determine the number of targets is required, which further complicates this method. Therefore, we split the joint maximization into G disjoint optimization problems by clearing the interference from previously declared targets. In this way, we derive two fast and robust suboptimal solutions which allow trading performance for a much lower implementation complexity which is almost independent of the number of targets. In addition, the multi-hypothesis testing is no longer required when target number is unknown. Simulation results show the proposed algorithms can correctly detect and accurately localize multiple targets even when targets share common range bins in some paths.
- Dec 22 2016 physics.soc-ph cs.SI arXiv:1612.07277v1Identifying influential nodes in networks is a significant and challenging task. Among many centrality indices, the $k$-shell index performs very well in finding out influential spreaders. However, the traditional method for calculating the $k$-shell indices of nodes needs the global topological information, which limits its applications in large-scale dynamically growing networks. Recently, L\@ΓΌ \emphet al. [Nature Communications 7 (2016) 10168] proposed a novel asynchronous algorithm to calculate the $k$-shell indices, which is suitable to deal with large-scale growing networks. In this paper, we propose two algorithms to select nodes and update their intermediate values towards the $k$-shell indices, which can help in accelerating the convergence of the calculation of $k$-shell indices. The former algorithm takes into account the degrees of nodes while the latter algorithm prefers to choose the node whose neighbors' values have been changed recently. We test these two methods on four real networks and three artificial networks. The results suggest that the two algorithms can respectively reduce the convergence time up to 75.4\% and 92.9\% in average, compared with the original asynchronous updating algorithm.
- Nov 22 2016 cs.CV arXiv:1611.07004v2We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.
- Alternating Direction Method of Multipliers (ADMM) is a popular method in solving Machine Learning problems. Stochastic ADMM was firstly proposed in order to reduce the per iteration computational complexity, which is more suitable for big data problems. Recently, variance reduction techniques have been integrated with stochastic ADMM in order to get a fast convergence rate, such as SAG-ADMM and SVRG-ADMM,but the convergence is still suboptimal w.r.t the smoothness constant. In this paper, we propose a new accelerated stochastic ADMM algorithm with variance reduction, which enjoys a faster convergence than all the other stochastic ADMM algorithms. We theoretically analyze its convergence rate and show its dependence on the smoothness constant is optimal. We also empirically validate its effectiveness and show its priority over other stochastic ADMM algorithms.
- By restricting the iterate on a nonlinear manifold, the recently proposed Riemannian optimization methods prove to be both efficient and effective in low rank tensor completion problems. However, existing methods fail to exploit the easily accessible side information, due to their format mismatch. Consequently, there is still room for improvement in such methods. To fill the gap, in this paper, a novel Riemannian model is proposed to organically integrate the original model and the side information by overcoming their inconsistency. For this particular model, an efficient Riemannian conjugate gradient descent solver is devised based on a new metric that captures the curvature of the objective.Numerical experiments suggest that our solver is more accurate than the state-of-the-art without compromising the efficiency.
- Oct 04 2016 cs.SY arXiv:1610.00063v3This paper investigates the minimal number of inputs/outputs required to guarantees the controllability/observability of a system, under the condition that its state transition matrix (STM) is prescribed. It has been proved that this minimal number is equal to the maximum geometric multiplicity of the system STM. The obtained conclusions are in sharp contrast to those established for the problems of finding the sparest input/output matrix under the restriction of system controllability/observabilty, which have been proved to be NP-hard, and even impossible to be approximated within a multiplicative factor. Moreover, a complete parametrization is also provided for the input/output matrix of a system with this minimal input/output number.
- Sep 12 2016 physics.soc-ph cs.SI arXiv:1609.02669v1Human behaviors exhibit ubiquitous correlations in many aspects, such as individual and collective levels, temporal and spatial dimensions, content, social and geographical layers. With rich Internet data of online behaviors becoming available, it attracts academic interests to explore human mobility similarity from the perspective of social network proximity. Existent analysis shows a strong correlation between online social proximity and offline mobility similari- ty, namely, mobile records between friends are significantly more similar than between strangers, and those between friends with common neighbors are even more similar. We argue the importance of the number and diversity of com- mon friends, with a counter intuitive finding that the number of common friends has no positive impact on mobility similarity while the diversity plays a key role, disagreeing with previous studies. Our analysis provides a novel view for better understanding the coupling between human online and offline behaviors, and will help model and predict human behaviors based on social proximity.
- Jul 06 2016 physics.soc-ph cs.SI arXiv:1607.01134v1Real networks exhibit heterogeneous nature with nodes playing far different roles in structure and function. To identify vital nodes is thus very significant, allowing us to control the outbreak of epidemics, to conduct advertisements for e-commercial products, to predict popular scientific publications, and so on. The vital nodes identification attracts increasing attentions from both computer science and physical societies, with algorithms ranging from simply counting the immediate neighbors to complicated machine learning and message passing approaches. In this review, we clarify the concepts and metrics, classify the problems and methods, as well as review the important progresses and describe the state of the art. Furthermore, we provide extensive empirical analyses to compare well-known methods on disparate real networks, and highlight the future directions. In despite of the emphasis on physics-rooted approaches, the unification of the language and comparison with cross-domain methods would trigger interdisciplinary solutions in the near future.
- In this paper, we design an association scheme to maximize the sum energy efficiency for massive multiple-input and multiple-output (MIMO) enabled heterogeneous cellular networks (HCNs). Considering that the final formulated problem is in a sum-of-ratio form, we first need to transform it into a parametric nonfractional form, by which we can achieve its solution through a two-layer iterative algorithm. The outer layer searches the energy efficiency parameters and multipliers associated with signal-interference-plus-noise-ratio (SINR) constraints using Newton-like method, and the inner layer optimizes the association indices using Lagrange multiplier method. In fact, the inner layer doesn't need iterative steps when the SINR constraints are not involved in the original problem, and then the whole algorithm should be a one-layer iterative one. As for the two-layer iterative algorithm, we also give the corresponding convergence proof. Numerical results show that the proposed scheme significantly outperforms the existing one in system throughput and network energy efficiency. In addition, we also investigate the impacts of the number of massive antennas and the transmit power of each pico base station on these association performances.
- Instead of achievable rate in the conventional association, we utilize the effective rate to design two association schemes for load balancing in heterogeneous cellular networks (HCNs), which are both formulated as such problems with maximizing the sum of effective rates. In these two schemes, the one just considers user association, but the other introduces power control to mitigate interference and reduce energy consumption while performing user association. Since the effective rate is closely related to the load of some BS and the achievable rate of some user, it can be used as a key factor of association schemes for load balancing in HCNs. To solve the association problem without power control, we design a one-layer iterative algorithm, which converts the sum-of-ratio form of original optimization problem into a parameterized polynomial form. By combining this algorithm with power control algorithm, we propose a two-layer iterative algorithm for the association problem with power control. Specially, the outer layer performs user association using the algorithm of problem without power control, and the inner layer updates the transmit power of each BS using a power update function (PUF). At last, we give some convergence and complexity analyses for the proposed algorithms. As shown in simulation results, the proposed schemes have superior performance than the conventional association, and the scheme with joint user association and power control achieves a higher load balancing gain and energy efficiency than conventional scheme and other offloading scheme.
- Inspired by practical importance of social networks, economic networks, biological networks and so on, studies on large and complex networks have attracted a surge of attentions in the recent years. Link prediction is a fundamental issue to understand the mechanisms by which new links are added to the networks. We introduce the method of robust principal component analysis (robust PCA) into link prediction, and estimate the missing entries of the adjacency matrix. On one hand, our algorithm is based on the sparsity and low rank property of the matrix, on the other hand, it also performs very well when the network is dense. This is because a relatively dense real network is also sparse in comparison to the complete graph. According to extensive experiments on real networks from disparate fields, when the target network is connected and sufficiently dense, whatever it is weighted or unweighted, our method is demonstrated to be very effective and with prediction accuracy being considerably improved comparing with many state-of-the-art algorithms.
- Jun 21 2016 cs.DL arXiv:1606.05752v1Predicting the fast-rising young researchers (Academic Rising Stars) in the future provides useful guidance to the research community, e.g., offering competitive candidates to university for young faculty hiring as they are expected to have success academic careers. In this work, given a set of young researchers who have published the first first-author paper recently, we solve the problem of how to effectively predict the top k% researchers who achieve the highest citation increment in βt years. We explore a series of factors that can drive an author to be fast-rising and design a novel impact increment ranking learning (IIRL) algorithm that leverages those factors to predict the academic rising stars. Experimental results on the large ArnetMiner dataset with over 1.7 million authors demonstrate the effectiveness of IIRL. Specifically, it outperforms all given benchmark methods, with over 8% average improvement. Further analysis demonstrates that the prediction models for different research topics follow the similar pattern. We also find that temporal features are the best indicators for rising stars prediction, while venue features are less relevant.
- Jun 20 2016 physics.soc-ph cs.SI arXiv:1606.05405v1In spite of the vast literature on spreading dynamics on complex networks, the role of local synergy, i.e., the interaction of elements that when combined produce a total effect greater than the sum of the individualelements, has been studied but only for irreversible spreading dynamics. Reversible spreading dynamics are ubiquitous but their interplay with synergy has remained unknown. To fill this knowledge gap, we articulate a model to incorporate local synergistic effect into the classical susceptible-infected-susceptible process, in which the probability for a susceptible node to become infected through an infected neighbor is enhanced when the neighborhood of the latter contains a number of infected nodes. We derive master equations incorporating the synergistic effect, with predictions that agree well with the numerical results. A striking finding is that, when a parameter characterizing the strength of the synergy reinforcement effect is above a critical value, the steady state density of the infected nodes versus the basic transmission rate exhibits an explosively increasing behavior and a hysteresis loop emerges. In fact, increasing the synergy strength can promote the spreading and reduce the invasion and persistence thresholds of the hysteresis loop. A physical understanding of the synergy promoting explosive spreading and the associated hysteresis behavior can be obtained through a mean-field analysis.
- Jun 14 2016 physics.soc-ph cs.SI arXiv:1606.03911v2With the help of information and communication technologies, studies on the overall social networks have been extensively reported recently. However, investigations on the directed Ego Communication Networks (ECNs) remain insufficient, where an ECN stands for a sub network composed of a centralized individual and his/her direct contacts. In this paper, the directed ECNs are built on the Call Detail Records (CDRs), which cover more than 7 million people of a provincial capital city in China for half a year. Results show that there is a critical size for ECN at about 150, above which the average emotional closeness between ego and alters drops, the balanced relationship between ego and network collapses, and the proportion of strong ties decreases. This paper not only demonstrate the significance of ECN size in affecting its properties, but also shows accordance with the "Dunbar's Number". These results can be viewed as a cross-culture supportive evidence to the well-known Social Brain Hypothesis (SBH).
- We propose a new random pruning method (called "submodular sparsification (SS)") to reduce the cost of submodular maximization. The pruning is applied via a "submodularity graph" over the $n$ ground elements, where each directed edge is associated with a pairwise dependency defined by the submodular function. In each step, SS prunes a $1-1/\sqrt{c}$ (for $c>1$) fraction of the nodes using weights on edges computed based on only a small number ($O(\log n)$) of randomly sampled nodes. The algorithm requires $\log_{\sqrt{c}}n$ steps with a small and highly parallelizable per-step computation. An accuracy-speed tradeoff parameter $c$, set as $c = 8$, leads to a fast shrink rate $\sqrt{2}/4$ and small iteration complexity $\log_{2\sqrt{2}}n$. Analysis shows that w.h.p., the greedy algorithm on the pruned set of size $O(\log^2 n)$ can achieve a guarantee similar to that of processing the original dataset. In news and video summarization tasks, SS is able to substantially reduce both computational costs and memory usage, while maintaining (or even slightly exceeding) the quality of the original (and much more costly) greedy algorithm.
- We propose a streaming submodular maximization algorithm "stream clipper" that performs as well as the offline greedy algorithm on document/video summarization in practice. It adds elements from a stream either to a solution set $S$ or to an extra buffer $B$ based on two adaptive thresholds, and improves $S$ by a final greedy step that starts from $S$ adding elements from $B$. During this process, swapping elements out of $S$ can occur if doing so yields improvements. The thresholds adapt based on if current memory utilization exceeds a budget, e.g., it increases the lower threshold, and removes from the buffer $B$ elements below the new lower threshold. We show that, while our approximation factor in the worst case is $1/2$ (like in previous work, and corresponding to the tight bound), we show that there are data-dependent conditions where our bound falls within the range $[1/2, 1-1/e]$. In news and video summarization experiments, the algorithm consistently outperforms other streaming methods, and, while using significantly less computation and memory, performs similarly to the offline greedy algorithm.
- May 12 2016 cs.CV arXiv:1605.03557v3We address the problem of novel view synthesis: given an input image, synthesizing new images of the same object or scene observed from arbitrary viewpoints. We approach this as a learning task but, critically, instead of learning to synthesize pixels from scratch, we learn to copy them from the input image. Our approach exploits the observation that the visual appearance of different views of the same instance is highly correlated, and such correlation could be explicitly learned by training a convolutional neural network (CNN) to predict appearance flows -- 2-D coordinate vectors specifying which pixels in the input view could be used to reconstruct the target view. Furthermore, the proposed framework easily generalizes to multiple input views by learning how to optimally combine single-view predictions. We show that for both objects and scenes, our approach is able to synthesize novel views of higher perceptual quality than previous CNN-based techniques.
- May 06 2016 cs.CV arXiv:1605.01576v1Image inpaiting is an important task in image processing and vision. In this paper, we develop a general method for patch-based image inpainting by synthesizing new textures from existing one. A novel framework is introduced to find several optimal candidate patches and generate a new texture patch in the process. We form it as an optimization problem that identifies the potential patches for synthesis from an coarse-to-fine manner. We use the texture descriptor as a clue in searching for matching patches from the known region. To ensure the structure faithful to the original image, a geometric constraint metric is formally defined that is applied directly to the patch synthesis procedure. We extensively conducted our experiments on a wide range of testing images on various scenarios and contents by arbitrarily specifying the target the regions for inference followed by using existing evaluation metrics to verify its texture coherency and structural consistency. Our results demonstrate the high accuracy and desirable output that can be potentially used for numerous applications: object removal, background subtraction, and image retrieval.
- Apr 20 2016 cs.CV arXiv:1604.05383v1Discriminative deep learning approaches have shown impressive results for problems where human-labeled ground truth is plentiful, but what about tasks where labels are difficult or impossible to obtain? This paper tackles one such problem: establishing dense visual correspondence across different object instances. For this task, although we do not know what the ground-truth is, we know it should be consistent across instances of that category. We exploit this consistency as a supervisory signal to train a convolutional neural network to predict cross-instance correspondences between pairs of images depicting objects of the same category. For each pair of training images we find an appropriate 3D CAD model and render two synthetic views to link in with the pair, establishing a correspondence flow 4-cycle. We use ground-truth synthetic-to-synthetic correspondences, provided by the rendering engine, to train a ConvNet to predict synthetic-to-real, real-to-real and real-to-synthetic correspondences that are cycle-consistent with the ground-truth. At test time, no CAD models are required. We demonstrate that our end-to-end trained ConvNet supervised by cycle-consistency outperforms state-of-the-art pairwise matching methods in correspondence-related tasks.
- Jan 21 2016 physics.soc-ph cs.SI arXiv:1601.05146v1An important fact in studying the link prediction is that the structural properties of networks have significant impacts on the performance of algorithms. Therefore, how to improve the performance of link prediction with the aid of structural properties of networks is an essential problem. By analyzing many real networks, we find a common structure property: nodes are preferentially linked to the nodes with the weak clique structure (abbreviated as PWCS to simplify descriptions). Based on this PWCS phenomenon, we propose a local friend recommendation (FR) index to facilitate link prediction. Our experiments show that the performance of FR index is generally better than some famous local similarity indices, such as Common Neighbor (CN) index, Adamic-Adar (AA) index and Resource Allocation (RA) index. We then explain why PWCS can give rise to the better performance of FR index in link prediction. Finally, a mixed friend recommendation index (labelled MFR) is proposed by utilizing the PWCS phenomenon, which further improves the accuracy of link prediction.
- Dec 18 2015 physics.soc-ph cs.SI arXiv:1512.05485v1Inspired by traditional link prediction and to solve the problem of recommending friends in social networks, we introduce the personalized link prediction in this paper, in which each individual will get equal number of diversiform predictions. While the performances of many classical algorithms are not satisfactory under this framework, thus new algorithms are in urgent need. Motivated by previous researches in other fields, we generalize heat conduction process to the framework of personalized link prediction and find that this method outperforms many classical similarity-based algorithms, especially in the performance of diversity. In addition, we demonstrate that adding one ground node who is supposed to connect all the nodes in the system will greatly benefit the performance of heat conduction. Finally, better hybrid algorithms composed of local random walk and heat conduction have been proposed. Numerical results show that the hybrid algorithms can outperform other algorithms simultaneously in all four adopted metrics: AUC, precision, recall and hamming distance. In a word, this work may shed some light on the in-depth understanding of the effect of physical processes in personalized link prediction.
- The heterogeneous cellular networks (HCNs) with device-to-device (D2D) communications have been a promising solution to cost-efficient delivery of high data rates. A key challenge in such D2D-enabled HCNs is how to design an effective association scheme with D2D model selection for load balancing. Moreover, the offloaded users and D2D receivers (RXs) would suffer strong interference from BSs, especially from high-power BSs. Evidently, a good association scheme should integrate with interference mitigation. Thus, we first propose an effective resource partitioning strategy that can mitigate the interference received by offloaded users from high-power BSs and the one received by D2D RXs from BSs. Based on this, we then design a user association scheme for load balancing, which jointly considers user association and D2D model selection to maximize network-wide utility. Considering that the formulated problem is in a nonlinear and mixted-integer form and hard to tackle, we adopt a dual decomposition method to develop an efficient distributed algorithm. Simulation results show that the proposed scheme provides a load balancing gain and a resource partitioning gain.
- Dec 07 2015 physics.soc-ph cs.SI arXiv:1512.01432v1Similarity is a fundamental measure in network analyses and machine learning algorithms, with wide applications ranging from personalized recommendation to socio-economic dynamics. We argue that an effective similarity measurement should guarantee the stability even under some information loss. With six bipartite networks, we investigate the stabilities of fifteen similarity measurements by comparing the similarity matrixes of two data samples which are randomly divided from original data sets. Results show that, the fifteen measurements can be well classified into three clusters according to their stabilities, and measurements in the same cluster have similar mathematical definitions. In addition, we develop a top-$n$-stability method for personalized recommendation, and find that the unstable similarities would recommend false information to users, and the performance of recommendation would be largely improved by using stable similarity measurements. This work provides a novel dimension to analyze and evaluate similarity measurements, which can further find applications in link prediction, personalized recommendation, clustering algorithms, community detection and so on.
- Energy reduction for wireless systems becomes more and more important due to its impact on the operation cost and global carbon footprint. In this paper, we investigate three kinds of energy-efficient association schemes under open loop power control for uplink heterogeneous cellular networks, which are formulated as a whole energy efficiency maximization problem, a sum energy efficiency maximization problem and a utility maximization problem respectively. The third case takes account of load balancing level and user's fairness in the energy-efficient association. Considering that the first problem is in a fractional mixed-integer form, we introduce an energy efficiency parameter to convert it into a parametric subtractive form, and then design an effective iterative algorithm to achieve the optimal solutions. As for the third problem, we first introduce a dual variable to decouple the constraint and then develop a distributed algorithm using dual decomposition. In addition, we also give the convergence proof for the proposed algorithms. In order to confirm the effectiveness of energy-efficient user association algorithms, we introduce other association rules for comparison, and investigate the influences of different parameters on the association performance of these association rules.
- Oct 09 2015 cs.CV arXiv:1510.02413v1We propose a data-driven approach for intrinsic image decomposition, which is the process of inferring the confounding factors of reflectance and shading in an image. We pose this as a two-stage learning problem. First, we train a model to predict relative reflectance ordering between image patches (`brighter', `darker', `same') from large-scale human annotations, producing a data-driven reflectance prior. Second, we show how to naturally integrate this learned prior into existing energy minimization frameworks for intrinsic image decomposition. We compare our method to the state-of-the-art approach of Bell et al. on both decomposition and image relighting tasks, demonstrating the benefits of the simple relative reflectance prior, especially for scenes under challenging lighting conditions.
- Oct 09 2015 cs.IR physics.data-an arXiv:1510.02348v2Recommender systems benefit us in tackling the problem of information overload by predicting our potential choices among diverse niche objects. So far, a variety of personalized recommendation algorithms have been proposed and most of them are based on similarities, such as collaborative filtering and mass diffusion. Here, we propose a novel vertex similarity index named CosRA, which combines advantages of both the cosine index and the resource-allocation (RA) index. By applying the CosRA index to real recommender systems including MovieLens, Netflix and RYM, we show that the CosRA-based method has better performance in accuracy, diversity and novelty than some benchmark methods. Moreover, the CosRA index is free of parameters, which is a significant advantage in real applications. Further experiments show that the introduction of two turnable parameters cannot remarkably improve the overall performance of the CosRA index.
- Oct 06 2015 cs.CV arXiv:1510.01148v3This paper presents a novel object tracking method based on approximated Locality-constrained Linear Coding (LLC). Rather than using a non-negativity constraint on encoding coefficients to guarantee these elements nonnegative, in this paper, the non-negativity constraint is substituted for a conventional $\ell_2$ norm regularization term in approximated LLC to obtain the similar nonnegative effect. And we provide a detailed and adequate explanation in theoretical analysis to clarify the rationality of this replacement. Instead of specifying fixed K nearest neighbors to construct the local dictionary, a series of different dictionaries with pre-defined numbers of nearest neighbors are selected. Weights of these various dictionaries are also learned from approximated LLC in the similar framework. In order to alleviate tracking drifts, we propose a simple and efficient occlusion detection method. The occlusion detection criterion mainly depends on whether negative templates are selected to represent the severe occluded target. Both qualitative and quantitative evaluations on several challenging sequences show that the proposed tracking algorithm achieves favorable performance compared with other state-of-the-art methods.
- To characterize economic development and diagnose the economic health condition, several popular indices such as gross domestic product (GDP), industrial structure and income growth are widely applied. However, computing these indices based on traditional economic census is usually costly and resources consuming, and more importantly, following a long time delay. In this paper, we analyzed nearly 200 million users' activities for four consecutive years in the largest social network (Sina Microblog) in China, aiming at exploring latent relationships between the online social activities and local economic status. Results indicate that online social activity has a strong correlation with local economic development and industrial structure, and more interestingly, allows revealing the macro-economic structure instantaneously with nearly no cost. Beyond, this work also provides a new venue to identify risky signal in local economic structure.
- Sep 22 2015 cs.CV arXiv:1509.06003v3The establishment of robust target appearance model over time is an overriding concern in visual tracking. In this paper, we propose an inverse nonnegative matrix factorization (NMF) method for robust appearance modeling. Rather than using a linear combination of nonnegative basis matrices for each target image patch in the conventional NMF, the proposed method is a reverse thought to conventional NMF tracker. It utilizes both the foreground and background information, and imposes a local coordinate constraint, where the basis matrix is sparse matrix from the linear combination of candidates with corresponding nonnegative coefficient vectors. Inverse NMF is used as a feature encoder, where the resulting coefficient vectors are fed into a SVM classifier for separating the target from the background. The proposed method is tested on several videos and compared with seven state-of-the-art methods. Our results have provided further support to the effectiveness and robustness of the proposed method.
- Reputation is a valuable asset in online social lives and it has drawn increased attention. How to evaluate user reputation in online rating systems is especially significant due to the existence of spamming attacks. To address this issue, so far, a variety of methods have been proposed, including network-based methods, quality-based methods and group-based ranking method. In this paper, we propose an iterative group-based ranking (IGR) method by introducing an iterative reputation-allocation process into the original group-based ranking (GR) method. More specifically, users with higher reputation have higher weights in dominating the corresponding group sizes. The reputation of users and the corresponding group sizes are iteratively updated until they become stable. Results on two real data sets suggest that the proposed IGR method has better performance and its robustness is considerably improved comparing with the original GR method. Our work highlights the positive role of users' grouping behavior towards a better reputation evaluation.
- Jul 21 2015 physics.soc-ph cs.SI arXiv:1507.05248v1Large-scale interacting human activities underlie all social and economic phenomena, but quantitative understanding of regular patterns and mechanism is very challenging and still rare. Self-organized online collaborative activities with precise record of event timing provide unprecedented opportunity. Our empirical analysis of the history of millions of updates in Wikipedia shows a universal double power-law distribution of time intervals between consecutive updates of an article. We then propose a generic model to unfold collaborative human activities into three modules: (i) individual behavior characterized by Poissonian initiation of an action, (ii) human interaction captured by a cascading response to others with a power-law waiting time, and (iii) population growth due to increasing number of interacting individuals. This unfolding allows us to obtain analytical formula that is fully supported by the universal patterns in empirical data. Our modeling approaches reveal "simplicity" beyond complex interacting human activities.
- A mass of traces of human activities show rich dynamic patterns. In this article, we comprehensively investigate the dynamic patterns of 50 thousands of researchers' activities in Sciencenet, the largest multi-disciplinary academic community in China. Through statistical analyses, we found that (i) there exists a power-law scaling between the frequency of visits to an academic forum and the number of corresponding visitors, with the exponent being about 1.33; (ii) the expansion process of academic forums obeys the Heaps' law, namely the number of distinct visited forums to the number of visits grows in a power-law form with exponent being about 0.54; (iii) the probability distributions of time intervals and the number of visits taken to revisit the same academic forum both follow power-laws, indicating the existence of memory effect in academic forum activities. On the basis of these empirical results, we propose a dynamic model that incorporates the exploration, preferential return and memory effect, which can well reproduce the observed scaling laws.
- May 28 2015 physics.soc-ph cs.SI arXiv:1505.07354v1Recent study shows that the accuracy of the k-shell method in determining node coreness in a spreading process is largely impacted due to the existence of core-like group, which has a large k-shell index but a low spreading efficiency. Based on analysis of the structure of core-like groups in real-world networks, we discover that nodes in the core-like group are mutually densely connected with very few out-leaving links from the group. By defining a measure of diffusion importance for each edge based on the number of out-leaving links of its both ends, we are able to identify redundant links in the spreading process, which have a relatively low diffusion importance but lead to form the locally densely connected core-like group. After filtering out the redundant links and applying the k-shell method to the residual network, we obtain a renewed coreness for each node which is a more accurate index to indicate its location importance and spreading influence in the original network. Moreover, we find that the performance of the ranking algorithms based on the renewed coreness are also greatly enhanced. Our findings help to more accurately decompose the network core structure and identify influential nodes in spreading processes.
- Display advertising normally charges advertisers for every single ad impression. Specifically, if an ad in a webpage has been loaded in the browser, an ad impression is counted. However, due to the position and size of the ad slot, lots of ads are actually not viewed but still measured as impressions and charged. These fraud ad impressions indeed undermine the efficacy of display advertising. A perfect ad impression viewability measurement should match what the user has really viewed with a short memory. In this paper, we conduct extensive investigations on display ad impression viewability measurements on dimensions of ad creative displayed pixel percentage and exposure time to find which measurement provides the most accurate ad impression counting. The empirical results show that the most accurate measurement counts one ad impression if more than 75% of the ad creative pixels have been exposed for at least 2 continuous seconds.
- With great theoretical and practical significance, locating influential nodes of complex networks is a promising issues. In this paper, we propose a dynamics-sensitive (DS) centrality that integrates topological features and dynamical properties. The DS centrality can be directly applied in locating influential spreaders. According to the empirical results on four real networks for both susceptible-infected-recovered (SIR) and susceptible-infected (SI) spreading models, the DS centrality is much more accurate than degree, $k$-shell index and eigenvector centrality.
- Feb 17 2015 physics.soc-ph cs.SI arXiv:1502.04184v2Enterprises have put more and more emphasis on data analysis so as to obtain effective management advices. Managers and researchers are trying to dig out the major factors that lead to employees' promotion and resignation. Most previous analyses were based on questionnaire survey, which usually consists of a small fraction of samples and contains biases caused by psychological defense. In this paper, we successfully collect a data set consisting of all the employees' work-related interactions (action network, AN for short) and online social connections (social network, SN for short) of a company, which inspires us to reveal the correlations between structural features and employees' career development, namely promotion and resignation. Through statistical analysis and prediction, we show that the structural features of both AN and SN are correlated and predictive to employees' promotion and resignation, and the AN has higher correlation and predictability. More specifically, the in-degree in AN is the most relevant indicator for promotion; while the k-shell index in AN and in-degree in SN are both very predictive to resignation. Our results provide a novel and actionable understanding of enterprise management and suggest that to enhance the interplays among employees, no matter work-related or social interplays, can largely improve the loyalty of employees.
- Jan 20 2015 cs.CV arXiv:1501.04378v4Multiple Instance Learning (MIL) recently provides an appealing way to alleviate the drifting problem in visual tracking. Following the tracking-by-detection framework, an online MILBoost approach is developed that sequentially chooses weak classifiers by maximizing the bag likelihood. In this paper, we extend this idea towards incorporating the instance significance estimation into the online MILBoost framework. First, instead of treating all instances equally, with each instance we associate a significance-coefficient that represents its contribution to the bag likelihood. The coefficients are estimated by a simple Bayesian formula that jointly considers the predictions from several standard MILBoost classifiers. Next, we follow the online boosting framework, and propose a new criterion for the selection of weak classifiers. Experiments with challenging public datasets show that the proposed method outperforms both existing MIL based and boosting based trackers.
- Jan 16 2015 cs.IR physics.data-an arXiv:1501.03577v1The explosive growth of information challenges people's capability in finding out items fitting to their own interests. Recommender systems provide an efficient solution by automatically push possibly relevant items to users according to their past preferences. Recommendation algorithms usually embody the causality from what having been collected to what should be recommended. In this article, we argue that in many cases, a user's interests are stable, and thus the previous and future preferences are highly consistent. The temporal order of collections then does not necessarily imply a causality relationship. We further propose a consistence-based algorithm that outperforms the state-of-the-art recommendation algorithms in disparate real data sets, including \textitNetflix, \textitMovieLens, \textitAmazon and \textitRate Your Music.
- Jan 06 2015 cs.IR physics.soc-ph arXiv:1501.00677v1Ranking problem has attracted much attention in real systems. How to design a robust ranking method is especially significant for online rating systems under the threat of spamming attacks. By building reputation systems for users, many well-performed ranking methods have been applied to address this issue. In this Letter, we propose a group-based ranking method that evaluates users' reputations based on their grouping behaviors. More specifically, users are assigned with high reputation scores if they always fall into large rating groups. Results on three real data sets indicate that the present method is more accurate and robust than correlation-based method in the presence of spamming attacks.
- Oct 29 2014 cs.CV arXiv:1410.7484v2Stochastic sampling based trackers have shown good performance for abrupt motion tracking so that they have gained popularity in recent years. However, conventional methods tend to use a two-stage sampling paradigm, in which the search space needs to be uniformly explored with an inefficient preliminary sampling phase. In this paper, we propose a novel sampling-based method in the Bayesian filtering framework to address the problem. Within the framework, nearest neighbor field estimation is utilized to compute the importance proposal probabilities, which guide the Markov chain search towards promising regions and thus enhance the sampling efficiency; given the motion priors, a smoothing stochastic sampling Monte Carlo algorithm is proposed to approximate the posterior distribution through a smoothing weight-updating scheme. Moreover, to track the abrupt and the smooth motions simultaneously, we develop an abrupt-motion detection scheme which can discover the presence of abrupt motions during online tracking. Extensive experiments on challenging image sequences demonstrate the effectiveness and the robustness of our algorithm in handling the abrupt motions.
- Oct 15 2014 physics.soc-ph cs.SI arXiv:1410.3519v2Numerous concise models such as preferential attachment have been put forward to reveal the evolution mechanisms of real-world networks, which show that real-world networks are usually jointly driven by a hybrid mechanism of multiplex features instead of a single pure mechanism. To get an accurate simulation for real networks, some researchers proposed a few hybrid models of mixing multiple evolution mechanisms. Nevertheless, how a hybrid mechanism of multiplex features jointly influence the network evolution is not very clear. In this study, we introduce two methods (link prediction and likelihood analysis) to measure multiple evolution mechanisms of complex networks. Through tremendous experiments on artificial networks, which can be controlled to follow multiple mechanisms with different weights, we find the method based on likelihood analysis performs much better and gives very accurate estimations. At last, we apply this method to some real-world networks which are from different domains (including technology networks and social networks) and different countries (e.g., USA and China), to see how popularity and clustering co-evolve. We find most of them are affected by both popularity and clustering, but with quite different weights.
- Sep 19 2014 physics.soc-ph cs.SI arXiv:1409.5187v2Identifying the most influential spreaders is an important issue in understanding and controlling spreading processes on complex networks. Recent studies showed that nodes located in the core of a network as identified by the k-shell decomposition are the most influential spreaders. However, through a great deal of numerical simulations, we observe that not in all real networks do nodes in high shells are very influential: in some networks the core nodes are the most influential which we call true core, while in others nodes in high shells, even the innermost core, are not good spreaders which we call core-like group. By analyzing the k-core structure of the networks, we find that the true core of a network links diversely to the shells of the network, while the core-like group links very locally within the group. For nodes in the core-like group, the k-shell index cannot reflect their location importance in the network. We further introduce a measure based on the link diversity of shells to effectively distinguish the true core and core-like group, and identify core-like groups throughout the networks. Our findings help to better understand the structural features of real networks and influential nodes.
- Repeated game theory has been one of the most prevailing tools for understanding the long-run relationships, which are footstones in building human society. Recent works have revealed a new set of "zero-determinant (ZD)" strategies, which is an important advance in repeated games. A ZD strategy player can exert a unilaterally control on two players' payoffs. In particular he can deterministically set the opponent's payoff, or enforce an unfair linear relationship between the players' payoffs, thereby always seizing an advantageous share of payoffs. One of the limitations of the original ZD strategy, however, is that it does not capture the notion of robustness when the game is subjected to stochastic errors. In this paper, we propose a general model of ZD strategies for noisy repeated games, and find that ZD strategies have high robustness against errors. We further derive the pinning strategy under noise, by which the ZD strategy player coercively set the opponent's expected payoff to his desired level, although his payoff control ability declines with the increase of noise strength. Due to the uncertainty caused by noise, the ZD strategy player cannot secure his payoff to be higher than the opponent's, which implies strong extortions do not exist even under low noise. While we show that the ZD strategy player can still establish a novel kind of extortions, named weak extortions, where any increase of his own payoff always exceeds that of the opponent's by a fixed percentage, and the conditions under which the weak extortions can be realized are more stringent as the noise becomes stronger.
- Link prediction aims to uncover missing links or predict the emergence of future relationships according to the current networks structure. Plenty of algorithms have been developed for link prediction in unweighted networks, with only a very few of them having been extended to weighted networks. Thus far, how to predict weights of links is important but rarely studied. In this Letter, we present a reliable-route-based method to extend unweighted local similarity indices to weighted indices and propose a method to predict both the link existence and link weights accordingly. Experiments on different real networks suggest that the weighted resource allocation index has the best performance to predict the existence of links, while the reliable-route-based weighted resource allocation index performs noticeably better on weight prediction. Further analysis shows a strong correlation for both link prediction and weight prediction: the larger the clustering coefficient, the higher the prediction accuracy.
- We reduce a broad class of machine learning problems, usually addressed by EM or sampling, to the problem of finding the $k$ extremal rays spanning the conical hull of a data point set. These $k$ "anchors" lead to a global solution and a more interpretable model that can even outperform EM and sampling on generalization error. To find the $k$ anchors, we propose a novel divide-and-conquer learning scheme "DCA" that distributes the problem to $\mathcal O(k\log k)$ same-type sub-problems on different low-D random hyperplanes, each can be solved by any solver. For the 2D sub-problem, we present a non-iterative solver that only needs to compute an array of cosine values and its max/min entries. DCA also provides a faster subroutine for other methods to check whether a point is covered in a conical hull, which improves algorithm design in multiple dimensions and brings significant speedup to learning. We apply our method to GMM, HMM, LDA, NMF and subspace clustering, then show its competitive performance and scalability over other methods on rich datasets.
- The identification of urban mobility patterns is very important for predicting and controlling spatial events. In this study, we analyzed millions of geographical check-ins crawled from a leading Chinese location-based social networking service (Jiepang.com), which contains demographic information that facilitates group-specific studies. We determined the distinct mobility patterns of natives and non-natives in all five large cities that we considered. We used a mixed method to assign different algorithms to natives and non-natives, which greatly improved the accuracy of location prediction compared with the basic algorithms. We also propose so-called indigenization coefficients to quantify the extent to which an individual behaves like a native, which depends only on their check-in behavior, rather than requiring demographic information. Surprisingly, the hybrid algorithm weighted using the indigenization coefficients outperformed a mixed algorithm that used additional demographic information, suggesting the advantage of behavioral data in characterizing individual mobility compared with the demographic information. The present location prediction algorithms can find applications in urban planning, traffic forecasting, mobile recommendation, and so on.
- As one of major challenges, cold-start problem plagues nearly all recommender systems. In particular, new items will be overlooked, impeding the development of new products online. Given limited resources, how to utilize the knowledge of recommender systems and design efficient marketing strategy for new items is extremely important. In this paper, we convert this ticklish issue into a clear mathematical problem based on a bipartite network representation. Under the most widely used algorithm in real e-commerce recommender systems, so-called the item-based collaborative filtering, we show that to simply push new items to active users is not a good strategy. To our surprise, experiments on real recommender systems indicate that to connect new items with some less active users will statistically yield better performance, namely these new items will have more chance to appear in other users' recommendation lists. Further analysis suggests that the disassortative nature of recommender systems contributes to such observation. In a word, getting in-depth understanding on recommender systems could pave the way for the owners to popularize their cold-start products with low costs.
- Link prediction plays an important role in understanding intrinsic evolving mechanisms of networks. With the belief that the likelihood of the existence of a link between two nodes is strongly related with their similarity, many methods have been proposed to calculate node similarity based on node attributes and/or topological structures. Among a large variety of methods that take into account paths connecting the target pair of nodes, most of which neglect the heterogeneity of those paths. Our hypothesis is that a path consisting of small-degree nodes provides a strong evidence of similarity between two ends, accordingly, we propose a so-called sig- nificant path index in this Letter to leverage intermediate nodes' degrees in similarity calculation. Empirical experiments on twelve disparate real networks demonstrate that the proposed index outperforms the mainstream link prediction baselines.
- Feb 26 2014 cs.IR arXiv:1402.6132v1With the rapid growth of the Internet and overwhelming amount of information that people are confronted with, recommender systems have been developed to effiectively support users' decision-making process in online systems. So far, much attention has been paid to designing new recommendation algorithms and improving existent ones. However, few works considered the different contributions from different users to the performance of a recommender system. Such studies can help us improve the recommendation efficiency by excluding irrelevant users. In this paper, we argue that in each online system there exists a group of core users who carry most of the information for recommendation. With them, the recommender systems can already generate satisfactory recommendation. Our core user extraction method enables the recommender systems to achieve 90% of the accuracy by taking only 20% of the data into account.
- Feb 25 2014 cs.IR arXiv:1402.5774v1Recent decade has witnessed the increasing popularity of recommender systems, which help users acquire relevant commodities and services from overwhelming resources on Internet. Some simple physical diffusion processes have been used to design effective recommendation algorithms for user-object bipartite networks, typically mass diffusion (MD) and heat conduction (HC) algorithms which have different advantages respectively on accuracy and diversity. In this paper, we investigate the effect of weight assignment in the hybrid of MD and HC, and find that a new hybrid algorithm of MD and HC with balanced weights will achieve the optimal recommendation results, we name it balanced diffusion (BD) algorithm. Numerical experiments on three benchmark data sets, MovieLens, Netflix and RateYourMusic (RYM), show that the performance of BD algorithm outperforms the existing diffusion-based methods on the three important recommendation metrics, accuracy, diversity and novelty. Specifically, it can not only provide accurately recommendation results, but also yield higher diversity and novelty in recommendations by accurately recommending unpopular objects.
- Recently, Press and Dyson have proposed a new class of probabilistic and conditional strategies for the two-player iterated Prisoner's Dilemma, so-called zero-determinant strategies. A player adopting zero-determinant strategies is able to pin the expected payoff of the opponents or to enforce a linear relationship between his own payoff and the opponents' payoff, in a unilateral way. This paper considers zero-determinant strategies in the iterated public goods game, a representative multi-player evolutionary game where in each round each player will choose whether or not put his tokens into a public pot, and the tokens in this pot are multiplied by a factor larger than one and then evenly divided among all players. The analytical and numerical results exhibit a similar yet different scenario to the case of two-player games: (i) with small number of players or a small multiplication factor, a player is able to unilaterally pin the expected total payoff of all other players; (ii) a player is able to set the ratio between his payoff and the total payoff of all other players, but this ratio is limited by an upper bound if the multiplication factor exceeds a threshold that depends on the number of players.
- Jan 20 2014 cs.SY arXiv:1401.4335v1Some necessary and sufficient conditions are obtained for the controllability and observability of a networked system with linear time invariant (LTI) dynamics. The topology of this system is fixed but arbitrary, and every subsystem is permitted to have different dynamic input-output relations. These conditions essentially depend only on transmission zeros of every subsystem and the connection matrix among subsystems, which makes them attractive in the analysis and synthesis of a large scale networked system. As an application, these conditions are utilized to characterize systems whose steady state estimation accuracy with the distributed predictor developed in (Zhou, 2013) is equal to that of the lumped Kalman filter. Some necessary and sufficient conditions on system matrices are derived for this equivalence. It has been made clear that to guarantee this equivalence, the steady state update gain matrix of the Kalman filter must be block diagonal.
- Ergodic properties and asymptotic stationarity are investigated in this paper for the pseudo-covariance matrix (PCM) of a recursive state estimator which is robust against parametric uncertainties and is based on plant output measurements that may be randomly dropped. When the measurement dropping process is described by a Markov chain and the modified plant is both controllable and observable, it is proved that if the dropping probability is less than 1, this PCM converges to a stationary distribution that is independent of its initial values. A convergence rate is also provided. In addition, it has also been made clear that when the initial value of the PCM is set to the stabilizing solution of the algebraic Riccati equation related to the robust state estimator without measurement dropping, this PCM converges to an ergodic process. Based on these results, two approximations are derived for the probability distribution function of the stationary PCM, as well as a bound of approximation errors. A numerical example is provided to illustrate the obtained theoretical results.
- Jan 17 2014 cs.SY arXiv:1401.4020v1A recursive state estimation procedure is derived for a linear time varying system with both parametric uncertainties and stochastic measurement droppings. This estimator has a similar form as that of the Kalman filter with intermittent observations, but its parameters should be adjusted when a plant output measurement arrives. A new recursive form is derived for the pseudo-covariance matrix of estimation errors, which plays important roles in analyzing its asymptotic properties. Based on a Riemannian metric for positive definite matrices, some necessary and sufficient conditions have been obtained for the strict contractiveness of an iteration of this recursion. It has also been proved that under some controllability and observability conditions, as well as some weak requirements on measurement arrival probability, the gain matrix of this recursive robust state estimator converges in probability one to a stationary distribution. Numerical simulation results show that estimation accuracy of the suggested procedure is more robust against parametric modelling errors than the Kalman filter.
- To solve the problem that the low capacity in hot-spots and coverage holes of conventional cellular networks, the base stations (BSs) having lower transmit power are deployed to form heterogeneous cellular networks (HetNets). However, because of these introduced disparate power BSs, the user distributions among them looked fairly unbalanced if an appropriate user association scheme hasn't been provided. For effectively tackling this problem, we jointly consider the load of each BS and user's achievable rate instead of only utilizing the latter when designing an association algorithm, and formulate it as a network-wide weighted utility maximization problem. Note that, the load mentioned above relates to the amount of required subbands decided by actual rate requirements, i.e., QoS, but the number of associated users, thus it can reflect user's actual load level. As for the proposed problem, we give a maximum probability (max-probability) algorithm by relaxing variables as well as a low-complexity distributed algorithm with a near-optimal solution that provides a theoretical performance guarantee. Experimental results show that, compared with the association strategy advocated by Ye, our strategy has a speeder convergence rate, a lower call blocking probability and a higher load balancing level.
- Nov 26 2013 physics.soc-ph cs.SI arXiv:1311.5932v1Understanding spreading dynamics will benefit society as a whole in better preventing and controlling diseases, as well as facilitating the socially responsible information while depressing destructive rumors. In network-based spreading dynamics, edges with different weights may play far different roles: a friend from afar usually brings novel stories, and an intimate relationship is highly risky for a flu epidemic. In this article, we propose a weighted susceptible-infected-susceptible model on complex networks, where the weight of an edge is defined by the topological proximity of the two associated nodes. Each infected individual is allowed to select limited number of neighbors to contact, and a tunable parameter is introduced to control the preference to contact through high-weight or low-weight edges. Experimental results on six real networks show that the epidemic prevalence can be largely promoted when strong ties are favored in the spreading process. By comparing with two statistical null models respectively with randomized topology and randomly redistributed weights, we show that the distribution pattern of weights, rather than the topology, mainly contributes to the experimental observations. Further analysis suggests that the weight-weight correlation strongly affects the results: high-weight edges are more significant in keeping high epidemic prevalence when the weight-weight correlation is present.
- Learning big data by matrix decomposition always suffers from expensive computation, mixing of complicated structures and noise. In this paper, we study more adaptive models and efficient algorithms that decompose a data matrix as the sum of semantic components with incoherent structures. We firstly introduce "GO decomposition (GoDec)", an alternating projection method estimating the low-rank part $L$ and the sparse part $S$ from data matrix $X=L+S+G$ corrupted by noise $G$. Two acceleration strategies are proposed to obtain scalable unmixing algorithm on big data: 1) Bilateral random projection (BRP) is developed to speed up the update of $L$ in GoDec by a closed-form built from left and right random projections of $X-S$ in lower dimensions; 2) Greedy bilateral (GreB) paradigm updates the left and right factors of $L$ in a mutually adaptive and greedy incremental manner, and achieve significant improvement in both time and sample complexities. Then we proposes three nontrivial variants of GoDec that generalizes GoDec to more general data type and whose fast algorithms can be derived from the two strategies......
- In this paper, we build up a framework for sparse interpolation. We first investigate the theoretical limit of the number of unisolvent points for sparse interpolation under a general setting and try to answer some basic questions of this topic. We also explore the relation between classical interpolation and sparse interpolation. We second consider the design of the interpolation points for the $s$-sparse functions in high dimensional Chebyshev bases, for which the possible applications include uncertainty quantification, numerically solving stochastic or parametric PDEs and compressed sensing. Unlike the traditional random sampling method, we present in this paper a deterministic method to produce the interpolation points, and show its performance with $\ell_1$ minimization by analyzing the mutual incoherence of the interpolation matrix. Numerical experiments show that the deterministic points have a similar performance with that of the random points.