results for au:Oh_S in:cs

- In this paper, we propose a novel maximum causal Tsallis entropy (MCTE) framework for imitation learning which can efficiently learn a sparse multi-modal policy distribution from demonstrations. We provide the full mathematical analysis of the proposed framework. First, the optimal solution of an MCTE problem is shown to be a sparsemax distribution, whose supporting set can be adjusted. The proposed method has advantages over a softmax distribution in that it can exclude unnecessary actions by assigning zero probability. Second, we prove that an MCTE problem is equivalent to robust Bayes estimation in the sense of the Brier score. Third, we propose a maximum causal Tsallis entropy imitation learning (MCTEIL) algorithm with a sparse mixture density network (sparse MDN) by modeling mixture weights using a sparsemax distribution. In particular, we show that the causal Tsallis entropy of an MDN encourages exploration and efficient mixture utilization while Boltzmann Gibbs entropy is less effective. We validate the proposed method in two simulation studies and MCTEIL outperforms existing imitation learning methods in terms of average returns and learning multi-modal policies.
- Machine Learning techniques are widely used by online services (e.g. Google, Apple) in order to analyze and make predictions on user data. As many of the provided services are user-centric (e.g. personal photo collections, speech recognition, personal assistance), user data generated on personal devices is key to provide the service. In order to protect the data and the privacy of the user, federated learning techniques have been proposed where the data never leaves the user's device and "only" model updates are communicated back to the server. In our work, we propose a new threat model that is not concerned with learning about the content - but rather is concerned with the linkability of users during such decentralized learning scenarios. We show that model updates are characteristic for users and therefore lend themselves to linkability attacks. We show identification and matching of users across devices in closed and open world scenarios. In our experiments, we find our attacks to be highly effective, achieving 20x-175x chance-level performance. In order to mitigate the risks of linkability attacks, we study various strategies. As adding random noise does not offer convincing operation points, we propose strategies based on using calibrated domain-specific data; we find these strategies offers substantial protection against linkability threats with little effect to utility.
- Mar 23 2018 cs.CV arXiv:1803.08190v1In this paper, we address the problem of estimating a 3D human pose from a single image, which is important but difficult to solve due to many reasons, such as self-occlusions, wild appearance changes, and inherent ambiguities of 3D estimation from a 2D cue. These difficulties make the problem ill-posed, which have become requiring increasingly complex estimators to enhance the performance. On the other hand, most existing methods try to handle this problem based on a single complex estimator, which might not be good solutions. In this paper, to resolve this issue, we propose a multiple-partial-hypothesis-based framework for the problem of estimating 3D human pose from a single image, which can be fine-tuned in an end-to-end fashion. We first select several joint groups from a human joint model using the proposed sampling scheme, and estimate the 3D poses of each joint group separately based on deep neural networks. After that, they are aggregated to obtain the final 3D poses using the proposed robust optimization formula. The overall procedure can be fine-tuned in an end-to-end fashion, resulting in better performance. In the experiments, the proposed framework shows the state-of-the-art performances on popular benchmark data sets, namely Human3.6M and HumanEva, which demonstrate the effectiveness of the proposed framework.
- Recently popularized graph neural networks achieve the state-of-the-art accuracy on a number of standard benchmark datasets for graph-based semi-supervised learning, improving significantly over existing approaches. These architectures alternate between a propagation layer that aggregates the hidden states of the local neighborhood and a fully-connected layer. Perhaps surprisingly, we show that a linear model, that removes all the intermediate fully-connected layers, is still able to achieve a performance comparable to the state-of-the-art models. This significantly reduces the number of parameters, which is critical for semi-supervised learning where number of labeled examples are small. This in turn allows a room for designing more innovative propagation layers. Based on this insight, we propose a novel graph neural network that removes all the intermediate fully-connected layers, and replaces the propagation layers with attention mechanisms that respect the structure of the graph. The attention mechanism allows us to learn a dynamic and adaptive local summary of the neighborhood to achieve more accurate predictions. In a number of experiments on benchmark citation networks datasets, we demonstrate that our approach outperforms competing methods. By examining the attention weights among neighbors, we show that our model provides some interesting insights on how neighbors influence each other.
- The Split Packing algorithm \citesplitpacking_ws, splitpacking is an offline algorithm that packs a set of circles into shapes (triangles and squares) at an optimal packing density. In this paper, we develop an online alternative to Split Packing to handle an online sequence of insertions and deletions, where the algorithm is allowed to reallocate circles into new positions at a cost proportional to their areas. The algorithm can be used to pack circles into squares and right angled triangles. If only insertions are considered, our algorithm is also able to achieve optimal packing density, with an amortised reallocation cost of $O(c\log \frac{1}{c})$ for squares, and $O(c(1+s^2)\log_{1+s^2}\frac{1}{c})$ for right angled triangles, where $s$ is the ratio of the lengths of the second shortest side to the shortest, when inserting a circle of area $c$. When insertions and deletions are considered, we achieve a packing density of $(1-\epsilon)$ of the optimal, where $\epsilon>0$ can be made arbitrarily small, with an amortised reallocation cost of $O(c(1+s^2)\log_{1+s^2}\frac{1}{c} + c\frac{1}{\epsilon})$.
- Jan 15 2018 cs.CV arXiv:1801.04102v1Single image reflection separation is an ill-posed problem since two scenes, a transmitted scene and a reflected scene, need to be inferred from a single observation. To make the problem tractable, in this work we assume that categories of two scenes are known. It allows us to address the problem by generating both scenes that belong to the categories while their contents are constrained to match with the observed image. A novel network architecture is proposed to render realistic images of both scenes based on adversarial learning. The network can be trained in a weakly supervised manner, i.e., it learns to separate an observed image without corresponding ground truth images of transmission and reflection scenes which are difficult to collect in practice. Experimental results on real and synthetic datasets demonstrate that the proposed algorithm performs favorably against existing methods.
- Given multi-platform genome data with prior knowledge of functional gene sets, how can we extract interpretable latent relationships between patients and genes? More specifically, how can we devise a tensor factorization method which produces an interpretable gene factor matrix based on gene set information while maintaining the decomposition quality and speed? We propose GIFT, a Guided and Interpretable Factorization for Tensors. GIFT provides interpretable factor matrices by encoding prior knowledge as a regularization term in its objective function. Experiment results demonstrate that GIFT produces interpretable factorizations with high scalability and accuracy, while other methods lack interpretability. We apply GIFT to the PanCan12 dataset, and GIFT reveals significant relations between cancers, gene sets, and genes, such as influential gene sets for specific cancer (e.g., interferon-gamma response gene set for ovarian cancer) or relations between cancers and genes (e.g., BRCA cancer - APOA1 gene and OV, UCEC cancers - BST2 gene).
- Generative adversarial networks (GANs) are innovative techniques for learning generative models of complex data distributions from samples. Despite remarkable recent improvements in generating realistic images, one of their major shortcomings is the fact that in practice, they tend to produce samples with little diversity, even when trained on diverse datasets. This phenomenon, known as mode collapse, has been the main focus of several recent advances in GANs. Yet there is little understanding of why mode collapse happens and why existing approaches are able to mitigate mode collapse. We propose a principled approach to handling mode collapse, which we call packing. The main idea is to modify the discriminator to make decisions based on multiple samples from the same class, either real or artificially generated. We borrow analysis tools from binary hypothesis testing---in particular the seminal result of Blackwell [Bla53]---to prove a fundamental connection between packing and mode collapse. We show that packing naturally penalizes generators with mode collapse, thereby favoring generator distributions with less mode collapse during the training process. Numerical experiments on benchmark datasets suggests that packing provides significant improvements in practice as well.
- Recently, there have been increasing demands to construct compact deep architectures to remove unnecessary redundancy and to improve the inference speed. While many recent works focus on reducing the redundancy by eliminating unneeded weight parameters, it is not possible to apply a single deep architecture for multiple devices with different resources. When a new device or circumstantial condition requires a new deep architecture, it is necessary to construct and train a new network from scratch. In this work, we propose a novel deep learning framework, called a nested sparse network, which exploits an n-in-1-type nested structure in a neural network. A nested sparse network consists of multiple levels of networks with a different sparsity ratio associated with each level, and higher level networks share parameters with lower level networks to enable stable nested learning. The proposed framework realizes a resource-aware versatile architecture as the same network can meet diverse resource requirements. Moreover, the proposed nested network can learn different forms of knowledge in its internal networks at different levels, enabling multiple tasks using a single network, such as coarse-to-fine hierarchical classification. In order to train the proposed nested sparse network, we propose efficient weight connection learning and channel and layer scheduling strategies. We evaluate our network in multiple tasks, including adaptive deep compression, knowledge distillation, and learning class hierarchy, and demonstrate that nested sparse networks perform competitively, but more efficiently, compared to existing methods.
- As more and more personal photos are shared online, being able to obfuscate identities in such photos is becoming a necessity for privacy protection. People have largely resorted to blacking out or blurring head regions, but they result in poor user experience while being surprisingly ineffective against state of the art person recognizers. In this work, we propose a novel head inpainting obfuscation technique. Generating a realistic head inpainting in social media photos is challenging because subjects appear in diverse activities and head orientations. We thus split the task into two sub-tasks: (1) facial landmark generation from image context (e.g. body pose) for seamless hypothesis of sensible head pose, and (2) facial landmark conditioned head inpainting. We verify that our inpainting method generates realistic person images, while achieving superior obfuscation performance against automatic person recognizers.
- Recent advances in learning Deep Neural Network (DNN) architectures have received a great deal of attention due to their ability to outperform state-of-the-art classifiers across a wide range of applications, with little or no feature engineering. In this paper, we broadly study the applicability of deep learning to website fingerprinting. We show that unsupervised DNNs can be used to extract low-dimensional feature vectors that improve the performance of state-of-the-art website fingerprinting attacks. When used as classifiers, we show that they can match or exceed performance of existing attacks across a range of application scenarios, including fingerprinting Tor website traces, fingerprinting search engine queries over Tor, defeating fingerprinting defenses, and fingerprinting TLS-encrypted websites. Finally, we show that DNNs can be used to predict the fingerprintability of a website based on its contents, achieving 99% accuracy on a data set of 4500 website downloads.
- Many deployed learned models are black boxes: given input, returns output. Internal information about the model, such as the architecture, optimisation procedure, or training data, is not disclosed explicitly as it might contain proprietary information or make the system more vulnerable. This work shows that such attributes of neural networks can be exposed from a sequence of queries. This has multiple implications. On the one hand, our work exposes the vulnerability of black-box neural networks to different types of attacks -- we show that the revealed internal information helps generate more effective adversarial examples against the black box model. On the other hand, this technique can be used for better protection of private content from automatic recognition models using adversarial examples. Our paper suggests that it is actually hard to draw a line between white box and black box models.
- Across a variety of scientific disciplines, sparse inverse covariance estimation is a popular tool for capturing the underlying dependency relationships in multivariate data. Unfortunately, most estimators are not scalable enough to handle the sizes of modern high-dimensional data sets (often on the order of terabytes), and assume Gaussian samples. To address these deficiencies, we introduce HP-CONCORD, a highly scalable optimization method for estimating a sparse inverse covariance matrix based on a regularized pseudolikelihood framework, without assuming Gaussianity. Our parallel proximal gradient method uses a novel communication-avoiding linear algebra algorithm and runs across a multi-node cluster with up to 1k nodes (24k cores), achieving parallel scalability on problems with up to ~819 billion parameters (1.28 million dimensions); even on a single node, HP-CONCORD demonstrates scalability, outperforming a state-of-the-art method. We also use HP-CONCORD to estimate the underlying dependency structure of the brain from fMRI data, and use the result to identify functional regions automatically. The results show good agreement with a clustering from the neuroscience literature.
- In this paper, we propose a generative model which learns the relationship between language and human action in order to generate a human action sequence given a sentence describing human behavior. The proposed generative model is a generative adversarial network (GAN), which is based on the sequence to sequence (SEQ2SEQ) model. Using the proposed generative network, we can synthesize various actions for a robot or a virtual agent using a text encoder recurrent neural network (RNN) and an action decoder RNN. The proposed generative network is trained from 29,770 pairs of actions and sentence annotations extracted from MSR-Video-to-Text (MSR-VTT), a large-scale video dataset. We demonstrate that the network can generate human-like actions which can be transferred to a Baxter robot, such that the robot performs an action based on a provided sentence. Results show that the proposed generative network correctly models the relationship between language and action and can generate a diverse set of actions from the same sentence.
- Oct 11 2017 cs.CV arXiv:1710.03224v1People nowadays share large parts of their personal lives through social media. Being able to automatically recognise people in personal photos may greatly enhance user convenience by easing photo album organisation. For human identification task, however, traditional focus of computer vision has been face recognition and pedestrian re-identification. Person recognition in social media photos sets new challenges for computer vision, including non-cooperative subjects (e.g. backward viewpoints, unusual poses) and great changes in appearance. To tackle this problem, we build a simple person recognition framework that leverages convnet features from multiple image regions (head, body, etc.). We propose new recognition scenarios that focus on the time and appearance gap between training and testing samples. We present an in-depth analysis of the importance of different features according to time and viewpoint generalisability. In the process, we verify that our simple approach achieves the state of the art result on the PIPA benchmark, arguably the largest social media based benchmark for person recognition to date with diverse poses, viewpoints, social groups, and events. Compared the conference version of the paper, this paper additionally presents (1) analysis of a face recogniser (DeepID2+), (2) new method naeil2 that combines the conference version method naeil and DeepID2+ to achieve state of the art results even compared to post-conference works, (3) discussion of related work since the conference version, (4) additional analysis including the head viewpoint-wise breakdown of performance, and (5) results on the open-world setup.
- Given sparse multi-dimensional data (e.g., (user, movie, time; rating) for movie recommendations), how can we discover latent concepts/relations and predict missing values? Tucker factorization has been widely used to solve such problems with multi-dimensional data, which are modeled as tensors. However, most Tucker factorization algorithms regard and estimate missing entries as zeros, which triggers a highly inaccurate decomposition. Moreover, few methods focusing on an accuracy exhibit limited scalability since they require huge memory and heavy computational costs while updating factor matrices. In this paper, we propose P-Tucker, a scalable Tucker factorization method for sparse tensors. P-Tucker performs alternating least squares with a row-wise update rule in a fully parallel way, which significantly reduces memory requirements for updating factor matrices. Furthermore, we offer two variants of P-Tucker: a caching algorithm P-Tucker-Cache and an approximation algorithm P-Tucker-Approx, both of which accelerate the update process. Experimental results show that P-Tucker exhibits 1.7-14.1x speed-up and 1.4-4.8x less error compared to the state-of-the-art. In addition, P-Tucker scales near linearly with the number of observable entries in a tensor and number of threads. Thanks to P-Tucker, we successfully discover hidden concepts and relations in a large-scale real-world tensor, while existing methods cannot reveal latent features due to their limited scalability or low accuracy.
- In this paper, a sparse Markov decision process (MDP) with novel causal sparse Tsallis entropy regularization is proposed.The proposed policy regularization induces a sparse and multi-modal optimal policy distribution of a sparse MDP. The full mathematical analysis of the proposed sparse MDP is provided.We first analyze the optimality condition of a sparse MDP. Then, we propose a sparse value iteration method which solves a sparse MDP and then prove the convergence and optimality of sparse value iteration using the Banach fixed point theorem. The proposed sparse MDP is compared to soft MDPs which utilize causal entropy regularization. We show that the performance error of a sparse MDP has a constant bound, while the error of a soft MDP increases logarithmically with respect to the number of actions, where this performance error is caused by the introduced regularization term. In experiments, we apply sparse MDPs to reinforcement learning problems. The proposed method outperforms existing methods in terms of the convergence speed and performance.
- Estimating mutual information from observed samples is a basic primitive, useful in several machine learning tasks including correlation mining, information bottleneck clustering, learning a Chow-Liu tree, and conditional independence testing in (causal) graphical models. While mutual information is a well-defined quantity in general probability spaces, existing estimators can only handle two special cases of purely discrete or purely continuous pairs of random variables. The main challenge is that these methods first estimate the (differential) entropies of X, Y and the pair (X;Y) and add them up with appropriate signs to get an estimate of the mutual information. These 3H-estimators cannot be applied in general mixture spaces, where entropy is not well-defined. In this paper, we design a novel estimator for mutual information of discrete-continuous mixtures. We prove that the proposed estimator is consistent. We provide numerical experiments suggesting superiority of the proposed estimator compared to other heuristics of adding small continuous noise to all the samples and applying standard estimators tailored for purely continuous variables, and quantizing the samples and applying standard estimators tailored for purely discrete variables. This significantly widens the applicability of mutual information estimation in real-world applications, where some variables are discrete, some continuous, and others are a mixture between continuous and discrete components.
- Sep 12 2017 cs.RO arXiv:1709.03211v1In this paper, we present a novel nonparametric motion flow model that effectively describes a motion trajectory of a human and its application to human robot cooperation. To this end, motion flow similarity measure which considers both spatial and temporal properties of a trajectory is proposed by utilizing the mean and variance functions of a Gaussian process. We also present a human robot cooperation method using the proposed motion flow model. Given a set of interacting trajectories of two workers, the underlying reward function of cooperating behaviors is optimized by using the learned motion description as an input to the reward function where a stochastic trajectory optimization method is used to control a robot. The presented human robot cooperation method is compared with the state-of-the-art algorithm, which utilizes a mixture of interaction primitives (MIP), in terms of the RMS error between generated and target trajectories. While the proposed method shows comparable performance with the MIP when the full observation of human demonstrations is given, it shows superior performance with respect to given partial trajectory information.
- In this paper, we propose an uncertainty-aware learning from demonstration method by presenting a novel uncertainty estimation method utilizing a mixture density network appropriate for modeling complex and noisy human behaviors. The proposed uncertainty acquisition can be done with a single forward path without Monte Carlo sampling and is suitable for real-time robotics applications. The properties of the proposed uncertainty measure are analyzed through three different synthetic examples, absence of data, heavy measurement noise, and composition of functions scenarios. We show that each case can be distinguished using the proposed uncertainty measure and presented an uncertainty-aware learn- ing from demonstration method of an autonomous driving using this property. The proposed uncertainty-aware learning from demonstration method outperforms other compared methods in terms of safety using a complex real-world driving dataset.
- Many efficient algorithms with strong theoretical guarantees have been proposed for the contextual multi-armed bandit problem. However, applying these algorithms in practice can be difficult because they require domain expertise to build appropriate features and to tune their parameters. We propose a new method for the contextual bandit problem that is simple, practical, and can be applied with little or no domain expertise. Our algorithm relies on decision trees to model the context-reward relationship. Decision trees are non-parametric, interpretable, and work well without hand-crafted features. To guide the exploration-exploitation trade-off, we use a bootstrapping approach which abstracts Thompson sampling to non-Bayesian settings. We also discuss several computational heuristics and demonstrate the performance of our method on several datasets.
- A great deal of effort has gone into trying to model social influence --- including the spread of behavior, norms, and ideas --- on networks. Most models of social influence tend to assume that individuals react to changes in the states of their neighbors without any time delay, but this is often not true in social contexts, where (for various reasons) different agents can have different response times. To examine such situations, we introduce the idea of a timer into threshold models of social influence. The presence of timers on nodes delays the adoption --- i.e., change of state --- of each agent, which in turn delays the adoptions of its neighbors. With a homogeneous-distributed timer, in which all nodes exhibit the same amount of delay, adoption delays are also homogeneous, so the adoption order of nodes remains the same. However, heterogeneously-distributed timers can change the adoption order of nodes and hence the "adoption paths" through which state changes spread in a network. Using a threshold model of social contagions, we illustrate that heterogeneous timers can either accelerate or decelerate the spread of adoptions compared to an analogous situation with homogeneous timers, and we investigate the relationship of such acceleration or deceleration with respect to timer distribution and network structure. We derive an analytical approximation for the temporal evolution of the fraction of adopters by modifying a pair approximation of the Watts threshold model, and we find good agreement with numerical computations. We also examine our new timer model on networks constructed from empirical data.
- When tracking user-specific online activities, each user's preference is revealed in the form of choices and comparisons. For example, a user's purchase history tracks her choices, i.e. which item was chosen among a subset of offerings. A user's comparisons are observed either explicitly as in movie ratings or implicitly as in viewing times of news articles. Given such individualized ordinal data, we address the problem of collaboratively learning representations of the users and the items. The learned features can be used to predict a user's preference of an unseen item to be used in recommendation systems. This also allows one to compute similarities among users and items to be used for categorization and search. Motivated by the empirical successes of the MultiNomial Logit (MNL) model in marketing and transportation, and also more recent successes in word embedding and crowdsourced image embedding, we pose this problem as learning the MNL model parameters that best explains the data. We propose a convex optimization for learning the MNL model, and show that it is minimax optimal up to a logarithmic factor by comparing its performance to a fundamental lower bound. This characterizes the minimax sample complexity of the problem, and proves that the proposed estimator cannot be improved upon other than by a logarithmic factor. Further, the analysis identifies how the accuracy depends on the topology of sampling via the spectrum of the sampling graph. This provides a guideline for designing surveys when one can choose which items are to be compared. This is accompanies by numerical simulations on synthetic and real datasets confirming our theoretical predictions.
- Apr 06 2017 cs.CV arXiv:1704.01518v1Learning how to generate descriptions of images or videos received major interest both in the Computer Vision and Natural Language Processing communities. While a few works have proposed to learn a grounding during the generation process in an unsupervised way (via an attention mechanism), it remains unclear how good the quality of the grounding is and whether it benefits the description quality. In this work we propose a movie description model which learns to generate description and jointly ground (localize) the mentioned characters as well as do visual co-reference resolution between pairs of consecutive sentences/clips. We also propose to use weak localization supervision through character mentions provided in movie descriptions to learn the character grounding. At training time, we first learn how to localize characters by relating their visual appearance to mentions in the descriptions via a semi-supervised approach. We then provide this (noisy) supervision into our description model which greatly improves its performance. Our proposed description model improves over prior work w.r.t. generated description quality and additionally provides grounding and local co-reference resolution. We evaluate it on the MPII Movie Description dataset using automatic and human evaluation measures and using our newly collected grounding and co-reference data for characters.
- Apr 03 2017 cs.CV arXiv:1703.10730v2We introduce a new problem of generating an image based on a small number of key local patches without any geometric prior. In this work, key local patches are defined as informative regions of the target object or scene. This is a challenging problem since it requires generating realistic images and predicting locations of parts at the same time. We construct adversarial networks to tackle this problem. A generator network generates a fake image as well as a mask based on the encoder-decoder framework. On the other hand, a discriminator network aims to detect fake images. The network is trained with three losses to consider spatial, appearance, and adversarial information. The spatial loss determines whether the locations of predicted parts are correct. Input patches are restored in the output image without much modification due to the appearance loss. The adversarial loss ensures output images are realistic. The proposed network is trained without supervisory signals since no labels of key parts are required. Experimental results on six datasets demonstrate that the proposed algorithm performs favorably on challenging objects and scenes.
- Users like sharing personal photos with others through social media. At the same time, they might want to make automatic identification in such photos difficult or even impossible. Classic obfuscation methods such as blurring are not only unpleasant but also not as effective as one would expect. Recent studies on adversarial image perturbations (AIP) suggest that it is possible to confuse recognition systems effectively without unpleasant artifacts. However, in the presence of counter measures against AIPs, it is unclear how effective AIP would be in particular when the choice of counter measure is unknown. Game theory provides tools for studying the interaction between agents with uncertainties in the strategies. We introduce a general game theoretical framework for the user-recogniser dynamics, and present a case study that involves current state of the art AIP and person recognition techniques. We derive the optimal strategy for the user that assures an upper bound on the recognition rate independent of the recogniser's counter measure. Code is available at https://goo.gl/hgvbNK.
- Singular values of a data in a matrix form provide insights on the structure of the data, the effective dimensionality, and the choice of hyper-parameters on higher-level data analysis tools. However, in many practical applications such as collaborative filtering and network analysis, we only get a partial observation. Under such scenarios, we consider the fundamental problem of recovering spectral properties of the underlying matrix from a sampling of its entries. We are particularly interested in directly recovering the spectrum, which is the set of singular values, and also in sample-efficient approaches for recovering a spectral sum function, which is an aggregate sum of the same function applied to each of the singular values. We propose first estimating the Schatten $k$-norms of a matrix, and then applying Chebyshev approximation to the spectral sum function or applying moment matching in Wasserstein distance to recover the singular values. The main technical challenge is in accurately estimating the Schatten norms from a sampling of a matrix. We introduce a novel unbiased estimator based on counting small structures in a graph and provide guarantees that match its empirical performance. Our theoretical analysis shows that Schatten norms can be recovered accurately from strictly smaller number of samples compared to what is needed to recover the underlying low-rank matrix. Numerical experiments suggest that we significantly improve upon a competing approach of using matrix completion methods.
- Crowdsourcing platforms emerged as popular venues for purchasing human intelligence at low cost for large volumes of tasks. As many low-paid workers are prone to give noisy answers, one of the fundamental questions is how to identify more reliable workers and exploit this heterogeneity to infer the true answers accurately. Despite significant research efforts for classification tasks with discrete answers, little attention has been paid to regression tasks with continuous answers. The popular Dawid-Skene model for discrete answers has the algorithmic and mathematical simplicity in relation to low-rank structures. But it does not generalize for continuous valued answers. To this end, we introduce a new probabilistic model for crowdsourced regression capturing the heterogeneity of the workers, generalizing the Dawid-Skene model to the continuous domain. We design a message-passing algorithm for Bayesian inference inspired by the popular belief propagation algorithm. We showcase its performance first by proving that it achieves a near optimal mean squared error by comparing it to an oracle estimator. Asymptotically, we can provide a tighter analysis showing that the proposed algorithm achieves the exact optimal performance. We next show synthetic experiments confirming our theoretical predictions. As a practical application, we further emulate a crowdsourcing system reproducing PASCAL visual object classes datasets and show that de-noising the crowdsourced data from the proposed scheme can significantly improve the performance for the vision task.
- Estimating expected polynomials of density functions from samples is a basic problem with numerous applications in statistics and information theory. Although kernel density estimators are widely used in practice for such functional estimation problems, practitioners are left on their own to choose an appropriate bandwidth for each application in hand. Further, kernel density estimators suffer from boundary biases, which are prevalent in real world data with lower dimensional structures. We propose using the fixed-k nearest neighbor distances for the bandwidth, which adaptively adjusts to local geometry. Further, we propose a novel estimator based on local likelihood density estimators, that mitigates the boundary biases. Although such a choice of fixed-k nearest neighbor distances to bandwidths results in inconsistent estimators, we provide a simple debiasing scheme that precomputes the asymptotic bias and divides off this term. With this novel correction, we show consistency of this debiased estimator. We provide numerical experiments suggesting that it improves upon competing state-of-the-art methods.
- Feb 07 2017 cs.CG arXiv:1702.01524v1In the Any-Angle Pathfinding problem, the goal is to find the shortest path between a pair of vertices on a uniform square grid, that is not constrained to any fixed number of possible directions over the grid. Visibility Graphs are a known optimal algorithm for solving the problem with the use of pre-processing. However, Visibility Graphs are known to perform poorly in terms of running time, especially on large, complex maps. In this paper, we introduce two improvements over the Visibility Graph Algorithm to compute optimal paths. Sparse Visibility Graphs (SVGs) are constructed by pruning unnecessary edges from the original Visibility Graph. Edge N-Level Sparse Visibility Graphs (ENLSVGs) is a hierarchical SVG built by iteratively pruning non-taut paths. We also introduce Line-of-Sight Scans, a faster algorithm for building Visibility Graphs over a grid. SVGs run much faster than Visibility Graphs by reducing the average vertex degree. ENLSVGs, a hierarchical algorithm, improves this further, especially on larger maps. On large maps, with the use of pre-processing, these algorithms are orders of magnitude faster than existing algorithms like Visibility Graphs and Theta*.
- Jan 31 2017 cs.CV arXiv:1701.08261v2There have been remarkable improvements in the semantic labelling task in the recent years. However, the state of the art methods rely on large-scale pixel-level annotations. This paper studies the problem of training a pixel-wise semantic labeller network from image-level annotations of the present object classes. Recently, it has been shown that high quality seeds indicating discriminative object regions can be obtained from image-level labels. Without additional information, obtaining the full extent of the object is an inherently ill-posed problem due to co-occurrences. We propose using a saliency model as additional information and hereby exploit prior knowledge on the object extent and image statistics. We show how to combine both information sources in order to recover 80% of the fully supervised performance - which is the new state of the art in weakly supervised training for pixel-wise semantic labelling. The code is available at https://goo.gl/KygSeb.
- Jan 13 2017 cs.NI arXiv:1701.03416v1In this paper, we consider the utilization of TV White Spaces (TVWS) by small Cognitive Radio (CR) network operators to support the communication needs of various smart grid applications. We first propose a multi-tier communication network architecture for smart metering applications in dense urban environments. Our measurement campaign, without any competition from other CR operators, reveals that the communication architecture can achieve more than 1Mbps data rates using the free unlicensed TVWS spectrum. However, anticipating stiff competition for the unlicensed TVWS spectrum among CR operators and to support smart grid applications with stringent Quality of Service (QoS) requirements, we further exploit the novel idea of high priority channels (HPC) that the CR operator can temporarily lease by paying a small fee. This poses several new challenges for the CR operators, such as, their economic viability while providing QoS guarantees. We develop a real-time decision support framework with several adjustable parameters for the CR operators that enables them to tradeoff HPC leasing cost and QoS. The developed algorithms are simple rules that provide significant opportunities to the CR operators to maintain a balance between spectrum cost and QoS depending on dynamic spectrum availability and smart grid application requirements.
- Estimators of information theoretic measures such as entropy and mutual information are a basic workhorse for many downstream applications in modern data science. State of the art approaches have been either geometric (nearest neighbor (NN) based) or kernel based (with a globally chosen bandwidth). In this paper, we combine both these approaches to design new estimators of entropy and mutual information that outperform state of the art methods. Our estimator uses local bandwidth choices of $k$-NN distances with a finite $k$, independent of the sample size. Such a local and data dependent choice improves performance in practice, but the bandwidth is vanishing at a fast rate, leading to a non-vanishing bias. We show that the asymptotic bias of the proposed estimator is universal; it is independent of the underlying distribution. Hence, it can be pre-computed and subtracted from the estimate. As a byproduct, we obtain a unified way of obtaining both kernel and NN estimators. The corresponding theoretical contribution relating the asymptotic geometry of nearest neighbors to order statistics is of independent mathematical interest.
- Aug 30 2016 cs.CV arXiv:1608.07951v1Computational color constancy refers to the problem of computing the illuminant color so that the images of a scene under varying illumination can be normalized to an image under the canonical illumination. In this paper, we adopt a deep learning framework for the illumination estimation problem. The proposed method works under the assumption of uniform illumination over the scene and aims for the accurate illuminant color computation. Specifically, we trained the convolutional neural network to solve the problem by casting the color constancy problem as an illumination classification problem. We designed the deep learning architecture so that the output of the network can be directly used for computing the color of the illumination. Experimental results show that our deep network is able to extract useful features for the illumination estimation and our method outperforms all previous color constancy methods on multiple test datasets.
- For massive and heterogeneous modern datasets, it is of fundamental interest to provide guarantees on the accuracy of estimation when computational resources are limited. In the application of learning to rank, we provide a hierarchy of rank-breaking mechanisms ordered by the complexity in thus generated sketch of the data. This allows the number of data points collected to be gracefully traded off against computational resources available, while guaranteeing the desired level of accuracy. Theoretical guarantees on the proposed generalized rank-breaking implicitly provide such trade-offs, which can be explicitly characterized under certain canonical scenarios on the structure of the data.
- In this paper, we focus on the problem of inferring the underlying reward function of an expert given demonstrations, which is often referred to as inverse reinforcement learning (IRL). In particular, we propose a model-free density-based IRL algorithm, named density matching reward learning (DMRL), which does not require model dynamics. The performance of DMRL is analyzed theoretically and the sample complexity is derived. Furthermore, the proposed DMRL is extended to handle nonlinear IRL problems by assuming that the reward function is in the reproducing kernel Hilbert space (RKHS) and kernel DMRL (KDMRL) is proposed. The parameters for KDMRL can be computed analytically, which greatly reduces the computation time. The performance of KDMRL is extensively evaluated in two sets of experiments: grid world and track driving experiments. In grid world experiments, the proposed KDMRL method is compared with both model-based and model-free IRL methods and shows superior performance on a nonlinear reward setting and competitive performance on a linear reward setting in terms of expected value di?fferences. Then we move on to more realistic experiments of learning diff?erent driving styles for autonomous navigation in complex and dynamic tracks using KDMRL and receding horizon control.
- As we shift more of our lives into the virtual domain, the volume of data shared on the web keeps increasing and presents a threat to our privacy. This works contributes to the understanding of privacy implications of such data sharing by analysing how well people are recognisable in social media data. To facilitate a systematic study we define a number of scenarios considering factors such as how many heads of a person are tagged and if those heads are obfuscated or not. We propose a robust person recognition system that can handle large variations in pose and clothing, and can be trained with few training samples. Our results indicate that a handful of images is enough to threaten users' privacy, even in the presence of obfuscation. We show detailed experimental results, and discuss their implications.
- We propose a method for building an interpretable recommender system for personalizing online content and promotions. Historical data available for the system consists of customer features, provided content (promotions), and user responses. Unlike in a standard multi-class classification setting, misclassification costs depend on both recommended actions and customers. Our method transforms such a data set to a new set which can be used with standard interpretable multi-class classification algorithms. The transformation has the desirable property that minimizing the standard misclassification penalty in this new space is equivalent to minimizing the custom cost function.
- Convolutional neural networks (CNNs) with convolutional and pooling operations along the frequency axis have been proposed to attain invariance to frequency shifts of features. However, this is inappropriate with regard to the fact that acoustic features vary in frequency. In this paper, we contend that convolution along the time axis is more effective. We also propose the addition of an intermap pooling (IMP) layer to deep CNNs. In this layer, filters in each group extract common but spectrally variant features, then the layer pools the feature maps of each group. As a result, the proposed IMP CNN can achieve insensitivity to spectral variations characteristic of different speakers and utterances. The effectiveness of the IMP CNN architecture is demonstrated on several LVCSR tasks. Even without speaker adaptation techniques, the architecture achieved a WER of 12.7% on the SWB part of the Hub5'2000 evaluation test set, which is competitive with other state-of-the-art methods.
- Apr 12 2016 cs.NI arXiv:1604.02747v1Heterogeneous hetworks (HetNets) have been introduced as an emerging technology in order to meet the increasing demand for mobile data. HetNets are a combination of multi-layer networks such as macrocells and small cells. In such networks, users may suffer significant cross-layer interference. To manage this interference, the 3rd Generation Partnership Project (3GPP) has introduced enhanced Inter-Cell Interference Coordination (eICIC) techniques. Almost Blank SubFrame (ABSF) is one of the time-domain techniques used in eICIC solutions. We propose a dynamically optimal Signal-to-Interference-and-Noise Ratio (SINR)-based ABSF framework to ensure macro user performance while maintaining small user performance. We also study cooperative mechanisms to help small cells collaborate efficiently in order to reduce mutual interference. Simulations show that our proposed scheme achieves good performance and outperforms the existing ABSF frameworks.
- Estimating mutual information from i.i.d. samples drawn from an unknown joint density function is a basic statistical problem of broad interest with multitudinous applications. The most popular estimator is one proposed by Kraskov and StÃ¶gbauer and Grassberger (KSG) in 2004, and is nonparametric and based on the distances of each sample to its $k^{\rm th}$ nearest neighboring sample, where $k$ is a fixed small integer. Despite its widespread use (part of scientific software packages), theoretical properties of this estimator have been largely unexplored. In this paper we demonstrate that the estimator is consistent and also identify an upper bound on the rate of convergence of the bias as a function of number of samples. We argue that the superior performance benefits of the KSG estimator stems from a curious "correlation boosting" effect and build on this intuition to modify the KSG estimator in novel ways to construct a superior estimator. As a byproduct of our investigations, we obtain nearly tight rates of convergence of the $\ell_2$ error of the well known fixed $k$ nearest neighbor estimator of differential entropy by Kozachenko and Leonenko.
- We explore the top-$K$ rank aggregation problem. Suppose a collection of items is compared in pairs repeatedly, and we aim to recover a consistent ordering that focuses on the top-$K$ ranked items based on partially revealed preference information. We investigate the Bradley-Terry-Luce model in which one ranks items according to their perceived utilities modeled as noisy observations of their underlying true utilities. Our main contributions are two-fold. First, in a general comparison model where item pairs to compare are given a priori, we attain an upper and lower bound on the sample size for reliable recovery of the top-$K$ ranked items. Second, more importantly, extending the result to a random comparison model where item pairs to compare are chosen independently with some probability, we show that in slightly restricted regimes, the gap between the derived bounds reduces to a constant factor, hence reveals that a spectral method can achieve the minimax optimality on the (order-wise) sample size required for top-$K$ ranking. That is to say, we demonstrate a spectral method alone to be sufficient to achieve the optimality and advantageous in terms of computational complexity, as it does not require an additional stage of maximum likelihood estimation that a state-of-the-art scheme employs to achieve the optimality. We corroborate our main results by numerical experiments.
- Crowdsourcing systems are popular for solving large-scale labelling tasks with low-paid workers. We study the problem of recovering the true labels from the possibly erroneous crowdsourced labels under the popular Dawid-Skene model. To address this inference problem, several algorithms have recently been proposed, but the best known guarantee is still significantly larger than the fundamental limit. We close this gap by introducing a tighter lower bound on the fundamental limit and proving that Belief Propagation (BP) exactly matches this lower bound. The guaranteed optimality of BP is the strongest in the sense that it is information-theoretically impossible for any other algorithm to correctly label a larger fraction of the tasks. Experimental results suggest that BP is close to optimal for all regimes considered and improves upon competing state-of-the-art algorithms.
- We conduct an axiomatic study of the problem of estimating the strength of a known causal relationship between a pair of variables. We propose that an estimate of causal strength should be based on the conditional distribution of the effect given the cause (and not on the driving distribution of the cause), and study dependence measures on conditional distributions. Shannon capacity, appropriately regularized, emerges as a natural measure under these axioms. We examine the problem of calculating Shannon capacity from the observed samples and propose a novel fixed-$k$ nearest neighbor estimator, and demonstrate its consistency. Finally, we demonstrate an application to single-cell flow-cytometry, where the proposed estimators significantly reduce sample complexity.
- Crowdsourcing platforms provide marketplaces where task requesters can pay to get labels on their data. Such markets have emerged recently as popular venues for collecting annotations that are crucial in training machine learning models in various applications. However, as jobs are tedious and payments are low, errors are common in such crowdsourced labels. A common strategy to overcome such noise in the answers is to add redundancy by getting multiple answers for each task and aggregating them using some methods such as majority voting. For such a system, there is a fundamental question of interest: how can we maximize the accuracy given a fixed budget on how many responses we can collect on the crowdsourcing system. We characterize this fundamental trade-off between the budget (how many answers the requester can collect in total) and the accuracy in the estimated labels. In particular, we ask whether adaptive task assignment schemes lead to a more efficient trade-off between the accuracy and the budget. Adaptive schemes, where tasks are assigned adaptively based on the data collected thus far, are widely used in practical crowdsourcing systems to efficiently use a given fixed budget. However, existing theoretical analyses of crowdsourcing systems suggest that the gain of adaptive task assignments is minimal. To bridge this gap, we investigate this question under a strictly more general probabilistic model, which has been recently introduced to model practical crowdsourced annotations. Under this generalized Dawid-Skene model, we characterize the fundamental trade-off between budget and accuracy. We introduce a novel adaptive scheme that matches this fundamental limit. We further quantify the fundamental gap between adaptive and non-adaptive schemes, by comparing the trade-off with the one for non-adaptive schemes. Our analyses confirm that the gap is significant.
- Experiments in particle physics produce enormous quantities of data that must be analyzed and interpreted by teams of physicists. This analysis is often exploratory, where scientists are unable to enumerate the possible types of signal prior to performing the experiment. Thus, tools for summarizing, clustering, visualizing and classifying high-dimensional data are essential. In this work, we show that meaningful physical content can be revealed by transforming the raw data into a learned high-level representation using deep neural networks, with measurements taken at the Daya Bay Neutrino Experiment as a case study. We further show how convolutional deep neural networks can provide an effective classification filter with greater than 97% accuracy across different classes of physics events, significantly better than other machine learning approaches.
- Rank aggregation systems collect ordinal preferences from individuals to produce a global ranking that represents the social preference. Rank-breaking is a common practice to reduce the computational complexity of learning the global ranking. The individual preferences are broken into pairwise comparisons and applied to efficient algorithms tailored for independent paired comparisons. However, due to the ignored dependencies in the data, naive rank-breaking approaches can result in inconsistent estimates. The key idea to produce accurate and consistent estimates is to treat the pairwise comparisons unequally, depending on the topology of the collected data. In this paper, we provide the optimal rank-breaking estimator, which not only achieves consistency but also achieves the best error bound. This allows us to characterize the fundamental tradeoff between accuracy and complexity. Further, the analysis identifies how the accuracy depends on the spectral gap of a corresponding comparison graph.
- Sep 14 2015 cs.CV arXiv:1509.03502v2Recognising persons in everyday photos presents major challenges (occluded faces, different clothing, locations, etc.) for machine vision. We propose a convnet based person recognition system on which we provide an in-depth analysis of informativeness of different body cues, impact of training data, and the common failure modes of the system. In addition, we discuss the limitations of existing benchmarks and propose more challenging ones. Our method is simple and is built on open source and open data, yet it improves the state of the art results on a large dataset of social media photos (PIPA).
- Anonymous social media platforms like Secret, Yik Yak, and Whisper have emerged as important tools for sharing ideas without the fear of judgment. Such anonymous platforms are also important in nations under authoritarian rule, where freedom of expression and the personal safety of message authors may depend on anonymity. Whether for fear of judgment or retribution, it is sometimes crucial to hide the identities of users who post sensitive messages. In this paper, we consider a global adversary who wishes to identify the author of a message; it observes either a snapshot of the spread of a message at a certain time, sampled timestamp metadata, or both. Recent advances in rumor source detection show that existing messaging protocols are vulnerable against such an adversary. We introduce a novel messaging protocol, which we call adaptive diffusion, and show that under the snapshot adversarial model, adaptive diffusion spreads content fast and achieves perfect obfuscation of the source when the underlying contact network is an infinite regular tree. That is, all users with the message are nearly equally likely to have been the origin of the message. When the contact network is an irregular tree, we characterize the probability of maximum likelihood detection by proving a concentration result over Galton-Watson trees. Experiments on a sampled Facebook network demonstrate that adaptive diffusion effectively hides the location of the source even when the graph is finite, irregular and has cycles.
- In applications such as recommendation systems and revenue management, it is important to predict preferences on items that have not been seen by a user or predict outcomes of comparisons among those that have never been compared. A popular discrete choice model of multinomial logit model captures the structure of the hidden preferences with a low-rank matrix. In order to predict the preferences, we want to learn the underlying model from noisy observations of the low-rank matrix, collected as revealed preferences in various forms of ordinal data. A natural approach to learn such a model is to solve a convex relaxation of nuclear norm minimization. We present the convex relaxation approach in two contexts of interest: collaborative ranking and bundled choice modeling. In both cases, we show that the convex relaxation is minimax optimal. We prove an upper bound on the resulting error with finite samples, and provide a matching information-theoretic lower bound.
- Apr 16 2015 cs.IR arXiv:1504.03713v1With a vast number of items, web-pages, and news to choose from, online services and the customers both benefit tremendously from personalized recommender systems. Such systems however provide great opportunities for targeted advertisements, by displaying ads alongside genuine recommendations. We consider a biased recommendation system where such ads are displayed without any tags (disguised as genuine recommendations), rendering them indistinguishable to a single user. We ask whether it is possible for a small subset of collaborating users to detect such a bias. We propose an algorithm that can detect such a bias through statistical analysis on the collaborating users' feedback. The algorithm requires only binary information indicating whether a user was satisfied with each of the recommended item or not. This makes the algorithm widely appealing to real world issues such as identification of search engine bias and pharmaceutical lobbying. We prove that the proposed algorithm detects the bias with high probability for a broad class of recommendation systems when sufficient number of users provide feedback on sufficient number of recommendations. We provide extensive simulations with real data sets and practical recommender systems, which confirm the trade offs in the theoretical guarantees.
- Anonymous messaging platforms, such as Secret and Whisper, have emerged as important social media for sharing one's thoughts without the fear of being judged by friends, family, or the public. Further, such anonymous platforms are crucial in nations with authoritarian governments; the right to free expression and sometimes the personal safety of the author of the message depend on anonymity. Whether for fear of judgment or personal endangerment, it is crucial to keep anonymous the identity of the user who initially posted a sensitive message. In this paper, we consider an adversary who observes a snapshot of the spread of a message at a certain time. Recent advances in rumor source detection shows that the existing messaging protocols are vulnerable against such an adversary. We introduce a novel messaging protocol, which we call adaptive diffusion, and show that it spreads the messages fast and achieves a perfect obfuscation of the source when the underlying contact network is an infinite regular tree: all users with the message are nearly equally likely to have been the origin of the message. Experiments on a sampled Facebook network show that it effectively hides the location of the source even when the graph is finite, irregular and has cycles. We further consider a stronger adversarial model where a subset of colluding users track the reception of messages. We show that the adaptive diffusion provides a strong protection of the anonymity of the source even under this scenario.
- Sparse high dimensional graphical model selection is a popular topic in contemporary machine learning. To this end, various useful approaches have been proposed in the context of $\ell_1$-penalized estimation in the Gaussian framework. Though many of these inverse covariance estimation approaches are demonstrably scalable and have leveraged recent advances in convex optimization, they still depend on the Gaussian functional form. To address this gap, a convex pseudo-likelihood based partial correlation graph estimation method (CONCORD) has been recently proposed. This method uses coordinate-wise minimization of a regression based pseudo-likelihood, and has been shown to have robust model selection properties in comparison with the Gaussian approach. In direct contrast to the parallel work in the Gaussian setting however, this new convex pseudo-likelihood framework has not leveraged the extensive array of methods that have been proposed in the machine learning literature for convex optimization. In this paper, we address this crucial gap by proposing two proximal gradient methods (CONCORD-ISTA and CONCORD-FISTA) for performing $\ell_1$-regularized inverse covariance matrix estimation in the pseudo-likelihood framework. We present timing comparisons with coordinate-wise minimization and demonstrate that our approach yields tremendous payoffs for $\ell_1$-penalized partial correlation graph estimation outside the Gaussian setting, thus yielding the fastest and most scalable approach for such problems. We undertake a theoretical analysis of our approach and rigorously demonstrate convergence, and also derive rates thereof.
- Jul 08 2014 cs.CR arXiv:1407.1546v2We study the problem of interactive function computation by multiple parties possessing a single bit each in a differential privacy setting (i.e., there remains an uncertainty in any specific party's bit even when given the transcript of the interactions and all the other parties' bits). Each party is interested in computing a function, which could differ from party to party, and there could be a central observer interested in computing a separate function. Performance at each party and the central observer is measured via the accuracy of the function computed. We allow for an arbitrary cost function to measure the distortion between the true and the computed function value. Our main result is the exact optimality of a simple non-interactive protocol: each party randomizes (sufficiently) and publishes its own bit. In other words, non-interactive randomized response is exactly optimal. Each party and the central observer then separately compute their respective function to maximize the appropriate notion of their accuracy measure. The optimality is very general: it holds for all types of functions, heterogeneous privacy conditions on the parties, all types of cost metrics, and both average and worst-case (over the inputs) measures of accuracy. Finally, the optimality result is simultaneous, in terms of maximizing accuracy at each of the parties and the central observer.
- Local differential privacy has recently surfaced as a strong measure of privacy in contexts where personal information remains private even from data analysts. Working in a setting where both the data providers and data analysts want to maximize the utility of statistical analyses performed on the released data, we study the fundamental trade-off between local differential privacy and utility. This trade-off is formulated as a constrained optimization problem: maximize utility subject to local differential privacy constraints. We introduce a combinatorial family of extremal privatization mechanisms, which we call staircase mechanisms, and show that it contains the optimal privatization mechanisms for a broad class of information theoretic utilities such as mutual information and $f$-divergences. We further prove that for any utility function and any privacy level, solving the privacy-utility maximization problem is equivalent to solving a finite-dimensional linear program, the outcome of which is the optimal staircase mechanism. However, solving this linear program can be computationally expensive since it has a number of variables that is exponential in the size of the alphabet the data lives in. To account for this, we show that two simple privatization mechanisms, the binary and randomized response mechanisms, are universally optimal in the low and high privacy regimes, and well approximate the intermediate regime.
- We study the problem of learning a distribution from samples, when the underlying distribution is a mixture of product distributions over discrete domains. This problem is motivated by several practical applications such as crowd-sourcing, recommendation systems, and learning Boolean functions. The existing solutions either heavily rely on the fact that the number of components in the mixtures is finite or have sample/time complexity that is exponential in the number of components. In this paper, we introduce a polynomial time/sample complexity method for learning a mixture of $r$ discrete product distributions over $\{1, 2, \dots, \ell\}^n$, for general $\ell$ and $r$. We show that our approach is statistically consistent and further provide finite sample guarantees. We use techniques from the recent work on tensor decompositions for higher-order moment matching. A crucial step in these moment matching methods is to construct a certain matrix and a certain tensor with low-rank spectral decompositions. These tensors are typically estimated directly from the samples. The main challenge in learning mixtures of discrete product distributions is that these low-rank tensors cannot be obtained directly from the sample moments. Instead, we reduce the tensor estimation problem to: $a$) estimating a low-rank matrix using only off-diagonal block elements; and $b$) estimating a tensor using a small number of linear measurements. Leveraging on recent developments in matrix completion, we give an alternating minimization based method to estimate the low-rank matrix, and formulate the tensor completion problem as a least-squares problem.
- Nov 06 2013 cs.RO arXiv:1311.0959v1This technical report gives an overview of our work on control algorithms dealing with redundant robot systems for achieving human-like motion characteristics. Previously, we developed a novel control law to exhibit human-motion characteristics in redundant robot arm systems as well as arm-trunk systems for reaching tasks [1], [2]. This newly developed method nullifies the need for the computation of pseudo-inverse of Jacobian while the formulation and optimization of any artificial performance index is not necessary. The time-varying properties of the muscle stiffness and damping as well as the low-pass filter characteristics of human muscles have been modeled by the proposed control law to generate human-motion characteristics for reaching motion like quasi-straight line trajectory of the end-effector and symmetric bell shaped velocity profile. This report focuses on the experiments performed using a 7-DOF redundant robot-arm system which proved the effectiveness of this algorithm in imitating human-like motion characteristics. In addition, we extended this algorithm to a 19-DOF Hand-Arm System for a reach-to-grasp task. Simulations using the 19-DOF Hand-Arm System show the effectiveness of the proposed scheme for effective human-like hand-arm coordination in reach-to-grasp tasks for pinch and envelope grasps on objects of different shapes such as a box, a cylinder, and a sphere.
- Sequential querying of differentially private mechanisms degrades the overall privacy level. In this paper, we answer the fundamental question of characterizing the level of overall privacy degradation as a function of the number of queries and the privacy levels maintained by each privatization mechanism. Our solution is complete: we prove an upper bound on the overall privacy level and construct a sequence of privatization mechanisms that achieves this bound. The key innovation is the introduction of an operational interpretation of differential privacy (involving hypothesis testing) and the use of new data processing inequalities. Our result improves over the state-of-the-art, and has immediate applications in several problems studied in the literature including differentially private multi-party computation.
- Nov 05 2013 cs.RO arXiv:1311.0388v1Many day-to-day activities require the dexterous manipulation of a redundant humanoid arm in complex 3D environments. However, position regulation of such robot arm systems becomes very difficult in presence of non-linear uncertainties in the system. Also, perturbations exist due to various unwanted interactions with obstacles for clumsy environments in which obstacle avoidance is not possible, and this makes position regulation even more difficult. This report proposes a non-linear task-space disturbance observer by virtue of which position regulation of such robotic systems can be achieved in spite of such perturbations and uncertainties. Simulations are conducted using a 7-DOF redundant robot arm system to show the effectiveness of the proposed method. These results are then compared with the case of a conventional mass-damper based task-space disturbance observer to show the enhancement in performance using the developed concept. This proposed method is then applied to a controller which exhibits human-like motion characteristics for reaching a target. Arbitrary perturbations in the form of interactions with obstacles are introduced in its path. Results show that the robot end-effector successfully continues to move in its path of a human-like quasi-straight trajectory even if the joint trajectories deviated by a considerable amount due to the perturbations. These results are also compared with that of the unperturbed motion of the robot which further prove the significance of the developed scheme.
- The question of aggregating pair-wise comparisons to obtain a global ranking over a collection of objects has been of interest for a very long time: be it ranking of online gamers (e.g. MSR's TrueSkill system) and chess players, aggregating social opinions, or deciding which product to sell based on transactions. In most settings, in addition to obtaining a ranking, finding `scores' for each object (e.g. player's rating) is of interest for understanding the intensity of the preferences. In this paper, we propose Rank Centrality, an iterative rank aggregation algorithm for discovering scores for objects (or items) from pair-wise comparisons. The algorithm has a natural random walk interpretation over the graph of objects with an edge present between a pair of objects if they are compared; the score, which we call Rank Centrality, of an object turns out to be its stationary probability under this random walk. To study the efficacy of the algorithm, we consider the popular Bradley-Terry-Luce (BTL) model (equivalent to the Multinomial Logit (MNL) for pair-wise comparisons) in which each object has an associated score which determines the probabilistic outcomes of pair-wise comparisons between objects. In terms of the pair-wise marginal probabilities, which is the main subject of this paper, the MNL model and the BTL model are identical. We bound the finite sample error rates between the scores assumed by the BTL model and those estimated by our algorithm. In particular, the number of samples required to learn the score well with high probability depends on the structure of the comparison graph. When the Laplacian of the comparison graph has a strictly positive spectral gap, e.g. each item is compared to a subset of randomly chosen items, this leads to dependence on the number of samples that is nearly order-optimal.
- Crowdsourcing systems, in which numerous tasks are electronically distributed to numerous "information piece-workers", have emerged as an effective paradigm for human-powered solving of large scale problems in domains such as image classification, data entry, optical character recognition, recommendation, and proofreading. Because these low-paid workers can be unreliable, nearly all such systems must devise schemes to increase confidence in their answers, typically by assigning each task multiple times and combining the answers in an appropriate manner, e.g. majority voting. In this paper, we consider a general model of such crowdsourcing tasks and pose the problem of minimizing the total price (i.e., number of task assignments) that must be paid to achieve a target overall reliability. We give a new algorithm for deciding which tasks to assign to which workers and for inferring correct answers from the workers' answers. We show that our algorithm, inspired by belief propagation and low-rank matrix approximation, significantly outperforms majority voting and, in fact, is optimal through comparison to an oracle that knows the reliability of every worker. Further, we compare our approach with a more general class of algorithms which can dynamically assign tasks. By adaptively deciding which questions to ask to the next arriving worker, one might hope to reduce uncertainty more efficiently. We show that, perhaps surprisingly, the minimum price necessary to achieve a target reliability scales in the same manner under both adaptive and non-adaptive scenarios. Hence, our non-adaptive approach is order-optimal under both scenarios. This strongly relies on the fact that workers are fleeting and can not be exploited. Therefore, architecturally, our results suggest that building a reliable worker-reputation system is essential to fully harnessing the potential of adaptive designs.
- We consider the problem of localizing wireless devices in an ad-hoc network embedded in a d-dimensional Euclidean space. Obtaining a good estimation of where wireless devices are located is crucial in wireless network applications including environment monitoring, geographic routing and topology control. When the positions of the devices are unknown and only local distance information is given, we need to infer the positions from these local distance measurements. This problem is particularly challenging when we only have access to measurements that have limited accuracy and are incomplete. We consider the extreme case of this limitation on the available information, namely only the connectivity information is available, i.e., we only know whether a pair of nodes is within a fixed detection range of each other or not, and no information is known about how far apart they are. Further, to account for detection failures, we assume that even if a pair of devices is within the detection range, it fails to detect the presence of one another with some probability and this probability of failure depends on how far apart those devices are. Given this limited information, we investigate the performance of a centralized positioning algorithm MDS-MAP introduced by Shang et al., and a distributed positioning algorithm, introduced by Savarese et al., called HOP-TERRAIN. In particular, for a network consisting of n devices positioned randomly, we provide a bound on the resulting error for both algorithms. We show that the error is bounded, decreasing at a rate that is proportional to R/Rc, where Rc is the critical detection range when the resulting random network starts to be connected, and R is the detection range of each device.
- Eigenvectors of data matrices play an important role in many computational problems, ranging from signal processing to machine learning and control. For instance, algorithms that compute positions of the nodes of a wireless network on the basis of pairwise distance measurements require a few leading eigenvectors of the distances matrix. While eigenvector calculation is a standard topic in numerical linear algebra, it becomes challenging under severe communication or computation constraints, or in absence of central scheduling. In this paper we investigate the possibility of computing the leading eigenvectors of a large data matrix through gossip algorithms. The proposed algorithm amounts to iteratively multiplying a vector by independent random sparsification of the original matrix and averaging the resulting normalized vectors. This can be viewed as a generalization of gossip algorithms for consensus, but the resulting dynamics is significantly more intricate. Our analysis is based on controlling the convergence to stationarity of the associated Kesten-Furstenberg Markov chain.
- Jan 31 2011 cs.GR arXiv:1101.5490v1We present a novel method of simulating wave effects in graphics using ray--based renderers with a new function: the Wave BSDF (Bidirectional Scattering Distribution Function). Reflections from neighboring surface patches represented by local BSDFs are mutually independent. However, in many surfaces with wavelength-scale microstructures, interference and diffraction requires a joint analysis of reflected wavefronts from neighboring patches. We demonstrate a simple method to compute the BSDF for the entire microstructure, which can be used independently for each patch. This allows us to use traditional ray--based rendering pipelines to synthesize wave effects of light and sound. We exploit the Wigner Distribution Function (WDF) to create transmissive, reflective, and emissive BSDFs for various diffraction phenomena in a physically accurate way. In contrast to previous methods for computing interference, we circumvent the need to explicitly keep track of the phase of the wave by using BSDFs that include positive as well as negative coefficients. We describe and compare the theory in relation to well understood concepts in rendering and demonstrate a straightforward implementation. In conjunction with standard raytracers, such as PBRT, we demonstrate wave effects for a range of scenarios such as multi--bounce diffraction materials, holograms and reflection of high frequency surfaces.
- We study the calibration process in circular ultrasound tomography devices where the sensor positions deviate from the circumference of a perfect circle. This problem arises in a variety of applications in signal processing ranging from breast imaging to sensor network localization. We introduce a novel method of calibration/localization based on the time-of-flight (ToF) measurements between sensors when the enclosed medium is homogeneous. In the presence of all the pairwise ToFs, one can easily estimate the sensor positions using multi-dimensional scaling (MDS) method. In practice however, due to the transitional behaviour of the sensors and the beam form of the transducers, the ToF measurements for close-by sensors are unavailable. Further, random malfunctioning of the sensors leads to random missing ToF measurements. On top of the missing entries, in practice an unknown time delay is also added to the measurements. In this work, we incorporate the fact that a matrix defined from all the ToF measurements is of rank at most four. In order to estimate the missing ToFs, we apply a state-of-the-art low-rank matrix completion algorithm, OPTSPACE . To find the correct positions of the sensors (our ultimate goal) we then apply MDS. We show analytic bounds on the overall error of the whole process in the presence of noise and hence deduce its robustness. Finally, we confirm the functionality of our method in practice by simulations mimicking the measurements of a circular ultrasound tomography device.
- Dec 16 2010 cs.DC arXiv:1012.3347v2Advances in web technologies have driven massive content uploads and requests that can be identified by the increased usage of multimedia web and social web services. This situation enforces the content providers to scale their infrastructure in order to cope with the extra provisioning of network traffic, storage and other resources. Since the complexity and cost factors in scaling the infrastructure exist, we propose a novel solution for providing and delivering contents to clients by introducing a brokered collaborative content delivery system. The architectural design of this system leverages content redundancy and content distribution mechanisms in other content providers to deliver contents to the clients. With the recent emergence of cloud computing, we show that this system can also be adopted to run on the cloud. In this paper, we focus on a brokering scheme to mediate user requests to the most appropriate content provider based on a ranking system. The architecture provides a novel Global Rank Value (GRV) concept in estimating content provider capability and transforming the QoS requirement of a content request. A fairness model that will bring this design to be attractive to the current content delivery regime is also introduced. Through simulation, we show that using fair provider selection, contents can be provisioned by a better pool of qualified providers thus leveraging the collaboration and preventing potential QoS violation that may occur when the size of pool is smaller.
- Oct 26 2010 cs.DC arXiv:1010.4952v1Cloud computing is penetrating into various domains and environments, from theoretical computer science to economy, from marketing hype to educational curriculum and from R&D lab to enterprise IT infrastructure. Yet, the currently developing state of cloud computing leaves several issues to address and also affects cloud computing adoption by organizations. In this paper, we explain how the transition into the cloud can occur in an organization and describe the mechanism for transforming legacy infrastructure into a virtual infrastructure-based cloud. We describe the state of the art of infrastructural cloud, which is essential in the decision making on cloud adoption, and highlight the challenges that can limit the scale and speed of the adoption. We then suggest a strategic framework for designing a high performance cloud system. This framework is applicable when transformation cloudbased deployment model collides with some constraints. We give an example of the implementation of the framework in a design of a budget-constrained high availability cloud system.
- We consider the problem of reconstructing a low-rank matrix from a small subset of its entries. In this paper, we describe the implementation of an efficient algorithm called OptSpace, based on singular value decomposition followed by local manifold optimization, for solving the low-rank matrix completion problem. It has been shown that if the number of revealed entries is large enough, the output of singular value decomposition gives a good estimate for the original matrix, so that local optimization reconstructs the correct matrix with high probability. We present numerical results which show that this algorithm can reconstruct the low rank matrix exactly from a very small subset of its entries. We further study the robustness of the algorithm with respect to noise, and its performance on actual collaborative filtering datasets.
- We consider a problem of significant practical importance, namely, the reconstruction of a low-rank data matrix from a small subset of its entries. This problem appears in many areas such as collaborative filtering, computer vision and wireless sensor networks. In this paper, we focus on the matrix completion problem in the case when the observed samples are corrupted by noise. We compare the performance of three state-of-the-art matrix completion algorithms (OptSpace, ADMiRA and FPCA) on a single simulation platform and present numerical results. We show that in practice these efficient algorithms can be used to reconstruct real data matrices, as well as randomly generated matrices, accurately.
- Jul 10 2009 cs.CV arXiv:0907.1545v1The ray-based 4D light field representation cannot be directly used to analyze diffractive or phase--sensitive optical elements. In this paper, we exploit tools from wave optics and extend the light field representation via a novel "light field transform". We introduce a key modification to the ray--based model to support the transform. We insert a "virtual light source", with potentially negative valued radiance for certain emitted rays. We create a look-up table of light field transformers of canonical optical elements. The two key conclusions are that (i) in free space, the 4D light field completely represents wavefront propagation via rays with real (positive as well as negative) valued radiance and (ii) at occluders, a light field composed of light field transformers plus insertion of (ray--based) virtual light sources represents resultant phase and amplitude of wavefronts. For free--space propagation, we analyze different wavefronts and coherence possibilities. For occluders, we show that the light field transform is simply based on a convolution followed by a multiplication operation. This formulation brings powerful concepts from wave optics to computer vision and graphics. We show applications in cubic-phase plate imaging and holographic displays.