results for au:Singh_A in:cs

- We present two fully mechanized proofs of Dilworths and Mirskys theorems in the Coq proof assistant. Dilworths Theorem states that in any finite partially ordered set (poset), the size of a smallest chain cover and a largest antichain are the same. Mirskys Theorem is a dual of Dilworths Theorem. We formalize the proofs by Perles [2] (for Dilworths Theorem) and by Mirsky [5] (for the dual theorem). We also come up with a library of definitions and facts that can be used as a framework for formalizing other theorems on finite posets.
- Mar 17 2017 cs.HC arXiv:1703.05462v1As virtual reality (VR) emerges as a mainstream platform, designers have started to experiment new interaction techniques to enhance the user experience. This is a challenging task because designers not only strive to provide designs with good performance but also carefully ensure not to disrupt users' immersive experience. There is a dire need for a new evaluation tool that extends beyond traditional quantitative measurements to assist designers in the design process. We propose an EEG-based experiment framework that evaluates interaction techniques in VR by measuring intentionally elicited cognitive conflict. Through the analysis of the feedback-related negativity (FRN) as well as other quantitative measurements, this framework allows designers to evaluate the effect of the variables of interest. We studied the framework by applying it to the fundamental task of 3D object selection using direct 3D input, i.e. tracked hand in VR. The cognitive conflict is intentionally elicited by manipulating the selection radius of the target object. Our first behavior experiment validated the framework in line with the findings of conflict-induced behavior adjustments like those reported in other classical psychology experiment paradigms. Our second EEG-based experiment examines the effect of the appearance of virtual hands. We found that the amplitude of FRN correlates with the level of realism of the virtual hands, which concurs with the Uncanny Valley theory.
- Mar 07 2017 cs.CV arXiv:1703.01698v1This paper presents two visual trackers from the different paradigms of learning and registration based tracking and evaluates their application in image based visual servoing. They can track object motion with four degrees of freedom (DOF) which, as we will show here, is sufficient for many fine manipulation tasks. One of these is a newly developed learning based tracker that relies on learning discriminative correlation filters while the other is a refinement of a recent 8 DoF RANSAC based tracker adapted with a new appearance model for 4 DoF motion. Both trackers are shown to have superior performance to other state of the art trackers on an existing dataset for manipulation tasks. Further, a new dataset with challenging sequences for fine manipulation tasks captured from the robot mounted eye-in-hand (EIH) cameras is also presented. These sequences have a variety of challenges encountered during real tasks including jittery camera movement, motion blur, drastic scale changes and partial occlusions. Quantitative and qualitative results of both trackers in comparison to eight recent state of the art trackers are shown on these sequences. It proves that these two trackers are robust to failures while maintaining high precision that makes them suitable for such fine manipulation tasks.
- Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and possibly exceed the sample size. We consider the problem of robust estimation of sparse functionals, and provide a computationally and statistically efficient algorithm in the high-dimensional setting. Our theory identifies a unified set of deterministic conditions under which our algorithm guarantees accurate recovery. By further establishing that these deterministic conditions hold with high-probability for a wide range of statistical models, our theory applies to many problems of considerable interest including sparse mean and covariance estimation; sparse linear regression; and sparse generalized linear models.
- We show that given an estimate $\widehat{A}$ that is close to a general high-rank positive semi-definite (PSD) matrix $A$ in spectral norm (i.e., $\|\widehat{A}-A\|_2 \leq \delta$), the simple truncated SVD of $\widehat{A}$ produces a multiplicative approximation of $A$ in Frobenius norm. This observation leads to many interesting results on general high-rank matrix estimation problems, which we briefly summarize below ($A$ is an $n\times n$ high-rank PSD matrix and $A_k$ is the best rank-$k$ approximation of $A$): (1) High-rank matrix completion: By observing $\Omega(\frac{n\max\{\epsilon^{-4},k^2\}\mu_0^2\|A\|_F^2\log n}{\sigma_{k+1}(A)^2})$ elements of $A$ where $\sigma_{k+1}\left(A\right)$ is the $\left(k+1\right)$-th singular value of $A$ and $\mu_0$ is the incoherence, the truncated SVD on a zero-filled matrix satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(\epsilon))\|A-A_k\|_F$ with high probability. (2)High-rank matrix de-noising: Let $\widehat{A}=A+E$ where $E$ is a Gaussian random noise matrix with zero mean and $\nu^2/n$ variance on each entry. Then the truncated SVD of $\widehat{A}$ satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(\sqrt{\nu/\sigma_{k+1}(A)}))\|A-A_k\|_F + O(\sqrt{k}\nu)$. (3) Low-rank Estimation of high-dimensional covariance: Given $N$ i.i.d.~samples $X_1,\cdots,X_N\sim\mathcal N_n(0,A)$, can we estimate $A$ with a relative-error Frobenius norm bound? We show that if $N = \Omega\left(n\max\{\epsilon^{-4},k^2\}\gamma_k(A)^2\log N\right)$ for $\gamma_k(A)=\sigma_1(A)/\sigma_{k+1}(A)$, then $\|\widehat{A}_k-A\|_F \leq (1+O(\epsilon))\|A-A_k\|_F$ with high probability, where $\widehat{A}=\frac{1}{N}\sum_{i=1}^N{X_iX_i^\top}$ is the sample covariance.
- The sparsest cut problem consists of identifying a small set of edges that breaks the graph into balanced sets of vertices. The normalized cut problem balances the total degree, instead of the size, of the resulting sets. Applications of graph cuts include community detection and computer vision. However, cut problems were originally proposed for static graphs, an assumption that does not hold in many modern applications where graphs are highly dynamic. In this paper, we introduce the sparsest and normalized cut problems in temporal graphs, which generalize their standard definitions by enforcing the smoothness of cuts over time. We propose novel formulations and algorithms for computing temporal cuts using spectral graph theory, multiplex graphs, divide-and-conquer and low-rank matrix approximation. Furthermore, we extend our formulation to dynamic graph signals, where cuts also capture node values, as graph wavelets. Experiments show that our solutions are accurate and scalable, enabling the discovery of dynamic communities and the analysis of dynamic graph processes.
- Feb 15 2017 cs.SI arXiv:1702.04082v2Network centrality plays an important role in many applications. Central nodes in social networks can be influential, driving opinions and spreading news or rumors.In hyperlinked environments, such as the Web, where users navigate via clicks, central content receives high traffic, becoming targets for advertising campaigns. While there is an extensive amount of work on centrality measures and their efficient computation, controlling nodes' centrality via network updates is a more recent and challenging problem. Performing minimal modifications to a network to achieve a desired property falls under the umbrella of network design problems. This paper is focused on improving the coverage centrality of a set of nodes, which is the number of pairs of nodes that have a shortest path passing through the set, by adding edges to the network. We prove strong inapproximability results and propose a greedy algorithm for maximizing coverage centrality. To ensure scalability to large networks, we also design an efficient sampling algorithm for the problem. In addition to providing an extensive empirical evaluation of our algorithms, we also show that, under some realistic constraints, the proposed solutions achieve almost-optimal approximation for coverage centrality maximization.
- Feb 14 2017 cs.CV arXiv:1702.03345v1This paper introduces a Deep Scattering network that utilizes Dual-Tree complex wavelets to extract translation invariant representations from an input signal. The computationally efficient Dual-Tree wavelets decompose the input signal into densely spaced representations over scales. Translation invariance is introduced in the representations by applying a non-linearity over a region followed by averaging. The discriminatory information in the densely spaced, locally smooth, signal representations aids the learning of the classifier. The proposed network is shown to outperform Mallat's ScatterNet on four datasets with different modalities on classification accuracy.
- Feb 13 2017 cs.CV arXiv:1702.03267v1We introduce a ScatterNet that uses a parametric log transformation with Dual-Tree complex wavelets to extract translation invariant representations from a multi-resolution image. The parametric transformation aids the OLS pruning algorithm by converting the skewed distributions into relatively mean-symmetric distributions while the Dual-Tree wavelets improve the computational efficiency of the network. The proposed network is shown to outperform Mallat's ScatterNet on two image datasets, both for classification accuracy and computational efficiency. The advantages of the proposed network over other supervised and some unsupervised methods are also presented using experiments performed on different training dataset sizes.
- We consider the problem of estimating and constructing component-wise confidence intervals of a sparse high-dimensional linear regression model when some covariates of the design matrix are missing completely at random. A variant of the Dantzig selector (Candes & Tao, 2007) is analyzed for estimating the regression model and a de-biasing argument is employed to construct component-wise confidence intervals under additional assumptions on the covariance of the design matrix. We also derive rates of convergence of the mean-square estimation error and the average confidence interval length, and show that the dependency over several model parameters (e.g., sparsity $s$, portion of observed covariates $\rho_*$, signal level $\|\beta_0\|_2$) are optimal in a minimax sense.
- All the existing real world networks are evolving, hence, study of traffic dynamics in these enlarged networks is a challenging task. The critical issue is to optimize the network structure to improve network capacity and avoid traffic congestion. We are interested in taking user's routes such that it is least congested with optimal network capacity. Network capacity may be improved either by optimizing network topology or enhancing in routing approach. In this context, we propose and design a model of the time varying data communication networks (TVCN) based on the dynamics of in-flowing links. Newly appeared node prefers to attach with most influential node present in the network. In this paper, influence is termed as \textitreputation and is applied for computing overall congestion at any node. User path with least betweenness centrality and most reputation is preferred for routing. Kelly's optimization formulation for a rate allocation problem is used for obtaining optimal rates of distinct users at different time instants and it is found that the user's path with lowest betweenness centrality and highest reputation will always give maximum rate at stable point.
- Jan 17 2017 cs.RO arXiv:1701.04350v1This paper aims to implement Object-Oriented Markov Decision Process (OO-MDPs) for goal planning and navigation of robot in an indoor environment. We use the OO-MDP representation of the environment which is a natural way of modeling the environment based on objects and their interactions. The paper aims to extend the well known Taxi domain example which has been tested on grid world environment to robotics domain with larger state-spaces. For the purpose of this project we have created simulation of the environment and robot in ROS with Gazebo and Rviz as visualization tools.The mobile robot uses a 2D LIDAR module to perform SLAM in the unknown environment. The goal of this project is to be able to make an autonomous agent capable of performing planning and navigation in an indoor environment to deliver boxes (passengers in Taxi domain) placed at random locations to a particular location (warehouse). The approach can be extended to a wide variety of mobile and manipulative robots
- Jan 09 2017 cs.RO arXiv:1701.01547v1In many human-in-the-loop robotic applications such as robot-assisted surgery and remote teleoperation, predicting the intended motion of the human operator may be useful for successful implementation of shared control, guidance virtual fixtures, and predictive control. Developing computational models of human movements is a critical foundation for such motion prediction frameworks. With this motivation, we present a computational framework for modeling reaching movements in the presence of obstacles. We propose a stochastic optimal control framework that consists of probabilistic collision avoidance constraints and a cost function that trades-off between effort and end-state variance in the presence of a signal-dependent noise. First, we present a series of reformulations to convert the original non-linear and non-convex optimal control into a parametric quadratic programming problem. We show that the parameters can be tuned to model various collision avoidance strategies, thereby capturing the quintessential variability associated with human motion. Then, we present a simulation study that demonstrates the complex interaction between avoidance strategies, control cost, and the probability of collision avoidance. The proposed framework can benefit a variety of applications that require teleoperation in cluttered spaces, including robot-assisted surgery. In addition, it can also be viewed as a new optimizer which produces smooth and probabilistically-safe trajectories under signal dependent noise.
- User-given tags or labels are valuable resources for semantic understanding of visual media such as images and videos. Recently, a new type of labeling mechanism known as hash-tags have become increasingly popular on social media sites. In this paper, we study the problem of generating relevant and useful hash-tags for short video clips. Traditional data-driven approaches for tag enrichment and recommendation use direct visual similarity for label transfer and propagation. We attempt to learn a direct low-cost mapping from video to hash-tags using a two step training process. We first employ a natural language processing (NLP) technique, skip-gram models with neural network training to learn a low-dimensional vector representation of hash-tags (Tag2Vec) using a corpus of 10 million hash-tags. We then train an embedding function to map video features to the low-dimensional Tag2vec space. We learn this embedding for 29 categories of short video clips with hash-tags. A query video without any tag-information can then be directly mapped to the vector space of tags using the learned embedding and relevant tags can be found by performing a simple nearest-neighbor retrieval in the Tag2Vec space. We validate the relevance of the tags suggested by our system qualitatively and quantitatively with a user study.
- We consider the Hypothesis Transfer Learning (HTL) problem where one incorporates a hypothesis trained on the source domain into the learning procedure of the target domain. Existing theoretical analysis either only studies specific algorithms or only presents upper bounds on the generalization error but not on the excess risk. In this paper, we propose a unified algorithm-dependent framework for HTL through a novel notion of transformation functions, which characterizes the relation between the source and the target domains. We conduct a general risk analysis of this framework and in particular, we show for the first time, if two domains are related, HTL enjoys faster convergence rates of excess risks for Kernel Smoothing and Kernel Ridge Regression than those of the classical non-transfer learning settings. We accompany this framework with an analysis of cross-validation for HTL to search for the best transfer technique and gracefully reduce to non-transfer learning when HTL is not helpful. Experiments on robotics and neural imaging data demonstrate the effectiveness of our framework.
- Dec 02 2016 cs.CV arXiv:1612.00144v2Deep learning based landcover classification algorithms have recently been proposed in literature. In hyperspectral images (HSI) they face the challenges of large dimensionality, spatial variability of spectral signatures and scarcity of labeled data. In this article we propose an end-to-end deep learning architecture that extracts band specific spectral-spatial features and performs landcover classification. The architecture has fewer independent connection weights and thus requires lesser number of training data. The method is found to outperform the highest reported accuracies on popular hyperspectral image data sets.
- We introduce the task of Visual Dialog, which requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a question about the image, the agent has to ground the question in image, infer context from history, and answer the question accurately. Visual Dialog is disentangled enough from a specific downstream task so as to serve as a general test of machine intelligence, while being grounded in vision enough to allow objective evaluation of individual responses and benchmark progress. We develop a novel two-person chat data-collection protocol to curate a large-scale Visual Dialog dataset (VisDial). Data collection is underway and on completion, VisDial will contain 1 dialog with 10 question-answer pairs on all ~200k images from COCO, with a total of 2M dialog question-answer pairs. We introduce a family of neural encoder-decoder models for Visual Dialog with 3 encoders -- Late Fusion, Hierarchical Recurrent Encoder and Memory Network -- and 2 decoders (generative and discriminative), which outperform a number of sophisticated baselines. We propose a retrieval-based evaluation protocol for Visual Dialog where the AI agent is asked to sort a set of candidate answers and evaluated on metrics such as mean-reciprocal-rank of human response. We quantify gap between machine and human performance on the Visual Dialog task via human studies. Our dataset, code, and trained models will be released publicly. Putting it all together, we demonstrate the first 'visual chatbot'!
- Nov 04 2016 cs.CL arXiv:1611.01083v1In this paper, we are going to find meaning of words based on distinct situations. Word Sense Disambiguation is used to find meaning of words based on live contexts using supervised and unsupervised approaches. Unsupervised approaches use online dictionary for learning, and supervised approaches use manual learning sets. Hand tagged data are populated which might not be effective and sufficient for learning procedure. This limitation of information is main flaw of the supervised approach. Our proposed approach focuses to overcome the limitation using learning set which is enriched in dynamic way maintaining new data. Trivial filtering method is utilized to achieve appropriate training data. We introduce a mixed methodology having Modified Lesk approach and Bag-of-Words having enriched bags using learning methods. Our approach establishes the superiority over individual Modified Lesk and Bag-of-Words approaches based on experimentation.
- Subspace clustering is the problem of partitioning unlabeled data points into a number of clusters so that data points within one cluster lie approximately on a low-dimensional linear subspace. In many practical scenarios, the dimensionality of data points to be clustered are compressed due to constraints of measurement, computation or privacy. In this paper, we study the theoretical properties of a popular subspace clustering algorithm named sparse subspace clustering (SSC) and establish formal success conditions of SSC on dimensionality-reduced data. Our analysis applies to the most general fully deterministic model where both underlying subspaces and data points within each subspace are deterministically positioned, and also a wide range of dimensionality reduction techniques (e.g., Gaussian random projection, uniform subsampling, sketching) that fall into a subspace embedding framework (Meng & Mahoney, 2013; Avron et al., 2014). Finally, we apply our analysis to a differentially private SSC algorithm and established both privacy and utility guarantees of the proposed method.
- From Traditional to Modern : Domain Adaptation for Action Classification in Short Social Video ClipsOct 19 2016 cs.CV arXiv:1610.05613v1Short internet video clips like vines present a significantly wild distribution compared to traditional video datasets. In this paper, we focus on the problem of unsupervised action classification in wild vines using traditional labeled datasets. To this end, we use a data augmentation based simple domain adaptation strategy. We utilise semantic word2vec space as a common subspace to embed video features from both, labeled source domain and unlablled target domain. Our method incrementally augments the labeled source with target samples and iteratively modifies the embedding function to bring the source and target distributions together. Additionally, we utilise a multi-modal representation that incorporates noisy semantic information available in form of hash-tags. We show the effectiveness of this simple adaptation technique on a test set of vines and achieve notable improvements in performance.
- Detecting a small number of outliers from a set of data observations is always challenging. This problem is more difficult in the setting of multiple network samples, where computing the anomalous degree of a network sample is generally not sufficient. In fact, explaining why the network is exceptional, expressed in the form of subnetwork, is also equally important. In this paper, we develop a novel algorithm to address these two key problems. We treat each network sample as a potential outlier and identify subnetworks that mostly discriminate it from nearby regular samples. The algorithm is developed in the framework of network regression combined with the constraints on both network topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus goes beyond subspace/subgraph discovery and we show that it converges to a global optimum. Evaluation on various real-world network datasets demonstrates that our algorithm not only outperforms baselines in both network and high dimensional setting, but also discovers highly relevant and interpretable local subnetworks, further enhancing our understanding of anomalous networks.
- Sep 28 2016 cs.DB arXiv:1609.08228v1Reduction of end-to-end network delays is an optimization task with applications in multiple domains. Low delays enable improved information flow in social networks, quick spread of ideas in collaboration networks, low travel times for vehicles on road networks and increased rate of packets in the case of communication networks. Delay reduction can be achieved by both improving the propagation capabilities of individual nodes and adding additional edges in the network. One of the main challenges in such design problems is that the effects of local changes are not independent, and as a consequence, there is a combinatorial search-space of possible improvements. Thus, minimizing the cumulative propagation delay requires novel scalable and data-driven approaches. In this paper, we consider the problem of network delay minimization via node upgrades. Although the problem is NP-hard, we show that probabilistic approximation for a restricted version can be obtained. We design scalable and high-quality techniques for the general setting based on sampling and targeted to different models of delay distribution. Our methods scale almost linearly with the graph size and consistently outperform competitors in quality.
- Sep 20 2016 cs.CV arXiv:1609.05296v1Liveliness detection acts as a safe guard against spoofing attacks. Most of the researchers used vision based techniques to detect liveliness of the user, but they are highly sensitive to illumination effects. Therefore it is very hard to design a system, which will work robustly under all circumstances. Literature shows that most of the research utilize eye blink or mouth movement to detect the liveliness, while the other group used face texture to distinguish between real and imposter. The classification results of all these approaches decreases drastically in variable light conditions. Hence in this paper we are introducing fuzzy expert system which is sufficient enough to handle most of the cases comes in real time. We have used two testing parameters, (a) under bad illumination and (b) less movement in eyes and mouth in case of real user to evaluate the performance of the system. The system is behaving well in all, while in first case its False Rejection Rate (FRR) is 0.28, and in second case its FRR is 0.4.
- Recommendation and collaborative filtering systems are important in modern information and e-commerce applications. As these systems are becoming increasingly popular in the industry, their outputs could affect business decision making, introducing incentives for an adversarial party to compromise the availability or integrity of such systems. We introduce a data poisoning attack on collaborative filtering systems. We demonstrate how a powerful attacker with full knowledge of the learner can generate malicious data so as to maximize his/her malicious objectives, while at the same time mimicking normal user behavior to avoid being detected. While the complete knowledge assumption seems extreme, it enables a robust assessment of the vulnerability of collaborative filtering schemes to highly motivated attacks. We present efficient solutions for two popular factorization-based collaborative filtering algorithms: the \emphalternative minimization formulation and the \emphnuclear norm minimization method. Finally, we test the effectiveness of our proposed algorithms on real-world data and discuss potential defensive strategies.
- Aug 23 2016 cs.RO arXiv:1608.05829v1We present Probabilistic Reciprocal Velocity Obstacle or PRVO as a general algorithm for navigating multiple robots under perception and motion uncertainty. PRVO is defined as the space of velocities that ensures dynamic collision avoidance between a pair of robots with a specified probability. Our approach is based on defining chance constraints over the inequalities defined by the deterministic Reciprocal Velocity Obstacle (RVO). The computational complexity of the proposed probabilistic RVO is comparable to the deterministic counterpart. This is achieved by a series of reformulations where we first substitute the computationally intractable chance constraints with a family of surrogate constraints and then adopt a time scaling based solution methodology to efficiently characterize their solution space. Further, we also show that the solution space of each member of the family of surrogate constraints can be mapped in closed form to the probability with which the original chance constraints are satisfied and thus consequently to probability of collision avoidance. We validate our formulations through numerical simulations where we highlight the importance of incorporating the effect of motion uncertainty and the advantages of PRVO over existing formulations which handles the effect of uncertainty by using conservative bounding volumes.
- Aug 16 2016 cs.CE arXiv:1608.03990v3A modeling paradigm is developed to augment predictive models of turbulence by effectively utilizing limited data generated from physical experiments. The key components of our approach involve inverse modeling to infer the spatial distribution of model discrepancies, and, machine learning to reconstruct discrepancy information from a large number of inverse problems into corrective model forms. We apply the methodology to turbulent flows over airfoils involving flow separation. Model augmentations are developed for the Spalart Allmaras (SA) model using adjoint-based full field inference on experimentally measured lift coefficient data. When these model forms are reconstructed using neural networks (NN) and embedded within a standard solver, we show that much improved predictions in lift can be obtained for geometries and flow conditions that were not used to train the model. The NN-augmented SA model also predicts surface pressures extremely well. Portability of this approach is demonstrated by confirming that predictive improvements are preserved when the augmentation is embedded in a different commercial finite-element solver. The broader vision is that by incorporating data that can reveal the form of the innate model discrepancy, the applicability of data-driven turbulence models can be extended to more general flows.
- Jul 19 2016 cs.CV arXiv:1607.04673v4This paper adapts a popular image quality measure called structural similarity for high precision registration based tracking while also introducing a simpler and faster variant of the same. Further, these are evaluated comprehensively against existing measures using a unified approach to study registration based trackers that decomposes them into three constituent sub modules - appearance model, state space model and search method. Several popular trackers in literature are broken down using this method so that their contributions - as of this paper - are shown to be limited to only one or two of these submodules. An open source tracking framework is made available that follows this decomposition closely through extensive use of generic programming. It is used to perform all experiments on four publicly available datasets so the results are easily reproducible. This framework provides a convenient interface to plug in a new method for any sub module and combine it with existing methods for the other two. It can also serve as a fast and flexible solution for practical tracking needs due to its highly efficient implementation.
- Apr 05 2016 cs.SI arXiv:1604.00657v2Do users from Carnegie Mellon University form social communities on Facebook? Do signal processing researchers from tightly collaborate with each other? Do Chinese restaurants in Manhattan cluster together? These seemingly different problems share a common structure: an attribute that may be localized on a graph. In other words, nodes activated by an attribute form a subgraph that can be easily separated from other nodes. In this paper, we thus focus on the task of detecting localized attributes on a graph. We are particularly interested in categorical attributes such as attributes in online social networks, ratings in recommender systems and viruses in cyber-physical systems because they are widely used in numerous data mining applications. To solve the task, we formulate a statistical hypothesis testing problem to decide whether a given attribute is localized or not. We propose two statistics: graph wavelet statistic and graph scan statistic, both of which are provably effective in detecting localized attributes. We validate the robustness of the proposed statistics on both simulated data and two real-world applications: high air-pollution detection and keyword ranking in a co-authorship network collected from IEEE Xplore. Experimental results show that the proposed graph wavelet statistic and graph scan statistic are effective and efficient.
- This paper proposes a novel selective autoencoder approach within the framework of deep convolutional networks. The crux of the idea is to train a deep convolutional autoencoder to suppress undesired parts of an image frame while allowing the desired parts resulting in efficient object detection. The efficacy of the framework is demonstrated on a critical plant science problem. In the United States, approximately $1 billion is lost per annum due to a nematode infection on soybean plants. Currently, plant-pathologists rely on labor-intensive and time-consuming identification of Soybean Cyst Nematode (SCN) eggs in soil samples via manual microscopy. The proposed framework attempts to significantly expedite the process by using a series of manually labeled microscopic images for training followed by automated high-throughput egg detection. The problem is particularly difficult due to the presence of a large population of non-egg particles (disturbances) in the image frames that are very similar to SCN eggs in shape, pose and illumination. Therefore, the selective autoencoder is trained to learn unique features related to the invariant shapes and sizes of the SCN eggs without handcrafting. After that, a composite non-maximum suppression and differencing is applied at the post-processing stage.
- In this paper, we develop the theory for constructing DNA cyclic codes of odd length over $R=\Z_4[u]/\langle u^2-1 \rangle$ based on the deletion distance. Firstly, we relate DNA pairs with a special 16 elements of ring $R$. Cyclic codes of odd length over $R$ satisfy the reverse constraint and the reverse-complement constraint are discussed in this paper. We also study the $GC$-content of these codes and their deletion distance. The paper concludes with some examples of cyclic DNA codes with $GC$-content and their respective deletion distance.
- Mar 07 2016 cs.CV arXiv:1603.01292v2This paper presents a new way to study registration based trackers by decomposing them into three constituent sub modules: appearance model, state space model and search method. It is often the case that when a new tracker is introduced in literature, it only contributes to one or two of these sub modules while using existing methods for the rest. Since these are often selected arbitrarily by the authors, they may not be optimal for the new method. In such cases, our breakdown can help to experimentally find the best combination of methods for these sub modules while also providing a framework within which the contributions of the new tracker can be clearly demarcated and thus studied better. We show how existing trackers can be broken down using the suggested methodology and compare the performance of the default configuration chosen by the authors against other possible combinations to demonstrate the new insights that can be gained by such an approach. We also present an open source system that provides a convenient interface to plug in a new method for any sub module and test it against all possible combinations of methods for the other two sub modules while also serving as a fast and efficient solution for practical tracking requirements.
- This paper presents a modular, extensible and highly efficient open source framework for registration based tracking targeted at robotics applications. It is implemented entirely in C++ and is designed from the ground up to easily integrate with systems that support any of several major vision and robotics libraries including OpenCV, ROS, ViSP and Eigen. It is also faster and more precise than other existing systems. To establish the theoretical basis for its design, a new way to conceptualize registration based trackers is introduced that decomposes them into three constituent sub modules - Search Method, Appearance Model and State Space Model. In the process, the seminal work by Baker & Matthews is extended with several important advances since its publication. In addition to being a practical solution for fast and high precision tracking, this system can also serve as a useful research tool by allowing existing and new methods for any of the sub modules to be studied better. When a new method is introduced for one of these, the breakdown can help to experimentally find the combination of methods for the others that is optimum for it. By extensive use of generic programming, the system makes it easy to plug in a new method for any of the sub modules so that it can not only be tested comprehensively with existing methods but also become immediately available for deployment in any project that uses the framework.
- Most data for evaluating and training recommender systems is subject to selection biases, either through self-selection by the users or through the actions of the recommendation system itself. In this paper, we provide a principled approach to handling selection biases, adapting models and estimation techniques from causal inference. The approach leads to unbiased performance estimators despite biased data, and to a matrix factorization method that provides substantially improved prediction performance on real-world data. We theoretically and empirically characterize the robustness of the approach, finding that it is highly practical and scalable.
- Feb 16 2016 cs.NI arXiv:1602.04400v1New data intensive applications, which are continuously emerging in daily routines of mobile devices, significantly increase the demand for data, and pose a challenge for current wireless networks due to scarce resources. Although bandwidth is traditionally considered as the primary scarce resource in wireless networks, the developments in communication theory shifts the focus from bandwidth to other scarce resources including processing power and energy. Especially, in device-to-device networks, where data rates are increasing rapidly, processing power and energy are becoming the primary bottlenecks of the network. Thus, it is crucial to develop new networking mechanisms by taking into account the processing power and energy as bottlenecks. In this paper, we develop an energy-aware cooperative computation framework for mobile devices. In this setup, a group of cooperative mobile devices, within proximity of each other, (i) use their cellular or Wi-Fi (802.11) links as their primary networking interfaces, and (ii) exploit their device-to-device connections (e.g., Wi-Fi Direct) to overcome processing power and energy bottlenecks. We evaluate our energy-aware cooperative computation framework on a testbed consisting of smartphones and tablets, and we show that it brings significant performance benefits.
- Modeling information that resides on vertices of large graphs is a key problem in several real-life applications, ranging from social networks to the Internet-of-things. Signal Processing on Graphs and, in particular, graph wavelets can exploit the intrinsic smoothness of these datasets in order to represent them in a both compact and accurate manner. However, how to discover wavelet bases that capture the geometry of the data with respect to the signal as well as the graph structure remains an open question. In this paper, we study the problem of computing graph wavelet bases via sparse cuts in order to produce low-dimensional encodings of data-driven bases. This problem is connected to known hard problems in graph theory (e.g. multiway cuts) and thus requires an efficient heuristic. We formulate the basis discovery task as a relaxation of a vector optimization problem, which leads to an elegant solution as a regularized eigenvalue computation. Moreover, we propose several strategies in order to scale our algorithm to large graphs. Experimental results show that the proposed algorithm can effectively encode both the graph structure and signal, producing compressed and accurate representations for vertex values in a wide range of datasets (e.g. sensor and gene networks) and significantly outperforming the best baseline.
- When data analysts train a classifier and check if its accuracy is significantly different from random guessing, they are implicitly and indirectly performing a hypothesis test (two sample testing) and it is of importance to ask whether this indirect method for testing is statistically optimal or not. Given that hypothesis tests attempt to maximize statistical power subject to a bound on the allowable false positive rate, while prediction attempts to minimize statistical risk on future predictions on unseen data, we wish to study whether a predictive approach for an ultimate aim of testing is prudent. We formalize this problem by considering the two-sample mean-testing setting where one must determine if the means of two Gaussians (with known and equal covariance) are the same or not, but the analyst indirectly does so by checking whether the accuracy achieved by Fisher's LDA classifier is significantly different from chance or not. Unexpectedly, we find that the asymptotic power of LDA's sample-splitting classification accuracy is actually minimax rate-optimal in terms of problem-dependent parameters. Since prediction is commonly thought to be harder than testing, it might come as a surprise to some that solving a harder problem does not create a information-theoretic bottleneck for the easier one. On the flip side, even though the power is rate-optimal, our derivation suggests that it may be worse by a small constant factor; hence practitioners must be wary of using (admittedly flexible) prediction methods on disguised testing problems.
- The problem of learning the structure of a high dimensional graphical model from data has received considerable attention in recent years. In many applications such as sensor networks and proteomics it is often expensive to obtain samples from all the variables involved simultaneously. For instance, this might involve the synchronization of a large number of sensors or the tagging of a large number of proteins. To address this important issue, we initiate the study of a novel graphical model selection problem, where the goal is to optimize the total number of scalar samples obtained by allowing the collection of samples from only subsets of the variables. We propose a general paradigm for graphical model selection where feedback is used to guide the sampling to high degree vertices, while obtaining only few samples from the ones with the low degrees. We instantiate this framework with two specific active learning algorithms, one of which makes mild assumptions but is computationally expensive, while the other is more computationally efficient but requires stronger (nevertheless standard) assumptions. Whereas the sample complexity of passive algorithms is typically a function of the maximum degree of the graph, we show that the sample complexity of our algorithms is provable smaller and that it depends on a novel local complexity measure that is akin to the average degree of the graph. We finally demonstrate the efficacy of our framework via simulations.
- Jan 27 2016 cs.SI physics.soc-ph arXiv:1601.07108v1Understanding the network structure, and finding out the influential nodes is a challenging issue in the large networks. Identifying the most influential nodes in the network can be useful in many applications like immunization of nodes in case of epidemic spreading, during intentional attacks on complex networks. A lot of research is done to devise centrality measures which could efficiently identify the most influential nodes in the network. There are two major approaches to the problem: On one hand, deterministic strategies that exploit knowledge about the overall network topology in order to find the influential nodes, while on the other end, random strategies are completely agnostic about the network structure. Centrality measures that can deal with a limited knowledge of the network structure are required. Indeed, in practice, information about the global structure of the overall network is rarely available or hard to acquire. Even if available, the structure of the network might be too large that it is too much computationally expensive to calculate global centrality measures. To that end, a centrality measure is proposed that requires information only at the community level to identify the influential nodes in the network. Indeed, most of the real-world networks exhibit a community structure that can be exploited efficiently to discover the influential nodes. We performed a comparative evaluation of prominent global deterministic strategies together with stochastic strategies with an available and the proposed deterministic community-based strategy. Effectiveness of the proposed method is evaluated by performing experiments on synthetic and real-world networks with community structure in the case of immunization of nodes for epidemic control.
- Jan 26 2016 cs.DB arXiv:1601.06316v2Recent spatio-temporal data applications, such as car-shar\-ing and smart cities, impose new challenges regarding the scalability and timeliness of data processing systems. Trajectory compression is a promising approach for scaling up spatio-temporal databases. However, existing techniques fail to address the online setting, in which a compressed version of a trajectory stream has to be maintained over time. In this paper, we introduce ONTRAC, a new framework for map-matched online trajectory compression. ONTRAC learns prediction models for suppressing updates to a trajectory database using training data. Two prediction schemes are proposed, one for road segments via a Markov model and another for travel-times by combining Quadratic Programming and Expectation Maximization. Experiments show that ONTRAC outperforms the state-of-the-art offline technique even when long update delays (4 mininutes) are allowed and achieves up to 21 times higher compression ratio for travel-times. Moreover, our approach increases database scalability by up to one order of magnitude.
- Linear independence testing is a fundamental information-theoretic and statistical problem that can be posed as follows: given $n$ points $\{(X_i,Y_i)\}^n_{i=1}$ from a $p+q$ dimensional multivariate distribution where $X_i \in \mathbb{R}^p$ and $Y_i \in\mathbb{R}^q$, determine whether $a^T X$ and $b^T Y$ are uncorrelated for every $a \in \mathbb{R}^p, b\in \mathbb{R}^q$ or not. We give minimax lower bound for this problem (when $p+q,n \to \infty$, $(p+q)/n \leq \kappa < \infty$, without sparsity assumptions). In summary, our results imply that $n$ must be at least as large as $\sqrt {pq}/\|\Sigma_{XY}\|_F^2$ for any procedure (test) to have non-trivial power, where $\Sigma_{XY}$ is the cross-covariance matrix of $X,Y$. We also provide some evidence that the lower bound is tight, by connections to two-sample testing and regression in specific settings.
- Jan 19 2016 cs.DC arXiv:1601.04675v1This volume represents the proceedings of the 2nd International Workshop on Dynamic Resource Allocation and Management in Embedded, High Performance and Cloud Computing (DREAMCloud 2016), co-located with HiPEAC 2016 on 19th January 2016 in Prague, Czech Republic.
- In this paper, using of automotive use cases as benchmarks for real-time system design has been proposed. The use cases are described in a format supported by AMALTHEA platform, which is a model based open source development environment for automotive multi-core systems. An example of a simple Electronic Control Unit has been analysed and presented with enough details to reconstruct this system in any format. For researchers willing to use AMALTHEA file format directly, an appropriate parser has been developed and offered. An example of applying this parser and benchmark for optimising makespan while not violating the timing constraints by allocating functionality to different Network on Chip resource is demonstrated.
- Let $R=\mathbb{F}_2+u\mathbb{F}_2+u^2\mathbb{F}_2$ be a non-chain finite commutative ring, where $u^3=u$. In this paper, we mainly study the construction of quantum codes from cyclic codes over $R$. We obtained self-orthogonal codes over $\mathbb{F}_2$ as gray images of linear and cyclic codes over $R$. The parameters of quantum codes which are obtained from cyclic code over $R$ are discussed.
- We derive computationally tractable methods to select a small subset of experiment settings from a large pool of given design points. The primary focus is on linear regression models, while the technique extends to generalized linear models and Delta's method (estimating functions of linear regression models) as well. The algorithms are based on a continuous relaxation of an otherwise intractable combinatorial optimization problem, with sampling or greedy procedures as post-processing steps. Formal approximation guarantees are established for both algorithms, and numerical results on both synthetic and real-world data confirm the effectiveness of the proposed methods.
- Advanced Driver Assistance Systems (ADAS) have made driving safer over the last decade. They prepare vehicles for unsafe road conditions and alert drivers if they perform a dangerous maneuver. However, many accidents are unavoidable because by the time drivers are alerted, it is already too late. Anticipating maneuvers beforehand can alert drivers before they perform the maneuver and also give ADAS more time to avoid or prepare for the danger. In this work we propose a vehicular sensor-rich platform and learning algorithms for maneuver anticipation. For this purpose we equip a car with cameras, Global Positioning System (GPS), and a computing device to capture the driving context from both inside and outside of the car. In order to anticipate maneuvers, we propose a sensory-fusion deep learning architecture which jointly learns to anticipate and fuse multiple sensory streams. Our architecture consists of Recurrent Neural Networks (RNNs) that use Long Short-Term Memory (LSTM) units to capture long temporal dependencies. We propose a novel training procedure which allows the network to predict the future given only a partial temporal context. We introduce a diverse data set with 1180 miles of natural freeway and city driving, and show that we can anticipate maneuvers 3.5 seconds before they occur in real-time with a precision and recall of 90.5\% and 87.4\% respectively.
- Dec 22 2015 cs.LG arXiv:1512.06173v1Data mining practitioners are facing challenges from data with network structure. In this paper, we address a specific class of global-state networks which comprises of a set of network instances sharing a similar structure yet having different values at local nodes. Each instance is associated with a global state which indicates the occurrence of an event. The objective is to uncover a small set of discriminative subnetworks that can optimally classify global network values. Unlike most existing studies which explore an exponential subnetwork space, we address this difficult problem by adopting a space transformation approach. Specifically, we present an algorithm that optimizes a constrained dual-objective function to learn a low-dimensional subspace that is capable of discriminating networks labelled by different global states, while reconciling with common network topology sharing across instances. Our algorithm takes an appealing approach from spectral graph learning and we show that the globally optimum solution can be achieved via matrix eigen-decomposition.
- Dec 22 2015 cs.LG arXiv:1512.06430v1Churn prediction, or the task of identifying customers who are likely to discontinue use of a service, is an important and lucrative concern of firms in many different industries. As these firms collect an increasing amount of large-scale, heterogeneous data on the characteristics and behaviors of customers, new methods become possible for predicting churn. In this paper, we present a unified analytic framework for detecting the early warning signs of churn, and assigning a "Churn Score" to each customer that indicates the likelihood that the particular individual will churn within a predefined amount of time. This framework employs a brute force approach to feature engineering, then winnows the set of relevant attributes via feature selection, before feeding the final feature-set into a suite of supervised learning algorithms. Using several terabytes of data from a large mobile phone network, our method identifies several intuitive - and a few surprising - early warning signs of churn, and our best model predicts whether a subscriber will churn with 89.4% accuracy.
- We present a framework for representing and modeling data on graphs. Based on this framework, we study three typical classes of graph signals: smooth graph signals, piecewise-constant graph signals, and piecewise-smooth graph signals. For each class, we provide an explicit definition of the graph signals and construct a corresponding graph dictionary with desirable properties. We then study how such graph dictionary works in two standard tasks: approximation and sampling followed with recovery, both from theoretical as well as algorithmic perspectives. Finally, for each class, we present a case study of a real-world problem by using the proposed methodology.
- This paper builds theoretical foundations for the recovery of a newly proposed class of smooth graph signals, approximately bandlimited graph signals, under three sampling strategies: uniform sampling, experimentally designed sampling and active sampling. We then state minimax lower bounds on the maximum risk for the approximately bandlimited class under these three sampling strategies and show that active sampling cannot fundamentally outperform experimentally designed sampling. We propose a recovery strategy to compare uniform sampling with experimentally designed sampling. As the proposed recovery strategy lends itself well to statistical analysis, we derive the exact mean square error for each sampling strategy. To study convergence rates, we introduce two types of graphs and find that (1) the proposed recovery strategy achieves the optimal rates; and (2) the experimentally designed sampling fundamentally outperforms uniform sampling for Type-2 class of graphs. To validate our proposed recovery strategy, we test it on five specific graphs: a ring graph with $k$ nearest neighbors, an Erdős-Rényi graph, a random geometric graph, a small-world graph and a power-law graph and find that experimental results match the proposed theory well. This work also presents a comprehensive explanation for when and why sampling for semi-supervised learning with graphs works.
- Computer system monitoring generates huge amounts of logs that record the interaction of system entities. How to query such data to better understand system behaviors and identify potential system risks and malicious behaviors becomes a challenging task for system administrators due to the dynamics and heterogeneity of the data. System monitoring data are essentially heterogeneous temporal graphs with nodes being system entities and edges being their interactions over time. Given the complexity of such graphs, it becomes time-consuming for system administrators to manually formulate useful queries in order to examine abnormal activities, attacks, and vulnerabilities in computer systems. In this work, we investigate how to query temporal graphs and treat query formulation as a discriminative temporal graph pattern mining problem. We introduce TGMiner to mine discriminative patterns from system logs, and these patterns can be taken as templates for building more complex queries. TGMiner leverages temporal information in graphs to prune graph patterns that share similar growth trend without compromising pattern quality. Experimental results on real system data show that TGMiner is 6-32 times faster than baseline methods. The discovered patterns were verified by system experts; they achieved high precision (97%) and recall (91%).
- In this paper, we mainly study the some structure of cyclic DNA codes of odd length over the ring $R = \F_2[u,v]/\langle u^2-1,v^3-v,uv-vu \rangle$ which play an important role in DNA computing. We established a direct link between the element of ring $R$ and 64 codons by introducing a Gray map from $R$ to $R_1 = F_2 + uF_2, u^2 = 1$ where $R_1$ is the ring of four elements. The reverse constrain and the reverse-complement constraint codes over $R$ and $R_1$ are studied in this paper. Binary image of the cyclic codes over R also study. The paper concludes with some example on DNA codes obtained via gray map.
- Nov 11 2015 cs.LG arXiv:1511.02900v1In this paper, we investigate whether "big-data" is more valuable than "precise" data for the problem of energy disaggregation: the process of breaking down aggregate energy usage on a per-appliance basis. Existing techniques for disaggregation rely on energy metering at a resolution of 1 minute or higher, but most power meters today only provide a reading once per month, and at most once every 15 minutes. In this paper, we propose a new technique called Neighbourhood NILM that leverages data from 'neighbouring' homes to disaggregate energy given only a single energy reading per month. The key intuition behind our approach is that 'similar' homes have 'similar' energy consumption on a per-appliance basis. Neighbourhood NILM matches every home with a set of 'neighbours' that have direct submetering infrastructure, i.e. power meters on individual circuits or loads. Many such homes already exist. Then, it estimates the appliance-level energy consumption of the target home to be the average of its K neighbours. We evaluate this approach using 25 homes and results show that our approach gives comparable or better disaggregation in comparison to state-of-the-art accuracy reported in the literature that depend on manual model training, high frequency power metering, or both. Results show that Neighbourhood NILM can achieve 83% and 79% accuracy disaggregating fridge and heating/cooling loads, compared to 74% and 73% for a technique called FHMM. Furthermore, it achieves up to 64% accuracy on washing machine, dryer, dishwasher, and lighting loads, which is higher than previously reported results. Many existing techniques are not able to disaggregate these loads at all. These results indicate a potentially substantial advantage to installing submetering infrastructure in a select few homes rather than installing new high-frequency smart metering infrastructure in all homes.
- The rapidly increasing number of mobile devices, voluminous data, and higher data rate are pushing to rethink the current generation of the cellular mobile communication. The next or fifth generation (5G) cellular networks are expected to meet high-end requirements. The 5G networks are broadly characterized by three unique features: ubiquitous connectivity, extremely low latency, and very high-speed data transfer. The 5G networks would provide novel architectures and technologies beyond state-of-the-art architectures and technologies. In this paper, our intent is to find an answer to the question: "what will be done by 5G and how?" We investigate and discuss serious limitations of the fourth generation (4G) cellular networks and corresponding new features of 5G networks. We identify challenges in 5G networks, new technologies for 5G networks, and present a comparative study of the proposed architectures that can be categorized on the basis of energy-efficiency, network hierarchy, and network types. Interestingly, the implementation issues, e.g., interference, QoS, handoff, security-privacy, channel access, and load balancing, hugely effect the realization of 5G networks. Furthermore, our illustrations highlight the feasibility of these models through an evaluation of existing real-experiments and testbeds.
- Oct 30 2015 cs.LG arXiv:1510.08713v1Since the early 1980s, the research community has developed ever more sophisticated algorithms for the problem of energy disaggregation, but despite decades of research, there is still a dearth of applications with demonstrated value. In this work, we explore a question that is highly pertinent to this research community: how good does energy disaggregation need to be in order to infer characteristics of a household? We present novel techniques that use unsupervised energy disaggregation to predict both household occupancy and static properties of the household such as size of the home and number of occupants. Results show that basic disaggregation approaches performs up to 30% better at occupancy estimation than using aggregate power data alone, and are up to 10% better at estimating static household characteristics. These results show that even rudimentary energy disaggregation techniques are sufficient for improved inference of household characteristics. To conclude, we re-evaluate the bar set by the community for energy disaggregation accuracy and try to answer the question "how good is good enough?"
- In this paper we present a distributed clustering protocol for mobile wireless sensor networks. A large majority of research in clustering and routing algorithms for WSNs assume a static network and hence are rendered inefficient in cases of highly mobile sensor networks, which is an aspect addressed here. MECP is an energy efficient, mobility aware protocol and utilizes information about movement of sensor nodes and residual energy as attributes in network formation. It also provides a mechanism for fault tolerance to decrease packet data loss in case of cluster head failures.
- Analysis of opinion dynamics in social networks plays an important role in today's life. For applications such as predicting users' political preference, it is particularly important to be able to analyze the dynamics of competing opinions. While observing the evolution of polar opinions of a social network's users over time, can we tell when the network "behaved" abnormally? Furthermore, can we predict how the opinions of the users will change in the future? Do opinions evolve according to existing network opinion dynamics models? To answer such questions, it is not sufficient to study individual user behavior, since opinions can spread far beyond users' egonets. We need a method to analyze opinion dynamics of all network users simultaneously and capture the effect of individuals' behavior on the global evolution pattern of the social network. In this work, we introduce Social Network Distance (SND) - a distance measure that quantifies the "cost" of evolution of one snapshot of a social network into another snapshot under various models of polar opinion propagation. SND has a rich semantics of a transportation problem, yet, is computable in time linear in the number of users, which makes SND applicable to the analysis of large-scale online social networks. In our experiments with synthetic and real-world Twitter data, we demonstrate the utility of our distance measure for anomalous event detection. It achieves a true positive rate of 0.83, twice as high as that of alternatives. When employed for opinion prediction in Twitter, our method's accuracy is 75.63%, which is 7.5% higher than that of the next best method. Source Code: https://cs.ucsb.edu/~victor/pub/ucsb/dbl/snd/
- Sep 23 2015 physics.soc-ph cs.SI arXiv:1509.06633v1Spectral algorithms based on matrix representations of networks are often used to detect communities but classic spectral methods based on the adjacency matrix and its variants fail to detect communities in sparse networks. New spectral methods based on non-backtracking random walks have recently been introduced that successfully detect communities in many sparse networks. However, the spectrum of non-backtracking random walks ignores hanging trees in networks that can contain information about the community structure of networks. We introduce the reluctant backtracking operators that explicitly account for hanging trees as they admit a small probability of returning to the immediately previous node unlike the non-backtracking operators that forbid an immediate return. We show that the reluctant backtracking operators can detect communities in certain sparse networks where the non-backtracking operators cannot while performing comparably on benchmark stochastic block model networks and real world networks. We also show that the spectrum of the reluctant backtracking operator approximately optimises the standard modularity function similar to the flow matrix. Interestingly, for this family of non- and reluctant-backtracking operators the main determinant of performance on real-world networks is whether or not they are normalised to conserve probability at each node.
- Anticipating the future actions of a human is a widely studied problem in robotics that requires spatio-temporal reasoning. In this work we propose a deep learning approach for anticipation in sensory-rich robotics applications. We introduce a sensory-fusion architecture which jointly learns to anticipate and fuse information from multiple sensory streams. Our architecture consists of Recurrent Neural Networks (RNNs) that use Long Short-Term Memory (LSTM) units to capture long temporal dependencies. We train our architecture in a sequence-to-sequence prediction manner, and it explicitly learns to predict the future given only a partial temporal context. We further introduce a novel loss layer for anticipation which prevents over-fitting and encourages early anticipation. We use our architecture to anticipate driving maneuvers several seconds before they happen on a natural driving data set of 1180 miles. The context for maneuver anticipation comes from multiple sensors installed on the vehicle. Our approach shows significant improvement over the state-of-the-art in maneuver anticipation by increasing the precision from 77.4% to 90.5% and recall from 71.2% to 87.4%.
- In this paper, we study the theory for constructing DNA cyclic codes of odd length over $\Z_4[u]/\langle u^2 \rangle$ which play an important role in DNA computing. Cyclic codes of odd length over $\Z_4 + u \Z_4$ satisfy the reverse constraint and the reverse-complement constraint are studied in this paper. The structure and existence of such codes are also studied. The paper concludes with some DNA example obtained via the family of cyclic codes.
- Nonparametric two sample testing is a decision theoretic problem that involves identifying differences between two random variables without making parametric assumptions about their underlying distributions. We refer to the most common settings as mean difference alternatives (MDA), for testing differences only in first moments, and general difference alternatives (GDA), which is about testing for any difference in distributions. A large number of test statistics have been proposed for both these settings. This paper connects three classes of statistics - high dimensional variants of Hotelling's t-test, statistics based on Reproducing Kernel Hilbert Spaces, and energy statistics based on pairwise distances. We ask the question: how much statistical power do popular kernel and distance based tests for GDA have when the unknown distributions differ in their means, compared to specialized tests for MDA? We formally characterize the power of popular tests for GDA like the Maximum Mean Discrepancy with the Gaussian kernel (gMMD) and bandwidth-dependent variants of the Energy Distance with the Euclidean norm (eED) in the high-dimensional MDA regime. Some practically important properties include (a) eED and gMMD have asymptotically equal power; furthermore they enjoy a free lunch because, while they are additionally consistent for GDA, they also have the same power as specialized high-dimensional t-test variants for MDA. All these tests are asymptotically optimal (including matching constants) under MDA for spherical covariances, according to simple lower bounds, (b) The power of gMMD is independent of the kernel bandwidth, as long as it is larger than the choice made by the median heuristic, (c) There is a clear and smooth computation-statistics tradeoff for linear-time, subquadratic-time and quadratic-time versions of these tests, with more computation resulting in higher power.
- Cyclic codes over R have been introduced recently. In this paper, we study the cyclic codes over R and their $\Z_2$ image. Making use of algebraic structure, we find the some good $\Z_2$ codes of length 28.
- The cognitive radio networks are an emerging wireless communication and computing paradigm. The cognitive radio nodes execute computations on multiple heterogeneous channels in the absence of licensed users (a.k.a. primary users) of those bands. Termination detection is a fundamental and non-trivial problem in distributed systems. In this paper, we propose a termination detection protocol for multi-hop cognitive radio networks where the cognitive radio nodes are allowed to tune to channels that are not currently occupied by primary users and to move to different locations during the protocol execution. The proposed protocol applies credit distribution and aggregation approach and maintains a new kind of logical structure, called the virtual tree-like structure. The virtual tree-like structure helps in decreasing the latency involved in announcing termination. Unlike conventional tree structures, the virtual tree-like structure does not require a specific node to act as the root node that has to stay involved in the computation until termination announcement; hence, the root node may become idle soon after finishing its computation. Also, the protocol is able to detect the presence of licensed users and announce strong or weak termination, whichever is possible.
- The purpose of this paper is to study the cyclic self orthogonal codes over $\mathbb{Z}_{p^m}$. After providing the generator polynomial of cyclic self orthogonal codes over $\mathbb{Z}_{p^m}$, we give the necessary and sufficient condition for the existence of non-trivial self orthogonal codes over $\mathbb{Z}_{p^m}$ . We have also provided the number of such codes of length $n$ over $\mathbb{Z}_{p^m}$ for any $ (p,n) = 1 $.
- This paper studies the problem of estimating the covariance of a collection of vectors using only extremely compressed measurements of each vector. An estimator based on back-projections of these compressive samples is proposed and analyzed. A distribution-free analysis shows that by observing just a single compressive measurement of each vector, one can consistently estimate the covariance matrix, in both infinity and spectral norm, and this same analysis leads to precise rates of convergence in both norms. Via information-theoretic techniques, lower bounds showing that this estimator is minimax-optimal for both infinity and spectral norm estimation problems are established. These results are also specialized to give matching upper and lower bounds for estimating the population covariance of a collection of Gaussian vectors, again in the compressive measurement model. The analysis conducted in this paper shows that the effective sample complexity for this problem is scaled by a factor of $m^2/d^2$ where $m$ is the compression dimension and $d$ is the ambient dimension. Applications to subspace learning (Principal Components Analysis) and learning over distributed sensor networks are also discussed.
- This work explores the rate-reliability-complexity limits of the quasi-static K-user multiple access channel (MAC), with or without feedback. Using high-SNR asymptotics, the work first derives bounds on the computational resources required to achieve near-optimal (ML-based) decoding performance. It then bounds the (reduced) complexity needed to achieve any (including suboptimal) diversity-multiplexing performance tradeoff (DMT) performance, and finally bounds the same complexity, in the presence of feedback-aided user selection. This latter effort reveals the ability of a few bits of feedback not only to improve performance, but also to reduce complexity. In this context, the analysis reveals the interesting finding that proper calibration of user selection can allow for near-optimal ML-based decoding, with complexity that need not scale exponentially in the total number of codeword bits. The derived bounds constitute the best known performance-vs-complexity behavior to date for ML-based MAC decoding, as well as a first exploration of the complexity-feedback-performance interdependencies in multiuser settings.
- We consider the problem of matrix column subset selection, which selects a subset of columns from an input matrix such that the input can be well approximated by the span of the selected columns. Column subset selection has been applied to numerous real-world data applications such as population genetics summarization, electronic circuits testing and recommendation systems. In many applications the complete data matrix is unavailable and one needs to select representative columns by inspecting only a small portion of the input matrix. In this paper we propose the first provably correct column subset selection algorithms for partially observed data matrices. Our proposed algorithms exhibit different merits and drawbacks in terms of statistical accuracy, computational efficiency, sample complexity and sampling schemes, which provides a nice exploration of the tradeoff between these desired properties for column subset selection. The proposed methods employ the idea of feedback driven sampling and are inspired by several sampling schemes previously introduced for low-rank matrix approximation tasks [DMM08, FKV04, DV06, KS14]. Our analysis shows that, under the assumption that the input data matrix has incoherent rows but possibly coherent columns, all algorithms provably converge to the best low-rank approximation of the original data as number of selected columns increases. Furthermore, two of the proposed algorithms enjoy a relative error bound, which is preferred for column subset selection and matrix approximation purposes. We also demonstrate through both theoretical and empirical analysis the power of feedback driven sampling compared to uniform random sampling on input matrices with highly correlated columns.
- In active learning, the user sequentially chooses values for feature $X$ and an oracle returns the corresponding label $Y$. In this paper, we consider the effect of feature noise in active learning, which could arise either because $X$ itself is being measured, or it is corrupted in transmission to the oracle, or the oracle returns the label of a noisy version of the query point. In statistics, feature noise is known as "errors in variables" and has been studied extensively in non-active settings. However, the effect of feature noise in active learning has not been studied before. We consider the well-known Berkson errors-in-variables model with additive uniform noise of width $\sigma$. Our simple but revealing setting is that of one-dimensional binary classification setting where the goal is to learn a threshold (point where the probability of a $+$ label crosses half). We deal with regression functions that are antisymmetric in a region of size $\sigma$ around the threshold and also satisfy Tsybakov's margin condition around the threshold. We prove minimax lower and upper bounds which demonstrate that when $\sigma$ is smaller than the minimiax active/passive noiseless error derived in \citeCN07, then noise has no effect on the rates and one achieves the same noiseless rates. For larger $\sigma$, the \textitunflattening of the regression function on convolution with uniform noise, along with its local antisymmetry around the threshold, together yield a behaviour where noise \textitappears to be beneficial. Our key result is that active learning can buy significant improvement over a passive strategy even in the presence of feature noise.
- Interesting theoretical associations have been established by recent papers between the fields of active learning and stochastic convex optimization due to the common role of feedback in sequential querying mechanisms. In this paper, we continue this thread in two parts by exploiting these relations for the first time to yield novel algorithms in both fields, further motivating the study of their intersection. First, inspired by a recent optimization algorithm that was adaptive to unknown uniform convexity parameters, we present a new active learning algorithm for one-dimensional thresholds that can yield minimax rates by adapting to unknown noise parameters. Next, we show that one can perform $d$-dimensional stochastic minimization of smooth uniformly convex functions when only granted oracle access to noisy gradient signs along any coordinate instead of real-valued gradients, by using a simple randomized coordinate descent procedure where each line search can be solved by $1$-dimensional active learning, provably achieving the same error convergence rate as having the entire real-valued gradient. Combining these two parts yields an algorithm that solves stochastic convex optimization of uniformly convex and smooth functions using only noisy gradient signs by repeatedly performing active learning, achieves optimal rates and is adaptive to all unknown convexity and smoothness parameters.
- Density mode clustering is a nonparametric clustering method. The clusters are the basins of attraction of the modes of a density estimator. We study the risk of mode-based clustering. We show that the clustering risk over the cluster cores --- the regions where the density is high --- is very small even in high dimensions. And under a low noise condition, the overall cluster risk is small even beyond the cores, in high dimensions.
- Apr 30 2015 cs.SE arXiv:1504.07713v1Code cloning is an important software engineering aspect. It is a common software reuse principle that consists of duplicating source code within a program or across different systems owned or maintained by the same entity. There are several contradictory claims concerning the impact of cloning on software stability and maintenance effort. Some papers state that cloning is desired since it speeds up the development process and helps stakeholders meet the tight schedule and deliver on time. Other papers argue that code clone leads to code bloat and causes increase software maintenance costs due to copied defects and dead code. In this paper, we are replicating a previous study done on cloning by the original author. We are repeating his work using the same methods and metrics but with different subjects and experimenters. The paper we are addressing evaluates the impact of code cloning on code stability using three different stability-measuring methods. Our team will apply the same stability measurement techniques on a different software system developed in C programming language to determine generalizability, assure that the results are reliable, validate their outcomes, and to inspire new search by combining previous findings from related studies.