results for au:Singh_A in:cs

- Rankings are ubiquitous in the online world today. As we have transitioned from finding books in libraries to ranking products, jobs, job applicants, opinions and potential romantic partners, there is a substantial precedent that ranking systems have a responsibility not only to their users but also to the items being ranked. To address these often conflicting responsibilities, we propose a conceptual and computational framework that allows the formulation of fairness constraints on rankings. As part of this framework, we develop efficient algorithms for finding rankings that maximize the utility for the user while satisfying fairness constraints for the items. Since fairness goals can be application specific, we show how a broad range of fairness constraints can be implemented in our framework, including forms of demographic parity, disparate treatment, and disparate impact constraints. We illustrate the effect of these constraints by providing empirical results on two ranking problems.
- We present a method to discover differences between populations with respect to the spatial coherence of their oriented white matter microstructure in arbitrarily shaped white matter regions. This method is applied to diffusion MRI scans of a subset of the Human Connectome Project dataset: 57 pairs of monozygotic and 52 pairs of dizygotic twins. After controlling for morphological similarity between twins, we identify 3.7% of all white matter as being associated with genetic similarity (35.1k voxels, $p < 10^{-4}$, false discovery rate 1.5%), 75% of which spatially clusters into twenty-two contiguous white matter regions. Furthermore, we show that the orientation similarity within these regions generalizes to a subset of 47 pairs of non-twin siblings, and show that these siblings are on average as similar as dizygotic twins. The regions are located in deep white matter including the superior longitudinal fasciculus, the optic radiations, the middle cerebellar peduncle, the corticospinal tract, and within the anterior temporal lobe, as well as the cerebellum, brain stem, and amygdalae. These results extend previous work using undirected fractional anisotrophy for measuring putative heritable influences in white matter. Our multidirectional extension better accounts for crossing fiber connections within voxels. This bottom up approach has at its basis a novel measurement of coherence within neighboring voxel dyads between subjects, and avoids some of the fundamental ambiguities encountered with tractographic approaches to white matter analysis that estimate global connectivity.
- A major challenge in understanding the generalization of deep learning is to explain why (stochastic) gradient descent can exploit the network architecture to find solutions that have good generalization performance when using high capacity models. We find simple but realistic examples showing that this phenomenon exists even when learning linear classifiers --- between two linear networks with the same capacity, the one with a convolutional layer can generalize better than the other when the data distribution has some underlying spatial structure. We argue that this difference results from a combination of the convolution architecture, data distribution and gradient descent, all of which are necessary to be included in a meaningful analysis. We provide a general analysis of the generalization performance as a function of data distribution and convolutional filter size, given gradient descent as the optimization algorithm, then interpret the results using concrete examples. Experimental results show that our analysis is able to explain what happens in our introduced examples.
- Feb 12 2018 cs.CV arXiv:1802.03374v2This paper proposes a generative ScatterNet hybrid deep learning (G-SHDL) network for semantic image segmentation. The proposed generative architecture is able to train rapidly from relatively small labeled datasets using the introduced structural priors. In addition, the number of filters in each layer of the architecture is optimized resulting in a computationally efficient architecture. The G-SHDL network produces state-of-the-art classification performance against unsupervised and semi-supervised learning on two image datasets. Advantages of the G-SHDL network over supervised methods are demonstrated with experiments performed on training datasets of reduced size.
- Feb 08 2018 cs.AI arXiv:1802.02511v1We train and validate a semi-supervised, multi-task LSTM on 57,675 person-weeks of data from off-the-shelf wearable heart rate sensors, showing high accuracy at detecting multiple medical conditions, including diabetes (0.8451), high cholesterol (0.7441), high blood pressure (0.8086), and sleep apnea (0.8298). We compare two semi-supervised train- ing methods, semi-supervised sequence learning and heuristic pretraining, and show they outperform hand-engineered biomarkers from the medical literature. We believe our work suggests a new approach to patient risk stratification based on cardiovascular risk scores derived from popular wearables such as Fitbit, Apple Watch, or Android Wear.
- Ideas from forensic linguistics are now being used frequently in Natural Language Processing (NLP), using machine learning techniques. While the role of forensic linguistics was more benign earlier, it is now being used for purposes which are questionable. Certain methods from forensic linguistics are employed, without considering their scientific limitations and ethical concerns. While we take the specific case of forensic linguistics as an example of such trends in NLP and machine learning, the issue is a larger one and present in many other scientific and data-driven domains. We suggest that such trends indicate that some of the applied sciences are exceeding their legal and scientific briefs. We highlight how carelessly implemented practices are serving to short-circuit the due processes of law as well breach ethical codes.
- Dec 15 2017 cs.RO arXiv:1712.04978v1In this paper, we propose a trajectory optimization for computing smooth collision free trajectories for nonholonomic curvature bounded vehicles among static and dynamic obstacles. One of the key novelties of our formulation is a hierarchal optimization routine which alternately operates in the space of angular accelerations and linear velocities. That is, the optimization has a two layer structure wherein angular accelerations are optimized keeping the linear velocities fixed and vice versa. If the vehicle/obstacles are modeled as circles than the velocity optimization layer can be shown to have the computationally efficient difference of convex structure commonly observed for linear systems. This leads to a less conservative approximation as compared to that obtained by approximating each polygon individually through its circumscribing circle. On the other hand, it leads to optimization problem with less number of constraints as compared to that obtained when approximating polygons as multiple overlapping circles. We use the proposed trajectory optimization as the basis for constructing a Model Predictive Control framework for navigating an autonomous car in complex scenarios like overtaking, lane changing and merging. Moreover, we also highlight the advantages provided by the alternating optimization routine. Specifically, we show it produces trajectories which have comparable arc lengths and smoothness as compared to those produced with joint simultaneous optimization in the space of angular accelerations and linear velocities. However, importantly, the alternating optimization provides some gain in computational time.
- Dec 15 2017 cs.DC arXiv:1712.04989v1The advent of non-volatile memory (NVM) technologies like PCM, STT, memristors and Fe-RAM is believed to enhance the system performance by getting rid of the traditional memory hierarchy by reducing the gap between memory and storage. This memory technology is considered to have the performance like that of DRAM and persistence like that of disks. Thus, it would also provide significant performance benefits for big data applications by allowing in-memory processing of large data with the lowest latency to persistence. Leveraging the performance benefits of this memory-centric computing technology through traditional memory programming is not trivial and the challenges aggravate for parallel/concurrent applications. To this end, several programming abstractions have been proposed like NVthreads, Mnemosyne and intel's NVML. However, deciding upon a programming abstraction which is easier to program and at the same time ensures the consistency and balances various software and architectural trade-offs is openly debatable and active area of research for NVM community. We study the NVthreads, Mnemosyne and NVML libraries by building a concurrent and persistent set and open addressed hash-table data structure application. In this process, we explore and report various tradeoffs and hidden costs involved in building concurrent applications for persistence in terms of achieving efficiency, consistency and ease of programming with these NVM programming abstractions. Eventually, we evaluate the performance of the set and hash-table data structure applications. We observe that NVML is easiest to program with but is least efficient and Mnemosyne is most performance friendly but involves significant programming efforts to build concurrent and persistent applications.
- Dec 15 2017 cs.RO arXiv:1712.04965v1In this paper, we present a Model Predictive Control (MPC) framework based on path velocity decomposition paradigm for autonomous driving. The optimization underlying the MPC has a two layer structure wherein first, an appropriate path is computed for the vehicle followed by the computation of optimal forward velocity along it. The very nature of the proposed path velocity decomposition allows for seamless compatibility between the two layers of the optimization. A key feature of the proposed work is that it offloads most of the responsibility of collision avoidance to velocity optimization layer for which computationally efficient formulations can be derived. In particular, we extend our previously developed concept of time scaled collision cone (TSCC) constraints and formulate the forward velocity optimization layer as a convex quadratic programming problem. We perform validation on autonomous driving scenarios wherein proposed MPC repeatedly solves both the optimization layers in receding horizon manner to compute lane change, overtaking and merging maneuvers among multiple dynamic obstacles.
- We consider the problem of learning a one-hidden-layer neural network with non-overlapping convolutional layer and ReLU activation function, i.e., $f(\mathbf{Z}; \mathbf{w}, \mathbf{a}) = \sum_j a_j\sigma(\mathbf{w}^\top\mathbf{Z}_j)$, in which both the convolutional weights $\mathbf{w}$ and the output weights $\mathbf{a}$ are parameters to be learned. We prove that with Gaussian input $\mathbf{Z}$, there is a spurious local minimum that is not a global mininum. Surprisingly, in the presence of local minimum, starting from randomly initialized weights, gradient descent with weight normalization can still be proven to recover the true parameters with constant probability (which can be boosted to arbitrarily high accuracy with multiple restarts). We also show that with constant probability, the same procedure could also converge to the spurious local minimum, showing that the local minimum plays a non-trivial role in the dynamics of gradient descent. Furthermore, a quantitative analysis shows that the gradient descent dynamics has two phases: it starts off slow, but converges much faster after several iterations.
- Standard model-free deep reinforcement learning (RL) algorithms sample a new initial state for each trial, allowing them to optimize policies that can perform well even in highly stochastic environments. However, problems that exhibit considerable initial state variation typically produce high-variance gradient estimates for model-free RL, making direct policy or value function optimization challenging. In this paper, we develop a novel algorithm that instead optimizes an ensemble of policies, each on a different "slice" of the initial state space, and gradually unifies them into a single policy that can succeed on the whole state space. This approach, which we term divide-and-conquer RL, is able to solve complex tasks where conventional deep RL methods are ineffective. Our results show that divide-and-conquer RL greatly outperforms conventional policy gradient methods on challenging grasping, manipulation, and locomotion tasks, and exceeds the performance of a variety of prior methods.
- Workflows provide an expressive programming model for fine-grained control of large-scale applications in distributed computing environments. Accurate estimates of complex workflow execution metrics on large-scale machines have several key advantages. The performance of scheduling algorithms that rely on estimates of execution metrics degrades when the accuracy of predicted execution metrics decreases. This in-progress paper presents a technique being developed to improve the accuracy of predicted performance metrics of large-scale workflows on distributed platforms. The central idea of this work is to train resource-centric machine learning agents to capture complex relationships between a set of program instructions and their performance metrics when executed on a specific resource. This resource-centric view of a workflow exploits the fact that predicting execution times of sub-modules of a workflow requires monitoring and modeling of a few dynamic and static features. We transform the input workflow that is essentially a directed acyclic graph of actions into a Physical Resource Execution Plan (PREP). This transformation enables us to model an arbitrarily complex workflow as a set of simpler programs running on physical nodes. We delegate a machine learning model to capture performance metrics for each resource type when it executes different program instructions under varying degrees of resource contention. Our algorithm takes the prediction metrics from each resource agent and composes the overall workflow performance metrics by utilizing the structure of the corresponding Physical Resource Execution Plan.
- Nov 16 2017 cs.SY arXiv:1711.04884v1We consider a class of piecewise-deterministic Markov processes where the state evolves according to a linear dynamical system. This continuous time evolution is interspersed by discrete events that occur at random times and change (reset) the state based on a linear affine map. In particular, we consider two families of discrete events, with the first family of resets occurring at exponentially-distributed times. The second family of resets is generally-distributed, in the sense that, the time intervals between events are independent and identically distributed random variables that follow an arbitrary continuous positively-valued probability density function. For this class of stochastic systems, we provide explicit conditions that lead to finite stationary moments, and the corresponding exact closed-form moment formulas. These results are illustrated on an example drawn from systems biology, where a protein is expressed in bursts at exponentially-distributed time intervals, decays within the cell-cycle, and is randomly divided among daughter cells when generally-distributed cell-division events occur. Our analysis leads to novel results for the mean and noise levels in protein copy numbers, and we decompose the noise levels into components arising from stochastic expression, random cell-cycle times, and partitioning. Interestingly, these individual noise contributions behave differently as cell division times become more random. In summary, the paper expands the class of stochastic hybrid systems for which statistical moments can be derived exactly without any approximations, and these results have applications for studying random phenomena in diverse areas.
- The experimental design problem concerns the selection of k points from a potentially large design pool of p-dimensional vectors, so as to maximize the statistical efficiency regressed on the selected k design points. Statistical efficiency is measured by optimality criteria, including A(verage), D(eterminant), T(race), E(igen), V(ariance) and G-optimality. Except for the T-optimality, exact optimization is NP-hard. We propose a polynomial-time regret minimization framework to achieve a $(1+\varepsilon)$ approximation with only $O(p/\varepsilon^2)$ design points, for all the optimality criteria above. In contrast, to the best of our knowledge, before our work, no polynomial-time algorithm achieves $(1+\varepsilon)$ approximations for D/E/G-optimality, and the best poly-time algorithm achieving $(1+\varepsilon)$-approximation for A/V-optimality requires $k = \Omega(p^2/\varepsilon)$ design points.
- We consider the problem of optimizing a high-dimensional convex function using stochastic zeroth-order query oracles. Such problems arise naturally in a variety of practical applications, including optimizing experimental or simulation parameters with many variables. Under sparsity assumptions on the gradients or function values, we present a successive component/feature selection algorithm and a noisy mirror descent algorithm with Lasso gradient estimates and show that both algorithms have convergence rates depending only logarithmically on the ambient problem dimension. Empirical results verify our theoretical findings and suggest that our designed algorithms outperform classical zeroth-order optimization methods in the high-dimensional setting.
- Availability of an explainable deep learning model that can be applied to practical real world scenarios and in turn, can consistently, rapidly and accurately identify specific and minute traits in applicable fields of biological sciences, is scarce. Here we consider one such real world example viz., accurate identification, classification and quantification of biotic and abiotic stresses in crop research and production. Up until now, this has been predominantly done manually by visual inspection and require specialized training. However, such techniques are hindered by subjectivity resulting from inter- and intra-rater cognitive variability. Here, we demonstrate the ability of a machine learning framework to identify and classify a diverse set of foliar stresses in the soybean plant with remarkable accuracy. We also present an explanation mechanism using gradient-weighted class activation mapping that isolates the visual symptoms used by the model to make predictions. This unsupervised identification of unique visual symptoms for each stress provides a quantitative measure of stress severity, allowing for identification, classification and quantification in one framework. The learnt model appears to be agnostic to species and make good predictions for other (non-soybean) species, demonstrating an ability of transfer learning.
- Oct 17 2017 cs.CV arXiv:1710.05152v1Iris serves as one of the best biometric modality owing to its complex, unique and stable structure. However, it can still be spoofed using fabricated eyeballs and contact lens. Accurate identification of contact lens is must for reliable performance of any biometric authentication system based on this modality. In this paper, we present a novel approach for detecting contact lens using a Generalized Hierarchically tuned Contact Lens detection Network (GHCLNet) . We have proposed hierarchical architecture for three class oculus classification namely: no lens, soft lens and cosmetic lens. Our network architecture is inspired by ResNet-50 model. This network works on raw input iris images without any pre-processing and segmentation requirement and this is one of its prodigious strength. We have performed extensive experimentation on two publicly available data-sets namely: 1)IIIT-D 2)ND and on IIT-K data-set (not publicly available) to ensure the generalizability of our network. The proposed architecture results are quite promising and outperforms the available state-of-the-art lens detection algorithms.
- Oct 16 2017 cs.CV arXiv:1710.04681v1Charcoal rot is a fungal disease that thrives in warm dry conditions and affects the yield of soybeans and other important agronomic crops worldwide. There is a need for robust, automatic and consistent early detection and quantification of disease symptoms which are important in breeding programs for the development of improved cultivars and in crop production for the implementation of disease control measures for yield protection. Current methods of plant disease phenotyping are predominantly visual and hence are slow and prone to human error and variation. There has been increasing interest in hyperspectral imaging applications for early detection of disease symptoms. However, the high dimensionality of hyperspectral data makes it very important to have an efficient analysis pipeline in place for the identification of disease so that effective crop management decisions can be made. The focus of this work is to determine the minimal number of most effective hyperspectral bands that can distinguish between healthy and diseased specimens early in the growing season. Healthy and diseased hyperspectral data cubes were captured at 3, 6, 9, 12, and 15 days after inoculation. We utilized inoculated and control specimens from 4 different genotypes. Each hyperspectral image was captured at 240 different wavelengths in the range of 383 to 1032 nm. We used a combination of genetic algorithm as an optimizer and support vector machines as a classifier for identification of maximally effective band combinations. A binary classification between healthy and infected samples using six selected band combinations obtained a classification accuracy of 97% and a F1 score of 0.97 for the infected class. The results demonstrated that these carefully chosen bands are more informative than RGB images, and could be used in a multispectral camera for remote identification of charcoal rot infection in soybean.
- Oct 10 2017 cs.RO arXiv:1710.02610v1Studying snake robot locomotion in a cluttered environment has been a complicated task because the motion model is discontinuous due to the physical contact with obstacles, and the contact force cannot be determined solely by contact positions. We present a unique mathematical model of the robot interacting with obstacles in which the contact forces are mapped on the basis of a viscous friction model. Also a motion planning strategy has been introduced which helps deriving the simplest path that ensures sufficient number of contacts of the robot with the obstacles required to reach a goal position. Numerical simulations and experimental results are presented to validate the theoretical approach.
- Finding patterns in data and being able to retrieve information from those patterns is an important task in Information retrieval. Complex search requirements which are not fulfilled by simple string matching and require exploring certain patterns in data demand a better query engine that can support searching via structured queries. In this article, we built a structured query engine which supports searching data through structured queries on the lines of ElasticSearch. We will show how we achieved real time indexing and retrieving of data through a RESTful API and how complex queries can be created and processed using efficient data structures we created for storing the data in structured way. Finally, we will conclude with an example of movie recommendation system built on top of this query engine.
- Oct 03 2017 cs.SY arXiv:1710.00289v1In this paper, trajectory prediction and control design for a desired hit point of a projectile is studied. Projectiles are subject to environment noise such as wind effect and measurement noise. In addition, mathematical models of projectiles contain a large number of important states that should be taken into account for having a realistic prediction. Furthermore, dynamics of projectiles contain nonlinear functions such as monomials and sine functions. To address all these issues we formulate a stochastic model for the projectile. We showed that with a set of transformations projectile dynamics only contains nonlinearities of the form of monomials. In the next step we derived approximate moment dynamics of this system using mean-field approximation. Our method still suffers from size of the system. To address this problem we selected a subset of first- and second-order statistical moments and we showed that they give reliable approximations of the mean and standard deviation of the impact point for a real projectile. Finally we used these selected moments to derive a control law that reduces error to hit a desired point.
- Oct 02 2017 cs.RO arXiv:1709.10452v1This paper presents a novel design of an Omnidirectional bendable Omnicrawler module. Compared to conventional crawlers, OmniCrawler module possesses an extra degree of freedom for sideways rolling motion, and the circular crosssection of the module enables holonomic motion of the robot. These advantages are further enhanced by the introduction of Omnidirectional joint with-in the module, which is the key contribution of this paper. This achieves high maneuverability and adaptability of the OmniCrawler module on an uneven surface. The Omni-directional bending is realized by means of two mechanisms: a Telescopic screw drive mechanism, and an arrangement of two independent 1-DOF joints aligned at 90? with respect to each other. The hybrid soft-rigid structure of the module provides compliant pathways for lug-chain assembly which allows crawling motion even in the bent configuration of the module. We show that the unique modular design unveils its versatility in terms of achieving compliance on an uneven surface, demonstrating its applications in different robotic platforms, such as an in-pipeline robot, Quadruped, snake robot, and exhibiting hybrid locomotive traits in various configurations of the robots with the help of simulations and experiments results on real robot prototype.
- Existing socio-psychological studies suggest that users of a social network form their opinions relying on the opinions of their neighbors. According to DeGroot opinion formation model, one value of particular importance is the asymptotic consensus value---the sum of user opinions weighted by the users' eigenvector centralities. This value plays the role of an attractor for the opinions in the network and is a lucrative target for external influence. However, since any potentially malicious control of the opinion distribution in a social network is clearly undesirable, it is important to design methods to prevent the external attempts to strategically change the asymptotic consensus value. In this work, we assume that the adversary wants to maximize the asymptotic consensus value by altering the opinions of some users in a network; we, then, state DIVER---an NP-hard problem of disabling such external influence attempts by strategically adding a limited number of edges to the network. Relying on the theory of Markov chains, we provide perturbation analysis that shows how eigenvector centrality and, hence, DIVER's objective function change in response to an edge's addition to the network. The latter leads to the design of a pseudo-linear-time heuristic for DIVER, whose computation relies on efficient estimation of mean first passage times in a Markov chain. We confirm our theoretical findings in experiments.
- Sep 05 2017 cs.DC arXiv:1709.00681v1A major focus of software transaction memory systems (STMs) has been to felicitate the multiprocessor programming and provide parallel programmers an abstraction for speedy and efficient development of parallel applications. To this end, different models for incorporating object/higher level semantics into STM have recently been proposed in transactional boosting, transactional data structure library, open nested transactions and abstract nested transactions. We build an alternative object model STM (OSTM) by adopting the transactional tree model of Weikum et al. originally given for databases and extend the current work by proposing comprehensive legality definitions and conflict notion which allows efficient composability of \otm. We first time show the proposed OSTM to be co-opacity. We build OSTM using chained hash table data structure. Lazyskip-list is used to implement chaining using lazy approach. We notice that major concurrency hotspot is the chaining data structure within the hash table. Lazyskip-list is time efficient compared to lists in terms of traversal overhead by average case O(log(n)). We optimise lookups as they are validated at the instant they execute and they have not validated again in commit phase. This allows lookup dominated transactions to be more efficient and at the same time co-opacity.
- Aug 31 2017 cs.CV arXiv:1708.09212v1The paper proposes the ScatterNet Hybrid Deep Learning (SHDL) network that extracts invariant and discriminative image representations for object recognition. SHDL framework is constructed with a multi-layer ScatterNet front-end, an unsupervised learning middle, and a supervised learning back-end module. Each layer of the SHDL network is automatically designed as an explicit optimization problem leading to an optimal deep learning architecture with improved computational performance as compared to the more usual deep network architectures. SHDL network produces the state-of-the-art classification performance against unsupervised and semi-supervised learning (GANs) on two image datasets. Advantages of the SHDL network over supervised methods (NIN, VGG) are also demonstrated with experiments performed on training datasets of reduced size.
- Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional NetworkAug 31 2017 cs.CV arXiv:1708.09317v1Disguised face identification (DFI) is an extremely challenging problem due to the numerous variations that can be introduced using different disguises. This paper introduces a deep learning framework to first detect 14 facial key-points which are then utilized to perform disguised face identification. Since the training of deep learning architectures relies on large annotated datasets, two annotated facial key-points datasets are introduced. The effectiveness of the facial keypoint detection framework is presented for each keypoint. The superiority of the key-point detection framework is also demonstrated by a comparison with other deep networks. The effectiveness of classification performance is also demonstrated by comparison with the state-of-the-art face disguise classification methods.
- Aug 31 2017 cs.CV arXiv:1708.09300v1Automation of brain matter segmentation from MR images is a challenging task due to the irregular boundaries between the grey and white matter regions. In addition, the presence of intensity inhomogeneity in the MR images further complicates the problem. In this paper, we propose a texture and vesselness incorporated version of the ScatterNet Hybrid Deep Learning Network (TS-SHDL) that extracts hierarchical invariant mid-level features, used by fisher vector encoding and a conditional random field (CRF) to perform the desired segmentation. The performance of the proposed network is evaluated by extensive experimentation and comparison with the state-of-the-art methods on several 2D MRI scans taken from the synthetic McGill Brain Web as well as on the MRBrainS dataset of real 3D MRI scans. The advantages of the TS-SHDL network over supervised deep learning networks is also presented in addition to its superior performance over the state-of-the-art.
- We propose a DTCWT ScatterNet Convolutional Neural Network (DTSCNN) formed by replacing the first few layers of a CNN network with a parametric log based DTCWT ScatterNet. The ScatterNet extracts edge based invariant representations that are used by the later layers of the CNN to learn high-level features. This improves the training of the network as the later layers can learn more complex patterns from the start of learning because the edge representations are already present. The efficient learning of the DTSCNN network is demonstrated on CIFAR-10 and Caltech-101 datasets. The generic nature of the ScatterNet front-end is shown by an equivalent performance to pre-trained CNN front-ends. A comparison with the state-of-the-art on CIFAR-10 and Caltech-101 datasets is also presented.
- Over 150,000 new people in the United States are diagnosed with colorectal cancer each year. Nearly a third die from it (American Cancer Society). The only approved noninvasive diagnosis tools currently involve fecal blood count tests (FOBTs) or stool DNA tests. Fecal blood count tests take only five minutes and are available over the counter for as low as \$15. They are highly specific, yet not nearly as sensitive, yielding a high percentage (25%) of false negatives (Colon Cancer Alliance). Moreover, FOBT results are far too generalized, meaning that a positive result could mean much more than just colorectal cancer, and could just as easily mean hemorrhoids, anal fissure, proctitis, Crohn's disease, diverticulosis, ulcerative colitis, rectal ulcer, rectal prolapse, ischemic colitis, angiodysplasia, rectal trauma, proctitis from radiation therapy, and others. Stool DNA tests, the modern benchmark for CRC screening, have a much higher sensitivity and specificity, but also cost \$600, take two weeks to process, and are not for high-risk individuals or people with a history of polyps. To yield a cheap and effective CRC screening alternative, a unique ensemble-based classification algorithm is put in place that considers the FIT result, BMI, smoking history, and diabetic status of patients. This method is tested under ten-fold cross validation to have a .95 AUC, 92% specificity, 89% sensitivity, .88 F1, and 90% precision. Once clinically validated, this test promises to be cheaper, faster, and potentially more accurate when compared to a stool DNA test.
- We tackle the problem of learning robotic sensorimotor control policies that can generalize to visually diverse and unseen environments. Achieving broad generalization typically requires large datasets, which are difficult to obtain for task-specific interactive processes such as reinforcement learning or learning from demonstration. However, much of the visual diversity in the world can be captured through passively collected datasets of images or videos. In our method, which we refer to as GPLAC (Generalized Policy Learning with Attentional Classifier), we use both interaction data and weakly labeled image data to augment the generalization capacity of sensorimotor policies. Our method combines multitask learning on action selection and an auxiliary binary classification objective, together with a convolutional neural network architecture that uses an attentional mechanism to avoid distractors. We show that pairing interaction data from just a single environment with a diverse dataset of weakly labeled data results in greatly improved generalization to unseen environments, and show that this generalization depends on both the auxiliary objective and the attentional architecture that we propose. We demonstrate our results in both simulation and on a real robotic manipulator, and demonstrate substantial improvement over standard convolutional architectures and domain adaptation methods.
- Aug 02 2017 cs.CV arXiv:1708.00163v2One in twenty-five patients admitted to a hospital will suffer from a hospital acquired infection. If we can intelligently track healthcare staff, patients, and visitors, we can better understand the sources of such infections. We envision a smart hospital capable of increasing operational efficiency and improving patient care with less spending. In this paper, we propose a non-intrusive vision-based system for tracking people's activity in hospitals. We evaluate our method for the problem of measuring hand hygiene compliance. Empirically, our method outperforms existing solutions such as proximity-based techniques and covert in-person observational studies. We present intuitive, qualitative results that analyze human movement patterns and conduct spatial analytics which convey our method's interpretability. This work is a first step towards a computer-vision based smart hospital and demonstrates promising results for reducing hospital acquired infections.
- Quadcopters, as unmanned aerial vehicles (UAVs), have great potential in civil applications such as surveying, building monitoring, and infrastructure condition assessment. Quadcopters, however, are relatively sensitive to noises and disturbances so that their performance may be quickly downgraded in the case of inadequate control, system uncertainties and/or external disturbances. In this study, we deal with the quadrotor low-level control by proposing a robust scheme named the adaptive second-order quasi-continuous sliding mode control (adaptive 2-QCSM). The ultimate objective is for robust attitude control of the UAV in monitoring and inspection of built infrastructure. First, the mathematical model of the quadcopter is derived considering nonlinearity, strong coupling, uncertain dynamics and external disturbances. The control design includes the selection of the sliding manifold and the development of quasi-continuous second-order sliding mode controller with an adaptive gain. Stability of the overall control system is analysed by using a global Lyapunov function for convergence of both the sliding dynamics and adaptation scheme. Extensive simulations have been carried out for evaluation. Results show that the proposed controller can achieve robustness against disturbances or parameter variations and has better tracking performance in comparison with experimental responses of a UAV in a real-time monitoring task.
- Jul 27 2017 cs.DC arXiv:1707.08357v1Writing concurrent programs for shared memory multiprocessor systems is a nightmare. This hinders users to exploit the full potential of multiprocessors. STM (Software Transactional Memory) is a promising concurrent programming paradigm which addresses woes of programming for multiprocessor systems. In this paper, we implement BTO (Basic Timestamp Ordering), SGT (Serialization Graph Testing) and MVTO(Multi-Version Time-Stamp Ordering) concurrency control protocols and build an STM(Software Transactional Memory) library to evaluate the performance of these protocols. The deferred write approach is followed to implement the STM. A SET data structure is implemented using the transactions of our STM library. And this transactional SET is used as a test application to evaluate the STM. The performance of the protocols is rigorously compared against the linked-list module of the Synchrobench benchmark. Linked list module implements SET data structure using lazy-list, lock-free list, lock-coupling list and ESTM (Elastic Software Transactional Memory). Our analysis shows that for a number of threads greater than 60 and update rate 70%, BTO takes (17% to 29%) and (6% to 24%) less CPU time per thread when compared against lazy-list and lock-coupling list respectively. MVTO takes (13% to 24%) and (3% to 24%) less CPU time per thread when compared against lazy-list and lock-coupling list respectively. BTO and MVTO have similar per thread CPU time. BTO and MVTO outperform SGT by 9% to 36%.
- Jul 11 2017 cs.CR arXiv:1707.02691v1Malwares are becoming persistent by creating full- edged variants of the same or different family. Malwares belonging to same family share same characteristics in their functionality of spreading infections into the victim computer. These similar characteristics among malware families can be taken as a measure for creating a solution that can help in the detection of the malware belonging to particular family. In our approach we have taken the advantage of detecting these malware families by creating the database of these characteristics in the form of n-grams of API sequences. We use various similarity score methods and also extract multiple API sequences to analyze malware effectively.
- Jun 21 2017 cs.RO arXiv:1706.06418v1This paper discusses the design of a novel compliant in-pipe climbing modular robot for small diameter pipes. The robot consists of a kinematic chain of 3 OmniCrawler modules with a link connected in between 2 adjacent modules via compliant joints. While the tank-like crawler mechanism provides good traction on low friction surfaces, its circular cross-section makes it holonomic. The holonomic motion assists it to re-align in a direction to avoid obstacles during motion as well as overcome turns with a minimal energy posture. Additionally, the modularity enables it to negotiate T-junction without motion singularity. The compliance is realized using 4 torsion springs incorporated in joints joining 3 modules with 2 links. For a desirable pipe diameter (\textØ 75mm), the springs' stiffness values are obtained by formulating a constraint optimization problem which has been simulated in ADAMS MSC and further validated on a real robot prototype. In order to negotiate smooth vertical bends and friction coefficient variations in pipes, the design was later modified by replacing springs with series elastic actuators (SEA) at 2 of the 4 joints.
- Although gradient descent (GD) almost always escapes saddle points asymptotically [Lee et al., 2016], this paper shows that even with fairly natural random initialization schemes and non-pathological functions, GD can be significantly slowed down by saddle points, taking exponential time to escape. On the other hand, gradient descent with perturbations [Ge et al., 2015, Jin et al., 2017] is not slowed down by saddle points - it can find an approximate local minimizer in polynomial time. This result implies that GD is inherently slower than perturbed GD, and justifies the importance of adding perturbations for efficient non-convex optimization. While our focus is theoretical, we also present experiments that illustrate our theoretical findings.
- May 02 2017 cs.CV arXiv:1705.00360v2This paper presents a novel real-time method for tracking salient closed boundaries from video image sequences. This method operates on a set of straight line segments that are produced by line detection. The tracking scheme is coherently integrated into a perceptual grouping framework in which the visual tracking problem is tackled by identifying a subset of these line segments and connecting them sequentially to form a closed boundary with the largest saliency and a certain similarity to the previous one. Specifically, we define a new tracking criterion which combines a grouping cost and an area similarity constraint. The proposed criterion makes the resulting boundary tracking more robust to local minima. To achieve real-time tracking performance, we use Delaunay Triangulation to build a graph model with the detected line segments and then reduce the tracking problem to finding the optimal cycle in this graph. This is solved by our newly proposed closed boundary candidates searching algorithm called "Bidirectional Shortest Path (BDSP)". The efficiency and robustness of the proposed method are tested on real video sequences as well as during a robot arm pouring experiment.
- Apr 25 2017 cs.RO arXiv:1704.06817v1This paper presents a modular in-pipeline climbing robot with a novel compliant foldable OmniCrawler mechanism. The circular cross-section of the OmniCrawler module enables a holonomic motion to facilitate the alignment of the robot in the direction of bends. Additionally, the crawler mechanism provides a fair amount of traction, even on slippery surfaces. These advantages of crawler modules have been further supplemented by incorporating active compliance in the module itself which helps to negotiate sharp bends in small diameter pipes. The robot has a series of 3 such compliant foldable modules interconnected by the links via passive joints. For the desirable pipe diameter and curvature of the bends, the spring stiffness value for each passive joint is determined by formulating a constrained optimization problem using the quasi-static model of the robot. Moreover, a minimum friction coefficient value between the module-pipe surface which can be vertically climbed by the robot without slipping is estimated. The numerical simulation results have further been validated by experiments on real robot prototype.
- Browsers can detect malicious websites that are provisioned with forged or fake TLS/SSL certificates. However, they are not so good at detecting malicious websites if they are provisioned with mistakenly issued certificates or certificates that have been issued by a compromised certificate authority. Google proposed certificate transparency which is an open framework to monitor and audit certificates in real time. Thereafter, a few other certificate transparency schemes have been proposed which can even handle revocation. All currently known constructions use Merkle hash trees and have proof size logarithmic in the number of certificates/domain owners. We present a new certificate transparency scheme with short (constant size) proofs. Our construction makes use of dynamic bilinear-map accumulators. The scheme has many desirable properties like efficient revocation, low verification cost and update costs comparable to the existing schemes. We provide proofs of security and evaluate the performance of our scheme.
- In this work we introduce Qumin, a novel quantum programming language with a focus on providing an easy to use, minimalist, high-level, and easily extensible platform for quantum programming. Qumin's design concentrates on encompassing the various interactions between classical and quantum computation via the use of two sublanguages: an untyped one that handles classical preparation and control, and one linearly typed that explicitly handles quantum routines. This allows the classical part of the language to be freely used for general programming while placing restrictions on the quantum part that enforce rules of quantum computing like the no-cloning of qubits. We describe both the language's theoretical foundations in terms of lambda calculi and linear type systems, and more practical matters such as implementations of algorithms and useful programming tools like matrix and oracle generators that streamline the interaction of the classical and quantum fragments of a program. Finally, we provide an experimental open-source implementation of an interpreter, typechecker and related tools for the language (which can be found in \urlhttps://github.com/wintershammer/QImp).
- Apr 10 2017 cs.LG arXiv:1704.02197v2Clustering is a useful data exploratory method with its wide applicability in multiple fields. However, data clustering greatly relies on initialization of cluster centers that can result in large intra-cluster variance and dead centers, therefore leading to sub-optimal solutions. This paper proposes a novel variance based version of the conventional Moving K-Means (MKM) algorithm called Variance Based Moving K-Means (VMKM) that can partition data into optimal homogeneous clusters, irrespective of cluster initialization. The algorithm utilizes a novel distance metric and a unique data element selection criteria to transfer the selected elements between clusters to achieve low intra-cluster variance and subsequently avoid dead centers. Quantitative and qualitative comparison with various clustering techniques is performed on four datasets selected from image processing, bioinformatics, remote sensing and the stock market respectively. An extensive analysis highlights the superior performance of the proposed method over other techniques.
- Apr 04 2017 cs.SY arXiv:1704.00236v1A Network Control System (NCS) consists of control components that interact with the plant over a shared network. The system dynamics of a NCS could be subject to noise arising from randomness in the times at which the data is transmitted over the network, corruption of the transmitted data by the communication network, and external disturbances that might affect the plant. A question of interest is to understand how the statistics of the data transmission times affects the system dynamics, and under what conditions the system is stable. Another related issue is designing a controller that meets desired performance specifications (e.g., a specific mean and variance of the system state). Here, we consider a minimal NCS that consists of a plant and a controller, and it is subject to random transmission times, channel corruption and external disturbances. We derive exact dynamics of the first two moments of the system, and use them to derive the stability conditions of the system. We further design a control law that steers the system to a desired mean and variance. Finally, we demonstrate our results using different examples, and show that under some specific conditions, randomness in the data transmission times can even reduce the variability contributed from disturbance.
- Apr 03 2017 cs.LO arXiv:1703.10994v1Separation Logic is an effective Program Logic for proving programs that involve pointers. Reasoning with pointers becomes difficult especially when there is aliasing arising due to several pointers to a given cell location. In this paper, we try to explore the problems with aliasing through some simple examples and introduce the notion of separating conjunction as a tool to deal with it. We introduce Separation Logic as an extension of the standard Hoare Logic with the help pf a programming language that has four pointer manipulating commands. These commands perform the usual heap operations such as lookup, update, allocation and deallocation. The new set of assertions and axioms of Separation Logic is presented in a semi-formal style. Examples are given to illustrate the unique features of the new assertions and axioms. Finally the paper concludes with the proofs of some real programs using the axioms of Separation Logic.
- Apr 03 2017 cs.LO arXiv:1703.10977v1We present fully formalized proofs of some central theorems from combinatorics. These are Dilworth's decomposition theorem, Mirsky's theorem, Hall's marriage theorem and the Erdős-Szekeres theorem. Dilworth's decomposition theorem is the key result among these. It states that in any finite partially ordered set (poset), the size of a smallest chain cover and a largest antichain are the same. Mirsky's theorem is a dual of Dilworth's decomposition theorem, which states that in any finite poset, the size of a smallest antichain cover and a largest chain are the same. We use Dilworth's theorem in the proofs of Hall's Marriage theorem and the Erdős-Szekeres theorem. The combinatorial objects involved in these theorems are sets and sequences. All the proofs are formalized in the Coq proof assistant. We develop a library of definitions and facts that can be used as a framework for formalizing other theorems on finite posets.
- We present two fully mechanized proofs of Dilworths and Mirskys theorems in the Coq proof assistant. Dilworths Theorem states that in any finite partially ordered set (poset), the size of a smallest chain cover and a largest antichain are the same. Mirskys Theorem is a dual of Dilworths Theorem. We formalize the proofs by Perles [2] (for Dilworths Theorem) and by Mirsky [5] (for the dual theorem). We also come up with a library of definitions and facts that can be used as a framework for formalizing other theorems on finite posets.
- Mar 17 2017 cs.HC arXiv:1703.05462v1As virtual reality (VR) emerges as a mainstream platform, designers have started to experiment new interaction techniques to enhance the user experience. This is a challenging task because designers not only strive to provide designs with good performance but also carefully ensure not to disrupt users' immersive experience. There is a dire need for a new evaluation tool that extends beyond traditional quantitative measurements to assist designers in the design process. We propose an EEG-based experiment framework that evaluates interaction techniques in VR by measuring intentionally elicited cognitive conflict. Through the analysis of the feedback-related negativity (FRN) as well as other quantitative measurements, this framework allows designers to evaluate the effect of the variables of interest. We studied the framework by applying it to the fundamental task of 3D object selection using direct 3D input, i.e. tracked hand in VR. The cognitive conflict is intentionally elicited by manipulating the selection radius of the target object. Our first behavior experiment validated the framework in line with the findings of conflict-induced behavior adjustments like those reported in other classical psychology experiment paradigms. Our second EEG-based experiment examines the effect of the appearance of virtual hands. We found that the amplitude of FRN correlates with the level of realism of the virtual hands, which concurs with the Uncanny Valley theory.
- Mar 07 2017 cs.CV arXiv:1703.01698v2This paper presents two visual trackers from the different paradigms of learning and registration based tracking and evaluates their application in image based visual servoing. They can track object motion with four degrees of freedom (DoF) which, as we will show here, is sufficient for many fine manipulation tasks. One of these trackers is a newly developed learning based tracker that relies on learning discriminative correlation filters while the other is a refinement of a recent 8 DoF RANSAC based tracker adapted with a new appearance model for tracking 4 DoF motion. Both trackers are shown to provide superior performance to several state of the art trackers on an existing dataset for manipulation tasks. Further, a new dataset with challenging sequences for fine manipulation tasks captured from robot mounted eye-in-hand (EIH) cameras is also presented. These sequences have a variety of challenges encountered during real tasks including jittery camera movement, motion blur, drastic scale changes and partial occlusions. Quantitative and qualitative results on these sequences are used to show that these two trackers are robust to failures while providing high precision that makes them suitable for such fine manipulation tasks.
- Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and possibly exceed the sample size. We consider the problem of robust estimation of sparse functionals, and provide a computationally and statistically efficient algorithm in the high-dimensional setting. Our theory identifies a unified set of deterministic conditions under which our algorithm guarantees accurate recovery. By further establishing that these deterministic conditions hold with high-probability for a wide range of statistical models, our theory applies to many problems of considerable interest including sparse mean and covariance estimation; sparse linear regression; and sparse generalized linear models.
- We show that given an estimate $\widehat{A}$ that is close to a general high-rank positive semi-definite (PSD) matrix $A$ in spectral norm (i.e., $\|\widehat{A}-A\|_2 \leq \delta$), the simple truncated SVD of $\widehat{A}$ produces a multiplicative approximation of $A$ in Frobenius norm. This observation leads to many interesting results on general high-rank matrix estimation problems, which we briefly summarize below ($A$ is an $n\times n$ high-rank PSD matrix and $A_k$ is the best rank-$k$ approximation of $A$): (1) High-rank matrix completion: By observing $\Omega(\frac{n\max\{\epsilon^{-4},k^2\}\mu_0^2\|A\|_F^2\log n}{\sigma_{k+1}(A)^2})$ elements of $A$ where $\sigma_{k+1}\left(A\right)$ is the $\left(k+1\right)$-th singular value of $A$ and $\mu_0$ is the incoherence, the truncated SVD on a zero-filled matrix satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(\epsilon))\|A-A_k\|_F$ with high probability. (2)High-rank matrix de-noising: Let $\widehat{A}=A+E$ where $E$ is a Gaussian random noise matrix with zero mean and $\nu^2/n$ variance on each entry. Then the truncated SVD of $\widehat{A}$ satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(\sqrt{\nu/\sigma_{k+1}(A)}))\|A-A_k\|_F + O(\sqrt{k}\nu)$. (3) Low-rank Estimation of high-dimensional covariance: Given $N$ i.i.d.~samples $X_1,\cdots,X_N\sim\mathcal N_n(0,A)$, can we estimate $A$ with a relative-error Frobenius norm bound? We show that if $N = \Omega\left(n\max\{\epsilon^{-4},k^2\}\gamma_k(A)^2\log N\right)$ for $\gamma_k(A)=\sigma_1(A)/\sigma_{k+1}(A)$, then $\|\widehat{A}_k-A\|_F \leq (1+O(\epsilon))\|A-A_k\|_F$ with high probability, where $\widehat{A}=\frac{1}{N}\sum_{i=1}^N{X_iX_i^\top}$ is the sample covariance.
- The sparsest cut problem consists of identifying a small set of edges that breaks the graph into balanced sets of vertices. The normalized cut problem balances the total degree, instead of the size, of the resulting sets. Applications of graph cuts include community detection and computer vision. However, cut problems were originally proposed for static graphs, an assumption that does not hold in many modern applications where graphs are highly dynamic. In this paper, we introduce the sparsest and normalized cut problems in temporal graphs, which generalize their standard definitions by enforcing the smoothness of cuts over time. We propose novel formulations and algorithms for computing temporal cuts using spectral graph theory, multiplex graphs, divide-and-conquer and low-rank matrix approximation. Furthermore, we extend our formulation to dynamic graph signals, where cuts also capture node values, as graph wavelets. Experiments show that our solutions are accurate and scalable, enabling the discovery of dynamic communities and the analysis of dynamic graph processes.
- Feb 15 2017 cs.SI arXiv:1702.04082v3Network centrality plays an important role in many applications. Central nodes in social networks can be influential, driving opinions and spreading news or rumors.In hyperlinked environments, such as the Web, where users navigate via clicks, central content receives high traffic, becoming targets for advertising campaigns. While there is an extensive amount of work on centrality measures and their efficient computation, controlling nodes' centrality via network updates is a more recent and challenging problem. Performing minimal modifications to a network to achieve a desired property falls under the umbrella of network design problems. This paper is focused on improving the coverage centrality of a set of nodes, which is the number of pairs of nodes that have a shortest path passing through the set, by adding edges to the network. We prove strong inapproximability results and propose a greedy algorithm for maximizing coverage centrality. To ensure scalability to large networks, we also design an efficient sampling algorithm for the problem. In addition to providing an extensive empirical evaluation of our algorithms, we also show that, under some realistic constraints, the proposed solutions achieve almost-optimal approximation for coverage centrality maximization.
- Feb 14 2017 cs.CV arXiv:1702.03345v1This paper introduces a Deep Scattering network that utilizes Dual-Tree complex wavelets to extract translation invariant representations from an input signal. The computationally efficient Dual-Tree wavelets decompose the input signal into densely spaced representations over scales. Translation invariance is introduced in the representations by applying a non-linearity over a region followed by averaging. The discriminatory information in the densely spaced, locally smooth, signal representations aids the learning of the classifier. The proposed network is shown to outperform Mallat's ScatterNet on four datasets with different modalities on classification accuracy.
- Feb 13 2017 cs.CV arXiv:1702.03267v1We introduce a ScatterNet that uses a parametric log transformation with Dual-Tree complex wavelets to extract translation invariant representations from a multi-resolution image. The parametric transformation aids the OLS pruning algorithm by converting the skewed distributions into relatively mean-symmetric distributions while the Dual-Tree wavelets improve the computational efficiency of the network. The proposed network is shown to outperform Mallat's ScatterNet on two image datasets, both for classification accuracy and computational efficiency. The advantages of the proposed network over other supervised and some unsupervised methods are also presented using experiments performed on different training dataset sizes.
- Although a majority of the theoretical literature in high-dimensional statistics has focused on settings which involve fully-observed data, settings with missing values and corruptions are common in practice. We consider the problems of estimation and of constructing component-wise confidence intervals in a sparse high-dimensional linear regression model when some covariates of the design matrix are missing completely at random. We analyze a variant of the Dantzig selector [9] for estimating the regression model and we use a de-biasing argument to construct component-wise confidence intervals. Our first main result is to establish upper bounds on the estimation error as a function of the model parameters (the sparsity level s, the expected fraction of observed covariates $\rho_*$, and a measure of the signal strength $\|\beta^*\|_2$). We find that even in an idealized setting where the covariates are assumed to be missing completely at random, somewhat surprisingly and in contrast to the fully-observed setting, there is a dichotomy in the dependence on model parameters and much faster rates are obtained if the covariance matrix of the random design is known. To study this issue further, our second main contribution is to provide lower bounds on the estimation error showing that this discrepancy in rates is unavoidable in a minimax sense. We then consider the problem of high-dimensional inference in the presence of missing data. We construct and analyze confidence intervals using a de-biased estimator. In the presence of missing data, inference is complicated by the fact that the de-biasing matrix is correlated with the pilot estimator and this necessitates the design of a new estimator and a novel analysis. We also complement our mathematical study with extensive simulations on synthetic and semi-synthetic data that show the accuracy of our asymptotic predictions for finite sample sizes.
- All the existing real world networks are evolving, hence, study of traffic dynamics in these enlarged networks is a challenging task. The critical issue is to optimize the network structure to improve network capacity and avoid traffic congestion. We are interested in taking user's routes such that it is least congested with optimal network capacity. Network capacity may be improved either by optimizing network topology or enhancing in routing approach. In this context, we propose and design a model of the time varying data communication networks (TVCN) based on the dynamics of in-flowing links. Newly appeared node prefers to attach with most influential node present in the network. In this paper, influence is termed as \textitreputation and is applied for computing overall congestion at any node. User path with least betweenness centrality and most reputation is preferred for routing. Kelly's optimization formulation for a rate allocation problem is used for obtaining optimal rates of distinct users at different time instants and it is found that the user's path with lowest betweenness centrality and highest reputation will always give maximum rate at stable point.
- Jan 17 2017 cs.RO arXiv:1701.04350v1This paper aims to implement Object-Oriented Markov Decision Process (OO-MDPs) for goal planning and navigation of robot in an indoor environment. We use the OO-MDP representation of the environment which is a natural way of modeling the environment based on objects and their interactions. The paper aims to extend the well known Taxi domain example which has been tested on grid world environment to robotics domain with larger state-spaces. For the purpose of this project we have created simulation of the environment and robot in ROS with Gazebo and Rviz as visualization tools.The mobile robot uses a 2D LIDAR module to perform SLAM in the unknown environment. The goal of this project is to be able to make an autonomous agent capable of performing planning and navigation in an indoor environment to deliver boxes (passengers in Taxi domain) placed at random locations to a particular location (warehouse). The approach can be extended to a wide variety of mobile and manipulative robots
- Jan 09 2017 cs.RO arXiv:1701.01547v1In many human-in-the-loop robotic applications such as robot-assisted surgery and remote teleoperation, predicting the intended motion of the human operator may be useful for successful implementation of shared control, guidance virtual fixtures, and predictive control. Developing computational models of human movements is a critical foundation for such motion prediction frameworks. With this motivation, we present a computational framework for modeling reaching movements in the presence of obstacles. We propose a stochastic optimal control framework that consists of probabilistic collision avoidance constraints and a cost function that trades-off between effort and end-state variance in the presence of a signal-dependent noise. First, we present a series of reformulations to convert the original non-linear and non-convex optimal control into a parametric quadratic programming problem. We show that the parameters can be tuned to model various collision avoidance strategies, thereby capturing the quintessential variability associated with human motion. Then, we present a simulation study that demonstrates the complex interaction between avoidance strategies, control cost, and the probability of collision avoidance. The proposed framework can benefit a variety of applications that require teleoperation in cluttered spaces, including robot-assisted surgery. In addition, it can also be viewed as a new optimizer which produces smooth and probabilistically-safe trajectories under signal dependent noise.
- User-given tags or labels are valuable resources for semantic understanding of visual media such as images and videos. Recently, a new type of labeling mechanism known as hash-tags have become increasingly popular on social media sites. In this paper, we study the problem of generating relevant and useful hash-tags for short video clips. Traditional data-driven approaches for tag enrichment and recommendation use direct visual similarity for label transfer and propagation. We attempt to learn a direct low-cost mapping from video to hash-tags using a two step training process. We first employ a natural language processing (NLP) technique, skip-gram models with neural network training to learn a low-dimensional vector representation of hash-tags (Tag2Vec) using a corpus of 10 million hash-tags. We then train an embedding function to map video features to the low-dimensional Tag2vec space. We learn this embedding for 29 categories of short video clips with hash-tags. A query video without any tag-information can then be directly mapped to the vector space of tags using the learned embedding and relevant tags can be found by performing a simple nearest-neighbor retrieval in the Tag2Vec space. We validate the relevance of the tags suggested by our system qualitatively and quantitatively with a user study.
- We consider the Hypothesis Transfer Learning (HTL) problem where one incorporates a hypothesis trained on the source domain into the learning procedure of the target domain. Existing theoretical analysis either only studies specific algorithms or only presents upper bounds on the generalization error but not on the excess risk. In this paper, we propose a unified algorithm-dependent framework for HTL through a novel notion of transformation function, which characterizes the relation between the source and the target domains. We conduct a general risk analysis of this framework and in particular, we show for the first time, if two domains are related, HTL enjoys faster convergence rates of excess risks for Kernel Smoothing and Kernel Ridge Regression than those of the classical non-transfer learning settings. Experiments on real world data demonstrate the effectiveness of our framework.
- Dec 02 2016 cs.CV arXiv:1612.00144v2Deep learning based landcover classification algorithms have recently been proposed in literature. In hyperspectral images (HSI) they face the challenges of large dimensionality, spatial variability of spectral signatures and scarcity of labeled data. In this article we propose an end-to-end deep learning architecture that extracts band specific spectral-spatial features and performs landcover classification. The architecture has fewer independent connection weights and thus requires lesser number of training data. The method is found to outperform the highest reported accuracies on popular hyperspectral image data sets.
- We introduce the task of Visual Dialog, which requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a question about the image, the agent has to ground the question in image, infer context from history, and answer the question accurately. Visual Dialog is disentangled enough from a specific downstream task so as to serve as a general test of machine intelligence, while being grounded in vision enough to allow objective evaluation of individual responses and benchmark progress. We develop a novel two-person chat data-collection protocol to curate a large-scale Visual Dialog dataset (VisDial). VisDial v0.9 has been released and contains 1 dialog with 10 question-answer pairs on ~120k images from COCO, with a total of ~1.2M dialog question-answer pairs. We introduce a family of neural encoder-decoder models for Visual Dialog with 3 encoders -- Late Fusion, Hierarchical Recurrent Encoder and Memory Network -- and 2 decoders (generative and discriminative), which outperform a number of sophisticated baselines. We propose a retrieval-based evaluation protocol for Visual Dialog where the AI agent is asked to sort a set of candidate answers and evaluated on metrics such as mean-reciprocal-rank of human response. We quantify gap between machine and human performance on the Visual Dialog task via human studies. Putting it all together, we demonstrate the first 'visual chatbot'! Our dataset, code, trained models and visual chatbot are available on https://visualdialog.org
- Nov 04 2016 cs.CL arXiv:1611.01083v1In this paper, we are going to find meaning of words based on distinct situations. Word Sense Disambiguation is used to find meaning of words based on live contexts using supervised and unsupervised approaches. Unsupervised approaches use online dictionary for learning, and supervised approaches use manual learning sets. Hand tagged data are populated which might not be effective and sufficient for learning procedure. This limitation of information is main flaw of the supervised approach. Our proposed approach focuses to overcome the limitation using learning set which is enriched in dynamic way maintaining new data. Trivial filtering method is utilized to achieve appropriate training data. We introduce a mixed methodology having Modified Lesk approach and Bag-of-Words having enriched bags using learning methods. Our approach establishes the superiority over individual Modified Lesk and Bag-of-Words approaches based on experimentation.
- Subspace clustering is the problem of partitioning unlabeled data points into a number of clusters so that data points within one cluster lie approximately on a low-dimensional linear subspace. In many practical scenarios, the dimensionality of data points to be clustered are compressed due to constraints of measurement, computation or privacy. In this paper, we study the theoretical properties of a popular subspace clustering algorithm named sparse subspace clustering (SSC) and establish formal success conditions of SSC on dimensionality-reduced data. Our analysis applies to the most general fully deterministic model where both underlying subspaces and data points within each subspace are deterministically positioned, and also a wide range of dimensionality reduction techniques (e.g., Gaussian random projection, uniform subsampling, sketching) that fall into a subspace embedding framework (Meng & Mahoney, 2013; Avron et al., 2014). Finally, we apply our analysis to a differentially private SSC algorithm and established both privacy and utility guarantees of the proposed method.
- From Traditional to Modern : Domain Adaptation for Action Classification in Short Social Video ClipsOct 19 2016 cs.CV arXiv:1610.05613v1Short internet video clips like vines present a significantly wild distribution compared to traditional video datasets. In this paper, we focus on the problem of unsupervised action classification in wild vines using traditional labeled datasets. To this end, we use a data augmentation based simple domain adaptation strategy. We utilise semantic word2vec space as a common subspace to embed video features from both, labeled source domain and unlablled target domain. Our method incrementally augments the labeled source with target samples and iteratively modifies the embedding function to bring the source and target distributions together. Additionally, we utilise a multi-modal representation that incorporates noisy semantic information available in form of hash-tags. We show the effectiveness of this simple adaptation technique on a test set of vines and achieve notable improvements in performance.
- Detecting a small number of outliers from a set of data observations is always challenging. This problem is more difficult in the setting of multiple network samples, where computing the anomalous degree of a network sample is generally not sufficient. In fact, explaining why the network is exceptional, expressed in the form of subnetwork, is also equally important. In this paper, we develop a novel algorithm to address these two key problems. We treat each network sample as a potential outlier and identify subnetworks that mostly discriminate it from nearby regular samples. The algorithm is developed in the framework of network regression combined with the constraints on both network topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus goes beyond subspace/subgraph discovery and we show that it converges to a global optimum. Evaluation on various real-world network datasets demonstrates that our algorithm not only outperforms baselines in both network and high dimensional setting, but also discovers highly relevant and interpretable local subnetworks, further enhancing our understanding of anomalous networks.
- Sep 28 2016 cs.DB arXiv:1609.08228v1Reduction of end-to-end network delays is an optimization task with applications in multiple domains. Low delays enable improved information flow in social networks, quick spread of ideas in collaboration networks, low travel times for vehicles on road networks and increased rate of packets in the case of communication networks. Delay reduction can be achieved by both improving the propagation capabilities of individual nodes and adding additional edges in the network. One of the main challenges in such design problems is that the effects of local changes are not independent, and as a consequence, there is a combinatorial search-space of possible improvements. Thus, minimizing the cumulative propagation delay requires novel scalable and data-driven approaches. In this paper, we consider the problem of network delay minimization via node upgrades. Although the problem is NP-hard, we show that probabilistic approximation for a restricted version can be obtained. We design scalable and high-quality techniques for the general setting based on sampling and targeted to different models of delay distribution. Our methods scale almost linearly with the graph size and consistently outperform competitors in quality.
- Sep 20 2016 cs.CV arXiv:1609.05296v1Liveliness detection acts as a safe guard against spoofing attacks. Most of the researchers used vision based techniques to detect liveliness of the user, but they are highly sensitive to illumination effects. Therefore it is very hard to design a system, which will work robustly under all circumstances. Literature shows that most of the research utilize eye blink or mouth movement to detect the liveliness, while the other group used face texture to distinguish between real and imposter. The classification results of all these approaches decreases drastically in variable light conditions. Hence in this paper we are introducing fuzzy expert system which is sufficient enough to handle most of the cases comes in real time. We have used two testing parameters, (a) under bad illumination and (b) less movement in eyes and mouth in case of real user to evaluate the performance of the system. The system is behaving well in all, while in first case its False Rejection Rate (FRR) is 0.28, and in second case its FRR is 0.4.
- Recommendation and collaborative filtering systems are important in modern information and e-commerce applications. As these systems are becoming increasingly popular in the industry, their outputs could affect business decision making, introducing incentives for an adversarial party to compromise the availability or integrity of such systems. We introduce a data poisoning attack on collaborative filtering systems. We demonstrate how a powerful attacker with full knowledge of the learner can generate malicious data so as to maximize his/her malicious objectives, while at the same time mimicking normal user behavior to avoid being detected. While the complete knowledge assumption seems extreme, it enables a robust assessment of the vulnerability of collaborative filtering schemes to highly motivated attacks. We present efficient solutions for two popular factorization-based collaborative filtering algorithms: the \emphalternative minimization formulation and the \emphnuclear norm minimization method. Finally, we test the effectiveness of our proposed algorithms on real-world data and discuss potential defensive strategies.
- Aug 23 2016 cs.RO arXiv:1608.05829v1We present Probabilistic Reciprocal Velocity Obstacle or PRVO as a general algorithm for navigating multiple robots under perception and motion uncertainty. PRVO is defined as the space of velocities that ensures dynamic collision avoidance between a pair of robots with a specified probability. Our approach is based on defining chance constraints over the inequalities defined by the deterministic Reciprocal Velocity Obstacle (RVO). The computational complexity of the proposed probabilistic RVO is comparable to the deterministic counterpart. This is achieved by a series of reformulations where we first substitute the computationally intractable chance constraints with a family of surrogate constraints and then adopt a time scaling based solution methodology to efficiently characterize their solution space. Further, we also show that the solution space of each member of the family of surrogate constraints can be mapped in closed form to the probability with which the original chance constraints are satisfied and thus consequently to probability of collision avoidance. We validate our formulations through numerical simulations where we highlight the importance of incorporating the effect of motion uncertainty and the advantages of PRVO over existing formulations which handles the effect of uncertainty by using conservative bounding volumes.
- Aug 16 2016 cs.CE arXiv:1608.03990v3A modeling paradigm is developed to augment predictive models of turbulence by effectively utilizing limited data generated from physical experiments. The key components of our approach involve inverse modeling to infer the spatial distribution of model discrepancies, and, machine learning to reconstruct discrepancy information from a large number of inverse problems into corrective model forms. We apply the methodology to turbulent flows over airfoils involving flow separation. Model augmentations are developed for the Spalart Allmaras (SA) model using adjoint-based full field inference on experimentally measured lift coefficient data. When these model forms are reconstructed using neural networks (NN) and embedded within a standard solver, we show that much improved predictions in lift can be obtained for geometries and flow conditions that were not used to train the model. The NN-augmented SA model also predicts surface pressures extremely well. Portability of this approach is demonstrated by confirming that predictive improvements are preserved when the augmentation is embedded in a different commercial finite-element solver. The broader vision is that by incorporating data that can reveal the form of the innate model discrepancy, the applicability of data-driven turbulence models can be extended to more general flows.