Apr 25 2018
cs.CV arXiv:1804.08831v1
Our overarching goal is to develop an accurate and explainable model for plant disease identification using hyperspectral data. Charcoal rot is a soil borne fungal disease that affects the yield of soybean crops worldwide. Hyperspectral images were captured at 240 different wavelengths in the range of 383 - 1032 nm. We developed a 3D Convolutional Neural Network model for soybean charcoal rot disease identification. Our model has classification accuracy of 95.73\% and an infected class F1 score of 0.87. We infer the trained model using saliency map and visualize the most sensitive pixel locations that enable classification. The sensitivity of individual wavelengths for classification was also determined using the saliency map visualization. We identify the most sensitive wavelength as 733 nm using the saliency map visualization. Since the most sensitive wavelength is in the Near Infrared Region(700 - 1000 nm) of the electromagnetic spectrum, which is also the commonly used spectrum region for determining the vegetation health of the plant, we were more confident in the predictions using our model.
Apr 23 2018
cs.CL arXiv:1804.07461v1
For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset. In pursuit of this objective, we introduce the General Language Understanding Evaluation benchmark (GLUE), a tool for evaluating and analyzing the performance of models across a diverse range of existing NLU tasks. GLUE is model-agnostic, but it incentivizes sharing knowledge across tasks because certain tasks have very limited training data. We further provide a hand-crafted diagnostic test suite that enables detailed linguistic analysis of NLU models. We evaluate baselines based on current methods for multi-task and transfer learning and find that they do not immediately give substantial improvements over the aggregate performance of training a separate model per task, indicating room for improvement in developing general and robust NLU systems.
Distributed computing platforms provide a robust mechanism to perform large-scale computations by splitting the task and data among multiple locations, possibly located thousands of miles apart geographically. Although such distribution of resources can lead to benefits, it also comes with its associated problems such as rampant duplication of file transfers increasing congestion, long job completion times, unexpected site crashing, suboptimal data transfer rates, unpredictable reliability in a time range, and suboptimal usage of storage elements. In addition, each sub-system becomes a potential failure node that can trigger system wide disruptions. In this vision paper, we outline our approach to leveraging Deep Learning algorithms to discover solutions to unique problems that arise in a system with computational infrastructure that is spread over a wide area. The presented vision, motivated by a real scientific use case from Belle II experiments, is to develop multilayer neural networks to tackle forecasting, anomaly detection and optimization challenges in a complex and distributed data movement environment. Through this vision based on Deep Learning principles, we aim to achieve reduced congestion events, faster file transfer rates, and enhanced site reliability.
Apr 12 2018
cs.CL arXiv:1804.04003v1
Expressing in language is subjective. Everyone has a different style of reading and writing, apparently it all boil downs to the way their mind understands things (in a specific format). Language style transfer is a way to preserve the meaning of a text and change the way it is expressed. Progress in language style transfer is lagged behind other domains, such as computer vision, mainly because of the lack of parallel data, use cases, and reliable evaluation metrics. In response to the challenge of lacking parallel data, we explore learning style transfer from non-parallel data. We propose a model combining seq2seq, autoencoders, and adversarial loss to achieve this goal. The key idea behind the proposed models is to learn separate content representations and style representations using adversarial networks. Considering the problem of evaluating style transfer tasks, we frame the problem as sentiment transfer and evaluation using a sentiment classifier to calculate how many sentiments was the model able to transfer. We report our results on several kinds of models.
We consider the problem of global optimization of an unknown non-convex smooth function with zeroth-order feedback. In this setup, an algorithm is allowed to adaptively query the underlying function at different locations and receives noisy evaluations of function values at the queried points (i.e. the algorithm has access to zeroth-order information). Optimization performance is evaluated by the expected difference of function values at the estimated optimum and the true optimum. In contrast to the classical optimization setup, first-order information like gradients are not directly accessible to the optimization algorithm. We show that the classical minimax framework of analysis, which roughly characterizes the worst-case query complexity of an optimization algorithm in this setting, leads to excessively pessimistic results. We propose a local minimax framework to study the fundamental difficulty of optimizing smooth functions with adaptive function evaluations, which provides a refined picture of the intrinsic difficulty of zeroth-order optimization. We show that for functions with fast level set growth around the global minimum, carefully designed optimization algorithms can identify a near global minimizer with many fewer queries. For the special case of strongly convex and smooth functions, our implied convergence rates match the ones developed for zeroth-order convex optimization problems. At the other end of the spectrum, for worst-case smooth functions no algorithm can converge faster than the minimax rate of estimating the entire unknown function in the $\ell_\infty$-norm. We provide an intuitive and efficient algorithm that attains the derived upper error bounds.
Conversational agents have become ubiquitous, ranging from goal-oriented systems for helping with reservations to chit-chat models found in modern virtual assistants. In this survey paper, we explore this fascinating field. We look at some of the pioneering work that defined the field and gradually move to the current state-of-the-art models. We look at statistical, neural, generative adversarial network based and reinforcement learning based approaches and how they evolved. Along the way we discuss various challenges that the field faces, lack of context in utterances, not having a good quantitative metric to compare models, lack of trust in agents because they do not have a consistent persona etc. We structure this paper in a way that answers these pertinent questions and discusses competing approaches to solve them.
In recent years, advances in the design of convolutional neural networks have resulted in significant improvements on the image classification and object detection problems. One of the advances is networks built by stacking complex cells, seen in such networks as InceptionNet and NasNet. These cells are either constructed by hand, generated by generative networks or discovered by search. Unlike conventional networks (where layers consist of a convolution block, sampling and non linear unit), the new cells feature more complex designs consisting of several filters and other operators connected in series and parallel. Recently, several cells have been proposed or generated that are supersets of previously proposed custom or generated cells. Influenced by this, we introduce a network construction method based on EnvelopeNets. An EnvelopeNet is a deep convolutional neural network of stacked EnvelopeCells. EnvelopeCells are supersets (or envelopes) of previously proposed handcrafted and generated cells. We propose a method to construct improved network architectures by restructuring EnvelopeNets. The algorithm restructures an EnvelopeNet by rearranging blocks in the network. It identifies blocks to be restructured using metrics derived from the featuremaps collected during a partial training run of the EnvelopeNet. The method requires less computation resources to generate an architecture than an optimized architecture search over the entire search space of blocks. The restructured networks have higher accuracy on the image classification problem on a representative dataset than both the generating EnvelopeNet and an equivalent arbitrary network.
Mar 13 2018
cs.RO arXiv:1803.03784v1
Motion planning for manipulators under task space constraints is difficult as it constrains the joint configurations to always lie on an implicitly defined manifold. It is possible to view task constrained motion planning as an optimization problem with non-linear equality constraints which can be solved by general non-linear optimization techniques. In this paper, we present a novel custom optimizer which exploits the underlying structure present in many task constraints. At the core of our approach are some simple reformulations, which when coupled with the \emphmethod of alternating projection, leads to an efficient convex optimization based routine for computing a feasible solution to the task constraints. We subsequently build on this result and use the concept of Augmented Lagrangian to guide the feasible solutions towards those which also minimize the user defined cost function. We show that the proposed optimizer is fully distributive and thus, can be easily parallelized. We validate our formulation on some common robotic benchmark problems. In particular, we show that the proposed optimizer achieves cyclic motion in the joint space corresponding to a similar nature trajectory in the task space. Furthermore, as a baseline, we compare the proposed optimizer with an off-the-shelf non-linear solver provide in open source package SciPy. We show that for similar task constraint residuals and smoothness cost, it can be upto more than three times faster than the SciPy alternative.
Mar 12 2018
cs.RO arXiv:1803.03478v1
In this paper, we propose a new model predictive control (MPC) formulation for autonomous driving. The novelty of our formulation stems from the following new results. Firstly, we adopt an alternating minimization approach wherein linear velocities and angular accelerations are alternately optimized. We show that in contrast to the joint formulation, the alternating minimization better exploits the structure of the problem. This in turn translates to reduction in computation time. Secondly, our MPC formulation incorporates the time dependent non-linear actuator dynamics which relates commanded velocity input to the actual body velocity of the vehicle. This added complexity improves the predictive component of MPC resulting in improved margin of inter-vehicle distance during maneuvers like overtaking, lane-change etc. Although, past works have also incorporated actuator dynamics within MPC, there has been very few attempts towards coupling actuator dynamics to collision avoidance constraints through the non-holonomic motion model of the vehicle and analyzing the resulting behavior. We test our formulation on a simulated environment and use metrics like inter-vehicle distance, velocity overshoot to demonstrate its usefulness.
What is a mathematically rigorous way to describe the taxi-pickup distribution in Manhattan, or the profile information in online social networks? A deep understanding of representing those data not only provides insights to the data properties, but also benefits to many subsequent processing procedures, such as denoising, sampling, recovery and localization. In this paper, we model those complex and irregular data as piecewise-smooth graph signals and propose a graph dictionary to effectively represent those graph signals. We first propose the graph multiresolution analysis, which provides a principle to design good representations. We then propose a coarse-to-fine approach, which iteratively partitions a graph into two subgraphs until we reach individual nodes. This approach efficiently implements the graph multiresolution analysis and the induced graph dictionary promotes sparse representations piecewise-smooth graph signals. Finally, we validate the proposed graph dictionary on two tasks: approximation and localization. The empirical results show that the proposed graph dictionary outperforms eight other representation methods on six datasets, including traffic networks, social networks and point cloud meshes.
Feb 27 2018
cs.CR arXiv:1802.09096v1
Modern high-performance as well as power-constrained System-on-Chips (SoC) are increasingly using hardware accelerated encryption engines to secure computation, memory access, and communication operations. The electromagnetic (EM) emission from a chip leaks information of the underlying logical operations and can be collected using low-cost non-invasive measurements. EM based side-channel attacks (EMSCA) have emerged as a major threat to security of encryption engines in a SoC. This paper presents the concept of Blindsight where a high-frequency inductive voltage regulator (IVR) integrated on the same chip with an encryption engine is used to increase resistance against EMSCA. High-frequency (~100MHz) IVRs are present in modern microprocessors to improve energy-efficiency. We show that an IVR with a randomized control loop (R-IVR) can reduce EMSCA as the integrated inductance acts as a strong EM emitter and blinds an adversary from EM emission of the encryption engine. The EM measurements are performed on a test-chip containing two architectures of a 128-bit Advanced Encryption Standard (AES) engine powered by a high-frequency R-IVR and under two attack scenarios, one, where an adversary gains complete physical access of the target device and the other, where the adversary is only in proximity of the device. In both attack modes, an adversary can observe information leakage in Test Vector Leakage Assessment (TVLA) test in a baseline IVR (B-IVR, without control loop randomization). However, we show that EM emission from the R-IVR blinds the attacker and significantly reduces SCA vulnerability of the AES engine. A range of practical side-channel analysis including TVLA, Correlation Electromagnetic Analysis (CEMA), and a template based CEMA shows that R-IVR can reduce information leakage and prevent key extraction even against a skilled adversary.
Rankings are ubiquitous in the online world today. As we have transitioned from finding books in libraries to ranking products, jobs, job applicants, opinions and potential romantic partners, there is a substantial precedent that ranking systems have a responsibility not only to their users but also to the items being ranked. To address these often conflicting responsibilities, we propose a conceptual and computational framework that allows the formulation of fairness constraints on rankings. As part of this framework, we develop efficient algorithms for finding rankings that maximize the utility for the user while satisfying fairness constraints for the items. Since fairness goals can be application specific, we show how a broad range of fairness constraints can be implemented in our framework, including forms of demographic parity, disparate treatment, and disparate impact constraints. We illustrate the effect of these constraints by providing empirical results on two ranking problems.
We present a method to discover differences between populations with respect to the spatial coherence of their oriented white matter microstructure in arbitrarily shaped white matter regions. This method is applied to diffusion MRI scans of a subset of the Human Connectome Project dataset: 57 pairs of monozygotic and 52 pairs of dizygotic twins. After controlling for morphological similarity between twins, we identify 3.7% of all white matter as being associated with genetic similarity (35.1k voxels, $p < 10^{-4}$, false discovery rate 1.5%), 75% of which spatially clusters into twenty-two contiguous white matter regions. Furthermore, we show that the orientation similarity within these regions generalizes to a subset of 47 pairs of non-twin siblings, and show that these siblings are on average as similar as dizygotic twins. The regions are located in deep white matter including the superior longitudinal fasciculus, the optic radiations, the middle cerebellar peduncle, the corticospinal tract, and within the anterior temporal lobe, as well as the cerebellum, brain stem, and amygdalae. These results extend previous work using undirected fractional anisotrophy for measuring putative heritable influences in white matter. Our multidirectional extension better accounts for crossing fiber connections within voxels. This bottom up approach has at its basis a novel measurement of coherence within neighboring voxel dyads between subjects, and avoids some of the fundamental ambiguities encountered with tractographic approaches to white matter analysis that estimate global connectivity.
A major challenge in understanding the generalization of deep learning is to explain why (stochastic) gradient descent can exploit the network architecture to find solutions that have good generalization performance when using high capacity models. We find simple but realistic examples showing that this phenomenon exists even when learning linear classifiers --- between two linear networks with the same capacity, the one with a convolutional layer can generalize better than the other when the data distribution has some underlying spatial structure. We argue that this difference results from a combination of the convolution architecture, data distribution and gradient descent, all of which are necessary to be included in a meaningful analysis. We provide a general analysis of the generalization performance as a function of data distribution and convolutional filter size, given gradient descent as the optimization algorithm, then interpret the results using concrete examples. Experimental results show that our analysis is able to explain what happens in our introduced examples.
Feb 12 2018
cs.CV arXiv:1802.03374v2
This paper proposes a generative ScatterNet hybrid deep learning (G-SHDL) network for semantic image segmentation. The proposed generative architecture is able to train rapidly from relatively small labeled datasets using the introduced structural priors. In addition, the number of filters in each layer of the architecture is optimized resulting in a computationally efficient architecture. The G-SHDL network produces state-of-the-art classification performance against unsupervised and semi-supervised learning on two image datasets. Advantages of the G-SHDL network over supervised methods are demonstrated with experiments performed on training datasets of reduced size.
Brandon Ballinger, Johnson Hsieh, Avesh Singh, Nimit Sohoni, Jack Wang, Geoffrey H. Tison, Gregory M. Marcus, Jose M. Sanchez, Carol Maguire, Jeffrey E. Olgin, Mark J. Pletcher Feb 08 2018
cs.AI arXiv:1802.02511v1
We train and validate a semi-supervised, multi-task LSTM on 57,675 person-weeks of data from off-the-shelf wearable heart rate sensors, showing high accuracy at detecting multiple medical conditions, including diabetes (0.8451), high cholesterol (0.7441), high blood pressure (0.8086), and sleep apnea (0.8298). We compare two semi-supervised train- ing methods, semi-supervised sequence learning and heuristic pretraining, and show they outperform hand-engineered biomarkers from the medical literature. We believe our work suggests a new approach to patient risk stratification based on cardiovascular risk scores derived from popular wearables such as Fitbit, Apple Watch, or Android Wear.
Ideas from forensic linguistics are now being used frequently in Natural Language Processing (NLP), using machine learning techniques. While the role of forensic linguistics was more benign earlier, it is now being used for purposes which are questionable. Certain methods from forensic linguistics are employed, without considering their scientific limitations and ethical concerns. While we take the specific case of forensic linguistics as an example of such trends in NLP and machine learning, the issue is a larger one and present in many other scientific and data-driven domains. We suggest that such trends indicate that some of the applied sciences are exceeding their legal and scientific briefs. We highlight how carelessly implemented practices are serving to short-circuit the due processes of law as well breach ethical codes.
Dec 15 2017
cs.RO arXiv:1712.04978v2
In this paper, we propose a trajectory optimization for computing smooth collision free trajectories for nonholonomic curvature bounded vehicles among static and dynamic obstacles. One of the key novelties of our formulation is a hierarchal optimization routine which alternately operates in the space of angular accelerations and linear velocities. That is, the optimization has a two layer structure wherein angular accelerations are optimized keeping the linear velocities fixed and vice versa. If the vehicle/obstacles are modeled as circles than the velocity optimization layer can be shown to have the computationally efficient difference of convex structure commonly observed for linear systems. This leads to a less conservative approximation as compared to that obtained by approximating each polygon individually through its circumscribing circle. On the other hand, it leads to optimization problem with less number of constraints as compared to that obtained when approximating polygons as multiple overlapping circles. We use the proposed trajectory optimization as the basis for constructing a Model Predictive Control framework for navigating an autonomous car in complex scenarios like overtaking, lane changing and merging. Moreover, we also highlight the advantages provided by the alternating optimization routine. Specifically, we show it produces trajectories which have comparable arc lengths and smoothness as compared to those produced with joint simultaneous optimization in the space of angular accelerations and linear velocities. However, importantly, the alternating optimization provides some gain in computational time.
Dec 15 2017
cs.DC arXiv:1712.04989v1
The advent of non-volatile memory (NVM) technologies like PCM, STT, memristors and Fe-RAM is believed to enhance the system performance by getting rid of the traditional memory hierarchy by reducing the gap between memory and storage. This memory technology is considered to have the performance like that of DRAM and persistence like that of disks. Thus, it would also provide significant performance benefits for big data applications by allowing in-memory processing of large data with the lowest latency to persistence. Leveraging the performance benefits of this memory-centric computing technology through traditional memory programming is not trivial and the challenges aggravate for parallel/concurrent applications. To this end, several programming abstractions have been proposed like NVthreads, Mnemosyne and intel's NVML. However, deciding upon a programming abstraction which is easier to program and at the same time ensures the consistency and balances various software and architectural trade-offs is openly debatable and active area of research for NVM community. We study the NVthreads, Mnemosyne and NVML libraries by building a concurrent and persistent set and open addressed hash-table data structure application. In this process, we explore and report various tradeoffs and hidden costs involved in building concurrent applications for persistence in terms of achieving efficiency, consistency and ease of programming with these NVM programming abstractions. Eventually, we evaluate the performance of the set and hash-table data structure applications. We observe that NVML is easiest to program with but is least efficient and Mnemosyne is most performance friendly but involves significant programming efforts to build concurrent and persistent applications.
Dec 15 2017
cs.RO arXiv:1712.04965v2
In this paper, we present a Model Predictive Control (MPC) framework based on path velocity decomposition paradigm for autonomous driving. The optimization underlying the MPC has a two layer structure wherein first, an appropriate path is computed for the vehicle followed by the computation of optimal forward velocity along it. The very nature of the proposed path velocity decomposition allows for seamless compatibility between the two layers of the optimization. A key feature of the proposed work is that it offloads most of the responsibility of collision avoidance to velocity optimization layer for which computationally efficient formulations can be derived. In particular, we extend our previously developed concept of time scaled collision cone (TSCC) constraints and formulate the forward velocity optimization layer as a convex quadratic programming problem. We perform validation on autonomous driving scenarios wherein proposed MPC repeatedly solves both the optimization layers in receding horizon manner to compute lane change, overtaking and merging maneuvers among multiple dynamic obstacles.
We consider the problem of learning a one-hidden-layer neural network with non-overlapping convolutional layer and ReLU activation function, i.e., $f(\mathbf{Z}; \mathbf{w}, \mathbf{a}) = \sum_j a_j\sigma(\mathbf{w}^\top\mathbf{Z}_j)$, in which both the convolutional weights $\mathbf{w}$ and the output weights $\mathbf{a}$ are parameters to be learned. We prove that with Gaussian input $\mathbf{Z}$, there is a spurious local minimum that is not a global mininum. Surprisingly, in the presence of local minimum, starting from randomly initialized weights, gradient descent with weight normalization can still be proven to recover the true parameters with constant probability (which can be boosted to arbitrarily high accuracy with multiple restarts). We also show that with constant probability, the same procedure could also converge to the spurious local minimum, showing that the local minimum plays a non-trivial role in the dynamics of gradient descent. Furthermore, a quantitative analysis shows that the gradient descent dynamics has two phases: it starts off slow, but converges much faster after several iterations.
Standard model-free deep reinforcement learning (RL) algorithms sample a new initial state for each trial, allowing them to optimize policies that can perform well even in highly stochastic environments. However, problems that exhibit considerable initial state variation typically produce high-variance gradient estimates for model-free RL, making direct policy or value function optimization challenging. In this paper, we develop a novel algorithm that instead optimizes an ensemble of policies, each on a different "slice" of the initial state space, and gradually unifies them into a single policy that can succeed on the whole state space. This approach, which we term divide-and-conquer RL, is able to solve complex tasks where conventional deep RL methods are ineffective. Our results show that divide-and-conquer RL greatly outperforms conventional policy gradient methods on challenging grasping, manipulation, and locomotion tasks, and exceeds the performance of a variety of prior methods.
Workflows provide an expressive programming model for fine-grained control of large-scale applications in distributed computing environments. Accurate estimates of complex workflow execution metrics on large-scale machines have several key advantages. The performance of scheduling algorithms that rely on estimates of execution metrics degrades when the accuracy of predicted execution metrics decreases. This in-progress paper presents a technique being developed to improve the accuracy of predicted performance metrics of large-scale workflows on distributed platforms. The central idea of this work is to train resource-centric machine learning agents to capture complex relationships between a set of program instructions and their performance metrics when executed on a specific resource. This resource-centric view of a workflow exploits the fact that predicting execution times of sub-modules of a workflow requires monitoring and modeling of a few dynamic and static features. We transform the input workflow that is essentially a directed acyclic graph of actions into a Physical Resource Execution Plan (PREP). This transformation enables us to model an arbitrarily complex workflow as a set of simpler programs running on physical nodes. We delegate a machine learning model to capture performance metrics for each resource type when it executes different program instructions under varying degrees of resource contention. Our algorithm takes the prediction metrics from each resource agent and composes the overall workflow performance metrics by utilizing the structure of the corresponding Physical Resource Execution Plan.
Nov 16 2017
cs.SY arXiv:1711.04884v1
We consider a class of piecewise-deterministic Markov processes where the state evolves according to a linear dynamical system. This continuous time evolution is interspersed by discrete events that occur at random times and change (reset) the state based on a linear affine map. In particular, we consider two families of discrete events, with the first family of resets occurring at exponentially-distributed times. The second family of resets is generally-distributed, in the sense that, the time intervals between events are independent and identically distributed random variables that follow an arbitrary continuous positively-valued probability density function. For this class of stochastic systems, we provide explicit conditions that lead to finite stationary moments, and the corresponding exact closed-form moment formulas. These results are illustrated on an example drawn from systems biology, where a protein is expressed in bursts at exponentially-distributed time intervals, decays within the cell-cycle, and is randomly divided among daughter cells when generally-distributed cell-division events occur. Our analysis leads to novel results for the mean and noise levels in protein copy numbers, and we decompose the noise levels into components arising from stochastic expression, random cell-cycle times, and partitioning. Interestingly, these individual noise contributions behave differently as cell division times become more random. In summary, the paper expands the class of stochastic hybrid systems for which statistical moments can be derived exactly without any approximations, and these results have applications for studying random phenomena in diverse areas.
The experimental design problem concerns the selection of k points from a potentially large design pool of p-dimensional vectors, so as to maximize the statistical efficiency regressed on the selected k design points. Statistical efficiency is measured by optimality criteria, including A(verage), D(eterminant), T(race), E(igen), V(ariance) and G-optimality. Except for the T-optimality, exact optimization is NP-hard. We propose a polynomial-time regret minimization framework to achieve a $(1+\varepsilon)$ approximation with only $O(p/\varepsilon^2)$ design points, for all the optimality criteria above. In contrast, to the best of our knowledge, before our work, no polynomial-time algorithm achieves $(1+\varepsilon)$ approximations for D/E/G-optimality, and the best poly-time algorithm achieving $(1+\varepsilon)$-approximation for A/V-optimality requires $k = \Omega(p^2/\varepsilon)$ design points.
We consider the problem of optimizing a high-dimensional convex function using stochastic zeroth-order queries. Under sparsity assumptions on the gradients or function values, we present two algorithms: a successive component/feature selection algorithm and a noisy mirror descent algorithm using Lasso gradient estimates, and show that both algorithms have convergence rates that de- pend only logarithmically on the ambient dimension of the problem. Empirical results confirm our theoretical findings and show that the algorithms we design outperform classical zeroth-order optimization methods in the high-dimensional setting.
Availability of an explainable deep learning model that can be applied to practical real world scenarios and in turn, can consistently, rapidly and accurately identify specific and minute traits in applicable fields of biological sciences, is scarce. Here we consider one such real world example viz., accurate identification, classification and quantification of biotic and abiotic stresses in crop research and production. Up until now, this has been predominantly done manually by visual inspection and require specialized training. However, such techniques are hindered by subjectivity resulting from inter- and intra-rater cognitive variability. Here, we demonstrate the ability of a machine learning framework to identify and classify a diverse set of foliar stresses in the soybean plant with remarkable accuracy. We also present an explanation mechanism using gradient-weighted class activation mapping that isolates the visual symptoms used by the model to make predictions. This unsupervised identification of unique visual symptoms for each stress provides a quantitative measure of stress severity, allowing for identification, classification and quantification in one framework. The learnt model appears to be agnostic to species and make good predictions for other (non-soybean) species, demonstrating an ability of transfer learning.
Oct 17 2017
cs.CV arXiv:1710.05152v1
Iris serves as one of the best biometric modality owing to its complex, unique and stable structure. However, it can still be spoofed using fabricated eyeballs and contact lens. Accurate identification of contact lens is must for reliable performance of any biometric authentication system based on this modality. In this paper, we present a novel approach for detecting contact lens using a Generalized Hierarchically tuned Contact Lens detection Network (GHCLNet) . We have proposed hierarchical architecture for three class oculus classification namely: no lens, soft lens and cosmetic lens. Our network architecture is inspired by ResNet-50 model. This network works on raw input iris images without any pre-processing and segmentation requirement and this is one of its prodigious strength. We have performed extensive experimentation on two publicly available data-sets namely: 1)IIIT-D 2)ND and on IIT-K data-set (not publicly available) to ensure the generalizability of our network. The proposed architecture results are quite promising and outperforms the available state-of-the-art lens detection algorithms.
Oct 16 2017
cs.CV arXiv:1710.04681v1
Charcoal rot is a fungal disease that thrives in warm dry conditions and affects the yield of soybeans and other important agronomic crops worldwide. There is a need for robust, automatic and consistent early detection and quantification of disease symptoms which are important in breeding programs for the development of improved cultivars and in crop production for the implementation of disease control measures for yield protection. Current methods of plant disease phenotyping are predominantly visual and hence are slow and prone to human error and variation. There has been increasing interest in hyperspectral imaging applications for early detection of disease symptoms. However, the high dimensionality of hyperspectral data makes it very important to have an efficient analysis pipeline in place for the identification of disease so that effective crop management decisions can be made. The focus of this work is to determine the minimal number of most effective hyperspectral bands that can distinguish between healthy and diseased specimens early in the growing season. Healthy and diseased hyperspectral data cubes were captured at 3, 6, 9, 12, and 15 days after inoculation. We utilized inoculated and control specimens from 4 different genotypes. Each hyperspectral image was captured at 240 different wavelengths in the range of 383 to 1032 nm. We used a combination of genetic algorithm as an optimizer and support vector machines as a classifier for identification of maximally effective band combinations. A binary classification between healthy and infected samples using six selected band combinations obtained a classification accuracy of 97% and a F1 score of 0.97 for the infected class. The results demonstrated that these carefully chosen bands are more informative than RGB images, and could be used in a multispectral camera for remote identification of charcoal rot infection in soybean.
Oct 10 2017
cs.RO arXiv:1710.02610v1
Studying snake robot locomotion in a cluttered environment has been a complicated task because the motion model is discontinuous due to the physical contact with obstacles, and the contact force cannot be determined solely by contact positions. We present a unique mathematical model of the robot interacting with obstacles in which the contact forces are mapped on the basis of a viscous friction model. Also a motion planning strategy has been introduced which helps deriving the simplest path that ensures sufficient number of contacts of the robot with the obstacles required to reach a goal position. Numerical simulations and experimental results are presented to validate the theoretical approach.
Finding patterns in data and being able to retrieve information from those patterns is an important task in Information retrieval. Complex search requirements which are not fulfilled by simple string matching and require exploring certain patterns in data demand a better query engine that can support searching via structured queries. In this article, we built a structured query engine which supports searching data through structured queries on the lines of ElasticSearch. We will show how we achieved real time indexing and retrieving of data through a RESTful API and how complex queries can be created and processed using efficient data structures we created for storing the data in structured way. Finally, we will conclude with an example of movie recommendation system built on top of this query engine.
Oct 03 2017
cs.SY arXiv:1710.00289v1
In this paper, trajectory prediction and control design for a desired hit point of a projectile is studied. Projectiles are subject to environment noise such as wind effect and measurement noise. In addition, mathematical models of projectiles contain a large number of important states that should be taken into account for having a realistic prediction. Furthermore, dynamics of projectiles contain nonlinear functions such as monomials and sine functions. To address all these issues we formulate a stochastic model for the projectile. We showed that with a set of transformations projectile dynamics only contains nonlinearities of the form of monomials. In the next step we derived approximate moment dynamics of this system using mean-field approximation. Our method still suffers from size of the system. To address this problem we selected a subset of first- and second-order statistical moments and we showed that they give reliable approximations of the mean and standard deviation of the impact point for a real projectile. Finally we used these selected moments to derive a control law that reduces error to hit a desired point.
Oct 02 2017
cs.RO arXiv:1709.10452v2
This paper presents a novel design of an Omnidirectional bendable Omnicrawler module- CObRaSO. Along with the longitudinal crawling and sideways rolling motion, the performance of the OmniCrawler is further enhanced by the introduction of Omnidirectional bending within the module, which is the key contribution of this paper. The Omnidirectional bending is achieved by an arrangement of two independent 1-DOF joints aligned at 90? w.r.t each other. The unique characteristic of this module is its ability to crawl in Omnidirectionally bent configuration which is achieved by a novel design of a 2-DOF roller chain and a backbone of a hybrid structure of a soft-rigid material. This hybrid structure provides compliant pathways for the lug-chain assembly to passively conform with the orientation of the module and crawl in Omnidirectional bent configuration, which makes this module one of its kind. Furthermore, we show that the unique modular design of CObRaSO unveils its versatility by achieving active compliance on an uneven surface, demonstrating its applications in different robotic platforms (an in-pipeline robot, Quadruped and snake robot) and exhibiting hybrid locomotion modes in various configurations of the robots. The mechanism and mobility characteristics of the proposed module have been verified with the aid of simulations and experiments on real robot prototype.
Existing socio-psychological studies suggest that users of a social network form their opinions relying on the opinions of their neighbors. According to DeGroot opinion formation model, one value of particular importance is the asymptotic consensus value---the sum of user opinions weighted by the users' eigenvector centralities. This value plays the role of an attractor for the opinions in the network and is a lucrative target for external influence. However, since any potentially malicious control of the opinion distribution in a social network is clearly undesirable, it is important to design methods to prevent the external attempts to strategically change the asymptotic consensus value. In this work, we assume that the adversary wants to maximize the asymptotic consensus value by altering the opinions of some users in a network; we, then, state DIVER---an NP-hard problem of disabling such external influence attempts by strategically adding a limited number of edges to the network. Relying on the theory of Markov chains, we provide perturbation analysis that shows how eigenvector centrality and, hence, DIVER's objective function change in response to an edge's addition to the network. The latter leads to the design of a pseudo-linear-time heuristic for DIVER, whose computation relies on efficient estimation of mean first passage times in a Markov chain. We confirm our theoretical findings in experiments.
Sep 05 2017
cs.DC arXiv:1709.00681v4
A major focus of software transaction memory systems (STMs) has been to felicitate the multiprocessor programming and provide parallel programmers an abstraction for speedy and efficient development of parallel applications. To this end, different models for incorporating object/higher level semantics into STM have recently been proposed in transactional boosting, transactional data structure library, open nested transactions and abstract nested transactions. We build an alternative object model STM (OSTM) by adopting the transactional tree model of Weikum et al. originally given for databases and extend the current work by proposing comprehensive legality definitions and conflict notion which allows efficient composability of \otm. We first time show the proposed OSTM to be co-opacity. We build OSTM using chained hash table data structure. Lazyskip-list is used to implement chaining using lazy approach. We notice that major concurrency hotspot is the chaining data structure within the hash table. Lazyskip-list is time efficient compared to lists in terms of traversal overhead by average case O(log(n)). We optimise lookups as they are validated at the instant they execute and they have not validated again in commit phase. This allows lookup dominated transactions to be more efficient and at the same time co-opacity.
Aug 31 2017
cs.CV arXiv:1708.09212v1
The paper proposes the ScatterNet Hybrid Deep Learning (SHDL) network that extracts invariant and discriminative image representations for object recognition. SHDL framework is constructed with a multi-layer ScatterNet front-end, an unsupervised learning middle, and a supervised learning back-end module. Each layer of the SHDL network is automatically designed as an explicit optimization problem leading to an optimal deep learning architecture with improved computational performance as compared to the more usual deep network architectures. SHDL network produces the state-of-the-art classification performance against unsupervised and semi-supervised learning (GANs) on two image datasets. Advantages of the SHDL network over supervised methods (NIN, VGG) are also demonstrated with experiments performed on training datasets of reduced size.
Aug 31 2017
cs.CV arXiv:1708.09317v1
Disguised face identification (DFI) is an extremely challenging problem due to the numerous variations that can be introduced using different disguises. This paper introduces a deep learning framework to first detect 14 facial key-points which are then utilized to perform disguised face identification. Since the training of deep learning architectures relies on large annotated datasets, two annotated facial key-points datasets are introduced. The effectiveness of the facial keypoint detection framework is presented for each keypoint. The superiority of the key-point detection framework is also demonstrated by a comparison with other deep networks. The effectiveness of classification performance is also demonstrated by comparison with the state-of-the-art face disguise classification methods.
Aug 31 2017
cs.CV arXiv:1708.09300v1
Automation of brain matter segmentation from MR images is a challenging task due to the irregular boundaries between the grey and white matter regions. In addition, the presence of intensity inhomogeneity in the MR images further complicates the problem. In this paper, we propose a texture and vesselness incorporated version of the ScatterNet Hybrid Deep Learning Network (TS-SHDL) that extracts hierarchical invariant mid-level features, used by fisher vector encoding and a conditional random field (CRF) to perform the desired segmentation. The performance of the proposed network is evaluated by extensive experimentation and comparison with the state-of-the-art methods on several 2D MRI scans taken from the synthetic McGill Brain Web as well as on the MRBrainS dataset of real 3D MRI scans. The advantages of the TS-SHDL network over supervised deep learning networks is also presented in addition to its superior performance over the state-of-the-art.
We propose a DTCWT ScatterNet Convolutional Neural Network (DTSCNN) formed by replacing the first few layers of a CNN network with a parametric log based DTCWT ScatterNet. The ScatterNet extracts edge based invariant representations that are used by the later layers of the CNN to learn high-level features. This improves the training of the network as the later layers can learn more complex patterns from the start of learning because the edge representations are already present. The efficient learning of the DTSCNN network is demonstrated on CIFAR-10 and Caltech-101 datasets. The generic nature of the ScatterNet front-end is shown by an equivalent performance to pre-trained CNN front-ends. A comparison with the state-of-the-art on CIFAR-10 and Caltech-101 datasets is also presented.
Over 150,000 new people in the United States are diagnosed with colorectal cancer each year. Nearly a third die from it (American Cancer Society). The only approved noninvasive diagnosis tools currently involve fecal blood count tests (FOBTs) or stool DNA tests. Fecal blood count tests take only five minutes and are available over the counter for as low as \$15. They are highly specific, yet not nearly as sensitive, yielding a high percentage (25%) of false negatives (Colon Cancer Alliance). Moreover, FOBT results are far too generalized, meaning that a positive result could mean much more than just colorectal cancer, and could just as easily mean hemorrhoids, anal fissure, proctitis, Crohn's disease, diverticulosis, ulcerative colitis, rectal ulcer, rectal prolapse, ischemic colitis, angiodysplasia, rectal trauma, proctitis from radiation therapy, and others. Stool DNA tests, the modern benchmark for CRC screening, have a much higher sensitivity and specificity, but also cost \$600, take two weeks to process, and are not for high-risk individuals or people with a history of polyps. To yield a cheap and effective CRC screening alternative, a unique ensemble-based classification algorithm is put in place that considers the FIT result, BMI, smoking history, and diabetic status of patients. This method is tested under ten-fold cross validation to have a .95 AUC, 92% specificity, 89% sensitivity, .88 F1, and 90% precision. Once clinically validated, this test promises to be cheaper, faster, and potentially more accurate when compared to a stool DNA test.
We tackle the problem of learning robotic sensorimotor control policies that can generalize to visually diverse and unseen environments. Achieving broad generalization typically requires large datasets, which are difficult to obtain for task-specific interactive processes such as reinforcement learning or learning from demonstration. However, much of the visual diversity in the world can be captured through passively collected datasets of images or videos. In our method, which we refer to as GPLAC (Generalized Policy Learning with Attentional Classifier), we use both interaction data and weakly labeled image data to augment the generalization capacity of sensorimotor policies. Our method combines multitask learning on action selection and an auxiliary binary classification objective, together with a convolutional neural network architecture that uses an attentional mechanism to avoid distractors. We show that pairing interaction data from just a single environment with a diverse dataset of weakly labeled data results in greatly improved generalization to unseen environments, and show that this generalization depends on both the auxiliary objective and the attentional architecture that we propose. We demonstrate our results in both simulation and on a real robotic manipulator, and demonstrate substantial improvement over standard convolutional architectures and domain adaptation methods.
Albert Haque, Michelle Guo, Alexandre Alahi, Serena Yeung, Zelun Luo, Alisha Rege, Jeffrey Jopling, Lance Downing, William Beninati, Amit Singh, Terry Platchek, Arnold Milstein, Li Fei-Fei Aug 02 2017
cs.CV arXiv:1708.00163v3
One in twenty-five patients admitted to a hospital will suffer from a hospital acquired infection. If we can intelligently track healthcare staff, patients, and visitors, we can better understand the sources of such infections. We envision a smart hospital capable of increasing operational efficiency and improving patient care with less spending. In this paper, we propose a non-intrusive vision-based system for tracking people's activity in hospitals. We evaluate our method for the problem of measuring hand hygiene compliance. Empirically, our method outperforms existing solutions such as proximity-based techniques and covert in-person observational studies. We present intuitive, qualitative results that analyze human movement patterns and conduct spatial analytics which convey our method's interpretability. This work is a step towards a computer-vision based smart hospital and demonstrates promising results for reducing hospital acquired infections.
Quadcopters, as unmanned aerial vehicles (UAVs), have great potential in civil applications such as surveying, building monitoring, and infrastructure condition assessment. Quadcopters, however, are relatively sensitive to noises and disturbances so that their performance may be quickly downgraded in the case of inadequate control, system uncertainties and/or external disturbances. In this study, we deal with the quadrotor low-level control by proposing a robust scheme named the adaptive second-order quasi-continuous sliding mode control (adaptive 2-QCSM). The ultimate objective is for robust attitude control of the UAV in monitoring and inspection of built infrastructure. First, the mathematical model of the quadcopter is derived considering nonlinearity, strong coupling, uncertain dynamics and external disturbances. The control design includes the selection of the sliding manifold and the development of quasi-continuous second-order sliding mode controller with an adaptive gain. Stability of the overall control system is analysed by using a global Lyapunov function for convergence of both the sliding dynamics and adaptation scheme. Extensive simulations have been carried out for evaluation. Results show that the proposed controller can achieve robustness against disturbances or parameter variations and has better tracking performance in comparison with experimental responses of a UAV in a real-time monitoring task.
Jul 27 2017
cs.DC arXiv:1707.08357v1
Writing concurrent programs for shared memory multiprocessor systems is a nightmare. This hinders users to exploit the full potential of multiprocessors. STM (Software Transactional Memory) is a promising concurrent programming paradigm which addresses woes of programming for multiprocessor systems. In this paper, we implement BTO (Basic Timestamp Ordering), SGT (Serialization Graph Testing) and MVTO(Multi-Version Time-Stamp Ordering) concurrency control protocols and build an STM(Software Transactional Memory) library to evaluate the performance of these protocols. The deferred write approach is followed to implement the STM. A SET data structure is implemented using the transactions of our STM library. And this transactional SET is used as a test application to evaluate the STM. The performance of the protocols is rigorously compared against the linked-list module of the Synchrobench benchmark. Linked list module implements SET data structure using lazy-list, lock-free list, lock-coupling list and ESTM (Elastic Software Transactional Memory). Our analysis shows that for a number of threads greater than 60 and update rate 70%, BTO takes (17% to 29%) and (6% to 24%) less CPU time per thread when compared against lazy-list and lock-coupling list respectively. MVTO takes (13% to 24%) and (3% to 24%) less CPU time per thread when compared against lazy-list and lock-coupling list respectively. BTO and MVTO have similar per thread CPU time. BTO and MVTO outperform SGT by 9% to 36%.
Jul 11 2017
cs.CR arXiv:1707.02691v1
Malwares are becoming persistent by creating full- edged variants of the same or different family. Malwares belonging to same family share same characteristics in their functionality of spreading infections into the victim computer. These similar characteristics among malware families can be taken as a measure for creating a solution that can help in the detection of the malware belonging to particular family. In our approach we have taken the advantage of detecting these malware families by creating the database of these characteristics in the form of n-grams of API sequences. We use various similarity score methods and also extract multiple API sequences to analyze malware effectively.
Jun 21 2017
cs.RO arXiv:1706.06418v1
This paper discusses the design of a novel compliant in-pipe climbing modular robot for small diameter pipes. The robot consists of a kinematic chain of 3 OmniCrawler modules with a link connected in between 2 adjacent modules via compliant joints. While the tank-like crawler mechanism provides good traction on low friction surfaces, its circular cross-section makes it holonomic. The holonomic motion assists it to re-align in a direction to avoid obstacles during motion as well as overcome turns with a minimal energy posture. Additionally, the modularity enables it to negotiate T-junction without motion singularity. The compliance is realized using 4 torsion springs incorporated in joints joining 3 modules with 2 links. For a desirable pipe diameter (\textØ 75mm), the springs' stiffness values are obtained by formulating a constraint optimization problem which has been simulated in ADAMS MSC and further validated on a real robot prototype. In order to negotiate smooth vertical bends and friction coefficient variations in pipes, the design was later modified by replacing springs with series elastic actuators (SEA) at 2 of the 4 joints.
Although gradient descent (GD) almost always escapes saddle points asymptotically [Lee et al., 2016], this paper shows that even with fairly natural random initialization schemes and non-pathological functions, GD can be significantly slowed down by saddle points, taking exponential time to escape. On the other hand, gradient descent with perturbations [Ge et al., 2015, Jin et al., 2017] is not slowed down by saddle points - it can find an approximate local minimizer in polynomial time. This result implies that GD is inherently slower than perturbed GD, and justifies the importance of adding perturbations for efficient non-convex optimization. While our focus is theoretical, we also present experiments that illustrate our theoretical findings.
May 09 2017
cs.DC arXiv:1705.02884v4
Concurrent data structures or CDS such as concurrent stacks, queues, sets etc. have become very popular in the past few years partly due to the rise of multi-core systems. But one of the greatest challenges with CDSs has been developing correct structures and then proving the correctness of these structures. We believe that techniques that help prove the correctness of these CDSs can also guide in developing new CDSs. An intuitive technique to prove the correctness of CDSs is using Linearization Points or LPs. An LP is an atomic event in the execution interval of each method such that the execution of the entire method seems to have taken place in the instant of that event. One of the main challenges with the LP based approach is to identify the correct LPs of a CDS. Identifying the correct LPs can be deceptively wrong in many cases. In fact, in many cases, the LP identified or even worse the CDS itself could be wrong. To address these issues, several automatic tools for verifying linearizability have been developed. But we believe that these tools don't provide insight to a programmer to develop the correct concurrent programs or identify the LPs. Considering the complexity of developing a CDS and verifying its correctness, we address the most basic problem of this domain in this paper: given the set of LPs of a CDS, how to show its correctness? We assume that we are given a CDS and its LPs. We have developed a hand-crafted technique of proving the correctness of the CDS by validating its LPs. As observed earlier, identifying the correct LPs is very tricky and erroneous. But since our technique is hand-crafted, we believe that the process of proving correctness might provide insight to identify the correct LPs, if the currently chosen LP is incorrect. We also believe that this technique might also offer the programmer some insight to develop more efficient variants of the CDS.
May 02 2017
cs.CV arXiv:1705.00360v2
This paper presents a novel real-time method for tracking salient closed boundaries from video image sequences. This method operates on a set of straight line segments that are produced by line detection. The tracking scheme is coherently integrated into a perceptual grouping framework in which the visual tracking problem is tackled by identifying a subset of these line segments and connecting them sequentially to form a closed boundary with the largest saliency and a certain similarity to the previous one. Specifically, we define a new tracking criterion which combines a grouping cost and an area similarity constraint. The proposed criterion makes the resulting boundary tracking more robust to local minima. To achieve real-time tracking performance, we use Delaunay Triangulation to build a graph model with the detected line segments and then reduce the tracking problem to finding the optimal cycle in this graph. This is solved by our newly proposed closed boundary candidates searching algorithm called "Bidirectional Shortest Path (BDSP)". The efficiency and robustness of the proposed method are tested on real video sequences as well as during a robot arm pouring experiment.
Apr 25 2017
cs.RO arXiv:1704.06817v1
This paper presents a modular in-pipeline climbing robot with a novel compliant foldable OmniCrawler mechanism. The circular cross-section of the OmniCrawler module enables a holonomic motion to facilitate the alignment of the robot in the direction of bends. Additionally, the crawler mechanism provides a fair amount of traction, even on slippery surfaces. These advantages of crawler modules have been further supplemented by incorporating active compliance in the module itself which helps to negotiate sharp bends in small diameter pipes. The robot has a series of 3 such compliant foldable modules interconnected by the links via passive joints. For the desirable pipe diameter and curvature of the bends, the spring stiffness value for each passive joint is determined by formulating a constrained optimization problem using the quasi-static model of the robot. Moreover, a minimum friction coefficient value between the module-pipe surface which can be vertically climbed by the robot without slipping is estimated. The numerical simulation results have further been validated by experiments on real robot prototype.
Browsers can detect malicious websites that are provisioned with forged or fake TLS/SSL certificates. However, they are not so good at detecting malicious websites if they are provisioned with mistakenly issued certificates or certificates that have been issued by a compromised certificate authority. Google proposed certificate transparency which is an open framework to monitor and audit certificates in real time. Thereafter, a few other certificate transparency schemes have been proposed which can even handle revocation. All currently known constructions use Merkle hash trees and have proof size logarithmic in the number of certificates/domain owners. We present a new certificate transparency scheme with short (constant size) proofs. Our construction makes use of dynamic bilinear-map accumulators. The scheme has many desirable properties like efficient revocation, low verification cost and update costs comparable to the existing schemes. We provide proofs of security and evaluate the performance of our scheme.
In this work we introduce Qumin, a novel quantum programming language with a focus on providing an easy to use, minimalist, high-level, and easily extensible platform for quantum programming. Qumin's design concentrates on encompassing the various interactions between classical and quantum computation via the use of two sublanguages: an untyped one that handles classical preparation and control, and one linearly typed that explicitly handles quantum routines. This allows the classical part of the language to be freely used for general programming while placing restrictions on the quantum part that enforce rules of quantum computing like the no-cloning of qubits. We describe both the language's theoretical foundations in terms of lambda calculi and linear type systems, and more practical matters such as implementations of algorithms and useful programming tools like matrix and oracle generators that streamline the interaction of the classical and quantum fragments of a program. Finally, we provide an experimental open-source implementation of an interpreter, typechecker and related tools for the language (which can be found in \urlhttps://github.com/wintershammer/QImp).
Apr 10 2017
cs.LG arXiv:1704.02197v2
Clustering is a useful data exploratory method with its wide applicability in multiple fields. However, data clustering greatly relies on initialization of cluster centers that can result in large intra-cluster variance and dead centers, therefore leading to sub-optimal solutions. This paper proposes a novel variance based version of the conventional Moving K-Means (MKM) algorithm called Variance Based Moving K-Means (VMKM) that can partition data into optimal homogeneous clusters, irrespective of cluster initialization. The algorithm utilizes a novel distance metric and a unique data element selection criteria to transfer the selected elements between clusters to achieve low intra-cluster variance and subsequently avoid dead centers. Quantitative and qualitative comparison with various clustering techniques is performed on four datasets selected from image processing, bioinformatics, remote sensing and the stock market respectively. An extensive analysis highlights the superior performance of the proposed method over other techniques.
Apr 04 2017
cs.SY arXiv:1704.00236v1
A Network Control System (NCS) consists of control components that interact with the plant over a shared network. The system dynamics of a NCS could be subject to noise arising from randomness in the times at which the data is transmitted over the network, corruption of the transmitted data by the communication network, and external disturbances that might affect the plant. A question of interest is to understand how the statistics of the data transmission times affects the system dynamics, and under what conditions the system is stable. Another related issue is designing a controller that meets desired performance specifications (e.g., a specific mean and variance of the system state). Here, we consider a minimal NCS that consists of a plant and a controller, and it is subject to random transmission times, channel corruption and external disturbances. We derive exact dynamics of the first two moments of the system, and use them to derive the stability conditions of the system. We further design a control law that steers the system to a desired mean and variance. Finally, we demonstrate our results using different examples, and show that under some specific conditions, randomness in the data transmission times can even reduce the variability contributed from disturbance.
Apr 03 2017
cs.LO arXiv:1703.10994v1
Separation Logic is an effective Program Logic for proving programs that involve pointers. Reasoning with pointers becomes difficult especially when there is aliasing arising due to several pointers to a given cell location. In this paper, we try to explore the problems with aliasing through some simple examples and introduce the notion of separating conjunction as a tool to deal with it. We introduce Separation Logic as an extension of the standard Hoare Logic with the help pf a programming language that has four pointer manipulating commands. These commands perform the usual heap operations such as lookup, update, allocation and deallocation. The new set of assertions and axioms of Separation Logic is presented in a semi-formal style. Examples are given to illustrate the unique features of the new assertions and axioms. Finally the paper concludes with the proofs of some real programs using the axioms of Separation Logic.
Apr 03 2017
cs.LO arXiv:1703.10977v1
We present fully formalized proofs of some central theorems from combinatorics. These are Dilworth's decomposition theorem, Mirsky's theorem, Hall's marriage theorem and the Erdős-Szekeres theorem. Dilworth's decomposition theorem is the key result among these. It states that in any finite partially ordered set (poset), the size of a smallest chain cover and a largest antichain are the same. Mirsky's theorem is a dual of Dilworth's decomposition theorem, which states that in any finite poset, the size of a smallest antichain cover and a largest chain are the same. We use Dilworth's theorem in the proofs of Hall's Marriage theorem and the Erdős-Szekeres theorem. The combinatorial objects involved in these theorems are sets and sequences. All the proofs are formalized in the Coq proof assistant. We develop a library of definitions and facts that can be used as a framework for formalizing other theorems on finite posets.
We present two fully mechanized proofs of Dilworths and Mirskys theorems in the Coq proof assistant. Dilworths Theorem states that in any finite partially ordered set (poset), the size of a smallest chain cover and a largest antichain are the same. Mirskys Theorem is a dual of Dilworths Theorem. We formalize the proofs by Perles [2] (for Dilworths Theorem) and by Mirsky [5] (for the dual theorem). We also come up with a library of definitions and facts that can be used as a framework for formalizing other theorems on finite posets.
Mar 17 2017
cs.HC arXiv:1703.05462v1
As virtual reality (VR) emerges as a mainstream platform, designers have started to experiment new interaction techniques to enhance the user experience. This is a challenging task because designers not only strive to provide designs with good performance but also carefully ensure not to disrupt users' immersive experience. There is a dire need for a new evaluation tool that extends beyond traditional quantitative measurements to assist designers in the design process. We propose an EEG-based experiment framework that evaluates interaction techniques in VR by measuring intentionally elicited cognitive conflict. Through the analysis of the feedback-related negativity (FRN) as well as other quantitative measurements, this framework allows designers to evaluate the effect of the variables of interest. We studied the framework by applying it to the fundamental task of 3D object selection using direct 3D input, i.e. tracked hand in VR. The cognitive conflict is intentionally elicited by manipulating the selection radius of the target object. Our first behavior experiment validated the framework in line with the findings of conflict-induced behavior adjustments like those reported in other classical psychology experiment paradigms. Our second EEG-based experiment examines the effect of the appearance of virtual hands. We found that the amplitude of FRN correlates with the level of realism of the virtual hands, which concurs with the Uncanny Valley theory.
Mar 07 2017
cs.CV arXiv:1703.01698v2
This paper presents two visual trackers from the different paradigms of learning and registration based tracking and evaluates their application in image based visual servoing. They can track object motion with four degrees of freedom (DoF) which, as we will show here, is sufficient for many fine manipulation tasks. One of these trackers is a newly developed learning based tracker that relies on learning discriminative correlation filters while the other is a refinement of a recent 8 DoF RANSAC based tracker adapted with a new appearance model for tracking 4 DoF motion. Both trackers are shown to provide superior performance to several state of the art trackers on an existing dataset for manipulation tasks. Further, a new dataset with challenging sequences for fine manipulation tasks captured from robot mounted eye-in-hand (EIH) cameras is also presented. These sequences have a variety of challenges encountered during real tasks including jittery camera movement, motion blur, drastic scale changes and partial occlusions. Quantitative and qualitative results on these sequences are used to show that these two trackers are robust to failures while providing high precision that makes them suitable for such fine manipulation tasks.
Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and possibly exceed the sample size. We consider the problem of robust estimation of sparse functionals, and provide a computationally and statistically efficient algorithm in the high-dimensional setting. Our theory identifies a unified set of deterministic conditions under which our algorithm guarantees accurate recovery. By further establishing that these deterministic conditions hold with high-probability for a wide range of statistical models, our theory applies to many problems of considerable interest including sparse mean and covariance estimation; sparse linear regression; and sparse generalized linear models.
We show that given an estimate $\widehat{A}$ that is close to a general high-rank positive semi-definite (PSD) matrix $A$ in spectral norm (i.e., $\|\widehat{A}-A\|_2 \leq \delta$), the simple truncated SVD of $\widehat{A}$ produces a multiplicative approximation of $A$ in Frobenius norm. This observation leads to many interesting results on general high-rank matrix estimation problems, which we briefly summarize below ($A$ is an $n\times n$ high-rank PSD matrix and $A_k$ is the best rank-$k$ approximation of $A$): (1) High-rank matrix completion: By observing $\Omega(\frac{n\max\{\epsilon^{-4},k^2\}\mu_0^2\|A\|_F^2\log n}{\sigma_{k+1}(A)^2})$ elements of $A$ where $\sigma_{k+1}\left(A\right)$ is the $\left(k+1\right)$-th singular value of $A$ and $\mu_0$ is the incoherence, the truncated SVD on a zero-filled matrix satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(\epsilon))\|A-A_k\|_F$ with high probability. (2)High-rank matrix de-noising: Let $\widehat{A}=A+E$ where $E$ is a Gaussian random noise matrix with zero mean and $\nu^2/n$ variance on each entry. Then the truncated SVD of $\widehat{A}$ satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(\sqrt{\nu/\sigma_{k+1}(A)}))\|A-A_k\|_F + O(\sqrt{k}\nu)$. (3) Low-rank Estimation of high-dimensional covariance: Given $N$ i.i.d.~samples $X_1,\cdots,X_N\sim\mathcal N_n(0,A)$, can we estimate $A$ with a relative-error Frobenius norm bound? We show that if $N = \Omega\left(n\max\{\epsilon^{-4},k^2\}\gamma_k(A)^2\log N\right)$ for $\gamma_k(A)=\sigma_1(A)/\sigma_{k+1}(A)$, then $\|\widehat{A}_k-A\|_F \leq (1+O(\epsilon))\|A-A_k\|_F$ with high probability, where $\widehat{A}=\frac{1}{N}\sum_{i=1}^N{X_iX_i^\top}$ is the sample covariance.
The sparsest cut problem consists of identifying a small set of edges that breaks the graph into balanced sets of vertices. The normalized cut problem balances the total degree, instead of the size, of the resulting sets. Applications of graph cuts include community detection and computer vision. However, cut problems were originally proposed for static graphs, an assumption that does not hold in many modern applications where graphs are highly dynamic. In this paper, we introduce the sparsest and normalized cut problems in temporal graphs, which generalize their standard definitions by enforcing the smoothness of cuts over time. We propose novel formulations and algorithms for computing temporal cuts using spectral graph theory, multiplex graphs, divide-and-conquer and low-rank matrix approximation. Furthermore, we extend our formulation to dynamic graph signals, where cuts also capture node values, as graph wavelets. Experiments show that our solutions are accurate and scalable, enabling the discovery of dynamic communities and the analysis of dynamic graph processes.
Feb 15 2017
cs.SI arXiv:1702.04082v3
Network centrality plays an important role in many applications. Central nodes in social networks can be influential, driving opinions and spreading news or rumors.In hyperlinked environments, such as the Web, where users navigate via clicks, central content receives high traffic, becoming targets for advertising campaigns. While there is an extensive amount of work on centrality measures and their efficient computation, controlling nodes' centrality via network updates is a more recent and challenging problem. Performing minimal modifications to a network to achieve a desired property falls under the umbrella of network design problems. This paper is focused on improving the coverage centrality of a set of nodes, which is the number of pairs of nodes that have a shortest path passing through the set, by adding edges to the network. We prove strong inapproximability results and propose a greedy algorithm for maximizing coverage centrality. To ensure scalability to large networks, we also design an efficient sampling algorithm for the problem. In addition to providing an extensive empirical evaluation of our algorithms, we also show that, under some realistic constraints, the proposed solutions achieve almost-optimal approximation for coverage centrality maximization.
Feb 14 2017
cs.CV arXiv:1702.03345v1
This paper introduces a Deep Scattering network that utilizes Dual-Tree complex wavelets to extract translation invariant representations from an input signal. The computationally efficient Dual-Tree wavelets decompose the input signal into densely spaced representations over scales. Translation invariance is introduced in the representations by applying a non-linearity over a region followed by averaging. The discriminatory information in the densely spaced, locally smooth, signal representations aids the learning of the classifier. The proposed network is shown to outperform Mallat's ScatterNet on four datasets with different modalities on classification accuracy.
Feb 13 2017
cs.CV arXiv:1702.03267v1
We introduce a ScatterNet that uses a parametric log transformation with Dual-Tree complex wavelets to extract translation invariant representations from a multi-resolution image. The parametric transformation aids the OLS pruning algorithm by converting the skewed distributions into relatively mean-symmetric distributions while the Dual-Tree wavelets improve the computational efficiency of the network. The proposed network is shown to outperform Mallat's ScatterNet on two image datasets, both for classification accuracy and computational efficiency. The advantages of the proposed network over other supervised and some unsupervised methods are also presented using experiments performed on different training dataset sizes.
Although a majority of the theoretical literature in high-dimensional statistics has focused on settings which involve fully-observed data, settings with missing values and corruptions are common in practice. We consider the problems of estimation and of constructing component-wise confidence intervals in a sparse high-dimensional linear regression model when some covariates of the design matrix are missing completely at random. We analyze a variant of the Dantzig selector [9] for estimating the regression model and we use a de-biasing argument to construct component-wise confidence intervals. Our first main result is to establish upper bounds on the estimation error as a function of the model parameters (the sparsity level s, the expected fraction of observed covariates $\rho_*$, and a measure of the signal strength $\|\beta^*\|_2$). We find that even in an idealized setting where the covariates are assumed to be missing completely at random, somewhat surprisingly and in contrast to the fully-observed setting, there is a dichotomy in the dependence on model parameters and much faster rates are obtained if the covariance matrix of the random design is known. To study this issue further, our second main contribution is to provide lower bounds on the estimation error showing that this discrepancy in rates is unavoidable in a minimax sense. We then consider the problem of high-dimensional inference in the presence of missing data. We construct and analyze confidence intervals using a de-biased estimator. In the presence of missing data, inference is complicated by the fact that the de-biasing matrix is correlated with the pilot estimator and this necessitates the design of a new estimator and a novel analysis. We also complement our mathematical study with extensive simulations on synthetic and semi-synthetic data that show the accuracy of our asymptotic predictions for finite sample sizes.
All the existing real world networks are evolving, hence, study of traffic dynamics in these enlarged networks is a challenging task. The critical issue is to optimize the network structure to improve network capacity and avoid traffic congestion. We are interested in taking user's routes such that it is least congested with optimal network capacity. Network capacity may be improved either by optimizing network topology or enhancing in routing approach. In this context, we propose and design a model of the time varying data communication networks (TVCN) based on the dynamics of in-flowing links. Newly appeared node prefers to attach with most influential node present in the network. In this paper, influence is termed as \textitreputation and is applied for computing overall congestion at any node. User path with least betweenness centrality and most reputation is preferred for routing. Kelly's optimization formulation for a rate allocation problem is used for obtaining optimal rates of distinct users at different time instants and it is found that the user's path with lowest betweenness centrality and highest reputation will always give maximum rate at stable point.
Jan 17 2017
cs.RO arXiv:1701.04350v1
This paper aims to implement Object-Oriented Markov Decision Process (OO-MDPs) for goal planning and navigation of robot in an indoor environment. We use the OO-MDP representation of the environment which is a natural way of modeling the environment based on objects and their interactions. The paper aims to extend the well known Taxi domain example which has been tested on grid world environment to robotics domain with larger state-spaces. For the purpose of this project we have created simulation of the environment and robot in ROS with Gazebo and Rviz as visualization tools.The mobile robot uses a 2D LIDAR module to perform SLAM in the unknown environment. The goal of this project is to be able to make an autonomous agent capable of performing planning and navigation in an indoor environment to deliver boxes (passengers in Taxi domain) placed at random locations to a particular location (warehouse). The approach can be extended to a wide variety of mobile and manipulative robots
Jan 09 2017
cs.RO arXiv:1701.01547v2
In many human-in-the-loop robotic applications such as robot-assisted surgery and remote teleoperation, predicting the intended motion of the human operator may be useful for successful implementation of shared control, guidance virtual fixtures, and predictive control. Developing computational models of human movements is a critical foundation for such motion prediction frameworks. With this motivation, we present a computational framework for modeling reaching movements in the presence of obstacles. We propose a stochastic optimal control framework that consists of probabilistic collision avoidance constraints and a cost function that trades-off between effort and end-state variance in the presence of a signal-dependent noise. First, we present a series of reformulations to convert the original non-linear and non-convex optimal control into a parametric quadratic programming problem. We show that the parameters can be tuned to model various collision avoidance strategies, thereby capturing the quintessential variability associated with human motion. Then, we present a simulation study that demonstrates the complex interaction between avoidance strategies, control cost, and the probability of collision avoidance. The proposed framework can benefit a variety of applications that require teleoperation in cluttered spaces, including robot-assisted surgery. In addition, it can also be viewed as a new optimizer which produces smooth and probabilistically-safe trajectories under signal dependent noise.
User-given tags or labels are valuable resources for semantic understanding of visual media such as images and videos. Recently, a new type of labeling mechanism known as hash-tags have become increasingly popular on social media sites. In this paper, we study the problem of generating relevant and useful hash-tags for short video clips. Traditional data-driven approaches for tag enrichment and recommendation use direct visual similarity for label transfer and propagation. We attempt to learn a direct low-cost mapping from video to hash-tags using a two step training process. We first employ a natural language processing (NLP) technique, skip-gram models with neural network training to learn a low-dimensional vector representation of hash-tags (Tag2Vec) using a corpus of 10 million hash-tags. We then train an embedding function to map video features to the low-dimensional Tag2vec space. We learn this embedding for 29 categories of short video clips with hash-tags. A query video without any tag-information can then be directly mapped to the vector space of tags using the learned embedding and relevant tags can be found by performing a simple nearest-neighbor retrieval in the Tag2Vec space. We validate the relevance of the tags suggested by our system qualitatively and quantitatively with a user study.