Training deep neural networks on images represented as grids of pixels has brought to light an interesting phenomenon known as adversarial examples. Inspired by how humans reconstruct abstract concepts, we attempt to codify the input bitmap image into a set of compact, interpretable elements to avoid being fooled by the adversarial structures. We take the first step in this direction by experimenting with image vectorization as an input transformation step to map the adversarial examples back into the natural manifold of MNIST handwritten digits. We compare our method vs. state-of-the-art input transformations and further discuss the trade-offs between a hand-designed and a learned transformation defense.
Apr 05 2018 cs.SI
Content polluters, or bots that hijack a conversation for political or advertising purposes are a known problem for event prediction, election forecasting and when distinguishing real news from fake news in social media data. Identifying this type of bot is particularly challenging, with state-of-the-art methods utilising large volumes of network data as features for machine learning models. Such datasets are generally not readily available in typical applications which stream social media data for real-time event prediction. In this work we develop a methodology to detect content polluters in social media datasets that are streamed in real-time. Applying our method to the problem of civil unrest event prediction in Australia, we identify content polluters from individual tweets, without collecting social network or historical data from individual accounts. We identify some peculiar characteristics of these bots in our dataset and propose metrics for identification of such accounts. We then pose some research questions around this type of bot detection, including: how good Twitter is at detecting content polluters and how well state-of-the-art methods perform in detecting bots in our dataset.
Mar 19 2018 cs.CV
We address the problem of jointly learning vision and language to understand the object in a fine-grained manner. The key idea of our approach is the use of object descriptions to provide the detailed understanding of an object. Based on this idea, we propose two new architectures to solve two related problems: object captioning and natural language-based object retrieval. The goal of the object captioning task is to simultaneously detect the object and generate its associated description, while in the object retrieval task, the goal is to localize an object given an input query. We demonstrate that both problems can be solved effectively using hybrid end-to-end CNN-LSTM networks. The experimental results on our new challenging dataset show that our methods outperform recent methods by a fair margin, while providing a detailed understanding of the object and having fast inference time. The source code will be made available.
Mar 12 2018 cs.NE
Biological evolution provides a creative fount of complex and subtle adaptations, often surprising the scientists who discover them. However, because evolution is an algorithmic process that transcends the substrate in which it occurs, evolution's creativity is not limited to nature. Indeed, many researchers in the field of digital evolution have observed their evolving algorithms and organisms subverting their intentions, exposing unrecognized bugs in their code, producing unexpected adaptations, or exhibiting outcomes uncannily convergent with ones in nature. Such stories routinely reveal creativity by evolution in these digital worlds, but they rarely fit into the standard scientific narrative. Instead they are often treated as mere obstacles to be overcome, rather than results that warrant study in their own right. The stories themselves are traded among researchers through oral tradition, but that mode of information transmission is inefficient and prone to error and outright loss. Moreover, the fact that these stories tend to be shared only among practitioners means that many natural scientists do not realize how interesting and lifelike digital organisms are and how natural their evolution can be. To our knowledge, no collection of such anecdotes has been published before. This paper is the crowd-sourced product of researchers in the fields of artificial life and evolutionary computation who have provided first-hand accounts of such cases. It thus serves as a written, fact-checked collection of scientifically important and even entertaining stories. In doing so we also present here substantial evidence that the existence and importance of evolutionary surprises extends beyond the natural world, and may indeed be a universal property of all complex evolving systems.
Redundancy is a fundamental characteristic of many biological processes such as those in the genetic, visual, muscular and nervous system; yet its function has not been fully understood. The conventional interpretation of redundancy is that it serves as a fault-tolerance mechanism, which leads to redundancy's de facto application in man-made systems for reliability enhancement. On the contrary, our previous works have demonstrated an example where redundancy can be engineered solely for enhancing other aspects of the system, namely accuracy and precision. This design was inspired by the binocular structure of the human vision which we believe may share a similar operation. In this paper, we present a unified theory describing how such utilization of redundancy is feasible through two complementary mechanisms: representational redundancy (RPR) and entangled redundancy (ETR). Besides the previous works, we point out two additional examples where our new understanding of redundancy can be applied to justify a system's superior performance. One is the human musculoskeletal system (HMS) - a biological instance, and one is the deep residual neural network (ResNet) - an artificial counterpart. We envision that our theory would provide a framework for the future development of bio-inspired redundant artificial systems as well as assist the studies of the fundamental mechanisms governing various biological processes.
Sensing is the process of deriving signals from the environment that allows artificial systems to interact with the physical world. The Shannon theorem specifies the maximum rate at which information can be acquired. However, this upper bound is hard to achieve in many man-made systems. The biological visual systems, on the other hand, have highly efficient signal representation and processing mechanisms that allow precise sensing. In this work, we argue that redundancy is one of the critical characteristics for such superior performance. We show architectural advantages by utilizing redundant sensing, including correction of mismatch error and significant precision enhancement. For a proof-of-concept demonstration, we have designed a heuristic-based analog-to-digital converter - a zero-dimensional quantizer. Through Monte Carlo simulation with the error probabilistic distribution as a priori, the performance approaching the Shannon limit is feasible. In actual measurements without knowing the error distribution, we observe at least 2-bit extra precision. The results may also help explain biological processes including the dominance of binocular vision, the functional roles of the fixational eye movements, and the structural mechanisms allowing hyperacuity.
We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including neural network and template-based models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than other systems. The results highlight the potential of coupling ensemble systems with deep reinforcement learning as a fruitful path for developing real-world, open-domain conversational agents.
Deep Learning can significantly benefit cancer proteomics and genomics. In this study, we attempt to determine a set of critical proteins that are associated with the FLT3-ITD mutation in newly-diagnosed acute myeloid leukemia patients. A Deep Learning network consisting of autoencoders forming a hierarchical model from which high-level features are extracted without labeled training data. Dimensional reduction reduced the number of critical proteins from 231 to 20. Deep Learning found an excellent correlation between FLT3-ITD mutation with the levels of these 20 critical proteins (accuracy 97%, sensitivity 90%, specificity 100%). Our Deep Learning network could hone in on 20 proteins with the strongest association with FLT3-ITD. The results of this study allow a novel approach to determine critical protein pathways in the FLT3-ITD mutation, and provide proof-of-concept for an accurate approach to model big data in cancer proteomics and genomics.
In this paper we propose Spatial PixelCNN, a conditional autoregressive model that generates images from small patches. By conditioning on a grid of pixel coordinates and global features extracted from a Variational Autoencoder (VAE), we are able to train on patches of images, and reproduce the full-sized image. We show that it not only allows for generating high quality samples at the same resolution as the underlying dataset, but is also capable of upscaling images to arbitrary resolutions (tested at resolutions up to $50\times$) on the MNIST dataset. Compared to a PixelCNN++ baseline, Spatial PixelCNN quantitatively and qualitatively achieves similar performance on the MNIST dataset.
We present a new method to translate videos to commands for robotic manipulation using Deep Recurrent Neural Networks (RNN). Our framework first extracts deep features from the input video frames with a deep Convolutional Neural Networks (CNN). Two RNN layers with an encoder-decoder architecture are then used to encode the visual features and sequentially generate the output words as the command. We demonstrate that the translation accuracy can be improved by allowing a smooth transaction between two RNN layers and using the state-of-the-art feature extractor. The experimental results on our new challenging dataset show that our approach outperforms recent methods by a fair margin. Furthermore, we combine the proposed translation module with the vision and planning system to let a robot perform various manipulation tasks. Finally, we demonstrate the effectiveness of our framework on a full-size humanoid robot WALK-MAN.
We propose AffordanceNet, a new deep learning approach to simultaneously detect multiple objects and their affordances from RGB images. Our AffordanceNet has two branches: an object detection branch to localize and classify the object, and an affordance detection branch to assign each pixel in the object to its most probable affordance label. The proposed framework employs three key components for effectively handling the multiclass problem in the affordance mask: a sequence of deconvolutional layers, a robust resizing strategy, and a multi-task loss function. The experimental results on the public datasets show that our AffordanceNet outperforms recent state-of-the-art methods by a fair margin, while its end-to-end architecture allows the inference at the speed of 150ms per image. This makes our AffordanceNet well suitable for real-time robotic applications. Furthermore, we demonstrate the effectiveness of AffordanceNet in different testing environments and in real robotic applications. The source code is available at https://github.com/nqanh/affordance-net
We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including template-based models, bag-of-words models, sequence-to-sequence neural network and latent variable neural network models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than many competing systems. Due to its machine learning architecture, the system is likely to improve with additional data.
We present a new method to relocalize the 6DOF pose of an event camera solely based on the event stream. Our method first creates the event image from a list of events that occurs in a very short time interval, then a Stacked Spatial LSTM Network (SP-LSTM) is used to learn the camera pose. Our SP-LSTM is composed of a CNN to learn deep features from the event images and a stack of LSTM to learn spatial dependencies in the image feature space. We show that the spatial dependency plays an important role in the relocalization task and the SP-LSTM can effectively learn this information. The experimental results on a publicly available dataset show that our approach generalizes well and outperforms recent methods by a substantial margin. Overall, our proposed method reduces by approx. 6 times the position error and 3 times the orientation error compared to the current state of the art. The source code and trained models will be released.
Jul 10 2017 cs.CV
Seizure prediction has attracted a growing attention as one of the most challenging predictive data analysis efforts in order to improve the life of patients living with drug-resistant epilepsy and tonic seizures. Many outstanding works have been reporting great results in providing a sensible indirect (warning systems) or direct (interactive neural-stimulation) control over refractory seizures, some of which achieved high performance. However, many works put heavily handcraft feature extraction and/or carefully tailored feature engineering to each patient to achieve very high sensitivity and low false prediction rate for a particular dataset. This limits the benefit of their approaches if a different dataset is used. In this paper we apply Convolutional Neural Networks (CNNs) on different intracranial and scalp electroencephalogram (EEG) datasets and proposed a generalized retrospective and patient-specific seizure prediction method. We use Short-Time Fourier Transform (STFT) on 30-second EEG windows with 50% overlapping to extract information in both frequency and time domains. A standardization step is then applied on STFT components across the whole frequency range to prevent high frequencies features being influenced by those at lower frequencies. A convolutional neural network model is used for both feature extraction and classification to separate preictal segments from interictal ones. The proposed approach achieves sensitivity of 81.4%, 81.2%, 82.3% and false prediction rate (FPR) of 0.06/h, 0.16/h, 0.22/h on Freiburg Hospital intracranial EEG (iEEG) dataset, Children's Hospital of Boston-MIT scalp EEG (sEEG) dataset, and Kaggle American Epilepsy Society Seizure Prediction Challenge's dataset, respectively. Our prediction method is also statistically better than an unspecific random predictor for most of patients in all three datasets.
Jun 02 2017 cs.HC
Three-dimensional (3D) applications have come to every corner of life. We present 3DTouch, a novel 3D wearable input device worn on the fingertip for interacting with 3D applications. 3DTouch is self-contained, and designed to universally work on various 3D platforms. The device employs touch input for the benefits of passive haptic feedback, and movement stability. Moreover, with touch interaction, 3DTouch is conceptually less fatiguing to use over many hours than 3D spatial input devices such as Kinect. Our approach relies on relative positioning technique using an optical laser sensor and a 9-DOF inertial measurement unit. We implemented a set of 3D interaction techniques including selection, translation, and rotation using 3DTouch. An evaluation also demonstrates the device's tracking accuracy of 1.10 mm and 2.33 degrees for subtle touch interaction in 3D space. With 3DTouch project, we would like to provide an input device that reduces the gap between 3D applications and users.
Having accurate, detailed, and up-to-date information about the location and behavior of animals in the wild would revolutionize our ability to study and conserve ecosystems. We investigate the ability to automatically, accurately, and inexpensively collect such data, which could transform many fields of biology, ecology, and zoology into "big data" sciences. Motion sensor "camera traps" enable collecting wildlife pictures inexpensively, unobtrusively, and frequently. However, extracting information from these pictures remains an expensive, time-consuming, manual task. We demonstrate that such information can be automatically extracted by deep learning, a cutting-edge type of artificial intelligence. We train deep convolutional neural networks to identify, count, and describe the behaviors of 48 species in the 3.2-million-image Snapshot Serengeti dataset. Our deep neural networks automatically identify animals with over 93.8% accuracy, and we expect that number to improve rapidly in years to come. More importantly, if our system classifies only images it is confident about, our system can automate animal identification for 99.3% of the data while still performing at the same 96.6% accuracy as that of crowdsourced teams of human volunteers, saving more than 8.4 years (at 40 hours per week) of human labeling effort (i.e. over 17,000 hours) on this 3.2-million-image dataset. Those efficiency gains immediately highlight the importance of using deep neural networks to automate data extraction from camera-trap images. Our results suggest that this technology could enable the inexpensive, unobtrusive, high-volume, and even real-time collection of a wealth of information about vast numbers of animals in the wild.
Dec 30 2016 cs.SI
This paper studies marijuana-related tweets in social network Twitter. We collected more than 300,000 marijuana related tweets during November 2016 in our study. Our text-mining based algorithms and data analysis unveil some interesting patterns including: (i) users' attitudes (e.g., positive or negative) can be characterized by the existence of outer links in a tweet; (ii) 67% users use their mobile phones to post their messages while many users publish their messages using third-party automatic posting services; and (3) the number of tweets during weekends is much higher than during weekdays. Our data also showed the impact of the political events such as the U.S. presidential election or state marijuana legalization votes on the marijuana-related tweeting frequencies.
Dec 30 2016 cs.NI
Reliable broadcasting data to multiple receivers over lossy wireless channels is challenging due to the heterogeneity of the wireless link conditions. Automatic Repeat-reQuest (ARQ) based retransmission schemes are bandwidth inefficient due to data duplication at receivers. Network coding (NC) has been shown to be a promising technique for improving network bandwidth efficiency by combining multiple lost data packets for retransmission. However, it is challenging to accurately determine which lost packets should be combined together due to disrupted feedback channels. This paper proposes an adaptive data encoding scheme at the transmitter by joining network coding and machine learning (NCML) for retransmission of lost packets. Our proposed NCML extracts the important features from historical feedback signals received by the transmitter to train a classifier. The constructed classifier is then used to predict states of transmitted data packets at different receivers based on their corrupted feedback signals for effective data mixing. We have conducted extensive simulations to collaborate the efficiency of our proposed approach. The simulation results show that our machine learning algorithm can be trained efficiently and accurately. The simulation results show that on average the proposed NCML can correctly classify 90% of the states of transmitted data packets at different receivers. It achieves significant bandwidth gain compared with the ARQ and NC based schemes in different transmission terrains, power levels, and the distances between the transmitter and receivers.
Recent studies have shown that information mined from Craigslist can be used for informing public health policy or monitoring risk behavior. This paper presents a text-mining method for conducting public health surveillance of marijuana use concerns in the U.S. using online classified ads in Craigslist. We collected more than 200 thousands of rental ads in the housing categories in Craigslist and devised text-mining methods for efficiently and accurately extract rental ads associated with concerns about the uses of marijuana in different states across the U.S. We linked the extracted ads to their geographic locations and computed summary statistics of the ads having marijuana use concerns. Our data is then compared with the State Marijuana Laws Map published by the U.S. government and marijuana related keywords search in Google to verify our collected data with respect to the demographics of marijuana use concerns. Our data not only indicates strong correlations between Craigslist ads, Google search and the State Marijuana Laws Map in states where marijuana uses are legal, but also reveals some hidden world of marijuana use concerns in other states where marijuana use is illegal. Our approach can be utilized as a marijuana surveillance tool for policy makers to develop public health policy and regulations.
Dec 02 2016 cs.CV
Generating high-resolution, photo-realistic images has been a long-standing goal in machine learning. Recently, Nguyen et al. (2016) showed one interesting way to synthesize novel images by performing gradient ascent in the latent space of a generator network to maximize the activations of one or multiple neurons in a separate classifier network. In this paper we extend this method by introducing an additional prior on the latent code, improving both sample quality and sample diversity, leading to a state-of-the-art generative model that produces high quality images at higher resolutions (227x227) than previous generative models, and does so for all 1000 ImageNet categories. In addition, we provide a unified probabilistic interpretation of related activation maximization methods and call the general class of models "Plug and Play Generative Networks". PPGNs are composed of 1) a generator network G that is capable of drawing a wide range of image types and 2) a replaceable "condition" network C that tells the generator what to draw. We demonstrate the generation of images conditioned on a class (when C is an ImageNet or MIT Places classification network) and also conditioned on a caption (when C is an image captioning network). Our method also improves the state of the art of Multifaceted Feature Visualization, which generates the set of synthetic inputs that activate a neuron in order to better understand how deep neural networks operate. Finally, we show that our model performs reasonably well at the task of image inpainting. While image models are used in this paper, the approach is modality-agnostic and can be applied to many types of data.
Nov 23 2016 cs.RO
While autonomous multirotor micro aerial vehicles (MAVs) are uniquely well suited for certain types of missions benefiting from stationary flight capabilities, their more widespread usage still faces many hurdles, due in particular to their limited range and the difficulty of fully automating their deployment and retrieval. In this paper we address these issues by solving the problem of the automated landing of a quadcopter on a ground vehicle moving at relatively high speed. We present our system architecture, including the structure of our Kalman filter for the estimation of the relative position and velocity between the quadcopter and the landing pad, as well as our controller design for the full rendezvous and landing maneuvers. The system is experimentally validated by successfully landing in multiple trials a commercial quadcopter on the roof of a car moving at speeds of up to 50 km/h.
Nov 22 2016 cs.IR
A recent "third wave" of Neural Network (NN) approaches now delivers state-of-the-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing. Because these modern NNs often comprise multiple interconnected layers, this new NN research is often referred to as deep learning. Stemming from this tide of NN work, a number of researchers have recently begun to investigate NN approaches to Information Retrieval (IR). While deep NNs have yet to achieve the same level of success in IR as seen in other areas, the recent surge of interest and work in NNs for IR suggest that this state of affairs may be quickly changing. In this work, we survey the current landscape of Neural IR research, paying special attention to the use of learned representations of queries and documents (i.e., neural embeddings). We highlight the successes of neural IR thus far, catalog obstacles to its wider adoption, and suggest potentially promising directions for future research.
Jul 12 2016 cs.CL
This study investigates the use of unsupervised word embeddings and sequence features for sample representation in an active learning framework built to extract clinical concepts from clinical free text. The objective is to further reduce the manual annotation effort while achieving higher effectiveness compared to a set of baseline features. Unsupervised features are derived from skip-gram word embeddings and a sequence representation approach. The comparative performance of unsupervised features and baseline hand-crafted features in an active learning framework are investigated using a wide range of selection criteria including least confidence, information diversity, information density and diversity, and domain knowledge informativeness. Two clinical datasets are used for evaluation: the i2b2/VA 2010 NLP challenge and the ShARe/CLEF 2013 eHealth Evaluation Lab. Our results demonstrate significant improvements in terms of effectiveness as well as annotation effort savings across both datasets. Using unsupervised features along with baseline features for sample representation lead to further savings of up to 9% and 10% of the token and concept annotation rates, respectively.
Deep neural networks (DNNs) have demonstrated state-of-the-art results on many pattern recognition tasks, especially vision classification problems. Understanding the inner workings of such computational brains is both fascinating basic science that is interesting in its own right - similar to why we study the human brain - and will enable researchers to further improve DNNs. One path to understanding how a neural network functions internally is to study what each of its neurons has learned to detect. One such method is called activation maximization (AM), which synthesizes an input (e.g. an image) that highly activates a neuron. Here we dramatically improve the qualitative state of the art of activation maximization by harnessing a powerful, learned prior: a deep generator network (DGN). The algorithm (1) generates qualitatively state-of-the-art synthetic images that look almost real, (2) reveals the features learned by each neuron in an interpretable way, (3) generalizes well to new datasets and somewhat well to different network architectures without requiring the prior to be relearned, and (4) can be considered as a high-quality generative method (in this case, by generating novel, creative, interesting, recognizable images).
This paper presents a strategy to guide a mobile ground robot equipped with a camera or depth sensor, in order to autonomously map the visible part of a bounded three-dimensional structure. We describe motion planning algorithms that determine appropriate successive viewpoints and attempt to fill holes automatically in a point cloud produced by the sensing and perception layer. The emphasis is on accurately reconstructing a 3D model of a structure of moderate size rather than mapping large open environments, with applications for example in architecture, construction and inspection. The proposed algorithms do not require any initialization in the form of a mesh model or a bounding box, and the paths generated are well adapted to situations where the vision sensor is used simultaneously for mapping and for localizing the robot, in the absence of additional absolute positioning system. We analyze the coverage properties of our policy, and compare its performance to the classic frontier based exploration algorithm. We illustrate its efficacy for different structure sizes, levels of localization accuracy and range of the depth sensor, and validate our design on a real-world experiment.
We can better understand deep neural networks by identifying which features each of their neurons have learned to detect. To do so, researchers have created Deep Visualization techniques including activation maximization, which synthetically generates inputs (e.g. images) that maximally activate each neuron. A limitation of current techniques is that they assume each neuron detects only one type of feature, but we know that neurons can be multifaceted, in that they fire in response to many different types of features: for example, a grocery store class neuron must activate either for rows of produce or for a storefront. Previous activation maximization techniques constructed images without regard for the multiple different facets of a neuron, creating inappropriate mixes of colors, parts of objects, scales, orientations, etc. Here, we introduce an algorithm that explicitly uncovers the multiple facets of each neuron by producing a synthetic visualization of each of the types of images that activate a neuron. We also introduce regularization methods that produce state-of-the-art results in terms of the interpretability of images obtained by activation maximization. By separately synthesizing each type of image a neuron fires in response to, the visualizations have more appropriate colors and coherent global structure. Multifaceted feature visualization thus provides a clearer and more comprehensive description of the role of each neuron.
Recent years have produced great advances in training large, deep neural networks (DNNs), including notable successes in training convolutional neural networks (convnets) to recognize natural images. However, our understanding of how these models work, especially what computations they perform at intermediate layers, has lagged behind. Progress in the field will be further accelerated by the development of better tools for visualizing and interpreting neural nets. We introduce two such tools here. The first is a tool that visualizes the activations produced on each layer of a trained convnet as it processes an image or video (e.g. a live webcam stream). We have found that looking at live activations that change in response to user input helps build valuable intuitions about how convnets work. The second tool enables visualizing features at each layer of a DNN via regularized optimization in image space. Because previous versions of this idea produced less recognizable images, here we introduce several new regularization methods that combine to produce qualitatively clearer, more interpretable visualizations. Both tools are open source and work on a pre-trained convnet with minimal setup.
Providing feedback, both assessing final work and giving hints to stuck students, is difficult for open-ended assignments in massive online classes which can range from thousands to millions of students. We introduce a neural network method to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and propose an algorithm for feedback at scale using these linear maps as features. We apply our algorithm to assessments from the Code.org Hour of Code and Stanford University's CS1 course, where we propagate human comments on student assignments to orders of magnitude more submissions.
Deep neural networks (DNNs) have recently been achieving state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classification problems. Given that DNNs are now able to classify objects in images with near-human-level performance, questions naturally arise as to what differences remain between computer and human vision. A recent study revealed that changing an image (e.g. of a lion) in a way imperceptible to humans can cause a DNN to label the image as something else entirely (e.g. mislabeling a lion a library). Here we show a related result: it is easy to produce images that are completely unrecognizable to humans, but that state-of-the-art DNNs believe to be recognizable objects with 99.99% confidence (e.g. labeling with certainty that white noise static is a lion). Specifically, we take convolutional neural networks trained to perform well on either the ImageNet or MNIST datasets and then find images with evolutionary algorithms or gradient ascent that DNNs label with high confidence as belonging to each dataset class. It is possible to produce images totally unrecognizable to human eyes that DNNs believe with near certainty are familiar objects, which we call "fooling images" (more generally, fooling examples). Our results shed light on interesting differences between human vision and current DNNs, and raise questions about the generality of DNN computer vision.
Dec 01 2014 cs.NI
A reliable and scalable mechanism to provide protection against a link or node failure has additional requirements in the context of SDN and OpenFlow. Not only it has to minimize the load on the controller, but it must be able to react even when the controller is unreachable. In this paper we present a protection scheme based on precomputed backup paths and inspired by MPLS crankback routing, that guarantees instantaneous recovery times and aims at zero packet-loss after failure detection, regardless of controller reachability, even when OpenFlow's "fast-failover" feature cannot be used. The proposed mechanism is based on OpenState, an OpenFlow extension that allows a programmer to specify how forwarding rules should autonomously adapt in a stateful fashion, reducing the need to rely on remote controllers. We present the scheme as well as two different formulations for the computation of backup paths.
We present 3DTouch, a novel 3D wearable input device worn on the fingertip for 3D manipulation tasks. 3DTouch is designed to fill the missing gap of a 3D input device that is self-contained, mobile, and universally working across various 3D platforms. This paper presents a low-cost solution to designing and implementing such a device. Our approach relies on relative positioning technique using an optical laser sensor and a 9-DOF inertial measurement unit. 3DTouch is self-contained, and designed to universally work on various 3D platforms. The device employs touch input for the benefits of passive haptic feedback, and movement stability. On the other hand, with touch interaction, 3DTouch is conceptually less fatiguing to use over many hours than 3D spatial input devices. We propose a set of 3D interaction techniques including selection, translation, and rotation using 3DTouch. An evaluation also demonstrates the device's tracking accuracy of 1.10 mm and 2.33 degrees for subtle touch interaction in 3D space. Modular solutions like 3DTouch opens up a whole new design space for interaction techniques to further develop on.
With the evolution of mobile devices, and smart-phones in particular, comes the ability to create new experiences that enhance the way we see, interact, and manipulate objects, within the world that surrounds us. It is now possible to blend data from our senses and our devices in numerous ways that simply were not possible before using Augmented Reality technology. In a near future, when all of the office devices as well as your personal electronic gadgets are on a common wireless network, operating them using a universal remote controller would be possible. This paper presents an off-the-shelf, low-cost prototype that leverages the Augmented Reality technology to deliver a novel and interactive way of operating office network devices around using a mobile device. We believe this type of system may provide benefits to controlling multiple integrated devices and visualizing interconnectivity or utilizing visual elements to pass information from one device to another, or may be especially beneficial to control devices when interacting with them physically may be difficult or pose danger or harm.
Feb 05 2013 cs.CY
Online retailing (a model of B2C e-commerce) is growing world-wide, with companies in many countries showing increased sales and productivity as a result. It has great potential within the global economy. This paper looks at the current status of online retailing in Saudi Arabia, with particular focus on what inhibits or enables both the customers and retailers. It also analyses the status of Government involvement and proposes a layered model, known as the Wheel of Online Retailing which illustrates how Government intervention can benefit the e-commerce in Saudi Arabia.
This paper looks at the present standing of ecommerce in Saudi Arabia, as well as the challenges and strengths of Business to Customers (B2C) electronic commerce. Many studies have been conducted around the world in order to gain a better understanding of the demands, needs and effectiveness of online commerce. A study was undertaken to review the literature identifying the factors influencing the adoption and diffusion of B2C e-commerce. It found four distinct categories: businesses, customers, environmental and governmental support, which must all be considered when creating an e-commerce infrastructure. A concept matrix was used to provide a comparison of important factors in different parts of the world. The study found that e-commerce in Saudi Arabia was lacking in Governmental support as well as relevant involvement by both customers and retailers.
Jan 05 2013 cs.CY
The IS-Impact Measurement Model, developed by Gable, Sedera and Chan in 2008, represents the to-date and expected stream of net profits from a given information system (IS), as perceived by all major user classes. Although this model has been stringently validated in previous studies, its generalizability and verified effectiveness are enhanced through this new application in e-learning. This paper focuses on the re-validation of the findings of the IS-Impact Model in two universities in the Kingdom of Saudi Arabia (KSA). Among the users of 2 universities e-learning systems, 528 students were recruited. A formative validation measurement with SmartPLS, a graphical structural equation modeling tool was used to analyse the collected data. On the basis of the SmartPLS results, as well as with the aid of data-supported IS impact measurements and dimensions, we confirmed the validity of the IS-Impact Model for assessing the effect of e-learning systems in KSA universities. The newly constructed model is more understandable, its use was proved as robust and applicable to various circumstances.
Nov 13 2012 cs.CY
This paper presents the preliminary findings of a study researching the diffusion and the adoption of online retailing in Saudi Arabia. It reports new research that identifies and explores the key issues that positively and negatively influence the decision of Saudi customers to buy from online retailers in Saudi Arabia. Although Saudi Arabia has the largest and fastest growth of ICT marketplaces in the Arab region, e-commerce activities are not progressing at the same speed. While the overall research project involves exploratory research using mixed methods, the focus of this paper is on a quantitative analysis of responses obtained from a survey of Saudi customers, with the design of the questionnaire instrument being based on the findings of a qualitative analysis reported in a previous paper. The main findings of the current analysis include a list of key factors that affect Saudi customers' purchase from Saudi online retailers, and quantitative indications of the relative strengths of the various relationships.
Nov 13 2012 cs.CY
This paper presents some findings from a study researching the diffusion and adoption of online retailing in Saudi Arabia. Although the country has the largest and fastest growing ICT marketplace in the Arab region, e-commerce activities have not progressed at a similar speed. In general, Saudi retailers have not responded actively to the global growth of online retailing. Accordingly new research has been conducted to identify and explore key issues that positively and negatively influence Saudi retailers in deciding whether to adopt the online channel. While the overall research project uses mixed methods, the focus of this paper is on a quantitative analysis of responses obtained from a survey of retailers in Saudi Arabia, with the design of the questionnaire instrument being based on the findings of a qualitative analysis reported in a previous paper. The main findings of the current analysis include a list of key factors that affect retailers decision to adopt e-commerce, and quantitative indications of the relative strengths of the various relationships.
Aug 03 2012 cs.DS
We present a practical algorithm for the cyclic longest common subsequence (CLCS) problem that runs in O(mn) time, where m and n are the lengths of the two input strings. While this is not necessarily an asymptotic improvement over the existing record, it is far simpler to understand and to implement.