# Top arXiv papers

• Low power wide area network (LPWAN) is a wireless telecommunication network that is designed for interconnecting devices with low bitrate focusing on long range and power efficiency. In this paper, we study two recent technologies built from existing Long-Term Evolution (LTE) functionalities: Enhanced machine type communications (eMTC) and Narrow band internet of things (NB-IoT). These technologies are designed to coexist with existing LTE infrastructure, spectrum, and devices. We first briefly introduce both systems and then compare their performance in terms of energy consumption, latency and scalability. We introduce a model for calculating the energy consumption and study the effect of clock drift and propose a method to overcome it. We also propose a model for analytically evaluating the latency and the maximum number of devices in a network. Furthermore, we implement the main functionality of both technologies and simulate the end-to-end latency and maximum number of devices in a discrete-event network simulator NS-3. Numerical results show that 8 years battery life time can be achieved by both technologies in a poor coverage scenario and that depending on the coverage conditions and data length, one technology consumes less energy than the other. The results also show that eMTC can serve more devices in a network than NB-IoT, while providing a latency that is 10 times lower.
• In this paper, we first investigate why typical two-stage methods are not as fast as single-stage, fast detectors like YOLO and SSD. We find that Faster R-CNN and R-FCN perform an intensive computation after or before RoI warping. Faster R-CNN involves two fully connected layers for RoI recognition, while R-FCN produces a large score maps. Thus, the speed of these networks is slow due to the heavy-head design in the architecture. Even if we significantly reduce the base model, the computation cost cannot be largely decreased accordingly. We propose a new two-stage detector, Light-Head R-CNN, to address the shortcoming in current two-stage approaches. In our design, we make the head of network as light as possible, by using a thin feature map and a cheap R-CNN subnet (pooling and single fully-connected layer). Our ResNet-101 based light-head R-CNN outperforms state-of-art object detectors on COCO while keeping time efficiency. More importantly, simply replacing the backbone with a tiny network (e.g, Xception), our Light-Head R-CNN gets 30.7 mmAP at 102 FPS on COCO, significantly outperforming the single-stage, fast detectors like YOLO and SSD on both speed and accuracy. Code will be make publicly available.
• This note provides a counterexample to a proposition stated in [J. Differ. Equ. 261.4 (2016) 2528--2551] regarding the neighborhood of certain $4\times 4$ symplectic matrices.
• The performance of face detection has been largely im- proved with the development of convolutional neural net- work. However, the occlusion issue due to mask and sun- glasses, is still a challenging problem. The improvement on the recall of these occluded cases usually brings the risk of high false positives. In this paper, we present a novel face detector called Face Attention Network (FAN), which can significantly improve the recall of the face detection prob- lem in the occluded case without compromising the speed. More specifically, we propose a new anchor-level attention, which will highlight the features from the face region. Inte- grated with our anchor assign strategy and data augmenta- tion techniques, we obtain state-of-art results on public face detection benchmarks like WiderFace and MAFA.
• Telugu is a Dravidian language spoken by more than 80 million people worldwide. The optical character recognition (OCR) of the Telugu script has wide ranging applications including education, health-care, administration etc. The beautiful Telugu script however is very different from Germanic scripts like English and German. This makes the use of transfer learning of Germanic OCR solutions to Telugu a non-trivial task. To address the challenge of OCR for Telugu, we make three contributions in this work: (i) a database of Telugu characters, (ii) a deep learning based OCR algorithm, and (iii) a client server solution for the online deployment of the algorithm. For the benefit of the Telugu people and the research community, we will make our code freely available at https://gayamtrishal.github.io/OCR_Telugu.github.io/
• Nov 21 2017 cs.CV arXiv:1711.07240v1
The improvements in recent CNN-based object detection works, from R-CNN [11] and Fast/Faster R-CNN [10, 29] to recent Mask R-CNN [14] and RetinaNet [22], mainly come from new network, or framework, or loss design. But mini-batch size, a key factor in the training, has not been well studied. In this paper, we propose a Large Mini-Batch Object Detector (MegDet) to enable the training with much larger mini-batch size than before (e.g. from 16 to 256), so that we can effectively utilize multiple GPUs (up to 128 in our experiments) to significantly shorten the training time. Technically, we suggest a learning rate policy and Cross- GPU Batch Normalization, which together allow us to suc- cessfully train a large mini-batch detector in much less time (e.g., from 33 hours to 4 hours), and achieve even better accuracy. The MegDet is the backbone of our submission (mmAP 52.5%) to COCO 2017 Challenge, where we won the 1st place of Detection task.
• Hyperspectral imaging holds enormous potential to improve the state-of-the-art in aerial vehicle tracking with low spatial and temporal resolutions. Recently, adaptive multi-modal hyperspectral sensors, controlled by Dynamic Data Driven Applications Systems (DDDAS) methodology, have attracted growing interest due to their ability to record extended data quickly from the aerial platforms. In this study, we apply popular concepts from traditional object tracking - (1) Kernelized Correlation Filters (KCF) and (2) Deep Convolutional Neural Network (CNN) features - to the hyperspectral aerial tracking domain. Specifically, we propose the Deep Hyperspectral Kernelized Correlation Filter based tracker (DeepHKCF) to efficiently track aerial vehicles using an adaptive multi-modal hyperspectral sensor. We address low temporal resolution by designing a single KCF-in-multiple Regions-of-Interest (ROIs) approach to cover a reasonable large area. To increase the speed of deep convolutional features extraction from multiple ROIs, we design an effective ROI mapping strategy. The proposed tracker also provides flexibility to couple it to the more advanced correlation filter trackers. The DeepHKCF tracker performs exceptionally with deep features set up in a synthetic hyperspectral video generated by the Digital Imaging and Remote Sensing Image Generation (DIRSIG) software. Additionally, we generate a large, synthetic, single-channel dataset using DIRSIG to perform vehicle classification in the Wide Area Motion Imagery (WAMI) platform . This way, the high-fidelity of the DIRSIG software is proved and a large scale aerial vehicle classification dataset is released to support studies on vehicle detection and tracking in the WAMI platform.
• We consider the classical problem of control of linear systems with quadratic cost. When the true system dynamics are unknown, an adaptive policy is required for learning the model parameters and planning a control policy simultaneously. Addressing this trade-off between accurate estimation and good control represents the main challenge in the area of adaptive control. Another important issue is to prevent the system becoming destabilized due to lack of knowledge of its dynamics. Asymptotically optimal approaches have been extensively studied in the literature, but there are very few non-asymptotic results which also do not provide a comprehensive treatment of the problem. In this work, we establish finite time high probability regret bounds that are optimal up to logarithmic factors. We also provide high probability guarantees for a stabilization algorithm based on random linear feedbacks. The results are obtained under very mild assumptions, requiring: (i) stabilizability of the matrices encoding the system's dynamics, and (ii) degree of heaviness of the noise distribution. To derive our results, we also introduce a number of new concepts and technical tools.
• The amount of unstructured text-based data is growing every day. Querying, clustering, and classifying this big data requires similarity computations across large sets of documents. Whereas low-complexity similarity metrics are available, attention has been shifting towards more complex methods that achieve a higher accuracy. In particular, the Word Mover's Distance (WMD) method proposed by Kusner et al. is a promising new approach, but its time complexity grows cubically with the number of unique words in the documents. The Relaxed Word Mover's Distance (RWMD) method, again proposed by Kusner et al., reduces the time complexity from qubic to quadratic and results in a limited loss in accuracy compared with WMD. Our work contributes a low-complexity implementation of the RWMD that reduces the average time complexity to linear when operating on large sets of documents. Our linear-complexity RWMD implementation, henceforth referred to as LC-RWMD, maps well onto GPUs and can be efficiently distributed across a cluster of GPUs. Our experiments on real-life datasets demonstrate 1) a performance improvement of two orders of magnitude with respect to our GPU-based distributed implementation of the quadratic RWMD, and 2) a performance improvement of three to four orders of magnitude with respect to our distributed WMD implementation that uses GPU-based RWMD for pruning.
• Nov 21 2017 math.FA arXiv:1711.07225v1
We study domination of quadratic forms in the abstract setting of ordered Hilbert spaces. Our main result gives a characterization in terms of the associated forms. This generalizes and unifies various earlier works. Along the way we present several examples.
• Cloud vendors are increasingly offering machine learning services as part of their platform and services portfolios. These services enable the deployment of machine learning models on the cloud that are offered on a pay-per-query basis to application developers and end users. However recent work has shown that the hosted models are susceptible to extraction attacks. Adversaries may launch queries to steal the model and compromise future query payments or privacy of the training data. In this work, we present a cloud-based extraction monitor that can quantify the extraction status of models by observing the query and response streams of both individual and colluding adversarial users. We present a novel technique that uses information gain to measure the model learning rate by users with increasing number of queries. Additionally, we present an alternate technique that maintains intelligent query summaries to measure the learning rate relative to the coverage of the input feature space in the presence of collusion. Both these approaches have low computational overhead and can easily be offered as services to model owners to warn them of possible extraction attacks from adversaries. We present performance results for these approaches for decision tree models deployed on BigML MLaaS platform, using open source datasets and different adversarial attack strategies.
• The AN.ON-Next project aims to integrate privacy-enhancing technologies into the internet's infrastructure and establish them in the consumer mass market. The technologies in focus include a basis protection at internet service provider level, an improved overlay network-based protection and a concept for privacy protection in the emerging 5G mobile network. A crucial success factor will be the viable adjustment and development of standards, business models and pricing strategies for those new technologies.
• Central Compact Objects (CCOs) are a handful of sources located close to the geometrical center of young supernova remnants. They only show thermal-like, soft X-ray emission and have no counterparts at any other wavelength. While the first observed CCO turned out to be a very peculiar magnetar, discovery that three members of the family are weakly magnetised Isolated Neutron Stars (INSs) set the basis for an interpretation of the class. However, the phenomeology of CCOs and their relationship with other classes of INSs, possibly ruled by supernova fall-back accretion, are still far from being well understood.
• We introduce a novel data-driven approach to discover and decode features in the neural code coming from large population neural recordings with minimal assumptions, using cohomological learning. We apply our approach to neural recordings of mice moving freely in a box, where we find a circular feature. We then observe that the decoded value corresponds well to the head direction of the mouse. Thus we capture head direction cells and decode the head direction from the neural population activity without having to process the behaviour of the mouse. Interestingly, the decoded values convey more information about the neural activity than the tracked head direction does, with differences that have some spatial organization. Finally, we note that the residual population activity, after the head direction has been accounted for, retains some low-dimensional structure but with no discernible shape.
• All the existing image steganography methods use manually crafted features to hide binary payloads into cover images. This leads to small payload capacity and image distortion. Here we propose a convolutional neural network based encoder-decoder architecture for embedding of images as payload. To this end, we make following three major contributions: (i) we propose a deep learning based generic encoder-decoder architecture for image steganography; (ii) we introduce a new loss function that ensures joint end-to-end training of encoder-decoder networks; (iii) we perform extensive empirical evaluation of proposed architecture on a range of challenging publicly available datasets (MNIST, CIFAR10, PASCAL-VOC12, ImageNet, LFW) and report state-of-the-art payload capacity at high PSNR and SSIM values.
• We present a stochastic first-order optimization algorithm, named BCSC, that adds a cyclic constraint to stochastic block-coordinate descent. It uses different subsets of the data to update different subsets of the parameters, thus limiting the detrimental effect of outliers in the training set. Empirical tests in benchmark datasets show that our algorithm outperforms state-of-the-art optimization methods in both accuracy as well as convergence speed. The improvements are consistent across different architectures, and can be combined with other training techniques and regularization methods.
• This paper surveys various precise (long-time) asymptotic results for the solutions of the Navier-Stokes equations with potential forces in bounded domains. It turns out that that the asymptotic expansion leads surprisingly to a Poincaré-Dulac normal form of the Navier-Stokes equations. We will also discuss some related results and a few open issues.
• Nov 21 2017 cs.CV arXiv:1711.07183v1
Generating adversarial examples is an intriguing problem and an important way of understanding the working mechanism of deep neural networks. Recently, it has attracted a lot of attention in the computer vision community. Most existing approaches generated perturbations in image space, i.e., each pixel can be modified independently. However, it remains unclear whether these adversarial examples are authentic, in the sense that they correspond to actual changes in physical properties. This paper aims at exploring this topic in the contexts of object classification and visual question answering. The baselines are set to be several state-of-the-art deep neural networks which receive 2D input images. We augment these networks with a differentiable 3D rendering layer in front, so that a 3D scene (in physical space) is rendered into a 2D image (in image space), and then mapped to a prediction (in output space). There are two (direct or indirect) ways of attacking the physical parameters. The former back-propagates the gradients of error signals from output space to physical space directly, while the latter first constructs an adversary in image space, and then attempts to find the best solution in physical space that is rendered into this image. An important finding is that attacking physical space is much more difficult, as the direct method, compared with that used in image space, produces a much lower success rate and requires heavier perturbations to be added. On the other hand, the indirect method does not work out, suggesting that adversaries generated in image space are inauthentic. By interpreting them in physical space, most of these adversaries can be filtered out, showing promise in defending adversaries.
• We construct a bounded $C^{1}$ domain $\Omega$ in $R^{n}$ for which the $H^{3/2}$ regularity for the Dirichlet and Neumann problems for the Laplacian cannot be improved, that is, there exists $f$ in $C^{\infty}(\overline\Omega)$ such that the solution of $\Delta u=f$ in $\Omega$ and either $u=0$ on $\partial\Omega$ or $\partial\_{n} u=0$ on $\partial\Omega$ is contained in $H^{3/2}(\Omega)$ but not in $H^{3/2+\varepsilon}(\Omega)$ for any $\epsilon>0$. An analogous result holds for $L^{p}$ Sobolev spaces with $p\in(1,\infty)$.
• The success of deep learning in computer vision is mainly attributed to an abundance of data. However, collecting large-scale data is not always possible, especially for the supervised labels. Unsupervised domain adaptation (UDA) aims to utilize labeled data from a source domain to learn a model that generalizes to a target domain of unlabeled data. A large amount of existing work uses Siamese network-based models, where two streams of neural networks process the source and the target domain data respectively. Nevertheless, most of these approaches focus on minimizing the domain discrepancy, overlooking the importance of preserving the discriminative ability for target domain features. Another important problem in UDA research is how to evaluate the methods properly. Common evaluation procedures require target domain labels for hyper-parameter tuning and model selection, contradicting the definition of the UDA task. Hence we propose a more reasonable evaluation principle that avoids this contradiction by simply adopting the latest snapshot of a model for evaluation. This adds an extra requirement for UDA methods besides the main performance criteria: the stability during training. We design a novel method that connects the target domain stream to the source domain stream with a Parameter Reference Loss (PRL) to solve these problems simultaneously. Experiments on various datasets show that the proposed PRL not only improves the performance on the target domain, but also stabilizes the training procedure. As a result, PRL based models do not need the contradictory model selection, and thus are more suitable for practical applications.
• We propose a novel distributed inference algorithm for continuous graphical models by extending Stein variational gradient descent (SVGD) to leverage the Markov dependency structure of the distribution of interest. The idea is to use a set of local kernel functions over the Markov blanket of each node, which alleviates the problem of the curse of high dimensionality and simultaneously yields a distributed algorithm for decentralized inference tasks. We justify our method with theoretical analysis and show that the use of local kernels can be viewed as a new type of localized approximation that matches the target distribution on the conditional distributions of each node over its Markov blanket. Our empirical results demonstrate that our method outperforms a variety of baselines including standard MCMC and particle message passing methods.
• Neural program embeddings have shown much promise recently for a variety of program analysis tasks, including program synthesis, program repair, fault localization, etc. However, most existing program embeddings are based on syntactic features of programs, such as raw token sequences or abstract syntax trees. Unlike images and text, a program has an unambiguous semantic meaning that can be difficult to capture by only considering its syntax (i.e. syntactically similar pro- grams can exhibit vastly different run-time behavior), which makes syntax-based program embeddings fundamentally limited. This paper proposes a novel semantic program embedding that is learned from program execution traces. Our key insight is that program states expressed as sequential tuples of live variable values not only captures program semantics more precisely, but also offer a more natural fit for Recurrent Neural Networks to model. We evaluate different syntactic and semantic program embeddings on predicting the types of errors that students make in their submissions to an introductory programming class and two exercises on the CodeHunt education platform. Evaluation results show that our new semantic program embedding significantly outperforms the syntactic program embeddings based on token sequences and abstract syntax trees. In addition, we augment a search-based program repair system with the predictions obtained from our se- mantic embedding, and show that search efficiency is also significantly improved.
• Person re-identification aims at establishing the identity of a pedestrian from a gallery that contains images of multiple people obtained from a multi-camera system. Many challenges such as occlusions, drastic lighting and pose variations across the camera views, indiscriminate visual appearances, cluttered backgrounds, imperfect detections, motion blur, and noise make this task highly challenging. While most approaches focus on learning features and metrics to derive better representations, we hypothesize that both local and global contextual cues are crucial for an accurate identity matching. To this end, we propose a Feature Mask Network (FMN) that takes advantage of ResNet high-level features to predict a feature map mask and then imposes it on the low-level features to dynamically reweight different object parts for a locally aware feature representation. This serves as an effective attention mechanism by allowing the network to focus on local details selectively. Given the resemblance of person re-identification with classification and retrieval tasks, we frame the network training as a multi-task objective optimization, which further improves the learned feature descriptions. We conduct experiments on Market-1501, DukeMTMC-reID and CUHK03 datasets, where the proposed approach respectively achieves significant improvements of $5.3\%$, $9.1\%$ and $10.7\%$ in mAP measure relative to the state-of-the-art.
• Geometry theorem proving forms a major and challenging component in the K-12 mathematics curriculum. A particular difficult task is to add auxiliary constructions (i.e, additional lines or points) to aid proof discovery. Although there exist many intelligent tutoring systems proposed for geometry proofs, few teach students how to find auxiliary constructions. And the few exceptions are all limited by their underlying reasoning processes for supporting auxiliary constructions. This paper tackles these weaknesses of prior systems by introducing an interactive geometry tutor, the Advanced Geometry Proof Tutor (AGPT). It leverages a recent automated geometry prover to provide combined benefits that any geometry theorem prover or intelligent tutoring system alone cannot accomplish. In particular, AGPT not only can automatically process images of geometry problems directly, but also can interactively train and guide students toward discovering auxiliary constructions on their own. We have evaluated AGPT via a pilot study with 78 high school students. The study results show that, on training students how to find auxiliary constructions, there is no significant perceived difference between AGPT and human tutors, and AGPT is significantly more effective than the state-of-the-art geometry solver that produces human-readable proofs.
• Boson Sampling photonic networks can be used to obtain Heisenberg limited measurements of optical phase gradients, by using quantum Fourier transform interferometry. Here, we use phase-space techniques and the complex P-distribution, to simulate a $100$ qubit Fourier transform interferometer with additional random phases, to simulate decoherence. This is far larger than possible in conventional calculations of matrix permanents, which is the standard technique for such calculations. Our results show that this quantum metrology technique is robust against phase decoherence, and one can also measure lower order correlations without substantial degradation.
• This paper introduces the "Search, Align, and Repair" data-driven program repair framework to automate feedback generation for introductory programming exercises. Distinct from existing techniques, our goal is to develop an efficient, fully automated, and problem-agnostic technique for large or MOOC-scale introductory programming courses. We leverage the large amount of available student submissions in such settings and develop new algorithms for identifying similar programs, aligning correct and incorrect programs, and repairing incorrect programs by finding minimal fixes. We have implemented our technique in the SARFGEN system and evaluated it on thousands of real student attempts from the Microsoft-DEV204.1X edX course and the Microsoft CodeHunt platform. Our results show that SARFGEN can, within two seconds on average, generate concise, useful feedback for 89.7% of the incorrect student submissions. It has been integrated with the Microsoft-DEV204.1X edX class and deployed for production use.
• In this paper, we propose a spectral-spatial feature extraction and classification framework based on artificial neuron network (ANN) in the context of hyperspectral imagery. With limited labeled samples, only spectral information is exploited for training and spatial context is integrated posteriorly at the testing stage. Taking advantage of recent advances in face recognition, a joint supervision symbol that combines softmax loss and center loss is adopted to train the proposed network, by which intra-class features are gathered while inter-class variations are enlarged. Based on the learned architecture, the extracted spectrum-based features are classified by a center classifier. Moreover, to fuse the spectral and spatial information, an adaptive spectral-spatial center classifier is developed, where multiscale neighborhoods are considered simultaneously, and the final label is determined using an adaptive voting strategy. Finally, experimental results on three well-known datasets validate the effectiveness of the proposed methods compared with the state-of-the-art approaches.
• Imaging objects that are obscured by scattering and occlusion is an important challenge for many applications. For example, navigation and mapping capabilities of autonomous vehicles could be improved, vision in harsh weather conditions or under water could be facilitated, or search and rescue scenarios could become more effective. Unfortunately, conventional cameras cannot see around corners. Emerging, time-resolved computational imaging systems, however, have demonstrated first steps towards non-line-of-sight (NLOS) imaging. In this paper, we develop an algorithmic framework for NLOS imaging that is robust to partial occlusions within the hidden scenes. This is a common light transport effect, but not adequately handled by existing NLOS reconstruction algorithms, resulting in fundamental limitations in what types of scenes can be recovered. We demonstrate state-of-the-art NLOS reconstructions in simulation and with a prototype single photon avalanche diode (SPAD) based acquisition system.
• In this paper, we study the problem of learning image classification models with label noise. Existing approaches depending on human supervision are generally not scalable as manually identifying correct or incorrect labels is timeconsuming, whereas approaches not relying on human supervision are scalable but less effective. To reduce the amount of human supervision for label noise cleaning, we introduce CleanNet, a joint neural embedding network, which only requires a fraction of the classes being manually verified to provide the knowledge of label noise that can be transferred to other classes. We further integrate CleanNet and conventional convolutional neural network classifier into one framework for image classification learning. We demonstrate the effectiveness of the proposed algorithm on both of the label noise detection task and the image classification on noisy data task on several large-scale datasets. Experimental results show that CleanNet can reduce label noise detection error rate on held-out classes where no human supervision available by 41.5% compared to current weakly supervised methods. It also achieves 47% of the performance gain of verifying all images with only 3.2% images verified on an image classification task.
• Keyword spotting (KWS) is a critical component for enabling speech based user interactions on smart devices. It requires real-time response and high accuracy for good user experience. Recently, neural networks have become an attractive choice for KWS architecture because of their superior accuracy compared to traditional speech processing algorithms. Due to its always-on nature, KWS application has highly constrained power budget and typically runs on tiny microcontrollers with limited memory and compute capability. The design of neural network architecture for KWS must consider these constraints. In this work, we perform neural network architecture evaluation and exploration for running KWS on resource-constrained microcontrollers. We train various neural network architectures for keyword spotting published in literature to compare their accuracy and memory/compute requirements. We show that it is possible to optimize these neural network architectures to fit within the memory and compute constraints of microcontrollers without sacrificing accuracy. We further explore the depthwise separable convolutional neural network (DS-CNN) and compare it against other neural network architectures. DS-CNN achieves an accuracy of 95.4%, which is ~10% higher than the DNN model with similar number of parameters.
• We present exact analytical results for the Caputo fractional derivative of a wide class of elementary functions, including trigonometric and inverse trigonometric, hyperbolic and inverse hyperbolic, Gaussian, quartic Gaussian, and Lorentzian functions. These results are especially important for multi-scale physical systems, such as porous materials, disordered media, and turbulent fluids, in which transport is described by fractional partial differential equations. The exact results for the Caputo fractional derivative are obtained from a single generalized Euler's integral transform of the generalized hyper-geometric function with a power-law argument. We present a proof of the generalized Euler's integral transform and directly apply it to the exact evaluation of the Caputo fractional derivative of a broad spectrum of functions, provided that these functions can be expressed in terms of a generalized hyper-geometric function with a power-law argument. We determine that the Caputo fractional derivative of elementary functions is given by the generalized hyper-geometric function. Moreover, we show that in the most general case the final result cannot be reduced to elementary functions, in contrast to both the Liouville-Caputo and Fourier fractional derivatives. However, we establish that in the infinite limit of the argument of elementary functions, all three definitions of a fractional derivative - the Caputo, Liouville-Caputo, and Fourier- converge to the same result given by the elementary functions. Finally, we prove the equivalence between Liouville-Caputo and Fourier fractional derivatives.
• Based on operators borrowed from scattering theory, several concrete realizations of index theorems are proposed. The corresponding operators belong to some C*-algebras of pseudo-differential operators with coefficients which either have limits at plus and minus infinity, or which are periodic or asymptotically periodic, or which are uniformly almost periodic. These various situations can be deduced from a single partial isometry which depends on several parameters. All computations are explicitly performed.
• Artificial Intelligence (AI) has been used extensively in automatic decision making in a broad variety of scenarios, ranging from credit ratings for loans to recommendations of movies. Traditional design guidelines for AI models focus essentially on accuracy maximization, but recent work has shown that economically irrational and socially unacceptable scenarios of discrimination and unfairness are likely to arise unless these issues are explicitly addressed. This undesirable behavior has several possible sources, such as biased datasets used for training that may not be detected in black-box models. After pointing out connections between such bias of AI and the problem of induction, we focus on Popper's contributions after Hume's, which offer a logical theory of preferences. An AI model can be preferred over others on purely rational grounds after one or more attempts at refutation based on accuracy and fairness. Inspired by such epistemological principles, this paper proposes a structured approach to mitigate discrimination and unfairness caused by bias in AI systems. In the proposed computational framework, models are selected and enhanced after attempts at refutation. To illustrate our discussion, we focus on hiring decision scenarios where an AI system filters in which job applicants should go to the interview phase.
• Objective The 3D printed medical models can come from virtual digital resources, like CT scanning. Nevertheless, the accuracy of CT scanning technology is limited, which is 1mm. In this situation, the collected data is not exactly the same as the real structure and there might be some errors causing the print to fail. This study presents a common and practical way to process the skull data to make the structures correctly. And then we make a skull model through 3D printing technology, which is useful for medical students to understand the complex structure of skull. Materials and Methods The skull data is collected by the CT scan. To get a corrected medical model, the computer-assisted image processing goes with the combination of five 3D manipulation tools: Mimics, 3ds Max, Geomagic, Mudbox and Meshmixer, to reconstruct the digital model and repair it. Subsequently, we utilize a low-cost desktop 3D printer, Ultimaker2, with polylactide filament (PLA) material to print the model and paint it based on the atlas. Result After the restoration and repairing, we eliminate the errors and repair the model by adding the missing parts of the uploaded data within 6 hours. Then we print it and compare the model with the cadaveric skull from frontal, left, right and anterior views respectively. The printed model can show the same structures and also the details of the skull clearly and is a good alternative of the cadaveric skull.
• This paper investigates, from information theoretic grounds, a learning problem based on the principle that any regularity in a given dataset can be exploited to extract compact features from data, i.e., using fewer bits than needed to fully describe the data itself, in order to build meaningful representations of a relevant content (multiple labels). We begin by introducing the noisy lossy source coding paradigm with the log-loss fidelity criterion which provides the fundamental tradeoffs between the \emphcross-entropy loss (average risk) and the information rate of the features (model complexity). Our approach allows an information theoretic formulation of the \emphmulti-task learning (MTL) problem which is a supervised learning framework in which the prediction models for several related tasks are learned jointly from common representations to achieve better generalization performance. Then, we present an iterative algorithm for computing the optimal tradeoffs and its global convergence is proven provided that some conditions hold. An important property of this algorithm is that it provides a natural safeguard against overfitting, because it minimizes the average risk taking into account a penalization induced by the model complexity. Remarkably, empirical results illustrate that there exists an optimal information rate minimizing the \emphexcess risk which depends on the nature and the amount of available training data. An application to hierarchical text categorization is also investigated, extending previous works.
• An equiangular tight frame (ETF) is a type of optimal packing of lines in Euclidean space. A regular simplex is a special type of ETF in which the number of vectors is one more than the dimension of the space they span. In this paper, we consider ETFs that contain a regular simplex, that is, have the property that a subset of its vectors forms a regular simplex. As we explain, such ETFs are characterized as those that achieve equality in a certain well-known bound from the theory of compressed sensing. We then consider the so-called binder of such an ETF, namely the set of all regular simplices that it contains. We provide a new algorithm for computing this binder in terms of products of entries of the ETF's Gram matrix. In certain circumstances, we show this binder can be used to produce a particularly elegant Naimark complement of the corresponding ETF. Other times, an ETF is a disjoint union of regular simplices, and we show this leads to a certain type of optimal packing of subspaces known as an equichordal tight fusion frame. We conclude by considering the extent to which these ideas can be applied to numerous known constructions of ETFs, including harmonic ETFs.
• Nov 21 2017 cs.CY arXiv:1711.07078v1
The main principle of the Lean Startup movement is that static business planning should be replaced by a dynamic development, where products, services, business model elements, business objectives and activities are frequently changed based on constant customer feedback. Our ambition is to empirically measure if such changes of the business idea, the business model elements, the project management and close interaction with customers really increases the success rate of entrepreneurs, and in what way. Our first paper, Does Lean Startup really work? - Foundation for an empirical study presented the first attempt to model the relations we want to measure. This paper will focus on how to build and set up a test harness (from now on called the Entrepreneurship Platform or EP) to gather empirical data from Companies and how to store these data together with demographical and financial data from the PROFF-portal in the Entrepreneurial Data Warehouse (from now called the EDW). We will end the paper by discussing the potential methodological problems with our method, before we document a test run of our set-up to verify that we are actually able to populate the Data Warehouse with time series data
• Following related work in law and policy, two notions of prejudice have come to shape the study of fairness in algorithmic decision-making. Algorithms exhibit disparate treatment if they formally treat people differently according to a protected characteristic, like race, or if they intentionally discriminate (even if via proxy variables). Algorithms exhibit disparate impact if they affect subgroups differently. Disparate impact can arise unintentionally and absent disparate treatment. The natural way to reduce disparate impact would be to apply disparate treatment in favor of the disadvantaged group, i.e. to apply affirmative action. However, owing to the practice's contested legal status, several papers have proposed trying to eliminate both forms of unfairness simultaneously, introducing a family of algorithms that we denote disparate learning processes (DLPs). These processes incorporate the protected characteristic as an input to the learning algorithm (e.g.~via a regularizer) but produce a model that cannot directly access the protected characteristic as an input. In this paper, we make the following arguments: (i) DLPs can be functionally equivalent to disparate treatment, and thus should carry the same legal status; (ii) when the protected characteristic is redundantly encoded in the nonsensitive features, DLPs can exactly apply any disparate treatment protocol; (iii) when the characteristic is only partially encoded, DLPs may induce within-class discrimination. Finally, we argue the normative point that rather than masking efforts towards proportional representation, it is preferable to undertake them transparently.
• This paper explores image caption generation using conditional variational auto-encoders (CVAEs). Standard CVAEs with a fixed Gaussian prior yield descriptions with too little variability. Instead, we propose two models that explicitly structure the latent space around $K$ components corresponding to different types of image content, and combine components to create priors for images that contain multiple types of content simultaneously (e.g., several kinds of objects). Our first model uses a Gaussian Mixture model (GMM) prior, while the second one defines a novel Additive Gaussian (AG) prior that linearly combines component means. We show that both models produce captions that are more diverse and more accurate than a strong LSTM baseline or a "vanilla" CVAE with a fixed Gaussian prior, with AG-CVAE showing particular promise.
• We present an end-to-end learning approach for motion deblurring, which is based on conditional GAN and content loss. It improves the state-of-the art in terms of peak signal-to-noise ratio, structural similarity measure and by visual appearance. The quality of the deblurring model is also evaluated in a novel way on a real-world problem -- object detection on (de-)blurred images. The method is 5 times faster than the closest competitor. Second, we present a novel method of generating synthetic motion blurred images from the sharp ones, which allows realistic dataset augmentation. Model, training code and dataset are available at https://github.com/KupynOrest/DeblurGAN
• Nov 21 2017 cs.GT math.CT arXiv:1711.07059v1
We define a notion of morphisms between open games, exploiting a surprising connection between lenses in computer science and compositional game theory. This extends the more intuitively obvious definition of globular morphisms as mappings between strategy profiles that preserve best responses, and hence in particular preserve Nash equilibria. We construct a symmetric monoidal double category in which the horizontal 1-cells are open games, vertical 1-morphisms are lenses, and 2-cells are morphisms of open games. States (morphisms out of the monoidal unit) in the vertical category give a flexible solution concept that includes both Nash and subgame perfect equilibria. Products in the vertical category give an external choice operator that is reminiscent of products in game semantics, and is useful in practical examples. We illustrate the above two features with a simple worked example from microeconomics, the market entry game.
• The Lagrangian of the causal action principle is computed in Minkowski space for Dirac wave functions interacting with classical electromagnetism and linearized gravity in the limiting case when the ultraviolet cutoff is removed. Various surface layer integrals are computed in this limiting case.
• Daily operation of a large-scale experiment is a resource consuming task, particularly from perspectives of routine data quality monitoring. Typically, data comes from different sub-detectors and the global quality of data depends on the combinatorial performance of each of them. In this paper, the problem of identifying channels in which anomalies occurred is considered. We introduce a generic deep learning model and prove that, under reasonable assumptions, the model learns to identify 'channels' which are affected by an anomaly. Such model could be used for data quality manager cross-check and assistance and identifying good channels in anomalous data samples. The main novelty of the method is that the model does not require ground truth labels for each channel, only global flag is used. This effectively distinguishes the model from classical classification methods. Being applied to CMS data collected in the year 2010, this approach proves its ability to decompose anomaly by separate channels.
• The variational autoencoder (VAE) is a popular probabilistic generative model. However, one shortcoming of VAEs is that the latent variables cannot be discrete, which makes it difficult to generate data from different modes of a distribution. Here, we propose an extension of the VAE framework that incorporates a classifier to infer the discrete class of the modeled data. To model sequential data, we can combine our Classifying VAE with a recurrent neural network such as an LSTM. We apply this model to algorithmic music generation, where our model learns to generate musical sequences in different keys. Most previous work in this area avoids modeling key by transposing data into only one or two keys, as opposed to the 10+ different keys in the original music. We show that our Classifying VAE and Classifying VAE+LSTM models outperform the corresponding non-classifying models in generating musical samples that stay in key. This benefit is especially apparent when trained on untransposed music data in the original keys.
• Person re-identification (re-ID) models trained on one domain often fail to generalize well to another. In our attempt, we present a "learning via translation" framework. In the baseline, we translate the labeled images from source to target domain in an unsupervised manner. We then train re-ID models with the translated images by supervised methods. Yet, being an essential part of this framework, unsupervised image-image translation suffers from the information loss of source-domain labels during translation. Our motivation is two-fold. First, for each image, the discriminative cues contained in its ID label should be maintained after translation. Second, given the fact that two domains have entirely different persons, a translated image should be dissimilar to any of the target IDs. To this end, we propose to preserve two types of unsupervised similarities, 1) self-similarity of an image before and after translation, and 2) domain-dissimilarity of a translated source image and a target image. Both constraints are implemented in the similarity preserving generative adversarial network (SPGAN) which consists of a Siamese network and a CycleGAN. Through domain adaptation experiment, we show that images generated by SPGAN are more suitable for domain adaptation and yield consistent and competitive re-ID accuracy on two large-scale datasets.
• Incorporating syntactic information in Neural Machine Translation models is a practical way to compensate their requirement for a large amount of parallel training text, specially for low-resource language pairs. Previous works on using syntactic information provided by (inevitably error-prone) parsers has been promising. In this paper, we propose a forest-to-sequence Attentional neural machine translation model to make use of exponentially many parse trees of the source sentence to compensate for the parser errors. Our method represents the collection of parse trees as a packed forest, and learns a neural attentional transduction model from the forest to the target sentence. Experiments on English to German, Chinese and Persian datasets show the superiority of our method over the tree-to-sequence and vanilla sequence-to-sequence attentional neural machine translation models.
• This paper is aimed at creating extremely small and fast convolutional neural networks (CNN) for the problem of facial expression recognition (FER) from frontal face images. We show that, for this problem, translation invariance (achieved through max-pooling layers) degrades performance, especially when the network is small, and that the knowledge distillation method can be used to obtain extremely compressed CNNs. Extensive comparisons on two widely-used FER datasets, CK+ and Oulu-CASIA, demonstrate that our largest model sets the new state-of-the-art by yielding 1.8% and 12.7% relative improvement over the previous best results, on CK+ and Oulu-CASIA datasets, respectively. In addition, our smallest model (MicroExpNet), obtained using knowledge distillation, is less than 1MB in size and works at 1408 frames per second on an Intel i7 CPU. Being slightly less accurate than our largest model, MicroExpNet still achieves a 8.3% relative improvement, on the Oulu-CASIA dataset, over the previous state-of-the-art, much larger network; and on the CK+ dataset, it performs on par with a previous state-of-the-art network but with 154x fewer parameters.
• We develop the hierarchical cluster coherence (HCC) method for brain signals, a procedure for characterizing connectivity in a network by clustering nodes or groups of channels that display high level of coordination as measured by "cluster-coherence". While the most common approach to measuring dependence between clusters is through pairs of single time series, our method proposes cluster coherence which measures dependence between whole clusters rather than between single elements. Thus it takes into account both the dependence between clusters and within channels in a cluster. Using our method, the identified clusters contain time series that exhibit high cross-dependence in the spectral domain. That is, these clusters correspond to connected brain regions with synchronized oscillatory activity. In the simulation studies, we show that the proposed HCC outperforms commonly used clustering algorithms, such as average coherence and minimum coherence based methods. To study clustering in a network of multichannel electroencephalograms (EEG) during an epileptic seizure, we applied the HCC method and identified connectivity in alpha (8-12) Hertz and beta (16-30) Hertz bands at different phases of the recording: before an epileptic seizure, during the early and middle phases of the seizure episode. To increase the potential impact of this work in neuroscience, we also developed the HCC-Vis, an R-Shiny app (RStudio), which can be downloaded from this https://carolinaeuan.shinyapps.io/hcc-vis/.
• The number of social images has exploded by the wide adoption of social networks, and people like to share their comments about them. These comments can be a description of the image, or some objects, attributes, scenes in it, which are normally used as the user-provided tags. However, it is well-known that user-provided tags are incomplete and imprecise to some extent. Directly using them can damage the performance of related applications, such as the image annotation and retrieval. In this paper, we propose to learn an image annotation model and refine the user-provided tags simultaneously in a weakly-supervised manner. The deep neural network is utilized as the image feature learning and backbone annotation model, while visual consistency, semantic dependency, and user-error sparsity are introduced as the constraints at the batch level to alleviate the tag noise. Therefore, our model is highly flexible and stable to handle large-scale image sets. Experimental results on two benchmark datasets indicate that our proposed model achieves the best performance compared to the state-of-the-art methods.
• In this paper, I give an overview of some selected results in quantum many body theory, lying at the interface between mathematical quantum statistical mechanics and condensed matter theory. In particular, I discuss some recent results on the universality of transport coefficients in lattice models of interacting electrons, with specific focus on the independence of the quantum Hall conductivity from the electron-electron interaction. In this context, the exchange of ideas between mathematical and theoretical physics proved particularly fruitful, and helped in clarifying the role played by quantum conservation laws (Ward Identities) together with the decay properties of the Euclidean current-current correlation functions, on the interaction-independence of the conductivity coefficients.

Zoltán Zimborás Nov 17 2017 07:59 UTC

Interesting title for a work on Mourre theory for Floquet Hamiltonians.
I wonder how this slipped through the prereview process in arXiv.

Aram Harrow Nov 07 2017 08:52 UTC

I am not sure, but the title is great.

Noon van der Silk Nov 07 2017 05:13 UTC

I'm not against this idea; but what's the point? Clearly it's to provide some benefit to efficient implementation of particular procedures in Quil, but it'd be nice to see some detail of that, and how this might matter outside of Quil.

Noon van der Silk Nov 01 2017 21:51 UTC

This is an awesome paper; great work! :)

Xiaodong Qi Oct 25 2017 19:55 UTC

Paper source repository is here https://github.com/CQuIC/NanofiberPaper2014
Comments can be submitted as an issue in the repository. Thanks!

Siddhartha Das Oct 06 2017 03:18 UTC

Here is a work in related direction: "Unification of Bell, Leggett-Garg and Kochen-Specker inequalities: Hybrid spatio-temporal inequalities", Europhysics Letters 104, 60006 (2013), which may be relevant to the discussions in your paper. [https://arxiv.org/abs/1308.0270]

Bin Shi Oct 05 2017 00:07 UTC

Welcome to give the comments for this paper!