- Apr 20 2018 math.RT arXiv:1804.07100v1In this paper we explicitly construct $G_1$-intertwining operators between holomorphic discrete series representations $\mathcal{H}$ of a Lie group $G$ and those $\mathcal{H}_1$ of a subgroup $G_1\subset G$ when $(G,G_1)$ is a symmetric pair of holomorphic type. More precisely, we construct $G_1$-intertwining projection operators from $\mathcal{H}$ onto $\mathcal{H}_1$ as differential operators, in the case $(G,G_1)=(G_0\times G_0,\Delta G_0)$ and both $\mathcal{H}$, $\mathcal{H}_1$ are of scalar type, and also construct $G_1$-intertwining embedding operators from $\mathcal{H}_1$ into $\mathcal{H}$ as infinite-order differential operators, in the case $G$ is simple, $\mathcal{H}$ is of scalar type, and $\mathcal{H}_1$ is multiplicity-free under a maximal compact subgroup $K_1\subset K$. In the actual computation we make use of series expansions of integral kernels and the result of Faraut-Korányi (1990) or the author's previous result (2016) on norm computation. As an application, we observe the behavior of residues of the intertwining operators, which define the maps from some subquotient modules, when the parameters are at poles.
- In ontology-based data access (OBDA), the classical database is enhanced with an ontology in the form of logical assertions generating new intensional knowledge. A powerful form of such logical assertions is the tuple-generating dependencies (TGDs), also called existential rules, where Horn rules are extended by allowing existential quantifiers to appear in the rule heads. In this paper we introduce a new language called loop restricted (LR) TGDs (existential rules), which are TGDs with certain restrictions on the loops embedded in the underlying rule set. We study the complexity of this new language. We show that the conjunctive query answering (CQA) under the LR TGDs is decid- able. In particular, we prove that this language satisfies the so-called bounded derivation-depth prop- erty (BDDP), which implies that the CQA is first-order rewritable, and its data complexity is in AC0 . We also prove that the combined complexity of the CQA is EXPTIME complete, while the language membership is PSPACE complete. Then we extend the LR TGDs language to the generalised loop restricted (GLR) TGDs language, and prove that this class of TGDs still remains to be first-order rewritable and properly contains most of other first-order rewritable TGDs classes discovered in the literature so far.
- We propose an unsupervised method using self-clustering convolutional adversarial autoencoders to classify prostate tissue as tumor or non-tumor without any labeled training data. The clustering method is integrated into the training of the autoencoder and requires only little post-processing. Our network trains on hematoxylin and eosin (H&E) input patches and we tested two different reconstruction targets, H&E and immunohistochemistry (IHC). We show that antibody-driven feature learning using IHC helps the network to learn relevant features for the clustering task. Our network achieves a F1 score of 0.62 using only a small set of validation labels to assign classes to clusters.
- Apr 20 2018 cs.CL arXiv:1804.07097v1Traditional information retrieval (such as that offered by web search engines) impedes users with information overload from extensive result pages and the need to manually locate the desired information therein. Conversely, question-answering systems change how humans interact with information systems: users can now ask specific questions and obtain a tailored answer - both conveniently in natural language. Despite obvious benefits, their use is often limited to an academic context, largely because of expensive domain customizations, which means that the performance in domain-specific applications often fails to meet expectations. This paper presents cost-efficient remedies: a selection mechanism increases the precision of document retrieval and a fused approach to transfer learning is proposed in order to improve the performance of answer extraction. Here knowledge is inductively transferred from a related, yet different, tasks to the domain-specific application, while accounting for potential differences in the sample sizes across both tasks. The resulting performance is demonstrated with an actual use case from a finance company, where fewer than 400 question-answer pairs had to be annotated in order to yield significant performance gains. As a direct implication to management, this presents a promising path to better leveraging of knowledge stored in information systems.
- Apr 20 2018 cs.CV arXiv:1804.07094v1We propose a novel network that learns a part-aligned representation for person re-identification. It handles the body part misalignment problem, that is, body parts are misaligned across human detections due to pose/viewpoint change and unreliable detection. Our model consists of a two-stream network (one stream for appearance map extraction and the other one for body part map extraction) and a bilinear-pooling layer that generates and spatially pools a part-aligned map. Each local feature of the part-aligned map is obtained by a bilinear mapping of the corresponding local appearance and body part descriptors. Our new representation leads to a robust image matching similarity, which is equivalent to an aggregation of the local similarities of the corresponding body parts combined with the weighted appearance similarity. This part-aligned representation reduces the part misalignment problem significantly. Our approach is also advantageous over other pose-guided representations (e.g., extracting representations over the bounding box of each body part) by learning part descriptors optimal for person re-identification. For training the network, our approach does not require any part annotation on the person re-identification dataset. Instead, we simply initialize the part sub-stream using a pre-trained sub-network of an existing pose estimation network, and train the whole network to minimize the re-identification loss. We validate the effectiveness of our approach by demonstrating its superiority over the state-of-the-art methods on the standard benchmark datasets, including Market-1501, CUHK03, CUHK01 and DukeMTMC, and standard video dataset MARS.
- Automatic detection of anomalies in space- and time-varying measurements is an important tool in several fields, e.g., fraud detection, climate analysis, or healthcare monitoring. We present an algorithm for detecting anomalous regions in multivariate spatio-temporal time-series, which allows for spotting the interesting parts in large amounts of data, including video and text data. In opposition to existing techniques for detecting isolated anomalous data points, we propose the "Maximally Divergent Intervals" (MDI) framework for unsupervised detection of coherent spatial regions and time intervals characterized by a high Kullback-Leibler divergence compared with all other data given. In this regard, we define an unbiased Kullback-Leibler divergence that allows for ranking regions of different size and show how to enable the algorithm to run on large-scale data sets in reasonable time using an interval proposal technique. Experiments on both synthetic and real data from various domains, such as climate analysis, video surveillance, and text forensics, demonstrate that our method is widely applicable and a valuable tool for finding interesting events in different types of data.
- Apr 20 2018 cs.AI arXiv:1804.07088v1Spatial information is often expressed using qualitative terms such as natural language expressions instead of coordinates; reasoning over such terms has several practical applications, such as bus routes planning. Representing and reasoning on trajectories is a specific case of qualitative spatial reasoning that focuses on moving objects and their paths. In this work, we propose two versions of a trajectory calculus based on the allowed properties over trajectories, where trajectories are defined as a sequence of non-overlapping regions of a partitioned map. More specifically, if a given trajectory is allowed to start and finish at the same region, 6 base relations are defined (TC-6). If a given trajectory should have different start and finish regions but cycles are allowed within, 10 base relations are defined (TC-10). Both versions of the calculus are implemented as ASP programs; we propose several different encodings, including a generalised program capable of encoding any qualitative calculus in ASP. All proposed encodings are experimentally evaluated using a real-world dataset. Experiment results show that the best performing implementation can scale up to an input of 250 trajectories for TC-6 and 150 trajectories for TC-10 for the problem of discovering a consistent configuration, a significant improvement compared to previous ASP implementations for similar qualitative spatial and temporal calculi. This manuscript is under consideration for acceptance in TPLP.
- We present a method of backward induction for computing approximate subgame perfect Nash equilibria of infinitely repeated games with discounted payoffs. This uses the selection monad transformer, combined with the searchable set monad viewed as a notion of 'topologically compact' nondeterminism, and a simple model of computable real numbers. This is the first application of Escardó and Oliva's theory of higher-order sequential games to games of imperfect information, in which (as well as its mathematical elegance) lazy evaluation does nontrivial work for us compared with a traditional game-theoretic analysis. Since a full theoretical understanding of this method is lacking (and appears to be very hard), we consider this an 'experimental' paper heavily inspired by theoretical ideas. We use the famous Iterated Prisoner's Dilemma as a worked example.
- Apr 20 2018 cs.CL arXiv:1804.07068v1In formal logic-based approaches to Recognizing Textual Entailment (RTE), a Combinatory Categorial Grammar (CCG) parser is used to parse input premises and hypotheses to obtain their logical formulas. Here, it is important that the parser processes the sentences consistently; failing to recognize a similar syntactic structure results in inconsistent predicate argument structures among them, in which case the succeeding theorem proving is doomed to failure. In this work, we present a simple method to extend an existing CCG parser to parse a set of sentences consistently, which is achieved with an inter-sentence modeling with Markov Random Fields (MRF). When combined with existing logic-based systems, our method always shows improvement in the RTE experiments on English and Japanese languages.
- Apr 20 2018 cs.CV arXiv:1804.07062v1The output of Convolutional Neural Networks (CNN) has been shown to be discontinuous which can make the CNN image classifier vulnerable to small well-tuned artificial perturbations. That is, images modified by adding such perturbations(i.e. adversarial perturbations) that make little difference to human eyes, can completely alter the CNN classification results. In this paper, we propose a practical attack using differential evolution(DE) for generating effective adversarial perturbations. We comprehensively evaluate the effectiveness of different types of DEs for conducting the attack on different network structures. The proposed method is a black-box attack which only requires the miracle feedback of the target CNN systems. The results show that under strict constraints which simultaneously control the number of pixels changed and overall perturbation strength, attacking can achieve 72.29%, 78.24% and 61.28% non-targeted attack success rates, with 88.68%, 99.85% and 73.07% confidence on average, on three common types of CNNs. The attack only requires modifying 5 pixels with 20.44, 14.76 and 22.98 pixel values distortion. Thus, the result shows that the current DNNs are also vulnerable to such simpler black-box attacks even under very limited attack conditions.
- Apr 20 2018 cs.CV arXiv:1804.07056v1We propose a new long-term tracking performance evaluation methodology and present a new challenging dataset of carefully selected sequences with many target disappearances. We perform an extensive evaluation of six long-term and nine short-term state-of-the-art trackers, using new performance measures, suitable for evaluating long-term tracking - tracking precision, recall and F-score. The evaluation shows that a good model update strategy and the capability of image-wide re-detection are critical for long-term tracking performance. We integrated the methodology in the VOT toolkit to automate experimental analysis and benchmarking and to facilitate the development of long-term trackers.
- Apr 20 2018 cs.SY arXiv:1804.07051v1Network Function Virtualization (NFV) can cost-efficiently provide network services by running different virtual network functions (VNFs) at different virtual machines (VMs) in a correct order. This can result in strong couplings between the decisions of the VMs on the placement and operations of VNFs. This paper presents a new fully decentralized online approach for optimal placement and operations of VNFs. Building on a new stochastic dual gradient method, our approach decouples the real-time decisions of VMs, asymptotically minimizes the time-average cost of NFV, and stabilizes the backlogs of network services with a cost-backlog tradeoff of $[\epsilon,1/\epsilon]$, for any $\epsilon > 0$. Our approach can be relaxed into multiple timescales to have VNFs (re)placed at a larger timescale and hence alleviate service interruptions. While proved to preserve the asymptotic optimality, the larger timescale can slow down the optimal placement of VNFs. A learn-and-adapt strategy is further designed to speed the placement up with an improved tradeoff $[\epsilon,\log^2(\epsilon)/{\sqrt{\epsilon}}]$. Numerical results show that the proposed method is able to reduce the time-average cost of NFV by 30\% and reduce the queue length (or delay) by 83\%, as compared to existing benchmarks.
- Sparse Mobile CrowdSensing (MCS) is a novel MCS paradigm where data inference is incorporated into the MCS process for reducing sensing costs while its quality is guaranteed. Since the sensed data from different cells (sub-areas) of the target sensing area will probably lead to diverse levels of inference data quality, cell selection (i.e., choose which cells of the target area to collect sensed data from participants) is a critical issue that will impact the total amount of data that requires to be collected (i.e., data collection costs) for ensuring a certain level of quality. To address this issue, this paper proposes a Deep Reinforcement learning based Cell selection mechanism for Sparse MCS, called DR-Cell. First, we properly model the key concepts in reinforcement learning including state, action, and reward, and then propose to use a deep recurrent Q-network for learning the Q-function that can help decide which cell is a better choice under a certain state during cell selection. Furthermore, we leverage the transfer learning techniques to reduce the amount of data required for training the Q-function if there are multiple correlated MCS tasks that need to be conducted in the same target area. Experiments on various real-life sensing datasets verify the effectiveness of DR-Cell over the state-of-the-art cell selection mechanisms in Sparse MCS by reducing up to 15% of sensed cells with the same data inference quality guarantee.
- Fueled by massive amounts of data, models produced by machine-learning (ML) algorithms, especially deep neural networks, are being used in diverse domains where trustworthiness is a concern, including automotive systems, finance, health care, natural language processing, and malware detection. Of particular concern is the use of ML algorithms in cyber-physical systems (CPS), such as self-driving cars and aviation, where an adversary can cause serious consequences. However, existing approaches to generating adversarial examples and devising robust ML algorithms mostly ignore the semantics and context of the overall system containing the ML component. For example, in an autonomous vehicle using deep learning for perception, not every adversarial example for the neural network might lead to a harmful consequence. Moreover, one may want to prioritize the search for adversarial examples towards those that significantly modify the desired semantics of the overall system. Along the same lines, existing algorithms for constructing robust ML algorithms ignore the specification of the overall system. In this paper, we argue that the semantics and specification of the overall system has a crucial role to play in this line of research. We present preliminary research results that support this claim.
- We consider a renormalizable extension of the minimal supersymmetric standard model endowed by an R and a gauged B - L symmetry. The model incorporates chaotic inflation driven by a quartic potential, associated with the Higgs field which leads to a spontaneous breaking of U(1)B-L, and yields possibly detectable gravitational waves. We employ semi-logarithmic Kahler potentials with an enhanced symmetry which include only quadratic terms and integer prefactor for the logarithms. An explanation of the mu term of the MSSM is also provided, consistently with the low energy phenomenology, under the condition that one related parameter in the superpotential is somewhat small. Baryogenesis occurs via non-thermal leptogenesis which is realized by the inflaton's decay to the lightest or next-to-lightest right-handed neutrinos.
- We consider planning problems for graphs, Markov decision processes (MDPs), and games on graphs. While graphs represent the most basic planning model, MDPs represent interaction with nature and games on graphs represent interaction with an adversarial environment. We consider two planning problems where there are k different target sets, and the problems are as follows: (a) the coverage problem asks whether there is a plan for each individual target set, and (b) the sequential target reachability problem asks whether the targets can be reached in sequence. For the coverage problem, we present a linear-time algorithm for graphs and quadratic conditional lower bound for MDPs and games on graphs. For the sequential target problem, we present a linear-time algorithm for graphs, a sub-quadratic algorithm for MDPs, and a quadratic conditional lower bound for games on graphs. Our results with conditional lower bounds establish (i) model-separation results showing that for the coverage problem MDPs and games on graphs are harder than graphs and for the sequential reachability problem games on graphs are harder than MDPs and graphs; (ii) objective-separation results showing that for MDPs the coverage problem is harder than the sequential target problem.
- Apr 20 2018 cs.SY arXiv:1804.07020v1With an increasing degree of automation, automated vehicle systems become more complex in terms of functional components as well as interconnected hardware and software components. Thus, holistic systems engineering becomes a severe challenge. Emergent properties like system safety are not solely arguable in singular viewpoints such as structural representations of software or electrical (e.g. fault tolerant) wiring. This states the need to get several viewpoints on a system and describe correspondences between these views in order to enable traceability of emergent system properties. Today, the most abstract view found in architecture frameworks is a logical description of system functions which structures the system in terms of information flow and functional components. In this article we extend established system viewpoints towards an ability-based assessment of an automated vehicle and conduct an exemplary safety analysis to derive behavioral safety requirements. These requirements can afterwards be attributed to different viewpoints in an architecture frameworks and thus be integrated in a development process for automated vehicles.
- Apr 20 2018 cs.MM arXiv:1804.07019v1Given a collection of videos, how to detect content-based copies efficiently with high accuracy? Detecting copies in large video collections still remains one of the major challenges of multimedia retrieval. While many video copy detection approaches show high computation times and insufficient quality, we propose a new efficient content-based video copy detection algorithm improving both aspects. The idea of our approach consists in utilizing self-similarity matrices as video descriptors in order to capture different visual properties. We benchmark our algorithm on the MuscleVCD ST1 benchmark dataset and show that our approach is able to achieve a score of 100\% and a score of at least 93\% in a wide range of parameters.
- Apr 20 2018 cs.CV arXiv:1804.07014v1Given an untrimmed video and a sentence description, temporal sentence localization aims to automatically determine the start and end points of the described sentence within the video. The problem is challenging as it needs the understanding of both video and sentence. Existing research predominantly employs a costly "scan and localize" framework, neglecting the global video context and the specific details within sentences which play as critical issues for this problem. In this paper, we propose a novel Attention Based Location Regression (ABLR) approach to solve the temporal sentence localization from a global perspective. Specifically, to preserve the context information, ABLR first encodes both video and sentence via Bidirectional LSTM networks. Then, a multi-modal co-attention mechanism is introduced to generate not only video attention which reflects the global video structure, but also sentence attention which highlights the crucial details for temporal localization. Finally, a novel attention based location regression network is designed to predict the temporal coordinates of sentence query from the previous attention. ABLR is jointly trained in an end-to-end manner. Comprehensive experiments on ActivityNet Captions and TACoS datasets demonstrate both the effectiveness and the efficiency of the proposed ABLR approach.
- Apr 20 2018 cs.AI arXiv:1804.07013v1In order to make the task, description of planning domains and problems, more comprehensive for non-experts in planning, the visual representation has been used in planning domain modeling in recent years. However, current knowledge engineering tools with visual modeling, like itSIMPLE (Vaquero et al. 2012) and VIZ (Vodrážka and Chrpa 2010), are less efficient than the traditional method of hand-coding by a PDDL expert using a text editor, and rarely involved in finetuning planning domains depending on the plan validation. Aim at this, we present an integrated development environment KAVI for planning domain modeling inspired by itSIMPLE and VIZ. KAVI using an abstract domain knowledge base to improve the efficiency of planning domain visual modeling. By integrating planners and a plan validator, KAVI proposes a method to fine-tune planning domains based on the plan validation.
- Classical numerical methods for solving partial differential equations suffer from the curse dimensionality mainly due to their reliance on meticulously generated spatio-temporal grids. Inspired by modern deep learning based techniques for solving forward and inverse problems associated with partial differential equations, we circumvent the tyranny of numerical discretization by devising an algorithm that is scalable to high-dimensions. In particular, we approximate the unknown solution by a deep neural network which essentially enables us to benefit from the merits of automatic differentiation. To train the aforementioned neural network we leverage the well-known connection between high-dimensional partial differential equations and forward-backward stochastic differential equations. In fact, independent realizations of a standard Brownian motion will act as training data. We test the effectiveness of our approach for a couple of benchmark problems spanning a number of scientific domains including Black-Scholes-Barenblatt and Hamilton-Jacobi-Bellman equations, both in 100-dimensions.
- Compared with visible object tracking, thermal infrared (TIR) object tracking can track an arbitrary target in total darkness since it cannot be influenced by illumination variations. However, there are many unwanted attributes that constrain the potentials of TIR tracking, such as the absence of visual color patterns and low resolutions. Recently, structured output support vector machine (SOSVM) and discriminative correlation filter (DCF) have been successfully applied to visible object tracking, respectively. Motivated by these, in this paper, we propose a large margin structured convolution operator (LMSCO) to achieve efficient TIR object tracking. To improve the tracking performance, we employ the spatial regularization and implicit interpolation to obtain continuous deep feature maps, including deep appearance features and deep motion features, of the TIR targets. Finally, a collaborative optimization strategy is exploited to significantly update the operators. Our approach not only inherits the advantage of the strong discriminative capability of SOSVM but also achieves accurate and robust tracking with higher-dimensional features and more dense samples. To the best of our knowledge, we are the first to incorporate the advantages of DCF and SOSVM for TIR object tracking. Comprehensive evaluations on two thermal infrared tracking benchmarks, i.e. VOT-TIR2015 and VOT-TIR2016, clearly demonstrate that our LMSCO tracker achieves impressive results and outperforms most state-of-the-art trackers in terms of accuracy and robustness with sufficient frame rate.
- In the description of quantum key distribution systems, much attention is paid to the operation of quantum cryptography protocols. The main problem is the insufficient study of the synchronization process of quantum key distribution systems. This paper contains a general description of quantum cryptography principles. A two-line fiber-optic quantum key distribution system with phase coding of photon states in transceiver and coding station synchronization mode was examined. A quantum key distribution system was built on the basis of the scheme with automatic compensation of polarization mode distortions. Single-photon avalanche diodes were used as optical radiation detecting devices. It was estimated how the parameters used in quantum key distribution systems of optical detectors affect the detection of the time frame with attenuated optical pulse in synchronization mode with respect to its probabilistic and time-domain characteristics. A design method was given for the process that detects the time frame that includes an optical pulse during synchronization. This paper describes the main quantum communication channel attack methods by removing a portion of optical emission. This paper describes the developed synchronization algorithm that takes into account the time required to restore the photodetectors operation state after the photon has been registered during synchronization. The computer simulation results of the developed synchronization algorithm were analyzed...
- Depression is ranked as the largest contributor to global disability and is also a major reason for suicide. Still, many individuals suffering from forms of depression are not treated for various reasons. Previous studies have shown that depression also has an effect on language usage and that many depressed individuals use social media platforms or the internet in general to get information or discuss their problems. This paper addresses the early detection of depression using machine learning models based on messages on a social platform. In particular, a convolutional neural network based on different word embeddings is evaluated and compared to a classification based on user-level linguistic metadata. An ensemble of both approaches is shown to achieve state-of-the-art results in a current early detection task. Furthermore, the currently popular ERDE score as metric for early detection systems is examined in detail and its drawbacks in the context of shared tasks are illustrated. A slightly modified metric is proposed and compared to the original score. Finally, a new word embedding was trained on a large corpus of the same domain as the described task and is evaluated as well.
- Apr 20 2018 cs.CV arXiv:1804.06992v1In recent years, deep learning has become a very active research tool which is used in many image processing fields. In this paper, we propose an effective image fusion method using a deep learning framework to generate a single image which contains all the features from infrared and visible images. First, the source images are decomposed into base parts and detail content. Then the base parts are fused by weighted-averaging. For the detail content, we use a deep learning network to extract multi-layer features. Using these features, we use l_1-norm and weighted-average strategy to generate several candidates of the fused detail content. Once we get these candidates, the max selection strategy is used to get final fused detail content. Finally, the fused image will be reconstructed by combining the fused base part and detail content. The experimental results demonstrate that our proposed method achieves state-of-the-art performance in both objective assessment and visual quality. The Code of our fusion method is available at https://github.com/exceptionLi/imagefusion_deeplearning.
- Apr 20 2018 cs.CL arXiv:1804.06987v1Relation extraction is the problem of classifying the relationship between two entities in a given sentence. Distant Supervision (DS) is a popular technique for developing relation extractors starting with limited supervision. We note that most of the sentences in the distant supervision relation extraction setting are very long and may benefit from word attention for better sentence representation. Our contributions in this paper are threefold. Firstly, we propose two novel word attention models for distantly- supervised relation extraction: (1) a Bi-directional Gated Recurrent Unit (Bi-GRU) based word attention model (BGWA), (2) an entity-centric attention model (EA), and (3) a combination model which combines multiple complementary models using weighted voting method for improved relation extraction. Secondly, we introduce GDS, a new distant supervision dataset for relation extraction. GDS removes test data noise present in all previous distant- supervision benchmark datasets, making credible automatic evaluation possible. Thirdly, through extensive experiments on multiple real-world datasets, we demonstrate the effectiveness of the proposed methods.
- A new solution of four-dimensional vacuum General Relativity is presented. It describes the near horizon region of the extreme (maximally spinning) binary black hole system with two identical extreme Kerr black holes held in equilibrium by a massless strut. This is the first example of a non-supersymmetric, asymptotically flat near horizon extreme binary black hole geometry of two uncharged black holes. The black holes are co-rotating, and the solution is uniquely specified by the mass. The binary extreme system has finite entropy. The distance between the black holes is fixed, but there is a zero-distance limit where the objects collapse into one. This limiting geometry corresponds to the near horizon extreme Kerr (NHEK) black hole.
- Apr 20 2018 q-bio.PE physics.soc-ph arXiv:1804.06984v1Cooperation among self-interested players in a social dilemma is fragile and easily interrupted by mistakes. In this work, we study the repeated $n$-person public-goods game and search for a strategy that forms a cooperative Nash equilibrium in the presence of implementation error with a guarantee that the resulting payoff will be no less than any of the co-players'. By enumerating strategic possibilities for $n=3$, we show that such a strategy indeed exists when its memory length $m$ equals three. It means that a deterministic strategy can be publicly employed to stabilize cooperation against error with avoiding the risk of being exploited. We furthermore show that, for general $n$-person public-goods game, $m \geq n$ is necessary to satisfy the above criteria.
- Apr 20 2018 quant-ph physics.optics arXiv:1804.06976v1Photodetection is a process in which an incident field induces a polarization current in the detector. The interaction of the field with this induced current excites an electron in the detector from a localized bound state to a state in which the electron freely propagates and can be classically amplified and detected. The induced current can interact not only with the applied field, but also with all of the initially unpopulated vacuum modes. This interaction with the vacuum modes is assumed to be small and is neglected in conventional photodetection theory. We show that this interaction contributes to the quantum efficiency of the detector. We also show that in the Purcell enhancement regime, shot noise in the photocurrent depends on the bandwidth of the the vacuum modes interacting with the detector. Our theory allows design of sensitive detectors to probe the properties of the vacuum modes.
- A key problem in deep multi-attribute learning is to effectively discover the inter-attribute correlation structures. Typically, the conventional deep multi-attribute learning approaches follow the pipeline of manually designing the network architectures based on task-specific expertise prior knowledge and careful network tunings, leading to the inflexibility for various complicated scenarios in practice. Motivated by addressing this problem, we propose an efficient greedy neural architecture search approach (GNAS) to automatically discover the optimal tree-like deep architecture for multi-attribute learning. In a greedy manner, GNAS divides the optimization of global architecture into the optimizations of individual connections step by step. By iteratively updating the local architectures, the global tree-like architecture gets converged where the bottom layers are shared across relevant attributes and the branches in top layers more encode attribute-specific features. Experiments on three benchmark multi-attribute datasets show the effectiveness and compactness of neural architectures derived by GNAS, and also demonstrate the efficiency of GNAS in searching neural architectures.
- Apr 20 2018 cs.CV arXiv:1804.06962v1In this work, we propose Adversarial Complementary Learning (ACoL) to automatically localize integral objects of semantic interest with weak supervision. We first mathematically prove that class localization maps can be obtained by directly selecting the class-specific feature maps of the last convolutional layer, which paves a simple way to identify object regions. We then present a simple network architecture including two parallel-classifiers for object localization. Specifically, we leverage one classification branch to dynamically localize some discriminative object regions during the forward pass. Although it is usually responsive to sparse parts of the target objects, this classifier can drive the counterpart classifier to discover new and complementary object regions by erasing its discovered regions from the feature maps. With such an adversarial learning, the two parallel-classifiers are forced to leverage complementary object regions for classification and can finally generate integral object localization together. The merits of ACoL are mainly two-fold: 1) it can be trained in an end-to-end manner; 2) dynamically erasing enables the counterpart classifier to discover complementary object regions more effectively. We demonstrate the superiority of our ACoL approach in a variety of experiments. In particular, the Top-1 localization error rate on the ILSVRC dataset is 45.14%, which is the new state-of-the-art.
- Apr 20 2018 cs.CV arXiv:1804.06958v1Crowd counting, for estimating the number of people in a crowd using vision-based computer techniques, has attracted much interest in the research community. Although many attempts have been reported, real-world problems, such as huge variation in subjects' sizes in images and serious occlusion among people, make it still a challenging problem. In this paper, we propose an Adaptive Counting Convolutional Neural Network (A-CCNN) and consider the scale variation of objects in a frame adaptively so as to improve the accuracy of counting. Our method takes advantages of contextual information to provide more accurate and adaptive density maps and crowd counting in a scene. Extensively experimental evaluation is conducted using different benchmark datasets for object-counting and shows that the proposed approach is effective and outperforms state-of-the-art approaches.
- Current exergaming sensors and inertial systems attached to sports equipment or the human body can provide quantitative information about the movement or impact e.g. with the ball. However, the scope of these technologies is not to qualitatively assess sports technique at a personalised level, similar to a coach during training or replay analysis. The aim of this paper is to demonstrate a novel approach to automate identification of tennis swings executed with erroneous technique without recorded ball impact. The presented spatiotemporal transformations relying on motion gradient vector flow and polynomial regression with RBF classifier, can identify previously unseen erroneous swings (84.5-94.6%). The presented solution is able to learn from a small dataset and capture two subjective swing-technique assessment criteria from a coach. Personalised and flexible assessment criteria required for players of diverse skill levels and various coaching scenarios were demonstrated by assigning different labelling criteria for identifying similar spatiotemporal patterns of tennis swings.
- Dynamic Ensemble Selection (DES) techniques aim to select locally competent classifiers for the classification of each new test sample. Most DES techniques estimate the competence of classifiers using a given criterion over the region of competence of the test sample (its the nearest neighbors in the validation set). The K-Nearest Oracles Eliminate (KNORA-E) DES selects all classifiers that correctly classify all samples in the region of competence of the test sample, if such classifier exists, otherwise, it removes from the region of competence the sample that is furthest from the test sample, and the process repeats. When the region of competence has samples of different classes, KNORA-E can reduce the region of competence in such a way that only samples of a single class remain in the region of competence, leading to the selection of locally incompetent classifiers that classify all samples in the region of competence as being from the same class. In this paper, we propose two DES techniques: K-Nearest Oracles Borderline (KNORA-B) and K-Nearest Oracles Borderline Imbalanced (KNORA-BI). KNORA-B is a DES technique based on KNORA-E that reduces the region of competence but maintains at least one sample from each class that is in the original region of competence. KNORA-BI is a variation of KNORA-B for imbalance datasets that reduces the region of competence but maintains at least one minority class sample if there is any in the original region of competence. Experiments are conducted comparing the proposed techniques with 19 DES techniques from the literature using 40 datasets. The results show that the proposed techniques achieved interesting results, with KNORA-BI outperforming state-of-art techniques.
- Apr 20 2018 astro-ph.SR arXiv:1804.06930v1The measurement of the Sun's diameter has been first tackled by the Greek astronomers from a geometric point of view. Their estimation of ~1800", although incorrect, was not truly called into question for several centuries. The first pioneer works for measuring the Sun's diameter with an astrometric precision were made around the year 1660 by Gabriel Mouton, then by Picard and La Hire. A canonical value of the solar radius of 959".63 was adopted by Auwers in 1891. Despite considerable efforts during the second half of the XXth century, involving dedicated space instruments, no consensus was reached on this issue. However, with the advent of high sensitivity instruments on board satellites, such as the Michelson Doppler Imager (MDI) on Solar and Heliospheric Observatory (SoHO) and the Helioseismic and Magnetic Imager (HMI) aboard NASA's Solar Dynamics Observatory (SDO), it was possible to extract with an unprecedented accuracy the surface gravity oscillation f modes, over nearly two solar cycles, from 1996 to 2017. Their analysis in the range of angular degree l=140-300 shows that the so-called "seismic radius" exhibits a temporal variability in anti-phase with the solar activity. Even if the link between the two radii (photospheric and seismic) can be made only through modeling, such measurements provide an interesting alternative which led to a revision of the standard solar radius by the International Astronomical Union in 2015. This new look on such modern measurements of the Sun's global changes from 1996 to 2017 gives a new way for peering into the solar interior, mainly to better understand the subsurface fields which play an important role in the implementation of the solar cycles.
- Apr 20 2018 cs.DC arXiv:1804.06926v1We implement exact triangle counting in graphs on the GPU using three different methodologies: subgraph matching to a triangle pattern; programmable graph analytics, with a set-intersection approach; and a matrix formulation based on sparse matrix-matrix multiplies. All three deliver best-of-class performance over CPU implementations and over comparable GPU implementations, with the graph-analytic approach achieving the best performance due to its ability to exploit efficient filtering steps to remove unnecessary work and its high-performance set-intersection core.
- Apr 20 2018 cs.CL arXiv:1804.06922v1Sentences with gapping, such as Paul likes coffee and Mary tea, lack an overt predicate to indicate the relation between two or more arguments. Surface syntax representations of such sentences are often produced poorly by parsers, and even if correct, not well suited to downstream natural language understanding tasks such as relation extraction that are typically designed to extract information from sentences with canonical clause structure. In this paper, we present two methods for parsing to a Universal Dependencies graph representation that explicitly encodes the elided material with additional nodes and edges. We find that both methods can reconstruct elided material from dependency trees with high accuracy when the parser correctly predicts the existence of a gap. We further demonstrate that one of our methods can be applied to other languages based on a case study on Swedish.
- Apr 20 2018 quant-ph physics.optics arXiv:1804.06921v1Microresonator-based nonlinear processes are fundamental to applications including microcomb generation, parametric frequency conversion, and harmonics generation. While nonlinear processes involving either second- ($\chi^{(2)}$) or third- $\chi^{(3)}$) order nonlinearity have been extensively studied, the interaction between these two basic nonlinear processes has seldom been reported. In this letter, we demonstrate a coherent interplay between second- and third- order nonlinear processes. The parametric ($\chi^{(2)})$ coupling to a lossy ancillary mode shortens the lifetime of the target photonic mode and suppresses its density of states, preventing the photon emissions into the target photonic mode via Zeno effect. Such effect is then used to control the stimulated four-wave mixing process and realize a suppression ratio of $34.5$.
- Apr 20 2018 cs.CV arXiv:1804.06919v1An ever increasing amount of our digital communication, media consumption, and content creation revolves around videos. We share, watch, and archive many aspects of our lives through them, all of which are powered by strong video compression. Traditional video compression is laboriously hand designed and hand optimized. This paper presents an alternative in an end-to-end deep learning codec. Our codec builds on one simple idea: Video compression is repeated image interpolation. It thus benefits from recent advances in deep image interpolation and generation. Our deep video codec outperforms today's prevailing codecs, such as H.261, MPEG-4 Part 2, and performs on par with H.264.
- In real world systems, the predictions of deployed Machine Learned models affect the training data available to build subsequent models. This introduces a bias in the training data that needs to be addressed. Existing solutions to this problem attempt to resolve the problem by either casting this in the reinforcement learning framework or by quantifying the bias and re-weighting the loss functions. In this work, we develop a novel Adversarial Neural Network (ANN) model, an alternative approach which creates a representation of the data that is invariant to the bias. We take the Paid Search auction as our working example and ad display position features as the confounding features for this setting. We show the success of this approach empirically on both synthetic data as well as real world paid search auction data from a major search engine.
- We demonstrate that current state-of-the-art approaches to Automated Essay Scoring (AES) are not well-suited to capturing adversarially crafted input of grammatical but incoherent sequences of sentences. We develop a neural model of local coherence that can effectively learn connectedness features between sentences, and propose a framework for integrating and jointly training the local coherence model with a state-of-the-art AES model. We evaluate our approach against a number of baselines and experimentally demonstrate its effectiveness on both the AES task and the task of flagging adversarial input, further contributing to the development of an approach that strengthens the validity of neural essay scoring models.
- Recent years have witnessed significant progresses in deep Reinforcement Learning (RL). Empowered with large scale neural networks, carefully designed architectures, novel training algorithms and massively parallel computing devices, researchers are able to attack many challenging RL problems. However, in machine learning, more training power comes with a potential risk of more overfitting. As deep RL techniques are being applied to critical problems such as healthcare and finance, it is important to understand the generalization behaviors of the trained agents. In this paper, we conduct a systematic study of standard RL agents and find that they could overfit in various ways. Moreover, overfitting could happen ``robustly'': commonly used techniques in RL that add stochasticity do not necessarily prevent or detect overfitting. In particular, the same agents and learning algorithms could have drastically different test performance, even when all of them achieve optimal rewards during training. The observations call for more principled and careful evaluation protocols in RL. We conclude with a general discussion on overfitting in RL and a study of the generalization behaviors from the perspective of inductive bias.
- Apr 20 2018 math.SP arXiv:1804.06887v1After recalling a fundamental identity relating traces and modified Fredholm determinants, we apply it to a class of half-line Schrödinger operators $(- d^2/dx^2) + q$ on $(0,\infty)$ with purely discrete spectra. Roughly speaking, the class considered is generated by potentials $q$ that, for some fixed $C_0 > 0$, $\varepsilon > 0$, $x_0 \in (0, \infty)$, diverge at infinity in the manner that $q(x) \geq C_0 x^{(2/3) + \varepsilon_0}$ for all $x \geq x_0$. We treat all self-adjoint boundary conditions at the left endpoint $0$.
- Quantitative analysis of holographic microscopy images yields the three-dimensional positions of micrometer-scale colloidal particles with nanometer precision, while simultaneously measuring the particles' sizes and refractive indexes. Extracting this information begins by detecting and localizing features of interest within individual holograms. Conventionally approached with heuristic algorithms, this image analysis problem can be solved faster and more generally with machine-learning techniques. We demonstrate that two popular machine-learning algorithms, cascade classifiers and deep convolutional neural networks (CNN), can solve the particle-tracking problem orders of magnitude faster than current state-of-the-art techniques. Our CNN implementation localizes holographic features precisely enough to bootstrap more detailed analyses based on the Lorenz-Mie theory of light scattering. The wavelet-based Haar cascade performs less well at detecting and localizing particles, but is so computationally efficient that it creates new opportunities for applications that emphasize speed and low cost. We demonstrate its use as a real-time targeting system for holographic optical trapping.
- Apr 20 2018 cs.CV arXiv:1804.06882v1An increasing need of running Convolutional Neural Network (CNN) models on mobile devices with limited computing power and memory resource encourages studies on efficient model design. A number of efficient architectures have been proposed in recent years, for example, MobileNet, ShuffleNet, and NASNet-A. However, all these models are heavily dependent on depthwise separable convolution which lacks efficient implementation in most deep learning frameworks. In this study, we propose an efficient architecture named PeleeNet, which is built with conventional convolution instead. On ImageNet ILSVRC 2012 dataset, our proposed PeleeNet achieves a higher accuracy by 0.6% (71.3% vs. 70.7%) and 11% lower computational cost than MobileNet, the state-of-the-art efficient architecture. Meanwhile, PeleeNet is only 66% of the model size of MobileNet. We then propose a real-time object detection system by combining PeleeNet with Single Shot MultiBox Detector (SSD) method and optimizing the architecture for fast speed. Our proposed detection system, named Pelee, achieves 76.4% mAP (mean average precision) on PASCAL VOC2007 and 22.4 mAP on MS COCO dataset at the speed of 17.1 FPS on iPhone 6s and 23.6 FPS on iPhone 8. The result on COCO outperforms YOLOv2 in consideration of a higher precision, 13.6 times lower computational cost and 11.3 times smaller model size. The code and models are open sourced.
- We introduce a new benchmark, WinoBias, for coreference resolution focused on gender bias. Our corpus contains Winograd-schema style sentences with entities corresponding to people referred by their occupation (e.g. the nurse, the doctor, the carpenter). We demonstrate that a rule-based, a feature-rich, and a neural coreference system all link gendered pronouns to pro-stereotypical entities with higher accuracy than anti-stereotypical entities, by an average difference of 21.1 in F1 score. Finally, we demonstrate a data-augmentation approach that, in combination with existing word-embedding debiasing techniques, removes the bias demonstrated by these systems in WinoBias without significantly affecting their performance on existing coreference benchmark datasets. Our dataset and code are available at http://winobias.org.
- Training robust deep networks is challenging under noisy labels. Current methodologies focus on estimating the noise transition matrix. However, this matrix is not easy to be estimated exactly. In this paper, free of the matrix estimation, we present a simple but robust learning paradigm called "Co-sampling", which can train deep networks robustly under extremely noisy labels. Briefly, our paradigm trains two networks simultaneously. In each mini-batch data, each network samples its small-loss instances, and cross-trains on such instances from its peer network. We conduct experiments on several simulated noisy datasets. Empirical results demonstrate that, under extremely noisy labels, the Co-sampling approach trains deep learning models robustly.
- Visual reasoning with compositional natural language instructions, e.g., based on the newly-released Cornell Natural Language Visual Reasoning (NLVR) dataset, is a challenging task, where the model needs to have the ability to create an accurate mapping between the diverse phrases and the several objects placed in complex arrangements in the image. Further, this mapping needs to be processed to answer the question in the statement given the ordering and relationship of the objects across three similar images. In this paper, we propose a novel end-to-end neural model for the NLVR task, where we first use joint bidirectional attention to build a two-way conditioning between the visual information and the language phrases. Next, we use an RL-based pointer network to sort and process the varying number of unordered objects (so as to match the order of the statement phrases) in each of the three images and then pool over the three decisions. Our model achieves strong improvements (of 4-6% absolute) over the state-of-the-art on both the structured representation and raw image versions of the dataset.
- Apr 20 2018 cs.CL arXiv:1804.06868v1We propose a context-dependent model to map utterances within an interaction to executable formal queries. To incorporate interaction history, the model maintains an interaction-level encoder that updates after each turn, and can copy sub-sequences of previously predicted queries during generation. Our approach combines implicit and explicit modeling of references between utterances. We evaluate our model on the ATIS flight planning interactions, and demonstrate the benefits of modeling context and explicit references.
- Apr 20 2018 cs.GT arXiv:1804.06867v1We study revenue maximization by deterministic mechanisms for the simplest case for which Myerson's characterization does not hold: a single seller selling two items, with independently distributed values, to a single additive buyer. We prove that optimal mechanisms are submodular and hence monotone. Furthermore, we show that in the IID case, optimal mechanisms are symmetric. Our characterizations are surprisingly non-trivial, and we show that they fail to extend in several natural ways, e.g. for correlated distributions or more than two items. In particular, this shows that the optimality of symmetric mechanisms does not follow from the symmetry of the IID distribution.