# Top arXiv papers

• A concept of measuring the quantum distance between two different quantum states which is called quantum information metric is presented. The holographic principle (AdS/CFT) suggests that the quantum information metric $G_{\lambda\lambda}$ between perturbed state and unperturbed state in field theory has a dual description in the classical gravity. In this work we calculate the quantum information metric of a theory which is dual to a conical defect geometry and we show that it is $n$ times the one of its covering space. We also give a holographic check for our result in the gravity side. Meanwhile, it was argued that $G_{\lambda\lambda}$ is dual to a codimension-one surface in spacetime and satisfies $G_{\lambda\lambda}=n_{d}\cdot\mbox{Vol}(\Sigma_{max})/L^{d}$. We show that the coefficient $n_d$ for conical defect should be rescaled by $n^2$ from the one for AdS. A limit case of conical defect --- the massless BTZ black hole--- is also considered. We show that the quantum information metric of the massless BTZ black hole disagrees with the one obtained by taking the vanishing temperature limit in BTZ black hole. This provides a new arena in differiating the different phases between BTZ spacetime and its massless cousin.
• The orbifold construction $A\mapsto A^G$ for a finite group $G$ is fundamental in rational conformal field theory. The construction of $Rep(A^G)$ from $Rep(A)$ on the categorical level, often called gauging, is also prominent in the study of topological phases of matter. The key step in this construction is to find a braided $G$-crossed extension of $\mathcal{C}$ compatible with a given action of $G$ on a non-degenerate braided fusion category $\mathcal{C}$. The extension theory of Etingof-Nikshych-Ostrik gives two obstructions for this problem, $o_3\in H^3(G)$ and $o_4\in H^4(G)$ for certain coefficients, the latter of which depends on a categorical lifting of the action and is notoriously difficult to compute. We show that in the case where $G\le S_n$ acts by permutations on $\mathcal{C}^{\boxtimes n}$, both of these obstructions vanish. This verifies a conjecture of Müger, and constitutes a nontrivial test of the conjecture that all modular tensor categories come from vertex operator algebras or conformal nets.
• Defined by a single axiom, finite abstract simplicial complexes belong to the simplest constructs of mathematics. We look at a a few theorems.
• We give a simple distributed algorithm for computing adjacency matrix eigenvectors for the communication graph in an asynchronous gossip model. We show how to use this algorithm to give state-of-the-art asynchronous community detection algorithms when the communication graph is drawn from the well-studied \emphstochastic block model. Our methods also apply to a natural alternative model of randomized communication, where nodes within a community communicate more frequently than nodes in different communities. Our analysis simplifies and generalizes prior work by forging a connection between asynchronous eigenvector computation and Oja's algorithm for streaming principal component analysis. We hope that our work serves as a starting point for building further connections between the analysis of stochastic iterative methods, like Oja's algorithm, and work on asynchronous and gossip-type algorithms for distributed computation.
• We design new algorithms for maximizing a monotone non-negative submodular function under various constraints, which improve the state-of-the-art in time complexity and/or performance guarantee. We first investigate the cardinality constrained submodular maximization problem that has been widely studied for about four decades. We design an $(1-\frac{1}{e}-\varepsilon)$-approximation algorithm that makes $O(n\cdot \max \{\varepsilon^{-1},\log\log k \})$ queries. To the best of our knowledge, this is the fastest known algorithm. We further answer the open problem on finding a lower bound on the number of queries. We show that, no (randomized) algorithm can achieve a ratio better than $(\frac{1}{2}+\Theta(1))$ with $o(\frac{n}{\log n})$ queries. The acceleration above is achieved by our \emphAdaptive Decreasing Threshold (ADT) algorithm. Based on ADT, we study the $p$-system and $d$ knapsack constrained maximization problem. We show that an $(1/(p+\frac{7}{4}d+1)-\varepsilon)$-approximate solution can be computed via $O(\frac{n}{\varepsilon}\log \frac{n}{\varepsilon}\max\{\log \frac{1}{\varepsilon},\log\log n\})$ queries. Note that it improves the state of the art in both time complexity and approximation ratio. We also show how to improve the ratio for a single knapsack constraint via $O(n\cdot \max \{\varepsilon^{-1},\log\log k \})$ queries. For maximizing a submodular function with curvature $\kappa$ under matroid constraint, we show an $(1-\frac{\kappa}{e}-\varepsilon)$-approximate algorithm that uses $\tilde{O}(nk)$ value oracle queries. Our ADT could be utilized to obtain faster algorithms in other problems. To prove our results, we introduce a general characterization between randomized complexity and deterministic complexity of approximation algorithms that could be used in other problems and may be interesting in its own right.
• We propose an algebraic approach to proving circuit lower bounds for ACC0 by defining and studying the notion of torus polynomials. We show how currently known polynomial-based approximation results for AC0 and ACC0 can be reformulated in this framework, implying that ACC0 can be approximated by low-degree torus polynomials. Furthermore, as a step towards proving ACC0 lower bounds for the majority function via our approach, we show that MAJORITY cannot be approximated by low-degree symmetric torus polynomials. We also pose several open problems related to our framework.
• We initiate the study of Boolean function analysis on high-dimensional expanders. We describe an analog of the Fourier expansion and of the Fourier levels on simplicial complexes, and generalize the FKN theorem to high-dimensional expanders. Our results demonstrate that a high-dimensional expanding complex $X$ can sometimes serve as a sparse model for the Boolean slice or hypercube, and quite possibly additional results from Boolean function analysis can be carried over to this sparse model. Therefore, this model can be viewed as a derandomization of the Boolean slice, containing $|X(k)|=O(n)$ points in comparison to $\binom{n}{k+1}$ points in the $(k+1)$-slice (which consists of all $n$-bit strings with exactly $k+1$ ones).
• Gao's quantum union bound is a generalization of the union bound from probability theory and finds a range of applications in quantum communication theory, quantum algorithms, and quantum complexity theory [Phys. Rev. A, 92(5):052331, 2015]. It is relevant when performing a sequence of binary-outcome quantum measurements on a quantum state, giving the same bound that the classical union bound would, except with a scaling factor of four. In this paper, we improve upon Gao's quantum union bound, by proving a quantum union bound that involves a tunable parameter that can be optimized. This tunable parameter plays a similar role to a parameter involved in the Hayashi-Nagaoka inequality [IEEE Trans. Inf. Theory, 49(7):1753 (2003)], used often in quantum information theory when analyzing the error probability of a square-root measurement. An advantage of the proof delivered here is that it is elementary, relying only on basic properties of projectors, the Pythagorean theorem, and the Cauchy--Schwarz inequality. As a non-trivial application of our quantum union bound, we prove that a sequential decoding strategy for classical communication over a quantum channel achieves a lower bound on the channel's second-order coding rate. This demonstrates the advantage of our quantum union bound in the non-asymptotic regime, in which a communication channel is called a finite number of times.
• Grover's algorithm provides quantum computers a quadratic advantage over classical computers for searching in an arbitrary data-set. Bitcoin mining falls into this category of a search problem. It has been previously argued that the only side-effect of quantum mining would be an increased difficulty, due to this quadratic speed-up which can be applied to Bitcoin mining. In this work we argue that a crucial argument in the analysis of Bitcoin's security breaks down when quantum mining is performed. Classically, a Bitcoin fork occurs rarely, when two miners find a block almost at the same time: only if both miners are unaware of the other's block, due to propagation time effects. The situation differs dramatically when quantum miners use Grover's algorithm. Grover's algorithm repeatedly applies a procedure called a Grover iteration. More iterations provide a quadratically higher chance of finding a block. Crucially, a miner does not have to choose how many iterations to apply in advance. Suppose Alice receives Bob's new block. To maximize her revenue, she should stop applying Grover iterations and measure her state. Her hope is that her block (rather than Bob's) would become part of the longest chain. This strong correlation between the miners' actions, and the fact that they all measure their state at the same time, may lead to more forks. This is known as a security risk for Bitcoin. We propose a mechanism which, we conjecture, prohibits this form of quantum mining, and circumvents the high rate of forks.
• We consider the problem of sampling from the Potts model on random regular graphs. It is conjectured that sampling is possible when the temperature of the model is in the uniqueness regime of the regular tree, but positive algorithmic results have been for the most part elusive. In this paper, for all integers $q\geq 3$ and $\Delta\geq 3$, we develop algorithms that produce samples within error $o(1)$ from the $q$-state Potts model on random $\Delta$-regular graphs, whenever the temperature is in uniqueness, for both the ferromagnetic and antiferromagnetic cases. The algorithm for the antiferromagnetic Potts model is based on iteratively adding the edges of the graph and resampling a bichromatic class that contains the endpoints of the newly added edge. Key to the algorithm is how to perform the resampling step efficiently since bichromatic classes may induce linear-sized components. To this end, we exploit the tree uniqueness to show that the average growth of bichromatic components is typically small, which allows us to use correlation decay algorithms for the resampling step. While the precise uniqueness threshold on the tree is not known for general values of $q$ and $\Delta$ in the antiferromagnetic case, our algorithm works throughout uniqueness regardless of its value. In the case of the ferromagnetic Potts model, we simplify the algorithm significantly by utilising the random-cluster representation of the model. In particular, we show that a percolation-type algorithm succeeds in sampling from the random-cluster model with parameters $p,q$ on random $\Delta$-regular graphs for all values of $q\geq 1$ and $p<p_c(q,\Delta)$, where $p_c(q,\Delta)$ corresponds to a uniqueness threshold for the model on the $\Delta$-regular tree. When restricted to integer values of $q$, this yields a simplified algorithm for the ferromagnetic Potts model on random $\Delta$-regular graphs.
• Online matching problems have garnered significant attention in recent years due to numerous applications in e-commerce, online advertisements, ride-sharing, etc. Many of them capture the uncertainty in the real world by including stochasticity in both the arrival process and the matching process. The Online Stochastic Matching with Timeouts problem introduced by Bansal, et al., (Algorithmica, 2012) models matching markets (e.g., E-Bay, Amazon). Buyers arrive from an independent and identically distributed (i.i.d.) known distribution on buyer profiles and can be shown a list of items one at a time. Each buyer has some probability of purchasing each item and a limit (timeout) on the number of items they can be shown. Bansal et al., (Algorithmica, 2012) gave a 0.12-competitive algorithm which was improved by Adamczyk, et al., (ESA, 2015) to 0.24. We present an online attenuation framework that uses an algorithm for offline stochastic matching as a black box. On the upper bound side, we show that this framework, combined with a black-box adapted from Bansal et al., (Algorithmica, 2012), yields an online algorithm which nearly doubles the ratio to 0.46. On the lower bound side, we show that no algorithm can achieve a ratio better than 0.632 using the standard LP for this problem. This framework has a high potential for further improvements since new algorithms for offline stochastic matching can directly improve the ratio for the online problem. Our online framework also has the potential for a variety of extensions. For example, we introduce a natural generalization: Online Stochastic Matching with Two-sided Timeouts in which both online and offline vertices have timeouts. Our framework provides the first algorithm for this problem achieving a ratio of 0.30. We once again use the algorithm of Adamczyk et al., (ESA, 2015) as a black-box and plug-it into our framework.
• We design new differentially private algorithms for the Euclidean k-means problem, both in the centralized model and in the local model of differential privacy. In both models, our algorithms achieve significantly improved error rates over the previous state-of-the-art. In addition, in the local model, our algorithm significantly reduces the number of needed interactions. Although the problem has been widely studied in the context of differential privacy, all of the existing constructions achieve only super constant approximation factors. We present, for the first time, efficient private algorithms for the problem with constant multiplicative error.
• A key problem in research on adversarial examples is that vulnerability to adversarial examples is usually measured by running attack algorithms. Because the attack algorithms are not optimal, the attack algorithms are prone to overestimating the size of perturbation needed to fool the target model. In other words, the attack-based methodology provides an upper-bound on the size of a perturbation that will fool the model, but security guarantees require a lower bound. CLEVER is a proposed scoring method to estimate a lower bound. Unfortunately, an estimate of a bound is not a bound. In this report, we show that gradient masking, a common problem that causes attack methodologies to provide only a very loose upper bound, causes CLEVER to overestimate the size of perturbation needed to fool the model. In other words, CLEVER does not resolve the key problem with the attack-based methodology, because it fails to provide a lower bound.
• We revisit the question of reducing online learning to approximate optimization of the offline problem. In this setting, we give two algorithms with near-optimal performance in the full information setting: they guarantee optimal regret and require only poly-logarithmically many calls to the approximation oracle per iteration. Furthermore, these algorithms apply to the more general improper learning problems. In the bandit setting, our algorithm also significantly improves the best previously known oracle complexity while maintaining the same regret.
• This work considers the question: what convergence guarantees does the stochastic subgradient method have in the absence of smoothness and convexity? We prove that the stochastic subgradient method, on any semialgebraic locally Lipschitz function, produces limit points that are all first-order stationary. More generally, our result applies to any function with a Whitney stratifiable graph. In particular, this work endows the stochastic subgradient method with rigorous convergence guarantees for a wide class of problems arising in data science---including all popular deep learning architectures.
• The current dominant paradigm for imitation learning relies on strong supervision of expert actions to learn both 'what' and 'how' to imitate. We pursue an alternative paradigm wherein an agent first explores the world without any expert supervision and then distills its experience into a goal-conditioned skill policy with a novel forward consistency loss. In our framework, the role of the expert is only to communicate the goals (i.e., what to imitate) during inference. The learned policy is then employed to mimic the expert (i.e., how to imitate) after seeing just a sequence of images demonstrating the desired task. Our method is 'zero-shot' in the sense that the agent never has access to expert actions during training or for the task demonstration at inference. We evaluate our zero-shot imitator in two real-world settings: complex rope manipulation with a Baxter robot and navigation in previously unseen office environments with a TurtleBot. Through further experiments in VizDoom simulation, we provide evidence that better mechanisms for exploration lead to learning a more capable policy which in turn improves end task performance. Videos, models, and more details are available at https://pathak22.github.io/zeroshot-imitation/
• Dictionary learning is a popular approach for inferring a hidden basis or dictionary in which data has a sparse representation. Data generated from the dictionary A (an n by m matrix, with m > n in the over-complete setting) is given by Y = AX where X is a matrix whose columns have supports chosen from a distribution over k-sparse vectors, and the non-zero values chosen from a symmetric distribution. Given Y, the goal is to recover A and X in polynomial time. Existing algorithms give polytime guarantees for recovering incoherent dictionaries, under strong distributional assumptions both on the supports of the columns of X, and on the values of the non-zero entries. In this work, we study the following question: Can we design efficient algorithms for recovering dictionaries when the supports of the columns of X are arbitrary? To address this question while circumventing the issue of non-identifiability, we study a natural semirandom model for dictionary learning where there are a large number of samples $y=Ax$ with arbitrary k-sparse supports for x, along with a few samples where the sparse supports are chosen uniformly at random. While the few samples with random supports ensures identifiability, the support distribution can look almost arbitrary in aggregate. Hence existing algorithmic techniques seem to break down as they make strong assumptions on the supports. Our main contribution is a new polynomial time algorithm for learning incoherent over-complete dictionaries that works under the semirandom model. Additionally the same algorithm provides polynomial time guarantees in new parameter regimes when the supports are fully random. Finally using these techniques, we also identify a minimal set of conditions on the supports under which the dictionary can be (information theoretically) recovered from polynomial samples for almost linear sparsity, i.e., $k=\tilde{O}(n)$.
• Current neural network-based classifiers are susceptible to adversarial examples even in the black-box setting, where the attacker only has query access to the model. In practice, the threat model for real-world systems is often more restrictive than the typical black-box model of full query access. We define three realistic threat models that more accurately characterize many real-world classifiers: the query-limited setting, the partial-information setting, and the label-only setting. We develop new attacks that fool classifiers under these more restrictive threat models, where previous methods would be impractical or ineffective. We demonstrate that our methods are effective against an ImageNet classifier under our proposed threat models. We also demonstrate a targeted black-box attack against a commercial classifier, overcoming the challenges of limited query access, partial information, and other practical issues to attack the Google Cloud Vision API.
• Deep Reinforcement Learning (deep RL) has made several breakthroughs in recent years in applications ranging from complex control tasks in unmanned vehicles to game playing. Despite their success, deep RL still lacks several important capacities of human intelligence, such as transfer learning, abstraction and interpretability. Deep Symbolic Reinforcement Learning (DSRL) seeks to incorporate such capacities to deep Q-networks (DQN) by learning a relevant symbolic representation prior to using Q-learning. In this paper, we propose a novel extension of DSRL, which we call Symbolic Reinforcement Learning with Common Sense (SRL+CS), offering a better balance between generalization and specialization, inspired by principles of common sense when assigning rewards and aggregating Q-values. Experiments reported in this paper show that SRL+CS learns consistently faster than Q-learning and DSRL, achieving also a higher accuracy. In the hardest case, where agents were trained in a deterministic environment and tested in a random environment, SRL+CS achieves nearly 100% average accuracy compared to DSRL's 70% and DQN's 50% accuracy. To the best of our knowledge, this is the first case of near perfect zero-shot transfer learning using Reinforcement Learning.
• Many tasks are related to determining if a particular text string exists in an image. In this work, we propose a new framework that learns this task in an end-to-end way. The framework takes an image and a text string as input and then outputs the probability of the text string being present in the image. This is the first end-to-end framework that learns such relationships between text and images in scene text area. The framework does not require explicit scene text detection or recognition and thus no bounding box annotations are needed for it. It is also the first work in scene text area that tackles suh a weakly labeled problem. Based on this framework, we developed a model called Guided Attention. Our designed model achieves much better results than several state-of-the-art scene text reading based solutions for a challenging Street View Business Matching task. The task tries to find correct business names for storefront images and the dataset we collected for it is substantially larger, and more challenging than existing scene text dataset. This new real-world task provides a new perspective for studying scene text related problems. We also demonstrate the uniqueness of our task via a comparison between our problem and a typical Visual Question Answering problem.
• Apr 24 2018 astro-ph.CO arXiv:1804.08581v1
The recent detection of the "cosmic dawn" redshifted 21 cm signal at 78 MHz by the EDGES experiment differs significantly from theoretical predictions. In particular, the absorption trough is roughly a factor of two stronger than the most optimistic theoretical models. The early interpretations of the origin of this discrepancy fall into two categories. The first is that there is increased cooling of the gas due to interactions with dark matter, while the second is that the background radiation field includes a contribution from a component in addition to the cosmic microwave background. In this paper we examine the feasibility of the second idea using new data from the first station of the Long Wavelength Array. The data span 40 to 80 MHz and provide important constraints on the present-day background in a frequency range where there are few surveys with absolute temperature calibration suitable for measuring the strength of the radio monopole. We find support for a strong, diffuse radio background that was suggested by the ARCARDE 2 results in the 3 to 10 GHz range. We find that this background is well modeled by a power law with a spectral index of $-$2.58$\pm$0.05 and a temperature at the rest frame 21 cm frequency of 603$^{+102}_{-92}$ mK.
• Unconstrained remote gaze tracking using off-the-shelf cameras is a challenging problem. Recently, promising algorithms for appearance-based gaze estimation using convolutional neural networks (CNN) have been proposed. Improving their robustness to various confounding factors including variable head pose, subject identity, illumination and image quality remain open problems. In this work, we study the effect of variable head pose on machine learning regressors trained to estimate gaze direction. We propose a novel branched CNN architecture that improves the robustness of gaze classifiers to variable head pose, without increasing computational cost. We also present various procedures to effectively train our gaze network including transfer learning from the more closely related task of object viewpoint estimation and from a large high-fidelity synthetic gaze dataset, which enable our ten times faster gaze network to achieve competitive accuracy to its current state-of-the-art direct competitor.
• We introduce and study the permanence properties of the class of linear transfers between probability measures. This class contains all cost minimizing mass transports, but also martingale mass transports, the Schr?odinger bridge associated to a reversible Markov process, and the weak mass transports of Tala- grand, Marton, Gozlan and others. The class also includes various stochastic mass transports to which Monge-Kantorovich theory does not apply. We also introduce the cone of convex transfers, which include any p-power (p > 1) of a linear transfer, but also the logarithmic entropy, the Donsker-Varadhan infor- mation and certain free energy functionals. This first paper is mostly about exhibiting examples that point to the pervasiveness of the concept in the important work on correlating probability distributions. Duality formulae for general transfer inequalities follow in a very natural way. We also study the infinite self-convolution of a linear transfer in order to establish the existence of generalized weak KAM solutions that could be applied to the stochastic counterpart of Fathi-Mather theory.
• We introduce a dynamical spatio-temporal model formalized as a recurrent neural network for forecasting time series of spatial processes, i.e. series of observations sharing temporal and spatial dependencies. The model learns these dependencies through a structured latent dynamical component, while a decoder predicts the observations from the latent representations. We consider several variants of this model, corresponding to different prior hypothesis about the spatial relations between the series. The model is evaluated and compared to state-of-the-art baselines, on a variety of forecasting problems representative of different application areas: epidemiology, geo-spatial statistics and car-traffic prediction. Besides these evaluations, we also describe experiments showing the ability of this approach to extract relevant spatial relations.
• Machine learning shows promise for improving our understanding of many-body problems. Tackling an unsolved problem, or classifying intricate phases, remains however a daunting task. Building on a recently introduced interpretable supervised learning scheme, we introduce a generic protocol to probe and identify nonlinear orientational spin order. We extract the tensorial order parameter up to rank 6. Moreover, we find that our approach yields reliable results already for a modest amount of training data and without knowledge of the exact transition temperature. Our approach may prove useful for identifying novel spin order and ruling out spurious spin liquid candidates.
• Training deep neural networks on images represented as grids of pixels has brought to light an interesting phenomenon known as adversarial examples. Inspired by how humans reconstruct abstract concepts, we attempt to codify the input bitmap image into a set of compact, interpretable elements to avoid being fooled by the adversarial structures. We take the first step in this direction by experimenting with image vectorization as an input transformation step to map the adversarial examples back into the natural manifold of MNIST handwritten digits. We compare our method vs. state-of-the-art input transformations and further discuss the trade-offs between a hand-designed and a learned transformation defense.
• We present a novel architectural enhancement of Channel Boosting in deep CNN. This idea of Channel Boosting exploits both the channel dimension of CNN (learning from multiple channels) and Transfer learning (TL). TL is utilized at two different stages, channel generation and channel exploitation. A deep CNN is boosted by various channels available through TL from already trained Deep NN, in addition to its own original channel. The deep architecture of CNN then exploits the original and boosted channels down the stream for learning discriminative patterns. Churn prediction in telecom is a challenging task due to high dimensionality and imbalance nature of the data and it is therefore used to evaluate the performance of the proposed Channel Boosted CNN. In the first phase, discriminative informative features are being extracted using a staked auto encoder, and then in the second phase, these features are combined with the original features to form Channel Boosted images. Finally, a CNN is employed for exploiting of transfer learning and performing classification. The results are promising and show the ability of the Channel Boosting concept in learning complex classification problem by discerning even minute differences in churners and non churners The evolution of recent architectures provides the concept that the restructuring of the architecture can make learning simplify with substantial improvement in performance.
• In this lecture the basic concepts of electromagnetic waves in accelerating structures are discussed. After a short introduction on the propagation of electromagnetic waves and on the concept of travelling wave and standing wave structures, the most important parameters and figures of merit used to describe the acceleration efficiency and the performance of accelerating structures in terms of power consumption is presented. The process of RF design optimization is also discussed. Finally a review of several type of structures using different accelerating modes is given, with examples of interest for medical accelerators.
• Gait as a biometric property for person identification plays a key role in video surveillance and security applications. In gait recognition, normally, gait feature such as Gait Energy Image (GEI) is extracted from one full gait cycle. However in many circumstances, such a full gait cycle might not be available due to occlusion. Thus, the GEI is not complete giving rise to a degrading in gait-based person identification rate. In this paper, we address this issue by proposing a novel method to identify individuals from gait feature when a few (or even single) frame(s) is available. To do so, we propose a deep learning-based approach to transform incomplete GEI to the corresponding complete GEI obtained from a full gait cycle. More precisely, this transformation is done gradually by training several auto encoders independently and then combining these as a uniform model. Experimental results on two public gait datasets, namely OULP and Casia-B demonstrate the validity of the proposed method in dealing with very incomplete gait cycles.
• In natural language understanding, many challenges require learning relationships between two sequences for various tasks such as similarity, relatedness, paraphrasing and question matching. Some of these challenges are inherently closer in nature, hence the knowledge acquired from one task to another is easier acquired and adapted. However, transferring all knowledge might be undesired and can lead to sub-optimal results due to \textitnegative transfer. Hence, this paper focuses on the transferability of both instances and parameters across natural language understanding tasks using an ensemble-based transfer learning method to circumvent such issues. The primary contribution of this paper is the combination of both \textitDropout and \textitBagging for improved transferability in neural networks, referred to as \textitDropping herein. Secondly, we present a straightforward yet novel approach to incorporating source \textitDropping Networks to a target task for few-shot learning that mitigates \textitnegative transfer. This is achieved by using a decaying parameter chosen according to the slope changes of a smoothed spline error curve at sub-intervals during training. We compare the approach over the hard parameter sharing, soft parameter sharing and single-task learning to compare its effectiveness. The aforementioned adjustment leads to improved transfer learning performance and comparable results to the current state of the art only using few instances from the target task.
• The process of aligning a pair of shapes is a fundamental operation in computer graphics. Traditional approaches rely heavily on matching corresponding points or features to guide the alignment, a paradigm that falters when significant shape portions are missing. These techniques generally do not incorporate prior knowledge about expected shape characteristics, which can help compensate for any misleading cues left by inaccuracies exhibited in the input shapes. We present an approach based on a deep neural network, leveraging shape datasets to learn a shape-aware prior for source-to-target alignment that is robust to shape incompleteness. In the absence of ground truth alignments for supervision, we train a network on the task of shape alignment using incomplete shapes generated from full shapes for self-supervision. Our network, called ALIGNet, is trained to warp complete source shapes to incomplete targets, as if the target shapes were complete, thus essentially rendering the alignment partial-shape agnostic. We aim for the network to develop specialized expertise over the common characteristics of the shapes in each dataset, thereby achieving a higher-level understanding of the expected shape space to which a local approach would be oblivious. We constrain ALIGNet through an anisotropic total variation identity regularization to promote piecewise smooth deformation fields, facilitating both partial-shape agnosticism and post-deformation applications. We demonstrate that ALIGNet learns to align geometrically distinct shapes, and is able to infer plausible mappings even when the target shape is significantly incomplete. We show that our network learns the common expected characteristics of shape collections, without over-fitting or memorization, enabling it to produce plausible deformations on unseen data during test time.
• In this paper we discuss how we can design Hamiltonians to implement quantum algorithms, in particular we focus in Deutsch and Grover algorithms. As main result of this paper, we show how Hamiltonian inverse quantum engineering method allow us to obtain feasible and time-independent Hamiltonians for implementing such algorithms. From our approach for the Deutsch algorithm, different from others techniques, we can provide an alternative approach for implementing such algorithm where no auxiliary qubit and additional resources are required. In addition, by using a single quantum evolution, the Grover algorithm can be achieved with high probability $1-\epsilon^2$, where $\epsilon$ is a very small arbitrary parameter.
• Internet users today are constantly giving away their personal information and privacy through social media, tracking cookies, 'free' email, and single sign-on authentication in order to access convenient online services. Unfortunately, the elected officials who are supposed to be regulating these technologies often know less about informed consent and data ownership than the users themselves. This is why without changes, internet users may continue to be exploited by companies offering free and convenient online services.
• In this paper, we address a relatively new task: prediction of ASR performance on unseen broadcast programs. We first propose an heterogenous French corpus dedicated to this task. Two prediction approaches are compared: a state-of-the-art performance prediction based on regression (engineered features) and a new strategy based on convolutional neural networks (learnt features). We particularly focus on the combination of both textual (ASR transcription) and signal inputs. While the joint use of textual and signal features did not work for the regression baseline, the combination of inputs for CNNs leads to the best WER prediction performance. We also show that our CNN prediction remarkably predicts the WER distribution on a collection of speech recordings.
• Heap layout manipulation is integral to exploiting heap-based memory corruption vulnerabilities. In this paper we present the first automatic approach to the problem, based on pseudo-random black-box search. Our approach searches for the inputs required to place the source of a heap-based buffer overflow or underflow next to heap-allocated objects that an exploit developer, or automatic exploit generation system, wishes to read or corrupt. We present a framework for benchmarking heap layout manipulation algorithms, and use it to evaluate our approach on several real-world allocators, showing that pseudo-random black-box search can be highly effective. We then present SHRIKE, a novel system that can perform automatic heap layout manipulation on the PHP interpreter and can be used in the construction of control-flow hijacking exploits. Starting from PHP's regression tests, SHRIKE discovers fragments of PHP code that interact with the interpreter's heap in useful ways, such as making allocations and deallocations of particular sizes, or allocating objects containing sensitive data, such as pointers. SHRIKE then uses our search algorithm to piece together these fragments into programs, searching for one that achieves a desired heap layout. SHRIKE allows an exploit developer to focus on the higher level concepts in an exploit, and to defer the resolution of heap layout constraints to SHRIKE. We demonstrate this by using SHRIKE in the construction of a control-flow hijacking exploit for the PHP interpreter.
• Many low-light enhancement methods ignore intensive noise in original images. As a result, they often simultaneously enhance the noise as well. Furthermore, extra denoising procedures adopted by most methods ruin the details. In this paper, we introduce a joint low-light enhancement and denoising strategy, aimed at obtaining well-enhanced low-light images while getting rid of the inherent noise issue simultaneously. The proposed method performs Retinex model based decomposition in a successive sequence, which sequentially estimates a piece-wise smoothed illumination and a noise-suppressed reflectance. After getting the illumination and reflectance map, we adjust the illumination layer and generate our enhancement result. In this noise-suppressed sequential decomposition process we enforce the spatial smoothness on each component and skillfully make use of weight matrices to suppress the noise and improve the contrast. Results of extensive experiments demonstrate the effectiveness and practicability of our method. It performs well for a wide variety of images, and achieves better or comparable quality compared with the state-of-the-art methods.
• Drones are being used in many industries for a variety of applications, including inspecting bridges, surveying farm land, and delivering cargo. Automating these kinds of scenarios requires more than following a sequence of GPS waypoints; they require integrating on-device hardware with real-time analysis to provide feedback and control to the drone. Currently, implementing these kinds of advanced scenarios is a complex task, requiring skilled software engineers programming with drone APIs. We envision an alternate model to enable drone operators to orchestrate advanced behaviors using a card-based approach. We describe the design of our card-based programming model, position it relative to other visual programming metaphors, share results from our paper prototype user study, and discuss our learnings from its implementation. Results suggest that a wide range of scenarios can be implemented with moderate mental effort and learning, balanced by intuitiveness and engagement.
• In this work, we focus on the problem of grounding language by training an agent to follow a set of natural language instructions and navigate to a target object in an environment. The agent receives visual information through raw pixels and a natural language instruction telling what task needs to be achieved. Other than these two sources of information, our model does not have any prior information of both the visual and textual modalities and is end-to-end trainable. We develop an attention mechanism for multi-modal fusion of visual and textual modalities that allows the agent to learn to complete the task and also achieve language grounding. Our experimental results show that our attention mechanism outperforms the existing multi-modal fusion mechanisms proposed for both 2D and 3D environments in order to solve the above mentioned task. We show that the learnt textual representations are semantically meaningful as they follow vector arithmetic and are also consistent enough to induce translation between instructions in different natural languages. We also show that our model generalizes effectively to unseen scenarios and exhibit \textitzero-shot generalization capabilities both in 2D and 3D environments. The code for our 2D environment as well as the models that we developed for both 2D and 3D are available at \hrefhttps://github.com/rl-lang-grounding/rl-lang-groundhttps://github.com/rl-lang-grounding/rl-lang-ground
• Apr 24 2018 cs.CV cs.LG stat.ML arXiv:1804.08450v1
Batch Normalization (BN) is capable of accelerating the training of deep models by centering and scaling activations within mini-batches. In this work, we propose Decorrelated Batch Normalization (DBN), which not just centers and scales activations but whitens them. We explore multiple whitening techniques, and find that PCA whitening causes a problem we call stochastic axis swapping, which is detrimental to learning. We show that ZCA whitening does not suffer from this problem, permitting successful learning. DBN retains the desirable qualities of BN and further improves BN's optimization efficiency and generalization ability. We design comprehensive experiments to show that DBN can improve the performance of BN on multilayer perceptrons and convolutional neural networks. Furthermore, we consistently improve the accuracy of residual networks on CIFAR-10, CIFAR-100, and ImageNet.
• In the paper some conditions for finite-time stability of the solution to differential inclusion with Carathéodory right-hand side are given. The main tools are a new Gronwall's like inequality and some corollaries from the Viability Theorem. Example of neural network which is finite-time stable is given.
• Voice conversion (VC) aims at conversion of speaker characteristic without altering content. Due to training data limitations and modeling imperfections, it is difficult to achieve believable speaker mimicry without introducing processing artifacts; performance assessment of VC, therefore, usually involves both speaker similarity and quality evaluation by a human panel. As a time-consuming, expensive, and non-reproducible process, it hinders rapid prototyping of new VC technology. We address artifact assessment using an alternative, objective approach leveraging from prior work on spoofing countermeasures (CMs) for automatic speaker verification. Therein, CMs are used for rejecting fake' inputs such as replayed, synthetic or converted speech but their potential for automatic speech artifact assessment remains unknown. This study serves to fill that gap. As a supplement to subjective results for the 2018 Voice Conversion Challenge (VCC'18) data, we configure a standard constant-Q cepstral coefficient CM to quantify the extent of processing artifacts. Equal error rate (EER) of the CM, a confusability index of VC samples with real human speech, serves as our artifact measure. Two clusters of VCC'18 entries are identified: low-quality ones with detectable artifacts (low EERs), and higher quality ones with less artifacts. None of the VCC'18 systems, however, is perfect: all EERs are < 30 % (the ideal' value would be 50 %). Our preliminary findings suggest potential of CMs outside of their original application, as a supplemental optimization and benchmarking tool to enhance VC technology.
• Computer Vision-based natural feature tracking is at the core of modern Augmented Reality applications. Still, Web-based Augmented Reality typically relies on location-based sensing (using GPS and orientation sensors) or marker-based approaches to solve the pose estimation problem. We present an implementation and evaluation of an efficient natural feature tracking pipeline for standard Web browsers using HTML5 and WebAssembly. Our system can track image targets at real-time frame rates tablet PCs (up to 60 Hz) and smartphones (up to 25 Hz).
• This paper describes and evaluates the use of Generative Adversarial Networks (GANs) for path planning in support of smart mobility applications such as indoor and outdoor navigation applications, individualized wayfinding for people with disabilities (e.g., vision impairments, physical disabilities, etc.), path planning for evacuations, robotic navigations, and path planning for autonomous vehicles. We propose an architecture based on GANs to recommend accurate and reliable paths for navigation applications. The proposed system can use crowd-sourced data to learn the trajectories and infer new ones. The system provides users with generated paths that help them navigate from their local environment to reach a desired location. As a use case, we experimented with the proposed method in support of a wayfinding application in an indoor environment. Our experiments assert that the generated paths are correct and reliable. The accuracy of the classification task for the generated paths is up to 99% and the quality of the generated paths has a mean opinion score of 89%.
• In this paper, we propose a novel abnormal event detection method with spatio-temporal adversarial networks (STAN). We devise a spatio-temporal generator which synthesizes an inter-frame by considering spatio-temporal characteristics with bidirectional ConvLSTM. A proposed spatio-temporal discriminator determines whether an input sequence is real-normal or not with 3D convolutional layers. These two networks are trained in an adversarial way to effectively encode spatio-temporal features of normal patterns. After the learning, the generator and the discriminator can be independently used as detectors, and deviations from the learned normal patterns are detected as abnormalities. Experimental results show that the proposed method achieved competitive performance compared to the state-of-the-art methods. Further, for the interpretation, we visualize the location of abnormal events detected by the proposed networks using a generator loss and discriminator gradients.
• Neural network frameworks such as PyTorch and TensorFlow are the workhorses of numerous machine learning applications ranging from object recognition to machine translation. While these frameworks are versatile and straightforward to use, the training of and inference in deep neural networks is resource (energy, compute, and memory) intensive. In contrast to recent works focusing on algorithmic enhancements, we introduce BrainSlug, a framework that transparently accelerates neural network workloads by changing the default layer-by-layer processing to a depth-first approach, reducing the amount of data required by the computations and thus improving the performance of the available hardware caches. BrainSlug achieves performance improvements of up to 41.1% on CPUs and 35.7% on GPUs. These optimizations come at zero cost to the user as they do not require hardware changes and only need tiny adjustments to the software.
• Automatization of the diagnosis of any kind of disease is of great importance and it's gaining speed as more and more deep learning solutions are applied to different problems. One of such computer aided systems could be a decision support too able to accurately differentiate between different types of breast cancer histological images - normal tissue or carcinoma. In this paper authors present a deep learning solution, based on convolutional capsule network for classification of four types of images of breast tissue biopsy when hematoxylin and eusin staining is applied. The cross-validation accuracy was achieved to be 0.87 with equaly high sensitivity.
• Apr 24 2018 cs.LG stat.ML arXiv:1804.08369v1
We present a learning-based system for rapid mass-scale material synthesis that is useful for novice and expert users alike. The user preferences are learned via Gaussian Process Regression and can be easily sampled for new recommendations. Typically, each recommendation takes 40-60 seconds to render with global illumination, which makes this process impracticable for real-world workflows. Our neural network eliminates this bottleneck by providing high-quality image predictions in real time, after which it is possible to pick the desired materials from a gallery and assign them to a scene in an intuitive manner. Workflow timings against Disney's "principled" shader reveal that our system scales well with the number of sought materials, thus empowering even novice users to generate hundreds of high-quality material models without any expertise in material modeling. Similarly, expert users experience a significant decrease in the total modeling time when populating a scene with materials. Furthermore, our proposed solution also offers controllable recommendations and a novel latent space variant generation step to enable the real-time fine-tuning of materials without requiring any domain expertise.
• Visual localization is one of the fundamental enablers of robot autonomy which has been mostly tackled using local feature-based pipelines that efficiently encode knowledge about the environment and the underlying geometrical constraints. Although deep learning based approaches have shown considerable robustness in the context of significant perceptual changes, repeating structures and textureless regions, their performance has been subpar in comparison to local feature-based pipelines. In this paper, we propose the novel VLocNet++ architecture that attempts to overcome this limitation by simultaneously embedding geometric and semantic knowledge of the world into the pose regression network. We adopt a multitask learning approach that exploits the inter-task relationship between learning semantics, regressing 6-DoF global pose and odometry, for the mutual benefit of each of these tasks. VLocNet++ incorporates the Geometric Consistency Loss function that utilizes the predicted motion from the odometry stream to enforce global consistency during pose regression. Furthermore, we propose a self-supervised warping technique that uses the relative motion to warp intermediate network representations in the segmentation stream for learning consistent semantics. In addition, we propose a novel adaptive weighted fusion layer to leverage inter and intra task dependencies based on region activations. Finally, we introduce a first-of-a-kind urban outdoor localization dataset with pixel-level semantic labels and multiple loops for training deep networks. Extensive experiments on the challenging indoor Microsoft 7-Scenes benchmark and our outdoor DeepLoc dataset demonstrate that our approach exceeds the state-of-the-art, outperforming local feature-based methods while exhibiting substantial robustness in challenging scenarios.
• In this paper, we present a novel deep learning architecture for infrared and visible images fusion problem. In contrast to conventional convolutional networks, our encoding network is combined by convolutional neural network layer and dense block in which the output of each layer is connected to every other layer. We attempt to use this architecture to get more useful features from source images in encoding process. Two fusion strategies are designed to fuse these features. Finally, the fused image is reconstructed by decoder. Compared with existing fusion methods, the proposed fusion method achieves state-of-the-art performance in objective and subjective assessment.Code and pre-trained models are available at https://github.com/exceptionLi/imagefusion densefuse

Stefano Pirandola Apr 23 2018 12:23 UTC

The most important reading here is Sam Braunstein's foundational paper: https://authors.library.caltech.edu/3827/1/BRAprl98.pdf published in January 98, already containing the key results for the strong convergence of the CV protocol. This is a must-read for those interested in CV quantum informatio

...(continued)
Mark M. Wilde Apr 23 2018 12:09 UTC

One should also consult my paper "Strong and uniform convergence in the teleportation simulation of bosonic Gaussian channels" https://arxiv.org/abs/1712.00145v4 posted in January 2018, in this context.

Stefano Pirandola Apr 23 2018 11:46 UTC

Some quick clarifications on the Braunstein-Kimble (BK) protocol for CV teleportation
and the associated teleportation simulation of bosonic channels.
(Disclaimer: the following is rather technical and CVs might not be so popular on this blog...so I guess this post will get a lot of dislikes :)

1)

...(continued)
NJBouman Apr 22 2018 18:26 UTC

[Fredrik Johansson][1] has pointed out to me (the author) the following about the multiplication benchmark w.r.t. GMP. This will be taken into account in the upcoming revision.

Fredrik Johansson wrote:
> You shouldn't be comparing your code to mpn_mul, because this function is not actually th

...(continued)
Joel Wallman Apr 18 2018 13:34 UTC

A very nice approach! Could you clarify the conclusion a little bit though? The aspirational goal for a quantum benchmark is to test how well we approximate a *specific* representation of a group (up to similarity transforms), whereas what your approach demonstrates is that without additional knowle

...(continued)
serfati philippe Mar 29 2018 14:07 UTC

see my 2 papers on direction of vorticity (nov1996 + feb1999) = https://www.researchgate.net/profile/Philippe_Serfati (published author, see also mendeley, academia.edu, orcid etc)

serfati philippe Mar 29 2018 13:34 UTC

see my 4 papers, 1998-1999, on contact and superposed vortex patches, cusps (and eg splashs), corners, generalized ones on lR^n and (ir/)regular ones =. http://www.researchgate.net/profile/Philippe_Serfati/ (published author).

Luis Cruz Mar 16 2018 15:34 UTC

Related Work:

- [Performance-Based Guidelines for Energy Efficient Mobile Applications](http://ieeexplore.ieee.org/document/7972717/)
- [Leafactor: Improving Energy Efficiency of Android Apps via Automatic Refactoring](http://ieeexplore.ieee.org/document/7972807/)

Dan Elton Mar 16 2018 04:36 UTC