The main SciRate homepage is down (when not logged in). We are working to fix it. See https://github.com/scirate/scirate/issues/337 for updates.

Top arXiv papers

sign in to customize
  • PDF
    Robust Stable Marriage (RSM) is a variant of the classical Stable Marriage problem, where the robustness of a given stable matching is measured by the number of modifications required for repairing it in case an unforeseen event occurs. We focus on the complexity of finding an (a,b)-supermatch. An (a,b)-supermatch is defined as a stable matching in which if any 'a' (non-fixed) men/women break up it is possible to find another stable matching by changing the partners of those 'a' men/women and also the partners of at most 'b' other couples. In order to show deciding if there exists an (a,b)-supermatch is NP-Complete, we first introduce a SAT formulation that is NP-Complete by using Schaefer's Dichotomy Theorem. Then, we show the equivalence between the SAT formulation and finding a (1,1)-supermatch on a specific family of instances.
  • PDF
    This paper investigates the theoretical foundations of metric learning, focused on three key questions that are not fully addressed in prior work: 1) we consider learning general low-dimensional (low-rank) metrics as well as sparse metrics; 2) we develop upper and lower (minimax)bounds on the generalization error; 3) we quantify the sample complexity of metric learning in terms of the dimension of the feature space and the dimension/rank of the underlying metric;4) we also bound the accuracy of the learned metric relative to the underlying true generative metric. All the results involve novel mathematical approaches to the metric learning problem, and lso shed new light on the special case of ordinal embedding (aka non-metric multidimensional scaling).
  • PDF
    The Spliced Alignment Problem (SAP) that consists in finding an optimal semi-global alignment of a spliced RNA sequence on an unspliced genomic sequence has been largely considered for the prediction and the annotation of gene structures in genomes. Here, we re-visit it for the purpose of identifying CDS ortholog groups within a set of CDS from homologous genes and for computing multiple CDS alignments. We introduce a new constrained version of the spliced alignment problem together with an algorithm that exploits full information on the exon-intron structure of the input RNA and gene sequences in order to compute high-coverage accurate alignments. We show how pairwise spliced alignments between the CDS and the gene sequences of a gene family can be directly used in order to clusterize the set of CDS of the gene family into a set of ortholog groups. We also introduce an extension of the spliced alignment problem called Multiple Spliced Alignment Problem (MSAP) that consists in aligning simultaneously several RNA sequences on several genes from the same gene family. We develop a heuristic algorithmic solution for the problem. We show how to exploit multiple spliced alignments for the clustering of homologous CDS into ortholog and close paralog groups, and for the construction of multiple CDS alignments. An implementation of the method in Python is available on demande to SFA@USherbrooke.ca. Keywords: Spliced alignment, CDS ortholog groups, Multiple CDS alignment, Gene structure, Gene family.
  • PDF
    In recent work, we considered the frequencies of patterns of consecutive primes $\pmod{q}$ and numerically found biases toward certain patterns and against others. We made a conjecture explaining these biases, the dominant factor in which permits an easy description but fails to distinguish many patterns that have seemingly very different frequencies. There was a secondary factor in our conjecture accounting for this additional variation, but it was given only by a complicated expression whose distribution was not easily understood. Here, we study this term, which proves to be connected to both the Fourier transform of classical Dedekind sums and the error term in the asymptotic formula for the sum of $\phi(n)$.
  • PDF
    We report the first measurement of the fusion excitation functions for $^{39,47}$K + $^{28}$Si at near-barrier energies. Evaporation residues resulting from the fusion process were identified by direct measurement of their energy and time-of-flight with high geometric efficiency. At the lowest incident energy, the cross-section measured for the neutron-rich $^{47}$K induced reaction is ~6 times larger than that of the $\beta$-stable system. The experimental data are compared with both a dynamical deformation model and coupled channels calculations (CCFULL).
  • PDF
    While imitation learning is becoming common practice in robotics, this approach often suffers from data mismatch and compounding errors. DAgger is an iterative algorithm that addresses these issues by continually aggregating training data from both the expert and novice policies, but does not consider the impact of safety. We present a probabilistic extension to DAgger, which uses the distribution over actions provided by the novice policy, for a given observation. Our method, which we call DropoutDAgger, uses dropout to train the novice as a Bayesian neural network that provides insight to its confidence. Using the distribution over the novice's actions, we estimate a probabilistic measure of safety with respect to the expert action, tuned to balance exploration and exploitation. The utility of this approach is evaluated on the MuJoCo HalfCheetah and in a simple driving experiment, demonstrating improved performance and safety compared to other DAgger variants and classic imitation learning.
  • PDF
    In this paper the universal enveloping algebra of color hom-Lie algebras is studied. A construction of the free involutive hom-associative color algebra on a hom-module is described and applied to obtain the universal enveloping algebra of an involutive hom-Lie color algebra. Finally, the construction is applied to obtain the well-known Poincaré-Birkhoff-Witt theorem for Lie algebras to the enveloping algebra of an involutive color hom-Lie algebra.
  • PDF
    Extremal problems concerning the number of independent sets or complete subgraphs in a graph have been well studied in recent years. Cutler and Radcliffe proved that among graphs with $n$ vertices and maximum degree at most $r$, where $n = a(r+1)+b$ and $0 \le b \le r$, $aK_{r+1}\cup K_b$ has the maximum number of complete subgraphs, answering a question of Galvin. Gan, Loh, and Sudakov conjectured that $aK_{r+1}\cup K_b$ also maximizes the number of complete subgraphs $K_t$ for each fixed size $t \ge 3$, and proved this for $a = 1$. Cutler and Radcliffe proved this conjecture for $r \le 6$. We investigate a variant of this problem where we fix the number of edges instead of the number of vertices. We prove that $aK_{r+1}\cup \mathcal{ C}(b)$, where $\mathcal{C}(b)$ is the colex graph on $b$ edges, maximizes the number of triangles among graphs with $m$ edges and any fixed maximum degree $r\le 8$, where $m = a \binom{r+1}{2} + b$ and $0 \le b < \binom{r+1}{2}$.
  • PDF
    Verbal metonymy has received relatively scarce attention in the field of computational linguistics despite the fact that a model to accurately paraphrase metonymy has applications both in academia and the technology sector. The method described in this paper makes use of data from the British National Corpus in order to create word vectors, find instances of verbal metonymy and generate potential paraphrases. Two different ways of creating word vectors are evaluated in this study: Continuous bag of words and Skip-grams. Skip-grams are found to outperform the Continuous bag of words approach. Furthermore, the Skip-gram model is found to operate with better-than-chance accuracy and there is a strong positive relationship (phi coefficient = 0.61) between the model's classification and human judgement of the ranked paraphrases. This study lends credence to the viability of modelling verbal metonymy through computational methods based on distributional semantics.
  • PDF
    Massive data exist among user local platforms that usually cannot support deep neural network (DNN) training due to computation and storage resource constraints. Cloud-based training schemes can provide beneficial services, but rely on excessive user data collection, which can lead to potential privacy risks and violations. In this paper, we propose PrivyNet, a flexible framework to enable DNN training on the cloud while protecting the data privacy simultaneously. We propose to split the DNNs into two parts and deploy them separately onto the local platforms and the cloud. The local neural network (NN) is used for feature extraction. To avoid local training, we rely on the idea of transfer learning and derive the local NNs by extracting the initial layers from pre-trained NNs. We identify and compare three factors that determine the topology of the local NN, including the number of layers, the depth of output channels, and the subset of selected channels. We also propose a hierarchical strategy to determine the local NN topology, which is flexible to optimize the accuracy of the target learning task under the constraints on privacy loss, local computation, and storage. To validate PrivyNet, we use the convolutional NN (CNN) based image classification task as an example and characterize the dependency of privacy loss and accuracy on the local NN topology in detail. We also demonstrate that PrivyNet is efficient and can help explore and optimize the trade-off between privacy loss and accuracy.
  • PDF
    Based on the observation that application phases exhibit varying degrees of sensitivity to noise (i.e., accuracy loss) in computation during execution, this paper explores how Dynamic Precision Scaling (DPS) can maximize power efficiency by tailoring the precision of computation adaptively to temporal changes in algorithmic noise tolerance. DPS can decrease the arithmetic precision of noise-tolerant phases to result in power savings at the same operating speed (or faster execution within the same power budget), while keeping the overall loss in accuracy due to precision reduction bounded.
  • PDF
    We develop a framework for certifying randomness from Bell-test trials based on directly estimating the probability of the measurement outcomes with adaptive test supermartingales. The number of trials need not be predetermined, and one can stop performing trials early, as soon as the desired amount of randomness is extractable. It can be used with arbitrary, partially known and time-dependent probabilities for the random settings choices. Furthermore, it is suitable for application to experimental configurations with low Bell violation per trial, such as current optical loophole-free Bell tests. It is possible to adapt to time-varying experimental parameters. We formulate the framework for the general situation where the trial probability distributions are constrained to a known set. Randomness expansion with logarithmic settings entropy is possible for many relevant configurations. We implement probability estimation numerically and apply it to a representative settings-conditional outcome probability distribution from an atomic loophole-free Bell test [Rosenfeld et al., Phys. Rev. Lett. 119:010402 (2017), arXiv:1611.04604 (2016)] to illustrate trade-offs between the amount of randomness, error, settings entropy, unknown settings biases, and number of trials. We then show that probability estimation yields more randomness from the loophole-free Bell-test data analyzed in [Bierhorst et al., arXiv:1702.05178 (2017)] and tolerates adversarial settings probability biases.
  • PDF
    Access to large, diverse RGB-D datasets is critical for training RGB-D scene understanding algorithms. However, existing datasets still cover only a limited number of views or a restricted scale of spaces. In this paper, we introduce Matterport3D, a large-scale RGB-D dataset containing 10,800 panoramic views from 194,400 RGB-D images of 90 building-scale scenes. Annotations are provided with surface reconstructions, camera poses, and 2D and 3D semantic segmentations. The precise global alignment and comprehensive, diverse panoramic set of views over entire buildings enable a variety of supervised and self-supervised computer vision tasks, including keypoint matching, view overlap prediction, normal prediction from color, semantic segmentation, and region classification.
  • PDF
    This paper derives a posteriori error estimators for the nonlinear first-order optimality conditions associated with the Frank-Oseen elastic free-energy model of nematic and cholesteric liquid crystals, where the required unit-length constraint is imposed via either a Lagrange multiplier or penalty method. Furthermore, theory establishing the reliability of the proposed error estimator for the penalty method is presented, yielding a concrete upper bound on the approximation error of discrete solutions. The error estimators herein are composed of readily computable quantities on each element of a finite-element mesh, allowing the formulation of an efficient adaptive mesh refinement strategy. Four elastic equilibrium problems are considered to examine the performance of the error estimators and corresponding adaptive mesh refinements against that of a simple uniform refinement scheme. The adapted grids successfully provide significant reductions in computational work while producing solutions that are highly competitive with those of uniform mesh in terms of constraint conformance and computed free energies.
  • PDF
    This paper studies multi-agent distributed estimation under sensor attacks. Individual agents make sensor measurements of an unknown parameter belonging to a compact set, and, at every time step, a fraction of the agents' sensor measurements may fall under attack and take arbitrary values. The goal of this paper is to develop a distributed estimation algorithm to ensure that all, even those whose sensors are under attack, correctly estimate the parameter of interest. We assume a fully distributed setup; there is no fusion center and inter-agent collaboration is restricted to message exchanges between neighboring agents, the neighborhoods corresponding to a pre-specified, possibly parse, communication graph. We present two versions of the Saturated Innovation Update (SIU) algorithm for distributed estimation resilient to sensor attacks. The SIU algorithm is a consensus + innovations type estimator: agents iteratively estimate the parameter of interest by combining estimates of neighbors (consensus) and local sensing information (innovations). In the SIU algorithm, each agent applies a local, time-varying scalar gain to its innovation term to ensure that the energy of the scaled innovation lies below a threshold. Under the first version of SIU, which uses constant combination weights, if less than three tenths of agent sensors fall under attack, then, all of the agents' estimates converge exponentially fast to the true parameter. Under the second version, which uses time-decaying weights, if less than one half of the agent sensors fall under attack, then, all of the agents' estimates converge at a polynomial rate to the true parameter. We show that the resilience of SIU to sensor attacks does not depend on the topology of the inter-agent communication network, as long as it remains connected. Finally, we demonstrate the performance of SIU with numerical examples.
  • PDF
    We revisit the classical dissipativity theorem of linear-quadratic theory in a generalized framework where the quadratic storage is negative definite in a p-dimensional subspace and positive definite in a complementary subspace. The classical theory assumes p = 0 and provides an inter- connection theory for stability analysis, i.e. convergence to a zero dimensional attractor. The generalized theory is shown to provide an interconnection theory for p-dominance analysis, i.e. convergence to a p-dimensional dominant subspace. In turn, this property is the differential characterization of a generalized contraction property for nonlinear systems. The proposed generalization opens a novel avegeneraliznue for the analysis of interconnected systems with low-dimensional attractors.
  • PDF
    Direct measurements of the spin glass correlation function $G(R)$ for Gaussian and bimodal Ising spin glasses in dimension two have been carried out in the temperature region $T \sim 1$. In the Gaussian case the data are consistent with the known anomalous dimension value $\eta \equiv 0$. For the bimodal spin glass in this temperature region $T > T^{*}(L)$ well above the crossover $T^{*}(L)$ to the ground-state dominated regime, the effective exponent $\eta$ is certainly non-zero and the data are consistent with the estimate $\eta \sim 0.28(4)$ given by McMillan in 1983 from similar measurements. The 2D bimodal and Gaussian interaction distribution Ising spin glasses are not in the same Universality class.
  • PDF
    We present development of a genetic algorithm for fitting potential energy curves of diatomic molecules to experimental data. Our approach does not involve any functional form for fitting, which makes it a general fitting procedure. In particular, it takes in a guess potential, perhaps from an $ab \ initio$ calculation, along with experimental measurements of vibrational binding energies, rotational constants, and their experimental uncertainties. The fitting procedure is able to modify the guess potential until it converges to better than 1% uncertainty, as measured by $\bar{\chi}^2$. We present the details of this technique along with a comparison of potentials calculated by our genetic algorithm and the state of the art fitting techniques based on inverted perturbation approach for the $X \ ^1\Sigma^+$ and $C \ ^1\Sigma^+$ potentials of lithium-rubidium.
  • PDF
    So far, fingerprinting studies have focused on identifying features from single-modality MRI data, which capture individual characteristics in terms of brain structure, function, or white matter microstructure. However, due to the lack of a framework for comparing across multiple modalities, studies based on multi-modal data remain elusive. This paper presents a multi-modal analysis of genetically-related subjects to compare and contrast the information provided by various MRI modalities. The proposed framework represents MRI scans as bags of SIFT features, and uses these features in a nearest-neighbor graph to measure subject similarity. Experiments using the T1/T2-weighted MRI and diffusion MRI data of 861 Human Connectome Project subjects demonstrate strong links between the proposed similarity measure and genetic proximity.
  • PDF
    We theoretically study scattering process and superconducting triplet correlations in a graphene junction comprised of ferromagnet-RSO-superconductor in which RSO stands for a region with Rashba spin orbit interaction. Our results reveal spin-polarized subgap transport through the system due to an anomalous equal-spin Andreev reflection in addition to conventional back scatterings. We calculate equal- and opposite-spin pair correlations near the F-RSO interface and demonstrate direct link of the anomalous Andreev reflection and equal-spin pairings arised due to the proximity effect in the presence of RSO interaction. Moreover, we show that the amplitude of anomalous Andreev reflection, and thus the triplet pairings, are experimentally controllable when incorporating the influences of both tunable strain and Fermi level in the nonsuperconducting region. Our findings can be confirmed by a conductance spectroscopy experiment and provide better insights into the proximity-induced RSO coupling in graphene layers reported in recent experiments.
  • PDF
    Recently, the authors of the present work (together with M. N. Kolountzakis) introduced a new version of the non-commutative Delsarte scheme and applied it to the problem of mutually unbiased bases. Here we use this method to investigate the existence of a finite projective plane of a given order d. In particular, a short new proof is obtained for the nonexistence of a projective plane of order 6. For higher orders like 10 and 12, the method is non decisive but could turn out to give important supplementary informations.
  • PDF
    We report a comprehensive $^{139}$La and $^{63}$Cu nuclear magnetic resonance study on La$_{2-x}$Sr$_x$CuO$_4$ ($0.07\leq x \leq 0.2$) single crystals. The $^{139}$La spin-lattice relaxation rate $^{139}T_1^{-1}$ is drastically influenced by Sr doping $x$ at low temperatures. A detailed field dependence of $^{139}T_1^{-1}$ at $x=1/8$ suggests that charge ordering induces the critical slowing down of spin fluctuations toward glassy spin order and competes with superconductivity. On the other hand, the $^{63}$Cu relaxation rate $^{63}T_1^{-1}$ is well described by a Curie-Weiss law at high temperatures, yielding the Curie-Weiss temperature $\Theta$ as a function of doping. $\Theta$ changes sharply through a critical hole concentration $x_c\sim 0.09$. $x_c$ appears to correspond to the delocalization limit of doped holes, above which the bulk nature of superconductivity is established.
  • PDF
    In this paper we analize a family of one dimensional fully analytically solvable models, named the n-cluster models in a transverse magnetic field, in which a many-body cluster interaction competes with a uniform transverse magnetic field. These models, independently by the cluster size n + 2, exibit a quantum phase transition, that separates a paramagnetic phase from a cluster one, that corresponds to a nematic ordered phase or a symmetry-protected topological ordered phase for even or odd n respectively. Due to the symmetries of the spin correlation functions, we prove that these models have no genuine n+2-partite entanglement. On the contrary, a non vanishing concurrence arises between spins at the endpoints of the cluster, for a magnetic field strong enough. Due to their analyticity and peculiar entanglement properties, the n-cluster models in a transverse magnetic field serve as a prototype for studying non trivial-spin orderings and as a potential reference system for the applications of quantum information tasks.
  • PDF
    We use a slave-rotor approach within a mean-field theory to study the competition of metallic, Mott-insulating, and superconducting phases of spin-3/2 fermions subjected to a periodic optical lattice potential. In addition to the metal, the Mott-insulator, and the superconducting phase that associates with the gauge symmetry breaking of spinon field, we identify a navel emerging superconducting phase that breaks both roton and spinon field gauge symmetries. This novel superconducting phase emerges as a result of the competition between spin-0 singlet and spin-2 quintet interaction channels naturally available for spin-3/2 systems. The two superconducting phases can be distinguished from each other from quasiparticle weight. We further discuss the properties of these phases for both two-dimensional square and three-dimensional cubic lattices at zero and finite temperatures.
  • PDF
    Vascular plants rely on differences of osmotic pressure to export sugars from regions of synthesis (mature leaves) to sugar sinks (roots, fruits). In this process, known as Münch pressure flow, the loading of sugars from photosynthetic cells to the export conduit (the phloem) is crucial, as it sets the pressure head necessary to power long-distance transport. Whereas most herbaceous plants use active mechanisms to increase phloem concentration above that of the photosynthetic cells, in most tree species, for which transport distances are largest, loading seems to occur via passive symplastic diffusion from the mesophyll to the phloem. Here, we use a synthetic microfluidic model of a passive loader to explore the nonlinear dynamics that arise during export and determine the ability of passive loading to drive long-distance transport. We first demonstrate that in our device, phloem concentration is set by the balance between the resistances to diffusive loading from the source and convective export through the phloem. Convection-limited export corresponds to classical models of Münch transport, where phloem concentration is close to that of the source; in contrast, diffusion-limited export leads to small phloem concentrations and weak scaling of flow rates with the hydraulic resistance. We then show that the effective regime of convection-limited export is predominant in plants with large transport resistances and low xylem pressures. Moreover, hydrostatic pressures developed in our synthetic passive loader can reach botanically relevant values as high as 10 bars. We conclude that passive loading is sufficient to drive long-distance transport in large plants, and that trees are well suited to take full advantage of passive phloem loading strategies.
  • PDF
    The extraction of fibers from dMRI data typically produces a large number of fibers, it is common to group fibers into bundles. To this end, many specialized distance measures, such as MCP, have been used for fiber similarity. However, these distance based approaches require point-wise correspondence and focus only on the geometry of the fibers. Recent publications have highlighted that using microstructure measures along fibers improves tractography analysis. Also, many neurodegenerative diseases impacting white matter require the study of microstructure measures as well as the white matter geometry. Motivated by these, we propose to use a novel computational model for fibers, called functional varifolds, characterized by a metric that considers both the geometry and microstructure measure (e.g. GFA) along the fiber pathway. We use it to cluster fibers with a dictionary learning and sparse coding-based framework, and present a preliminary analysis using HCP data.
  • PDF
    In this paper we study the fully nonlinear stochastic Hamilton-Jacobi-Bellman (HJB) equation for the optimal stochastic control problem of stochastic differential equations with random coefficients. The notion of viscosity solution is introduced, and we prove that the value function of the optimal stochastic control problem is the maximal viscosity solution of the associated stochastic HJB equation. For the superparabolic cases, the uniqueness is addressed as well.
  • PDF
    We study a set $\mathcal{M}_{K,N}$ parameterizing filtered $SL(K)$-Higgs bundles over $\mathbb{CP}^1$ with an irregular singularity at $z = \infty$, such that the eigenvalues of the Higgs field grow like $\lvert \lambda \rvert \sim \lvert z ^{N/K} \mathrm{d} z \rvert$, where $K$ and $N$ are coprime. $\mathcal{M}_{K,N}$ carries a $\mathbb{C}^\times$-action analogous to the famous $\mathbb{C}^\times$-action introduced by Hitchin on the moduli spaces of Higgs bundles over compact curves. The construction of this $\mathbb{C}^\times$-action on $\mathcal{M}_{K,N}$ involves the rotation automorphism of the base $\mathbb{CP}^1$. We classify the fixed points of this $\mathbb{C}^\times$-action, and exhibit a curious $1$-$1$ correspondence between these fixed points and certain representations of the vertex algebra $\mathcal{W}_K$; in particular we have the relation $\mu = \frac{1}{12} \left(K - 1 - c_{\mathrm{eff}} \right)$, where $\mu$ is a regulated version of the $L^2$ norm of the Higgs field, and $c_{\mathrm{eff}}$ is the effective Virasoro central charge of the corresponding $W$-algebra representation. We also discuss a Bialynicki-Birula-type stratification of $\mathcal{M}_{K,N}$, where the strata are labeled by isomorphism classes of the underlying filtered vector bundles.
  • PDF
    The Large Ultraviolet / Optical / Infrared Surveyor (LUVOIR) is one of four large mission concepts currently undergoing community study for consideration by the 2020 Astronomy and Astrophysics Decadal Survey. The LUVOIR Ultraviolet Multi-Object Spectrograph, LUMOS, is being designed to support all of the UV science requirements of LUVOIR, from exoplanet host star characterization to tomography of circumgalactic halos to water plumes on outer solar system satellites. LUMOS offers point source and multi-object spectroscopy across the UV bandpass, with multiple resolution modes to support different science goals. The instrument will provide low (R = 8,000-18,000) and medium (R = 30,000-65,000) resolution modes across the far-ultraviolet (FUV: 100-200 nm) and near-ultraviolet (NUV: 200-400 nm) windows, and a very low resolution mode (R = 500) for spectroscopic investigations of extremely faint objects in the FUV. Imaging spectroscopy will be accomplished over a 3 x 1.6 arcminute field-of-view by employing holographically-ruled diffraction gratings to control optical aberrations, microshutter arrays (MSA), advanced optical coatings for high-throughput in the FUV, and next generation large-format photon-counting detectors. The spectroscopic capabilities of LUMOS are augmented by an FUV imaging channel (100-200nm, 13 milliarcsecond angular resolution, 2 x 2 arcminute field-of-view) that will employ a complement of narrow and medium-band filters. We present an overview of LUMOS' observing modes and estimated performance curves for effective area, spectral resolution, and imaging performance. Example "LUMOS 100-hour Highlights" observing programs are presented to demonstrate the potential power of LUVOIR's ultraviolet spectroscopic capabilities.
  • PDF
    Optimal training design and maximum likelihood channel estimation for one-way amplify-and-forward full duplex relay systems are proposed. The destination estimates the residual self-interference (RSI) channel as well as the end-to-end channel of the relay system, aiming to cancel the RSI through equalization. The log-likelihood function is maximized through a quasi-Newton method, with an MMSE-based initialization. The Cramer-Rao bounds are derived to evaluate the accuracy of the estimates. By using Szegö's theorems about Toeplitz matrices, we minimize the Cramer-Rao bounds and find the corresponding optimal training sequences. Extensions of our method to frequency selective channels and multiple relays are also presented.
  • PDF
    We consider the problem of non-parametric Conditional Independence testing (CI testing) for continuous random variables. Given i.i.d samples from the joint distribution $f(x,y,z)$ of continuous random vectors $X,Y$ and $Z,$ we determine whether $X \perp Y | Z$. We approach this by converting the conditional independence test into a classification problem. This allows us to harness very powerful classifiers like gradient-boosted trees and deep neural networks. These models can handle complex probability distributions and allow us to perform significantly better compared to the prior state of the art, for high-dimensional CI testing. The main technical challenge in the classification problem is the need for samples from the conditional product distribution $f^{CI}(x,y,z) = f(x|z)f(y|z)f(z)$ -- the joint distribution if and only if $X \perp Y | Z.$ -- when given access only to i.i.d. samples from the true joint distribution $f(x,y,z)$. To tackle this problem we propose a novel nearest neighbor bootstrap procedure and theoretically show that our generated samples are indeed close to $f^{CI}$ in terms of total variational distance. We then develop theoretical results regarding the generalization bounds for classification for our problem, which translate into error bounds for CI testing. We provide a novel analysis of Rademacher type classification bounds in the presence of non-i.i.d near-independent samples. We empirically validate the performance of our algorithm on simulated and real datasets and show performance gains over previous methods.
  • PDF
    Primary Frequency Control (PFC) is a fast acting mechanism used to ensure high-quality power for the grid that is becoming an increasingly attractive option for load participation. Because of the speed requirement, PFC requires distributed control laws to be used instead of a more centralized design. Current PFC designs assume that costs at each geographic location are independent. Unfortunately for many networked systems such as cloud computing, the decisions made among locations are interdependent and therefore require geographic coordination. In this paper, distributed control laws are designed for geo-distributed loads such as data centers in PFC. The controlled frequencies are provably stable, and the final equilibrium point is proven to strike an optimal balance between load participation and the frequency's deviation from its nominal set point. We evaluate the proposed control laws with realistic numerical simulations. Results highlight significant cost savings over existing approaches under a variety of settings.
  • PDF
    In this paper, we present a deep reinforcement learning (RL) framework for iterative dialog policy optimization in end-to-end task-oriented dialog systems. Popular approaches in learning dialog policy with RL include letting a dialog agent to learn against a user simulator. Building a reliable user simulator, however, is not trivial, often as difficult as building a good dialog agent. We address this challenge by jointly optimizing the dialog agent and the user simulator with deep RL by simulating dialogs between the two agents. We first bootstrap a basic dialog agent and a basic user simulator by learning directly from dialog corpora with supervised training. We then improve them further by letting the two agents to conduct task-oriented dialogs and iteratively optimizing their policies with deep RL. Both the dialog agent and the user simulator are designed with neural network models that can be trained end-to-end. Our experiment results show that the proposed method leads to promising improvements on task success rate and total task reward comparing to supervised training and single-agent RL training baseline models.
  • PDF
    The security of conventional cryptography systems is threatened in the forthcoming era of quantum computers. Quantum key distribution (QKD) features fundamentally proven security and offers a promising option for quantum-proof cryptography solution. Although prototype QKD systems over optical fiber have been demonstrated over the years, the key generation rates remain several orders-of-magnitude lower than current classical communication systems. In an effort towards a commercially viable QKD system with improved key generation rates, we developed a discrete-variable QKD system based on time-bin quantum photonic states that is capable of generating provably-secure cryptographic keys at megabit-per-second (Mbps) rates over metropolitan distances. We use high-dimensional quantum states that transmit more than one secret bit per received photon, alleviating detector saturation effects in the superconducting nanowire single-photon detectors (SNSPDs) employed in our system that feature very high detection efficiency (of over 70%) and low timing jitter (of less than 40 ps). Our system is constructed using commercial off-the-shelf components, and the adopted protocol can readily be extended to free-space quantum channels. The security analysis adopted to distill the keys ensures that the demonstrated protocol is robust against coherent attacks, finite-size effects, and a broad class of experimental imperfections identified in our system.
  • PDF
    This paper explores the discrete Dynamic Causal Modeling (DDCM) and its relationship with Directed Information (DI). We prove the conditional equivalence between DDCM and DI in characterizing the causal relationship between two brain regions. The theoretical results are demonstrated using fMRI data obtained under both resting state and stimulus based state. Our numerical analysis is consistent with that reported in previous study.
  • PDF
    We consider developable surfaces along the singular set of a swallowtail which are considered to be flat approximations of the swallowtail. For the study of singularities of such developable surfaces, we introduce the notion of Darboux frames along swallowtails and invariants. As a by-product, we give a new example of a frontal which is locally homeomorphic to a swallowtail.
  • PDF
    From an economic point of view, a common criterion for assessing the merits of a reserve investment is its impacts on social welfare. The underlying assumption in using this criterion is that side payments may be used to distribute the social gains among all market players. In reality, however, since the impacts of an electricity reserve project on different players may vary, such side payments are rather difficult to implement. We present a three-stage scenario-based programming model for committing reserves in power markets with large amounts of wind power. We describe wind power generation in terms of a representative set of appropriately weighted scenarios, and we present a dual decomposition algorithm for solving the resulting stochastic program. We test our scenario generation methodology on a model of 118 bus system, and we show that the stochastic programming unit commitment policy outperforms common reserve rules.
  • PDF
    For tensors in $\mathbb{C}^d \otimes \mathbb{C}^d \otimes \mathbb{C}^d$, Landsberg provides non-trivial equations for tensors of border rank $2d-3$ for $d$ even and $2d-5$ for $d$ odd were found by Landsberg. In previous work, we observe that Landsberg's method can be interpreted in the language of tensor blow-ups of matrix spaces, and using concavity of blow-ups we improve the case for odd $d$ from $2d-5$ to $2d-4$. The purpose of this paper is to show that the aforementioned results extend to tensors in $K^d \otimes K^d \otimes K^d$ for any field $K$.
  • PDF
    We study Ramsey-type problems in Gallai-colorings. Given a graph $G$ and an integer $k\ge1$, the Gallai-Ramsey number $gr_k(K_3,G)$ is the least positive integer $n$ such that every $k$-coloring of the edges of the complete graph on $n$ vertices contains either a rainbow triangle or a monochromatic copy of $G$. It turns out that $gr_k(K_3, G)$ behaves more nicely than the classical Ramsey number $r_k(G)$. However, finding exact values of $gr_k (K_3, G)$ is far from trivial. In this paper, we prove that $gr_k(K_3, C_9)= 4\cdot 2^k+1$ for all $k\ge1$. This new result provides partial evidence for the first open case of the Triple Odd Cycle Conjecture of Bondy and Erdős from 1973. Our technique relies heavily on the structural result of Gallai on edge-colorings of complete graphs without rainbow triangles. We believe the method we developed can be used to determine the exact values of $gr_k(K_3, C_n)$ for odd integers $n\ge11$.
  • PDF
    We analyze the convergence of (stochastic) gradient descent algorithm for learning a convolutional filter with Rectified Linear Unit (ReLU) activation function. Our analysis does not rely on any specific form of the input distribution and our proofs only use the definition of ReLU, in contrast with previous works that are restricted to standard Gaussian input. We show that (stochastic) gradient descent with random initialization can learn the convolutional filter in polynomial time and the convergence rate depends on the smoothness of the input distribution and the closeness of patches. To the best of our knowledge, this is the first recovery guarantee of gradient-based algorithms for convolutional filter on non-Gaussian input distributions. Our theory also justifies the two-stage learning rate strategy in deep neural networks. While our focus is theoretical, we also present experiments that illustrate our theoretical findings.
  • PDF
    Traditional data centers are designed with a rigid architecture of fit-for-purpose servers that provision resources beyond the average workload in order to deal with occasional peaks of data. Heterogeneous data centers are pushing towards more cost-efficient architectures with better resource provisioning. In this paper we study the feasibility of using disaggregated architectures for intensive data applications, in contrast to the monolithic approach of server-oriented architectures. Particularly, we have tested a proactive network analysis system in which the workload demands are highly variable. In the context of the dReDBox disaggregated architecture, the results show that the overhead caused by using remote memory resources is significant, between 66\% and 80\%, but we have also observed that the memory usage is one order of magnitude higher for the stress case with respect to average workloads. Therefore, dimensioning memory for the worst case in conventional systems will result in a notable waste of resources. Finally, we found that, for the selected use case, parallelism is limited by memory. Therefore, using a disaggregated architecture will allow for increased parallelism, which, at the same time, will mitigate the overhead caused by remote memory.
  • PDF
    It has been proposed that the pseudogap state of underdoped cuprate superconductors may be due to a transition to a phase which has circulating currents within each unit cell. Here, we use polarized neutron diffraction to search for the corresponding orbital moments in two samples of underdoped YBa$_2$Cu$_3$O$_{6+x}$ with doping levels $p=0.104$ and 0.123. In contrast to some other reports using polarized neutrons, but in agreement with nuclear magnetic resonance and muon spin rotation measurements, we find no evidence for the appearance of magnetic order below 300 K. Thus, our experiment suggests that such order is either not an intrinsic property of cuprate superconductors or the signal is eliminated using the experimental method described here. The experiment provides an upper bound for a possible orbital moment which depends on the pattern of currents within the unit cell. For example, for the CC-$\theta_{II}$ pattern proposed by Varma, we find that the ordered moment per current loop is less than 0.021 $\mu_B$.
  • PDF
    Motivated by the Gestalt pattern theory, and the Winograd Challenge for language understanding, we design synthetic experiments to investigate a deep learning algorithm's ability to infer simple (at least for human) visual concepts, such as symmetry, from examples. A visual concept is represented by randomly generated, positive as well as negative, example images. We then test the ability and speed of algorithms (and humans) to learn the concept from these images. The training and testing are performed progressively in multiple rounds, with each subsequent round deliberately designed to be more complex and confusing than the previous round(s), especially if the concept was not grasped by the learner. However, if the concept was understood, all the deliberate tests would become trivially easy. Our experiments show that humans can often infer a semantic concept quickly after looking at only a very small number of examples (this is often referred to as an "aha moment": a moment of sudden realization), and performs perfectly during all testing rounds (except for careless mistakes). On the contrary, deep convolutional neural networks (DCNN) could approximate some concepts statistically, but only after seeing many (x10^4) more examples. And it will still make obvious mistakes, especially during deliberate testing rounds or on samples outside the training distributions. This signals a lack of true "understanding", or a failure to reach the right "formula" for the semantics. We did find that some concepts are easier for DCNN than others. For example, simple "counting" is more learnable than "symmetry", while "uniformity" or "conformance" are much more difficult for DCNN to learn. To conclude, we propose an "Aha Challenge" for visual perception, calling for focused and quantitative research on Gestalt-style machine intelligence using limited training examples.
  • PDF
    We present measurements of low frequency charge noise and conductance drift in modulation doped GaAs/AlGaAs heterostructures grown by molecular beam epitaxy in which the silicon doping density has been varied from $2.4\times 10^{18} cm^{-3}$ (critically doped) to $6.0\times 10^{18} cm^{-3}$ (overdoped). Quantum point contacts were used to detect charge fluctuations. A clear reduction of both short time scale telegraphic noise and long time scale conductance drift with decreased doping density was observed. These measurements indicate that the \it neutral doping region plays a significant role in charge noise and conductance drift.
  • PDF
    Context. Transit events of extrasolar planets offer the opportunity to study the composition of their atmospheres. Previous work on transmission spectroscopy of the close-in gas giant TrES-3 b revealed an increase in absorption towards blue wavelengths of very large amplitude in terms of atmospheric pressure scale heights, too large to be explained by Rayleigh-scattering in the planetary atmosphere. Aims. We present a follow-up study of the optical transmission spectrum of the hot Jupiter TrES-3 b to investigate the strong increase in opacity towards short wavelengths found by a previous study. Furthermore, we aim to estimate the effect of stellar spots on the transmission spectrum. Methods. This work uses previously published long slit spectroscopy transit data of the Gran Telescopio Canarias (GTC) and published broad band observations as well as new observations in different bands from the near-UV to the near-IR, for a homogeneous transit light curve analysis. Additionally, a long-term photometric monitoring of the TrES-3 host star was performed. Results. Our newly analysed GTC spectroscopic transit observations show a slope of much lower amplitude than previous studies. We conclude from our results the previously reported increasing signal towards short wavelengths is not intrinsic to the TrES-3 system. Furthermore, the broad band spectrum favours a flat spectrum. Long-term photometric monitoring rules out a significant modification of the transmission spectrum by unocculted star spots.
  • PDF
    We present a probabilistic framework for nonlinearities, based on doubly truncated Gaussian distributions. By setting the truncation points appropriately, we are able to generate various types of nonlinearities within a unified framework, including sigmoid, tanh and ReLU, the most commonly used nonlinearities in neural networks. The framework readily integrates into existing stochastic neural networks (with hidden units characterized as random variables), allowing one for the first time to learn the nonlinearities alongside model weights in these networks. Extensive experiments demonstrate the performance improvements brought about by the proposed framework when integrated with the restricted Boltzmann machine (RBM), temporal RBM and the truncated Gaussian graphical model (TGGM).
  • PDF
    We present the concept of fiber-flux density for locally quantifying white matter (WM) fiber bundles. By combining scalar diffusivity measures (e.g., fractional anisotropy) with fiber-flux measurements, we define new local descriptors called Fiber-Flux Diffusion Density (FFDD) vectors. Applying each descriptor throughout fiber bundles allows along-tract coupling of a specific diffusion measure with geometrical properties, such as fiber orientation and coherence. A key step in the proposed framework is the construction of an FFDD dissimilarity measure for sub-voxel alignment of fiber bundles, based on the fast marching method (FMM). The obtained aligned WM tract-profiles enable meaningful inter-subject comparisons and group-wise statistical analysis. We demonstrate our method using two different datasets of contact sports players. Along-tract pairwise comparison as well as group-wise analysis, with respect to non-player healthy controls, reveal significant and spatially-consistent FFDD anomalies. Comparing our method with along-tract FA analysis shows improved sensitivity to subtle structural anomalies in football players over standard FA measurements.
  • PDF
    Using numerical simulations, we examine the dynamics of active matter run-and-tumble disks moving in a disordered array of obstacles. As a function of increasing active disk density and activity, we find a transition from a completely clogged state to a continuous flowing phase, and in the large activity limit, we observe an intermittent state where the motion occurs in avalanches that are power law distributed in size with an exponent of $\beta = 1.46$. In contrast, in the thermal or low activity limit we find bursts of motion that are not broadly distributed in size. We argue that in the highly active regime, the system reaches a self-jamming state due to the activity-induced self-clustering, and that the intermittent dynamics is similar to that found in the yielding of amorphous solids. Our results show that activity is another route by which particulate systems can be tuned to a nonequilibrium critical state.
  • PDF
    We first establish a family of sharp Caffarelli-Kohn-Nirenberg type inequalities on the Euclidean spaces and then extend them to the setting of Cartan-Hadamard manifolds with the same best constant. The quantitative version of these inequalities also are proved by adding a non-negative remainder term in terms of the sectional curvature of manifolds. We next prove several rigidity results for complete Riemannian manifolds supporting the Caffarelli-Kohn-Nirenberg type inequalities with the same sharp constant as in $\mathbb R^n$ (shortly, sharp CKN inequalities). Our results illustrate the influence of curvature to the sharp CKN inequalities on the Riemannian manifolds. They extend recent results of Kristály to a larger class of the sharp CKN inequalities.
  • PDF
    Searches for resonances in final states with leptons and photons have always been a powerful tool for discovery in high energy physics. We present here the latest results from the ATLAS and CMS experiments, based on up to 36.1 fb$^{-1}$ of 13 TeV proton-proton collisions produced at the Large Hadron Collider. Detailed results on single lepton, dilepton, diphoton and Z$\gamma$ resonances are included.

Recent comments

Ben Criger Sep 08 2017 08:09 UTC

Oh look, there's another technique for decoding surface codes subject to X/Z correlated errors: https://scirate.com/arxiv/1709.02154

Aram Harrow Sep 06 2017 07:54 UTC

The paper only applies to conformal field theories, and such a result cannot hold for more general 1-D systems by 0705.4077 and other papers (assuming standard complexity theory conjectures).

Felix Leditzky Sep 05 2017 21:27 UTC

Thanks for the clarification, Philippe!

Philippe Faist Sep 05 2017 21:09 UTC

Hi Felix, thanks for the good question.

We've found it more convenient to consider trace-nonincreasing and $\Gamma$-sub-preserving maps (and this is justified by the fact that they can be dilated to fully trace-preserving and $\Gamma$-preserving maps on a larger system). The issue arises because

...(continued)
Felix Leditzky Sep 05 2017 19:02 UTC

What is the reason/motivation to consider trace-non-increasing maps instead of trace-preserving maps in your framework and the definition of the coherent relative entropy?

Steve Flammia Aug 30 2017 22:30 UTC

Thanks for the reference Ashley. If I understand your paper, you are still measuring stabilizers of X- and Z-type at the top layer of the code. So it might be that we can improve on the factor of 2 that you found if we tailor the stabilizers to the noise bias at the base level.

Ashley Aug 30 2017 22:09 UTC

We followed Aliferis and Preskill's approach in [https://arxiv.org/abs/1308.4776][1] and found that the fault-tolerant threshold for the surface code was increased by approximately a factor of two, from around 0.75 per cent to 1.5 per cent for a bias of 10 to 100.

[1]: https://arxiv.org/abs/1308.

...(continued)
Stephen Bartlett Aug 30 2017 21:55 UTC

Following on from Steve's comments, it's possible to use the bias-preserving gate set in Aliferis and Preskill directly to do the syndrome extraction, as you build up a CNOT gadget, but such a direct application of your methods would be very complicated and involve a lot of gate teleportation. If y

...(continued)
Steve Flammia Aug 30 2017 21:38 UTC

We agree that finding good syndrome extraction circuits if an important question. At the moment we do not have such circuits, though we have started to think about them. We are optimistic that this can be done in principle, but it remains to be seen if the circuits can be made sufficiently simple to

...(continued)
John Preskill Aug 30 2017 14:48 UTC

Hi Steves and David. When we wrote https://arxiv.org/abs/0710.1301 our viewpoint was that a gate with highly biased (primarily Z) noise would need to commute with Z. So we built our fault-tolerant gadgets from such gates, along with preparations and measurements in the X basis.

Can you easily ext

...(continued)