Data Analysis, Statistics and Probability (

  • PDF
    We introduce swordfish, a Monte-Carlo-free Python package to predict expected exclusion limits, the discovery reach and expected confidence contours for a large class of experiments relevant for particle- and astrophysics. The tool is applicable to any counting experiment, supports general correlated background uncertainties, and gives exact results in both the signal- and systematics-limited regimes. Instead of time-intensive Monte Carlo simulations and likelihood maximization, it internally utilizes new approximation methods that are built on information geometry. Out of the box, swordfish provides straightforward methods for accurately deriving many of the common sensitivity measures. In addition, it allows one to examine experimental abilities in great detail by employing the notion of information flux. This new concept generalizes signal-to-noise ratios to situations where background uncertainties and component mixing cannot be neglected. The user interface of swordfish is designed with ease-of-use in mind, which we demonstrate by providing typical examples from indirect and direct dark matter searches as jupyter notebooks.
  • PDF
    Community detection is one of the pivotal tools for discovering the structure of complex networks. Majority of community detection methods rely on optimization of certain quality functions characterizing the proposed community structure. Perhaps, the most commonly used of those quality functions is modularity. Many heuristics are claimed to be efficient in modularity maximization, which is usually justified in relative terms through comparison of their outcomes with those provided by other known algorithms. However as all the approaches are heuristics, while the complete brute-force is not feasible, there is no known way to understand if the obtained partitioning is really the optimal one. In this article we address the modularity maximization problem from the other side --- finding an upper-bound estimate for the possible modularity values within a given network, allowing to better evaluate suggested community structures. Moreover, in some cases when then upper bound estimate meets the actually obtained modularity score, it provides a proof that the suggested community structure is indeed the optimal one. We propose an efficient algorithm for building such an upper-bound estimate and illustrate its usage on the examples of well-known classical and synthetic networks, being able to prove the optimality of the existing partitioning for some of the networks including well-known Zachary's Karate Club.
  • PDF
    We present a Bayesian and frequentist analysis of the DAMPE charged cosmic ray spectrum. The spectrum, by eye, contained a spectral break at about 1 TeV and a monochromatic excess at about 1.4 TeV. The break was supported by a Bayes factor of about $10^{10}$ and we argue that the statistical significance was resounding. We investigated whether we should attribute the excess to dark matter annihilation into electrons in a nearby subhalo. We found a local significance of about $3.6\sigma$ and a global significance of about $2.3\sigma$, including a two-dimensional look-elsewhere effect by simulating 1000 pseudo-experiments. The Bayes factor was sensitive to our choices of priors, but favoured the excess by about 2 for our choices. Thus, whilst intriguing, the evidence for a signal is not currently compelling.
  • PDF
    The characterization of intermittent, multiscale and transient dynamics using data-driven analysis remains an open challenge. We demonstrate an application of the Dynamic Mode Decomposition (DMD) with sparse sampling for the diagnostic analysis of multiscale physics. The DMD method is an ideal spatiotemporal matrix decomposition that correlates spatial features of computational or experimental data to periodic temporal behavior. DMD can be modified into a multiresolution analysis to separate complex dynamics into a hierarchy of multiresolution timescale components, where each level of the hierarchy divides dynamics into distinct background (slow) and foreground (fast) timescales. The multiresolution DMD is capable of characterizing nonlinear dynamical systems in an equation-free manner by recursively decomposing the state of the system into low-rank spatial modes and their temporal Fourier dynamics. Moreover, these multiresolution DMD modes can be used to determined sparse sampling locations which are nearly optimal for dynamic regime classification and full state reconstruction. Specifically, optimized sensors are efficiently chosen using QR column pivots of the DMD library, thus avoiding an NP-hard selection process. We demonstrate the efficacy of the method on several examples, including global sea-surface temperature data, and show that only a small number of sensors are needed for accurate global reconstructions and classification of El NiƱo events.

Recent comments

Noon van der Silk Jan 27 2016 03:39 UTC

Great institute name ...

Chris Granade Sep 22 2015 19:15 UTC

Thank you for the kind comments, I'm glad that our paper, source code, and tutorial are useful!

Travis Scholten Sep 21 2015 17:05 UTC

This was a really well-written paper! Am very glad to see this kind of work being done.

In addition, the openness about source code is refreshing. By explicitly relating the work to [QInfer](, this paper makes it more easy to check the authors' work. Furthe

Chris Granade Sep 15 2015 02:40 UTC

As a quick addendum, please note that the [supplementary video]( for this work is available [on YouTube]( Thank you!