Methodology (stat.ME)

  • PDF
    In this paper we propose a new approach for sequential monitoring of a parameter of a $d$-dimensional time series. We consider a closed-end-method, which is motivated by the likelihood ratio test principle and compare the new method with two alternative procedures. We also incorporate self-normalization such that estimation of the long-run variance is not necessary. We prove that for a large class of testing problems the new detection scheme has asymptotic level $\alpha$ and is consistent. The asymptotic theory is illustrated for the important cases of monitoring a change in the mean, variance and correlation. By means of a simulation study it is demonstrated that the new test performs better than the currently available procedures for these problems.
  • PDF
    Conditional Kendall's tau is a measure of dependence between two random variables, conditionally on some covariates. We study nonparametric estimators of such quantities using kernel smoothing techniques. Then, we assume a regression-type relationship between conditional Kendall's tau and covariates, in a parametric setting with possibly a large number of regressors. This model may be sparse, and the underlying parameter is estimated through a penalized criterion. The theoretical properties of all these estimators are stated. We prove non-asymptotic bounds with explicit constants that hold with high probability. We derive their consistency, their asymptotic law and some oracle properties. Some simulations and applications to real data conclude the paper.
  • PDF
    A nonparametric Bayesian sparse graph linear dynamical system (SGLDS) is proposed to model sequentially observed multivariate data. SGLDS uses the Bernoulli-Poisson link together with a gamma process to generate an infinite dimensional sparse random graph to model state transitions. Depending on the sparsity pattern of the corresponding row and column of the graph affinity matrix, a latent state of SGLDS can be categorized as either a non-dynamic state or a dynamic one. A normal-gamma construction is used to shrink the energy captured by the non-dynamic states, while the dynamic states can be further categorized into live, absorbing, or noise-injection states, which capture different types of dynamical components of the underlying time series. The state-of-the-art performance of SGLDS is demonstrated with experiments on both synthetic and real data.
  • PDF
    Calcium imaging data promises to transform the field of neuroscience by making it possible to record from large populations of neurons simultaneously. However, determining the exact moment in time at which a neuron spikes, from a calcium imaging data set, amounts to a non-trivial deconvolution problem which is of critical importance for downstream analyses. While a number of formulations have been proposed for this task in the recent literature, in this paper we focus on a formulation recently proposed in Jewell and Witten (2017) which has shown initial promising results. However, this proposal is slow to run on fluorescence traces of hundreds of thousands of timesteps. Here we develop a much faster online algorithm for solving the optimization problem of Jewell and Witten (2017) that can be used to deconvolve a fluorescence trace of 100,000 timesteps in less than a second. Furthermore, this algorithm overcomes a technical challenge of Jewell and Witten (2017) by avoiding the occurrence of so-called "negative" spikes. We demonstrate that this algorithm has superior performance relative to existing methods for spike deconvolution on calcium imaging datasets that were recently released as part of the spikefinder challenge (http://spikefinder.codeneuro.org/). Our C++ implementation, along with R and python wrappers, is publicly available on Github at https://github.com/jewellsean/FastLZeroSpikeInference.
  • PDF
    A folded type model is developed for analyzing compositional data. The proposed model, which is based upon the $\alpha$-transformation for compositional data, provides a new and flexible class of distributions for modeling data defined on the simplex sample space. Despite its rather seemingly complex structure, employment of the EM algorithm guarantees efficient parameter estimation. The model is validated through simulation studies and examples which illustrate that the proposed model performs better in terms of capturing the data structure, when compared to the popular logistic normal distribution.