Statistics Theory (math.ST)

  • PDF
    We investigate the problem of inferring the causal variables of a response $Y$ from a set of $d$ predictors $(X^1,\dots,X^d)$. Classical ordinary least squares regression includes all predictors that reduce the variance of $Y$. Using only the causal parents instead leads to models that have the advantage of remaining invariant under interventions, i.e., loosely speaking they lead to invariance across different "environments" or "heterogeneity patterns". More precisely, the conditional distribution of $Y$ given its causal variables remains constant for all observations. Recent work exploit such a stability to infer causal relations from data with different but known environments. We show here that even without having knowledge of the environments or heterogeneity pattern, inferring causal relations is possible for time-ordered (or any other type of sequentially ordered) data. In particular, this then allows to detect instantaneous causal relations in multivariate linear time series, in contrast to the concept of Granger causality. Besides novel methodology, we provide statistical confidence bounds and asymptotic detection results for inferring causal variables, and we present an application to monetary policy in macro economics.
  • PDF
    In the last few years, an extensive literature has been focused on the $\ell_1$ penalized least squares (Lasso) estimators of high dimensional linear regression when the number of covariates $p$ is considerably larger than the sample size $n$. However, there is limited attention paid to the properties of the estimators when the errors or/and the covariates are serially dependent. In this study, we investigate the theoretical properties of the Lasso estimators for linear regression with random design under serially dependent and/or non-sub-Gaussian errors and covariates. In contrast to the traditional case in which the errors are i.i.d and have finite exponential moments, we show that $p$ can at most be a power of $n$ if the errors have only polynomial moments. In addition, the rate of convergence becomes slower due to the serial dependencies in errors and the covariates. We also consider sign consistency for model selection via Lasso when there are serial correlations in the errors or the covariates or both. Adopting the framework of functional dependence measure, we provide a detailed description on how the rates of convergence and the selection consistencies of the estimators depend on the dependence measures and moment conditions of the errors and the covariates. Simulation results show that Lasso regression can be substantially more powerful than the mixed frequency data sampling regression (MIDAS) in the presence of irrelevant variables. We apply the results obtained for the Lasso method to nowcasting mixing frequency data in which serially correlated errors and a large number of covariates are common. In real examples, the Lasso procedure outperforms the MIDAS in both forecasting and nowcasting.
  • PDF
    We consider high-dimensional binary classification by sparse logistic regression. We propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic bounds for the resulting misclassification excess risk. The bounds can be reduced under the additional low-noise condition. The proposed complexity penalty is remarkably related to the VC-dimension of a set of sparse linear classifiers. Implementation of any complexity penalty-based criterion, however, requires a combinatorial search over all possible models. To find a model selection procedure computationally feasible for high-dimensional data, we extend the Slope estimator for logistic regression and show that under an additional weighted restricted eigenvalue condition it is rate-optimal in the minimax sense.
  • PDF
    This paper considers the problem of estimating the emission densities of a nonparametric finite state space hidden Markov model in a way that is state-by-state adaptive and leads to minimax rates for each emission density--as opposed to globally minimax estimators, which adapt to the worst regularity among the emission densities. We propose a model selection procedure based on the Goldenschluger-Lepski method. Our method is computationally efficient and only requires a family of preliminary estimators, without any restriction on the type of estimators considered. We present two such estimators that allow to reach minimax rates up to a logarithmic term: a spectral estimator and a least squares estimator. Finally, numerical experiments assess the performance of the method and illustrate how to calibrate it in practice. Our method is not specific to hidden Markov models and can be applied to nonparametric multiview mixture models.
  • PDF
    We consider the problem of testing simultaneously many null hypotheses when the test statistics have a discrete distribution. We present new modifications of the Benjamini-Hochberg procedure that incorporate the discrete structure of the data in an appropriate way. These new procedures are theoretically proved to control the false discovery rate (FDR) for any fixed number of null hypotheses. A strong point of our FDR controlling methodology is that it allows to incorporate at once the discreteness and the quantity of signal of the data (so called "$\pi\_0$-adaptation "). Finally, the power advantage of the new methods is demonstrated by using both numerical experiments and real data sets.
  • PDF
    In this paper, we consider isotropic and stationary max-stable, inverse max-stable and max-mixture processes $X=(X(s))\_{s\in\bR^2}$ and the damage function $\cD\_X^{\nu}= |X|^\nu$ with $0<\nu<1/2$. We study the quantitative behavior of a risk measure which is the variance of the average of $\cD\_X^{\nu}$ over a region $\mathcal{A}\subset \bR^2$. This kind of risk measure has already been introduced and studied for \verosome max-stable processes in \citekoch2015spatial. %\textcolorredIn this study, we generalised this risk measure to be applicable for several models: asymptotic dependence represented by max-stable, asymptotic independence represented by inverse max-stable and mixing between of them. We evaluated the proposed risk measure by a simulation study.
  • PDF
    An image is here defined to be a set which is either open or closed and an image transformation is structure preserving in the following sense: It corresponds to an algebra homomorphism for each singly generated algebra. The results extend parts of results of J.F. Aarnes on quasi-measures, -states, -homomorphisms, and image-transformations from the setting compact Hausdorff spaces to locally compact Hausdorff spaces.
  • PDF
    This paper considers the problem of phase retrieval, where the goal is to recover a signal $z\in C^n$ from the observations $y_i=|a_i^* z|$, $i=1,2,\cdots,m$. While many algorithms have been proposed, the alternating minimization algorithm has been one of the most commonly used methods, and it is very simple to implement. Current work has proved that when the observation vectors $\{a_i\}_{i=1}^m$ are sampled from a complex Gaussian distribution $N(0, I)$, it recovers the underlying signal with a good initialization when $m=O(n)$, or with random initialization when $m=O(n^2)$, and it conjectured that random initialization succeeds with $m=O(n)$. This work proposes a modified alternating minimization method in a batch setting, and proves that when $m=O(n\log^{3}n)$, the proposed algorithm with random initialization recovers the underlying signal with high probability. The proof is based on the observation that after each iteration of alternating minimization, with high probability, the angle between the estimated signal and the underlying signal is reduced.
  • PDF
    Estimating a high-dimensional sparse covariance matrix from a limited number of samples is a fundamental problem in contemporary data analysis. Most proposals to date, however, are not robust to outliers or heavy tails. Towards bridging this gap, in this work we consider estimating a sparse shape matrix from $n$ samples following a possibly heavy tailed elliptical distribution. We propose estimators based on thresholding either Tyler's M-estimator or its regularized variant. We derive bounds on the difference in spectral norm between our estimators and the shape matrix in the joint limit as the dimension $p$ and sample size $n$ tend to infinity with $p/n\to\gamma>0$. These bounds are minimax rate-optimal. Results on simulated data support our theoretical analysis.
  • PDF
    We define causal estimands for experiments on single time series, extending the potential outcome framework to dealing with temporal data. Our approach allows the estimation of some of these estimands and exact randomization based p-values for testing causal effects, without imposing stringent assumptions. We test our methodology on simulated "potential autoregressions,"which have a causal interpretation. Our methodology is partially inspired by data from a large number of experiments carried out by a financial company who compared the impact of two different ways of trading equity futures contracts. We use our methodology to make causal statements about their trading methods.

Recent comments

Alessandro Dec 09 2015 01:12 UTC

Hey, I've already seen this title! http://arxiv.org/abs/1307.0401

Richard Kueng Mar 08 2015 22:02 UTC

Neither, Frédéric! Replacing fidelity by superfidelity still requires optimizing over all density matrices. However, the Birkhoff-von Neumann Theorem (see Lemma 1) allows for further restricting this optimization to n scalar variables w.l.o.g.---Theorem 2. Arguably, this greatly simplifies the geome

...(continued)
Frédéric Grosshans Mar 05 2015 11:31 UTC

I fell for that clickbait title and read the paper. I still don’t get why von Neumann didn't want us to know about this weird trick? And which weird trick? The use of superfidelity or the use of non-physical density matrices like $\sigma^\sharp$?

Noon van der Silk Mar 03 2015 03:20 UTC

I took the liberty of uploading the IPython notebook as a github [gist](https://gist.github.com), so it's viewable [here](http://nbviewer.ipython.org/urls/gist.githubusercontent.com/silky/b14fa42c6d5475a3a724/raw/887c19fb04581f1a33f9d03370e4b7b3a33c2ea8/ferrie_kueng_bayes_est_fid.ipynb).