Statistics Theory (math.ST)

  • PDF
    In many biological, agricultural, military activity problems and in some quality control problems, it is almost impossible to have a fixed sample size, because some observations are always lost for various reasons. Therefore, the sample size itself is considered frequently to be a random variable (rv). The class of limit distribution functions (df's) of the random bivariate extreme generalized order statistics (GOS) from independent and identically distributed RV's are fully characterized. When the random sample size is assumed to be independent of the basic variables and its df is assumed to converge weakly to a non-degenerate limit, the necessary and sufficient conditions for the weak convergence of the random bivariate extreme GOS are obtained. Furthermore, when the interrelation of the random size and the basic rv's is not restricted, sufficient conditions for the convergence and the forms of the limit df's are deduced. Illustrative examples are given which lend further support to our theoretical results.
  • PDF
    We propose to estimate a metamodel and the sensitivity indices of a complex model m in the Gaussian regression framework. Our approach combines methods for sensitivity analysis of complex models and statistical tools for sparse non-parametric estimation in multivariate Gaussian regression model. It rests on the construction of a metamodel for aproximating the Hoeffding-Sobol decomposition of m. This metamodel belongs to a reproducing kernel Hilbert space constructed as a direct sum of Hilbert spaces leading to a functional ANOVA decomposition. The estimation of the metamodel is carried out via a penalized least-squares minimization allowing to select the subsets of variables that contribute to predict the output. It allows to estimate the sensitivity indices of m. We establish an oracle-type inequality for the risk of the estimator, describe the procedure for estimating the metamodel and the sensitivity indices, and assess the performances of the procedure via a simulation study.
  • PDF
    In the present paper we propose and study estimators for a wide class of bivariate measures of concordance for copulas. These measures of concordance are generated by a copula and generalize Spearman's rho and Gini's gamma. In the case of Spearman's rho and Gini's gamma the estimators turn out to be the usual sample versions of these measures of concordance.
  • PDF
    We consider a sparse linear regression model Y=X\beta^*+W where X has a Gaussian entries, W is the noise vector with mean zero Gaussian entries, and \beta^* is a binary vector with support size (sparsity) k. Using a novel conditional second moment method we obtain a tight up to a multiplicative constant approximation of the optimal squared error \min_\beta\|Y-X\beta\|_2, where the minimization is over all k-sparse binary vectors \beta. The approximation reveals interesting structural properties of the underlying regression problem. In particular, a) We establish that n^*=2k\log p/\log (2k/\sigma^2+1) is a phase transition point with the following "all-or-nothing" property. When n exceeds n^*, (2k)^-1\|\beta_2-\beta^*\|_0≈0, and when n is below n^*, (2k)^-1\|\beta_2-\beta^*\|_0≈1, where \beta_2 is the optimal solution achieving the smallest squared error. With this we prove that n^* is the asymptotic threshold for recovering \beta^* information theoretically. b) We compute the squared error for an intermediate problem \min_\beta\|Y-X\beta\|_2 where minimization is restricted to vectors \beta with \|\beta-\beta^*\|_0=2k \zeta, for \zeta∈[0,1]. We show that a lower bound part \Gamma(\zeta) of the estimate, which corresponds to the estimate based on the first moment method, undergoes a phase transition at three different thresholds, namely n_\textinf,1=\sigma^2\log p, which is information theoretic bound for recovering \beta^* when k=1 and \sigma is large, then at n^* and finally at n_\textLASSO/CS. c) We establish a certain Overlap Gap Property (OGP) on the space of all binary vectors \beta when n\le ck\log p for sufficiently small constant c. We conjecture that OGP is the source of algorithmic hardness of solving the minimization problem \min_\beta\|Y-X\beta\|_2 in the regime n<n_\textLASSO/CS.

Recent comments

Alessandro Dec 09 2015 01:12 UTC

Hey, I've already seen this title!

Richard Kueng Mar 08 2015 22:02 UTC

Neither, Frédéric! Replacing fidelity by superfidelity still requires optimizing over all density matrices. However, the Birkhoff-von Neumann Theorem (see Lemma 1) allows for further restricting this optimization to n scalar variables w.l.o.g.---Theorem 2. Arguably, this greatly simplifies the geome

Frédéric Grosshans Mar 05 2015 11:31 UTC

I fell for that clickbait title and read the paper. I still don’t get why von Neumann didn't want us to know about this weird trick? And which weird trick? The use of superfidelity or the use of non-physical density matrices like $\sigma^\sharp$?

Noon van der Silk Mar 03 2015 03:20 UTC

I took the liberty of uploading the IPython notebook as a github [gist](, so it's viewable [here](