- This paper proposes a novel entropy encoding technique for lossless data compression. Representing a message string by its lexicographic index in the permutations of its symbols results in a compressed version matching Shannon entropy of the message. Commercial data compression standards make use of Huffman or arithmetic coding at some stage of the compression process. In the proposed method, like arithmetic coding entire string is mapped to an integer but is not based on fractional numbers. Unlike both arithmetic and Huffman coding no prior entropy model of the source is required. Simple intuitive algorithm based on multinomial coefficients is developed for entropy encoding that adoptively uses low number of bits for more frequent symbols. Correctness of the algorithm is demonstrated by an example.
- Mar 24 2017 math.OC arXiv:1703.08044v1We study a deep matrix factorization problem. It takes as input a matrix $X$ obtained by multiplying $K$ matrices (called factors). Each factor is obtained by applying a fixed linear operator to a short vector of parameters satisfying a model (for instance sparsity, grouped sparsity, non-negativity, constraints defining a convolution network\ldots). We call the problem deep or multi-layer because the number of factors is not limited. In the practical situations we have in mind, we can typically have $K=10$ or $100$. This work aims at identifying conditions on the structure of the model that guarantees the stable recovery of the factors from the knowledge of $X$ and the model for the factors.We provide necessary and sufficient conditions for the identifiability of the factors (up to a scale rearrangement). We also provide a necessary and sufficient condition called Deep Null Space Property (because of the analogy with the usual Null Space Property in the compressed sensing framework) which guarantees that even an inaccurate optimization algorithm for the factorization stably recovers the factors. We illustrate the theory with a practical example where the deep factorization is a convolutional network.
- This work addresses the recovery and demixing problem of signals that are sparse in some general dictionary. Involved applications include source separation, image inpainting, super-resolution, and restoration of signals corrupted by clipping, saturation, impulsive noise, or narrowband interference. We employ the $\ell_q$-norm ($0 \le q < 1$) for sparsity inducing and propose a constrained $\ell_q$-minimization formulation for the recovery and demixing problem. This nonconvex formulation is approximately solved by two efficient first-order algorithms based on proximal coordinate descent and alternative direction method of multipliers (ADMM), respectively. The new algorithms are convergent in the nonconvex case under some mild conditions and scale well for high-dimensional problems. A convergence condition of the new ADMM algorithm has been derived. Furthermore, extension of the two algorithms for multi-channels joint recovery has been presented, which can further exploit the joint sparsity pattern among multi-channel signals. Various numerical experiments showed that the new algorithms can achieve considerable performance gain over the $\ell_1$-regularized algorithms.
- Recently, research on accelerated stochastic gradient descent methods (e.g., SVRG) has made exciting progress (e.g., linear convergence for strongly convex problems). However, the best-known methods (e.g., Katyusha) requires at least two auxiliary variables and two momentum parameters. In this paper, we propose a fast stochastic variance reduction gradient (FSVRG) method, in which we design a novel update rule with the Nesterov's momentum and incorporate the technique of growing epoch size. FSVRG has only one auxiliary variable and one momentum weight, and thus it is much simpler and has much lower per-iteration complexity. We prove that FSVRG achieves linear convergence for strongly convex problems and the optimal $\mathcal{O}(1/T^2)$ convergence rate for non-strongly convex problems, where $T$ is the number of outer-iterations. We also extend FSVRG to directly solve the problems with non-smooth component functions, such as SVM. Finally, we empirically study the performance of FSVRG for solving various machine learning problems such as logistic regression, ridge regression, Lasso and SVM. Our results show that FSVRG outperforms the state-of-the-art stochastic methods, including Katyusha.
- Mar 24 2017 math.OC arXiv:1703.08167v1
- Mar 24 2017 math.CV arXiv:1703.08165v1
- Mar 24 2017 math.PR arXiv:1703.08163v1
- Mar 24 2017 math.NA arXiv:1703.08158v1
- Mar 24 2017 math.AP arXiv:1703.08154v1
- Mar 24 2017 math.NT arXiv:1703.08145v1
- Mar 24 2017 math.DG arXiv:1703.08143v1
- Mar 24 2017 math.AP arXiv:1703.08140v1
- Mar 24 2017 math.OA arXiv:1703.08134v1
- Mar 24 2017 math.CA arXiv:1703.08129v1
- Mar 24 2017 math.FA arXiv:1703.08128v1
- Mar 24 2017 math.CO arXiv:1703.08123v1
- Mar 24 2017 math.GR arXiv:1703.08121v1
- Mar 24 2017 math.AP arXiv:1703.08103v1
- Mar 24 2017 math.AG arXiv:1703.08086v1
- Mar 24 2017 math.NA arXiv:1703.08079v1
- Mar 24 2017 math.PR arXiv:1703.08078v1
- Mar 24 2017 math.AP arXiv:1703.08064v1
- Mar 24 2017 math.PR arXiv:1703.08058v1
- Mar 24 2017 math.PR arXiv:1703.08057v1
- Mar 24 2017 math.RT arXiv:1703.08048v1
- Mar 24 2017 math.PR arXiv:1703.08035v1
- Mar 24 2017 math.NT arXiv:1703.08032v1
- Mar 24 2017 math.AP arXiv:1703.08028v1