This paper explains why deep learning can generalize well, despite large capacity and possible algorithmic instability, nonrobustness, and sharp minima, effectively addressing an open problem in the literature. Based on our theoretical insight, this paper also proposes a family of new regularization methods. Its simplest member was empirically shown to improve base models and achieve state-of-the-art performance on MNIST and CIFAR-10 benchmarks. Moreover, this paper presents both data-dependent and data-independent generalization guarantees with improved convergence rates. Our results suggest several new open areas of research.
In this study, we systematically investigate the impact of class imbalance on classification performance of convolutional neural networks (CNNs) and compare frequently used methods to address the issue. Class imbalance is a common problem that has been comprehensively studied in classical machine learning, yet very limited systematic research is available in the context of deep learning. In our study, we use three benchmark datasets of increasing complexity, MNIST, CIFAR-10 and ImageNet, to investigate the effects of imbalance on classification and perform an extensive comparison of several methods to address the issue: oversampling, undersampling, two-phase training, and thresholding that compensates for prior class probabilities. Our main evaluation metric is area under the receiver operating characteristic curve (ROC AUC) adjusted to multi-class tasks since overall accuracy metric is associated with notable difficulties in the context of imbalanced data. Based on results from our experiments we conclude that (i) the effect of class imbalance on classification performance is detrimental; (ii) the method of addressing class imbalance that emerged as dominant in almost all analyzed scenarios was oversampling; (iii) oversampling should be applied to the level that totally eliminates the imbalance, whereas undersampling can perform better when the imbalance is only removed to some extent; (iv) as opposed to some classical machine learning models, oversampling does not necessarily cause overfitting of CNNs; (v) thresholding should be applied to compensate for prior class probabilities when overall number of properly classified cases is of interest.
Vector Quantization, VQ is a popular image compression technique with a simple decoding architecture and high compression ratio. Codebook designing is the most essential part in Vector Quantization. LindeBuzoGray, LBG is a traditional method of generation of VQ Codebook which results in lower PSNR value. A Codebook affects the quality of image compression, so the choice of an appropriate codebook is a must. Several optimization techniques have been proposed for global codebook generation to enhance the quality of image compression. In this paper, a novel algorithm called IDE-LBG is proposed which uses Improved Differential Evolution Algorithm coupled with LBG for generating optimum VQ Codebooks. The proposed IDE works better than the traditional DE with modifications in the scaling factor and the boundary control mechanism. The IDE generates better solutions by efficient exploration and exploitation of the search space. Then the best optimal solution obtained by the IDE is provided as the initial Codebook for the LBG. This approach produces an efficient Codebook with less computational time and the consequences include excellent PSNR values and superior quality reconstructed images. It is observed that the proposed IDE-LBG find better VQ Codebooks as compared to IPSO-LBG, BA-LBG and FA-LBG.
We introduce a graphical method originating from the computer graphics domain that is used for the arbitrary and intuitive placement of cells over a two-dimensional manifold. Using a bitmap image as input, where the color indicates the identity of the different structures and the alpha channel indicates the local cell density, this method guarantees a discrete distribution of cell position respecting the local density function. This method scales to any number of cells, allows to specify several different structures at once with arbitrary shapes and provides a scalable and versatile alternative to the more classical assumption of a uniform non-spatial distribution. Furthermore, several connection schemes can be derived from the paired distances between cells using either an automatic mapping or a user-defined local reference frame, providing new computational properties for the underlying model. The method is illustrated on a discrete homogeneous neural field, on the distribution of cones and rods in the retina and on a coronal view of the basal ganglia.