results for au:Jin_L in:cs

- Jul 14 2017 cs.CV arXiv:1707.03993v1Human action recognition in videos is one of the most challenging tasks in computer vision. One important issue is how to design discriminative features for representing spatial context and temporal dynamics. Here, we introduce a path signature feature to encode information from intra-frame and inter-frame contexts. A key step towards leveraging this feature is to construct the proper trajectories (paths) for the data steam. In each frame, the correlated constraints of human joints are treated as small paths, then the spatial path signature features are extracted from them. In video data, the evolution of these spatial features over time can also be regarded as paths from which the temporal path signature features are extracted. Eventually, all these features are concatenated to constitute the input vector of a fully connected neural network for action classification. Experimental results on four standard benchmark action datasets, J-HMDB, SBU Dataset, Berkeley MHAD, and NTURGB+D demonstrate that the proposed approach achieves state-of-the-art accuracy even in comparison with recent deep learning based models.
- May 22 2017 cs.CV arXiv:1705.06849v1Inspired by the great success of recurrent neural networks (RNNs) in sequential modeling, we introduce a novel RNN system to improve the performance of online signature verification. The training objective is to directly minimize intra-class variations and to push the distances between skilled forgeries and genuine samples above a given threshold. By back-propagating the training signals, our RNN network produced discriminative features with desired metrics. Additionally, we propose a novel descriptor, called the length-normalized path signature (LNPS), and apply it to online signature verification. LNPS has interesting properties, such as scale invariance and rotation invariance after linear combination, and shows promising results in online signature verification. Experiments on the publicly available SVC-2004 dataset yielded state-of-the-art performance of 2.37% equal error rate (EER).
- May 16 2017 cs.CV arXiv:1705.05207v1Currently, owing to the ubiquity of mobile devices, online handwritten Chinese character recognition (HCCR) has become one of the suitable choice for feeding input to cell phones and tablet devices. Over the past few years, larger and deeper convolutional neural networks (CNNs) have extensively been employed for improving character recognition performance. However, its substantial storage requirement is a significant obstacle in deploying such networks into portable electronic devices. To circumvent this problem, we propose a novel technique called DropWeight for pruning redundant connections in the CNN architecture. It is revealed that the proposed method not only treats streamlined architectures such as AlexNet and VGGNet well but also exhibits remarkable performance for deep residual network and inception network. We also demonstrate that global pooling is a better choice for building very compact online HCCR systems. Experiments were performed on the ICDAR-2013 online HCCR competition dataset using our proposed network, and it is found that the proposed approach requires only 0.57 MB for storage, whereas state-of-the-art CNN-based methods require up to 135 MB; meanwhile the performance is decreased only by 0.91%.
- We study unsupervised learning by developing introspective generative modeling (IGM) that attains a generator using progressively learned deep convolutional neural networks. The generator is itself a discriminator, capable of introspection: being able to self-evaluate the difference between its generated samples and the given training data. When followed by repeated discriminative learning, desirable properties of modern discriminative classifiers are directly inherited by the generator. IGM learns a cascade of CNN classifiers using a synthesis-by-classification algorithm. In the experiments, we observe encouraging results on a number of applications including texture modeling, artistic style transferring, face modeling, and semi-supervised learning.
- In this paper we propose introspective classifier learning (ICL) that emphasizes the importance of having a discriminative classifier empowered with generative capabilities. We develop a reclassification-by-synthesis algorithm to perform training using a formulation stemmed from the Bayes theory. Our classifier is able to iteratively: (1) synthesize pseudo-negative samples in the synthesis step; and (2) enhance itself by improving the classification in the reclassification step. The single classifier learned is at the same time generative --- being able to directly synthesize new samples within its own discriminative model. We conduct experiments on standard benchmark datasets including MNIST, CIFAR, and SVHN using state-of-the-art CNN architectures, and observe improved classification results.
- Mar 20 2017 cs.CV arXiv:1703.05870v2Chinese font recognition (CFR) has gained significant attention in recent years. However, due to the sparsity of labeled font samples and the structural complexity of Chinese characters, CFR is still a challenging task. In this paper, a DropRegion method is proposed to generate a large number of stochastic variant font samples whose local regions are selectively disrupted and an inception font network (IFN) with two additional convolutional neural network (CNN) structure elements, i.e., a cascaded cross-channel parametric pooling (CCCP) and global average pooling, is designed. Because the distribution of strokes in a font image is non-stationary, an elastic meshing technique that adaptively constructs a set of local regions with equalized information is developed. Thus, DropRegion is seamlessly embedded in the IFN, which enables end-to-end training; the proposed DropRegion-IFN can be used for high performance CFR. Experimental results have confirmed the effectiveness of our new approach for CFR.
- It was shown by Massey that linear complementary dual (LCD for short) codes are asymptotically good. In 2004, Sendrier proved that LCD codes meet the asymptotic Gilbert-Varshamov (GV for short) bound. Until now, the GV bound still remains to be the best asymptotical lower bound for LCD codes. In this paper, we show that an algebraic geometry code over a finite field of even characteristic is equivalent to an LCD code and consequently there exists a family of LCD codes that are equivalent to algebraic geometry codes and exceed the asymptotical GV bound.
- Mar 07 2017 cs.CV arXiv:1703.01425v1Detecting incidental scene text is a challenging task because of multi-orientation, perspective distortion, and variation of text size, color and scale. Retrospective research has only focused on using rectangular bounding box or horizontal sliding window to localize text, which may result in redundant background noise, unnecessary overlap or even information loss. To address these issues, we propose a new Convolutional Neural Networks (CNNs) based method, named Deep Matching Prior Network (DMPNet), to detect text with tighter quadrangle. First, we use quadrilateral sliding windows in several specific intermediate convolutional layers to roughly recall the text with higher overlapping area and then a shared Monte-Carlo method is proposed for fast and accurate computing of the polygonal areas. After that, we designed a sequential protocol for relative regression which can exactly predict text with compact quadrangle. Moreover, a auxiliary smooth Ln loss is also proposed for further regressing the position of text, which has better overall performance than L2 loss and smooth L1 loss in terms of robustness and stability. The effectiveness of our approach is evaluated on a public word-level, multi-oriented scene text database, ICDAR 2015 Robust Reading Competition Challenge 4 "Incidental scene text localization". The performance of our method is evaluated by using F-measure and found to be 70.64%, outperforming the existing state-of-the-art method with F-measure 63.76%.
- Feb 28 2017 cs.CV arXiv:1702.07975v1Like other problems in computer vision, offline handwritten Chinese character recognition (HCCR) has achieved impressive results using convolutional neural network (CNN)-based methods. However, larger and deeper networks are needed to deliver state-of-the-art results in this domain. Such networks intuitively appear to incur high computational cost, and require the storage of a large number of parameters, which renders them unfeasible for deployment in portable devices. To solve this problem, we propose a Global Supervised Low-rank Expansion (GSLRE) method and an Adaptive Drop-weight (ADW) technique to solve the problems of speed and storage capacity. We design a nine-layer CNN for HCCR consisting of 3,755 classes, and devise an algorithm that can reduce the networks computational cost by nine times and compress the network to 1/18 of the original size of the baseline model, with only a 0.21% drop in accuracy. In tests, the proposed algorithm surpassed the best single-network performance reported thus far in the literature while requiring only 2.3 MB for storage. Furthermore, when integrated with our effective forward implementation, the recognition of an offline character image took only 9.7 ms on a CPU. Compared with the state-of-the-art CNN model for HCCR, our approach is approximately 30 times faster, yet 10 times more cost efficient.
- Feb 27 2017 cs.CV arXiv:1702.07508v1This paper presents an investigation of several techniques that increase the accuracy of online handwritten Chinese character recognition (HCCR). We propose a new training strategy named DropDistortion to train a deep convolutional neural network (DCNN) with distorted samples. DropDistortion gradually lowers the degree of character distortion during training, which allows the DCNN to better generalize. Path signature is used to extract effective features for online characters. Further improvement is achieved by employing spatial stochastic max-pooling as a method of feature map distortion and model averaging. Experiments were carried out on three publicly available datasets, namely CASIA-OLHWDB 1.0, CASIA-OLHWDB 1.1, and the ICDAR2013 online HCCR competition dataset. The proposed techniques yield state-of-the-art recognition accuracies of 97.67%, 97.30%, and 97.99%, respectively.
- Nov 29 2016 cs.CV arXiv:1611.08991v2Instance segmentation has attracted recent attention in computer vision and existing methods in this domain mostly have an object detection stage. In this paper, we study the intrinsic challenge of the instance segmentation problem, the presence of a quotient space (swapping the labels of different instances leads to the same result), and propose new methods that are object proposal- and object detection- free. We propose three alternative methods, namely pixel-based affinity mapping, superpixel-based affinity learning, and boundary-based component segmentation, all focusing on performing labeling transformations to cope with the quotient space problem. By adopting fully convolutional neural networks (FCN) like models, our framework attains competitive results on both the PASCAL dataset (object-centric) and the Gland dataset (texture-centric), which the existing methods are not able to do. Our work also has the advantages in its transparency, simplicity, and being all segmentation based.
- In 2011, Guruswami-HÃ¥stad-Kopparty \citeGru showed that the list-decodability of random linear codes is as good as that of general random codes. In the present paper, we further strengthen the result by showing that the list-decodability of random \it Euclidean self-orthogonal codes is as good as that of general random codes as well, i.e., achieves the classical Gilbert-Varshamov bound. Specifically, we show that, for any fixed finite field $\F_q$, error fraction $\delta\in (0,1-1/q)$ satisfying $1-H_q(\delta)\le \frac12$ and small $\epsilon>0$, with high probability a random Euclidean self-orthogonal code over $\F_q$ of rate $1-H_q(\delta)-\epsilon$ is $(\delta, O(1/\epsilon))$-list-decodable. This generalizes the result of linear codes to Euclidean self-orthogonal codes. In addition, we extend the result to list decoding \it symplectic dual-containing codes by showing that the list-decodability of random symplectic dual-containing codes achieves the quantum Gilbert-Varshamov bound as well. This implies that list-decodability of quantum stabilizer codes can achieve the quantum Gilbert-Varshamov bound. The counting argument on self-orthogonal codes is an important ingredient to prove our result.
- Oct 11 2016 cs.CV arXiv:1610.02616v2Online handwritten Chinese text recognition (OHCTR) is a challenging problem as it involves a large-scale character set, ambiguous segmentation, and variable-length input sequences. In this paper, we exploit the outstanding capability of path signature to translate online pen-tip trajectories into informative signature feature maps using a sliding window-based method, successfully capturing the analytic and geometric properties of pen strokes with strong local invariance and robustness. A multi-spatial-context fully convolutional recurrent network (MCFCRN) is proposed to exploit the multiple spatial contexts from the signature feature maps and generate a prediction sequence while completely avoiding the difficult segmentation problem. Furthermore, an implicit language model is developed to make predictions based on semantic context within a predicting feature sequence, providing a new perspective for incorporating lexicon constraints and prior knowledge about a certain language in the recognition procedure. Experiments on two standard benchmarks, Dataset-CASIA and Dataset-ICDAR, yielded outstanding results, with correct rates of 97.10% and 97.15%, respectively, which are significantly better than the best result reported thus far in the literature.
- May 25 2016 cs.CV arXiv:1605.07314v1In this paper, we develop a novel unified framework called DeepText for text region proposal generation and text detection in natural images via a fully convolutional neural network (CNN). First, we propose the inception region proposal network (Inception-RPN) and design a set of text characteristic prior bounding boxes to achieve high word recall with only hundred level candidate proposals. Next, we present a powerful textdetection network that embeds ambiguous text category (ATC) information and multilevel region-of-interest pooling (MLRP) for text and non-text classification and accurate localization. Finally, we apply an iterative bounding box voting scheme to pursue high recall in a complementary manner and introduce a filtering algorithm to retain the most suitable bounding box, while removing redundant inner and outer boxes for each text instance. Our approach achieves an F-measure of 0.83 and 0.85 on the ICDAR 2011 and 2013 robust text detection benchmarks, outperforming previous state-of-the-art results.
- Apr 19 2016 cs.CV arXiv:1604.04953v1This paper proposes an end-to-end framework, namely fully convolutional recurrent network (FCRN) for handwritten Chinese text recognition (HCTR). Unlike traditional methods that rely heavily on segmentation, our FCRN is trained with online text data directly and learns to associate the pen-tip trajectory with a sequence of characters. FCRN consists of four parts: a path-signature layer to extract signature features from the input pen-tip trajectory, a fully convolutional network to learn informative representation, a sequence modeling layer to make per-frame predictions on the input sequence and a transcription layer to translate the predictions into a label sequence. The FCRN is end-to-end trainable in contrast to conventional methods whose components are separately trained and tuned. We also present a refined beam search method that efficiently integrates the language model to decode the FCRN and significantly improve the recognition results. We evaluate the performance of the proposed method on the test sets from the databases CASIA-OLHWDB and ICDAR 2013 Chinese handwriting recognition competition, and both achieve state-of-the-art performance with correct rates of 96.40% and 95.00%, respectively.
- We consider a piecewise-deterministic queueing (PDQ) model to study traffic queues due to stochastic capacity fluctuations in transportation facilities. The saturation rate (capacity) of the PDQ model switches between a finite set of values (modes) according to a Markov chain. The inflow to the PDQ is controlled by a state-feedback policy. The main results of this article are stability conditions of PDQs, i.e. conditions under which the distribution of the queue length converges to a unique invariant probability measure. On one hand, a necessary condition for stability is that the average inflow does not exceed the average saturation rate. On the other hand, based on the Foster-Lyapunov criteria, we derive a sufficient condition that requires a bilinear matrix inequality to admit positive solutions and the invariant probability measure to be unique. We also study the rate of convergence for stable PDQs. Furthermore, for PDQs with two modes, a necessary and sufficient condition for stability is available. In addition, we present examples for the stability analysis of feedback control policies for single PDQs as well as a network of two PDQ links in parallel.
- Feb 16 2016 cs.CV arXiv:1602.04348v1Maximally stable extremal regions (MSER), which is a popular method to generate character proposals/candidates, has shown superior performance in scene text detection. However, the pixel-level operation limits its capability for handling some challenging cases (e.g., multiple connected characters, separated parts of one character and non-uniform illumination). To better tackle these cases, we design a character proposal network (CPN) by taking advantage of the high capacity and fast computing of fully convolutional network (FCN). Specifically, the network simultaneously predicts characterness scores and refines the corresponding locations. The characterness scores can be used for proposal ranking to reject non-character proposals and the refining process aims to obtain the more accurate locations. Furthermore, considering the situation that different characters have different aspect ratios, we propose a multi-template strategy, designing a refiner for each aspect ratio. The extensive experiments indicate our method achieves recall rates of 93.88%, 93.60% and 96.46% on ICDAR 2013, SVT and Chinese2k datasets respectively using less than 1000 proposals, demonstrating promising performance of our character proposal network.
- Both MDS and Euclidean self-dual codes have theoretical and practical importance and the study of MDS self-dual codes has attracted lots of attention in recent years. In particular, determining existence of $q$-ary MDS self-dual codes for various lengths has been investigated extensively. The problem is completely solved for the case where $q$ is even. The current paper focuses on the case where $q$ is odd. We construct a few classes of new MDS self-dual code through generalized Reed-Solomon codes. More precisely, we show that for any given even length $n$ we have a $q$-ary MDS code as long as $q\equiv1\bmod{4}$ and $q$ is sufficiently large (say $q\ge 2^n\times n^2)$. Furthermore, we prove that there exists a $q$-ary MDS self-dual code of length $n$ if $q=r^2$ and $n$ satisfies one of the three conditions: (i) $n\le r$ and $n$ is even; (ii) $q$ is odd and $n-1$ is an odd divisor of $q-1$; (iii) $r\equiv3\mod{4}$ and $n=2tr$ for any $t\le (r-1)/2$.
- A pure quantum state is called $k$-uniform if all its reductions to $k$-qudit are maximally mixed. We investigate the general constructions of $k$-uniform pure quantum states of $n$ subsystems with $d$ levels. We provide one construction via symmetric matrices and the second one through classical error-correcting codes. There are three main results arising from our constructions. Firstly, we show that for any given even $n\ge 2$, there always exists an $n/2$-uniform $n$-qudit quantum state of level $p$ for sufficiently large prime $p$. Secondly, both constructions show that their exist $k$-uniform $n$-qudit pure quantum states such that $k$ is proportional to $n$, i.e., $k=\Omega(n)$ although the construction from symmetric matrices outperforms the one by error-correcting codes. Thirdly, our symmetric matrix construction provides a positive answer to the open question in \citeDA on whether there exists $3$-uniform $n$-qudit pure quantum state for all $n\ge 8$. In fact, we can further prove that, for every $k$, there exists a constant $M_k$ such that there exists a $k$-uniform $n$-qudit quantum state for all $n\ge M_k$. In addition, by using concatenation of algebraic geometry codes, we give an explicit construction of $k$-uniform quantum state when $k$ tends to infinity.
- Nov 10 2015 cs.CV arXiv:1511.02465v1This paper proposes a deep leaning method to address the challenging facial attractiveness prediction problem. The method constructs a convolutional neural network of facial beauty prediction using a new deep cascaded fine-turning scheme with various face inputting channels, such as the original RGB face image, the detail layer image, and the lighting layer image. With a carefully designed CNN model of deep structure, large input size and small convolutional kernels, we have achieved a high prediction correlation of 0.88. This result convinces us that the problem of facial attractiveness prediction can be solved by deep learning approach, and it also shows the important roles of the facial smoothness, lightness, and color information that were involved in facial beauty perception, which is consistent with the result of recent psychology studies. Furthermore, we analyze the high-level features learnt by CNN through visualization of its hidden layers, and some interesting phenomena were observed. It is found that the contours and appearance of facial features, especially eyes and moth, are the most significant facial attributes for facial attractiveness prediction, which is also consistent with the visual perception intuition of human.
- Nov 10 2015 cs.CV arXiv:1511.02282v1We introduce a new pipeline for hand localization and fingertip detection. For RGB images captured from an egocentric vision mobile camera, hand and fingertip detection remains a challenging problem due to factors like background complexity and hand shape variety. To address these issues accurately and robustly, we build a large scale dataset named Ego-Fingertip and propose a bi-level cascaded pipeline of convolutional neural networks, namely, Attention-based Hand Detector as well as Multi-point Fingertip Detector. The proposed method significantly tackles challenges and achieves satisfactorily accurate prediction and real-time performance compared to previous hand and fingertip detection methods.
- Nov 10 2015 cs.CV arXiv:1511.02459v1In this paper, a novel face dataset with attractiveness ratings, namely, the SCUT-FBP dataset, is developed for automatic facial beauty perception. This dataset provides a benchmark to evaluate the performance of different methods for facial attractiveness prediction, including the state-of-the-art deep learning method. The SCUT-FBP dataset contains face portraits of 500 Asian female subjects with attractiveness ratings, all of which have been verified in terms of rating distribution, standard deviation, consistency, and self-consistency. Benchmark evaluations for facial attractiveness prediction were performed with different combinations of facial geometrical features and texture features using classical statistical learning methods and the deep learning method. The best Pearson correlation (0.8187) was achieved by the CNN model. Thus, the results of our experiments indicate that the SCUT-FBP dataset provides a reliable benchmark for facial beauty perception.
- Owing to the rapid growth of touchscreen mobile terminals and pen-based interfaces, handwriting-based writer identification systems are attracting increasing attention for personal authentication, digital forensics, and other applications. However, most studies on writer identification have not been satisfying because of the insufficiency of data and difficulty of designing good features under various conditions of handwritings. Hence, we introduce an end-to-end system, namely DeepWriterID, employed a deep convolutional neural network (CNN) to address these problems. A key feature of DeepWriterID is a new method we are proposing, called DropSegment. It designs to achieve data augmentation and improve the generalized applicability of CNN. For sufficient feature representation, we further introduce path signature feature maps to improve performance. Experiments were conducted on the NLPR handwriting database. Even though we only use pen-position information in the pen-down state of the given handwriting samples, we achieved new state-of-the-art identification rates of 95.72% for Chinese text and 98.51% for English text.
- The Reed-Muller (RM) code encoding $n$-variate degree-$d$ polynomials over ${\mathbb F}_q$ for $d < q$, with its evaluation on ${\mathbb F}_q^n$, has relative distance $1-d/q$ and can be list decoded from a $1-O(\sqrt{d/q})$ fraction of errors. In this work, for $d \ll q$, we give a length-efficient puncturing of such codes which (almost) retains the distance and list decodability properties of the Reed-Muller code, but has much better rate. Specificially, when $q =\Omega( d^2/\epsilon^2)$, we given an explicit rate $\Omega\left(\frac{\epsilon}{d!}\right)$ puncturing of Reed-Muller codes which have relative distance at least $(1-\epsilon)$ and efficient list decoding up to $(1-\sqrt{\epsilon})$ error fraction. This almost matches the performance of random puncturings which work with the weaker field size requirement $q= \Omega( d/\epsilon^2)$. We can also improve the field size requirement to the optimal (up to constant factors) $q =\Omega( d/\epsilon)$, at the expense of a worse list decoding radius of $1-\epsilon^{1/3}$ and rate $\Omega\left(\frac{\epsilon^2}{d!}\right)$. The first of the above trade-offs is obtained by substituting for the variables functions with carefully chosen pole orders from an algebraic function field; this leads to a puncturing for which the RM code is a subcode of a certain algebraic-geometric code (which is known to be efficiently list decodable). The second trade-off is obtained by concatenating this construction with a Reed-Solomon based multiplication friendly pair, and using the list recovery property of algebraic-geometric codes.
- This paper presents an open source tool for testing the recognition accuracy of Chinese handwriting input methods. The tool consists of two modules, namely the PC and Android mobile client. The PC client reads handwritten samples in the computer, and transfers them individually to the Android client in accordance with the socket communication protocol. After the Android client receives the data, it simulates the handwriting on screen of client device, and triggers the corresponding handwriting recognition method. The recognition accuracy is recorded by the Android client. We present the design principles and describe the implementation of the test platform. We construct several test datasets for evaluating different handwriting recognition systems, and conduct an objective and comprehensive test using six Chinese handwriting input methods with five datasets. The test results for the recognition accuracy are then compared and analyzed.
- May 29 2015 cs.CV arXiv:1505.07675v1Deep convolutional neural networks (DCNNs) have achieved great success in various computer vision and pattern recognition applications, including those for handwritten Chinese character recognition (HCCR). However, most current DCNN-based HCCR approaches treat the handwritten sample simply as an image bitmap, ignoring some vital domain-specific information that may be useful but that cannot be learnt by traditional networks. In this paper, we propose an enhancement of the DCNN approach to online HCCR by incorporating a variety of domain-specific knowledge, including deformation, non-linear normalization, imaginary strokes, path signature, and 8-directional features. Our contribution is twofold. First, these domain-specific technologies are investigated and integrated with a DCNN to form a composite network to achieve improved performance. Second, the resulting DCNNs with diversity in their domain knowledge are combined using a hybrid serial-parallel (HSP) strategy. Consequently, we achieve a promising accuracy of 97.20% and 96.87% on CASIA-OLHWDB1.0 and CASIA-OLHWDB1.1, respectively, outperforming the best results previously reported in the literature.
- May 26 2015 cs.CV arXiv:1505.06623v1In this paper, we present an effective method to analyze the recognition confidence of handwritten Chinese character, based on the softmax regression score of a high performance convolutional neural networks (CNN). Through careful and thorough statistics of 827,685 testing samples that randomly selected from total 8836 different classes of Chinese characters, we find that the confidence measurement based on CNN is an useful metric to know how reliable the recognition results are. Furthermore, we find by experiments that the recognition confidence can be used to find out similar and confusable character-pairs, to check wrongly or cursively written samples, and even to discover and correct mis-labelled samples. Many interesting observations and statistics are given and analyzed in this study.
- May 21 2015 cs.CV arXiv:1505.05354v1Inspired by the theory of Leitners learning box from the field of psychology, we propose DropSample, a new method for training deep convolutional neural networks (DCNNs), and apply it to large-scale online handwritten Chinese character recognition (HCCR). According to the principle of DropSample, each training sample is associated with a quota function that is dynamically adjusted on the basis of the classification confidence given by the DCNN softmax output. After a learning iteration, samples with low confidence will have a higher probability of being selected as training data in the next iteration; in contrast, well-trained and well-recognized samples with very high confidence will have a lower probability of being involved in the next training iteration and can be gradually eliminated. As a result, the learning process becomes more efficient as it progresses. Furthermore, we investigate the use of domain-specific knowledge to enhance the performance of DCNN by adding a domain knowledge layer before the traditional CNN. By adopting DropSample together with different types of domain-specific knowledge, the accuracy of HCCR can be improved efficiently. Experiments on the CASIA-OLHDWB 1.0, CASIA-OLHWDB 1.1, and ICDAR 2013 online HCCR competition datasets yield outstanding recognition rates of 97.33%, 97.06%, and 97.51% respectively, all of which are significantly better than the previous best results reported in the literature.
- May 21 2015 cs.CV arXiv:1505.04925v1Just like its great success in solving many computer vision problems, the convolutional neural networks (CNN) provided new end-to-end approach to handwritten Chinese character recognition (HCCR) with very promising results in recent years. However, previous CNNs so far proposed for HCCR were neither deep enough nor slim enough. We show in this paper that, a deeper architecture can benefit HCCR a lot to achieve higher performance, meanwhile can be designed with less parameters. We also show that the traditional feature extraction methods, such as Gabor or gradient feature maps, are still useful for enhancing the performance of CNN. We design a streamlined version of GoogLeNet [13], which was original proposed for image classification in recent years with very deep architecture, for HCCR (denoted as HCCR-GoogLeNet). The HCCR-GoogLeNet we used is 19 layers deep but involves with only 7.26 million parameters. Experiments were conducted using the ICDAR 2013 offline HCCR competition dataset. It has been shown that with the proper incorporation with traditional directional feature maps, the proposed single and ensemble HCCR-GoogLeNet models achieve new state of the art recognition accuracy of 96.35% and 96.74%, respectively, outperforming previous best result with significant gap.
- May 21 2015 cs.CV arXiv:1505.04922v1Most existing online writer-identification systems require that the text content is supplied in advance and rely on separately designed features and classifiers. The identifications are based on lines of text, entire paragraphs, or entire documents; however, these materials are not always available. In this paper, we introduce a path-signature feature to an end-to-end text-independent writer-identification system with a deep convolutional neural network (DCNN). Because deep models require a considerable amount of data to achieve good performance, we propose a data-augmentation method named DropStroke to enrich personal handwriting. Experiments were conducted on online handwritten Chinese characters from the CASIA-OLHWDB1.0 dataset, which consists of 3,866 classes from 420 writers. For each writer, we only used 200 samples for training and the remaining 3,666. The results reveal that the path-signature feature is useful for writer identification, and the proposed DropStroke technique enhances the generalization and significantly improves performance.
- To improve accuracy and speed of regressions and classifications, we present a data-based prediction method, Random Bits Regression (RBR). This method first generates a large number of random binary intermediate/derived features based on the original input matrix, and then performs regularized linear/logistic regression on those intermediate/derived features to predict the outcome. Benchmark analyses on a simulated dataset, UCI machine learning repository datasets and a GWAS dataset showed that RBR outperforms other popular methods in accuracy and robustness. RBR (available on https://sourceforge.net/projects/rbr/) is very fast and requires reasonable memories, therefore, provides a strong, robust and fast predictor in the big data era.
- Bayesian optimization is a powerful global optimization technique for expensive black-box functions. One of its shortcomings is that it requires auxiliary optimization of an acquisition function at each iteration. This auxiliary optimization can be costly and very hard to carry out in practice. Moreover, it creates serious theoretical concerns, as most of the convergence results assume that the exact optimum of the acquisition function can be found. In this paper, we introduce a new technique for efficient global optimization that combines Gaussian process confidence bounds and treed simultaneous optimistic optimization to eliminate the need for auxiliary optimization of acquisition functions. The experiments with global optimization benchmarks and a novel application to automatic information extraction demonstrate that the resulting technique is more efficient than the two approaches from which it draws inspiration. Unlike most theoretical analyses of Bayesian optimization with Gaussian processes, our finite-time convergence rate proofs do not require exact optimization of an acquisition function. That is, our approach eliminates the unsatisfactory assumption that a difficult, potentially NP-hard, problem has to be solved in order to obtain vanishing regret rates.
- Erasure list decoding was introduced to correct a larger number of erasures with output of a list of possible candidates. In the present paper, we consider both random linear codes and algebraic geometry codes for list decoding erasure errors. The contributions of this paper are two-fold. Firstly, we show that, for arbitrary $0<R<1$ and $\epsilon>0$ ($R$ and $\epsilon$ are independent), with high probability a random linear code is an erasure list decodable code with constant list size $2^{O(1/\epsilon)}$ that can correct a fraction $1-R-\epsilon$ of erasures, i.e., a random linear code achieves the information-theoretic optimal trade-off between information rate and fraction of erasure errors. Secondly, we show that algebraic geometry codes are good erasure list-decodable codes. Precisely speaking, for any $0<R<1$ and $\epsilon>0$, a $q$-ary algebraic geometry code of rate $R$ from the Garcia-Stichtenoth tower can correct $1-R-\frac{1}{\sqrt{q}-1}+\frac{1}{q}-\epsilon$ fraction of erasure errors with list size $O(1/\epsilon)$. This improves the Johnson bound applied to algebraic geometry codes. Furthermore, list decoding of these algebraic geometry codes can be implemented in polynomial time.
- It has been a great challenge to construct new quantum MDS codes. In particular, it is very hard to construct quantum MDS codes with relatively large minimum distance. So far, except for some sparse lengths, all known $q$-ary quantum MDS codes have minimum distance less than or equal to $q/2+1$. In the present paper, we provide a construction of quantum MDS codes with minimum distance bigger than $q/2+1$. In particular, we show existence of $q$-ary quantum MDS codes with length $n=q^2+1$ and minimum distance $d$ for any $d\le q+1$ (this result extends those given in \citeGu11,Jin1,KZ12); and with length $(q^2+2)/3$ and minimum distance $d$ for any $d\le (2q+2)/3$ if $3|(q+1)$. Our method is through Hermitian self-orthogonal codes. The main idea of constructing Hermitian self-orthogonal codes is based on the solvability in $\F_q$ of a system of homogenous equations over $\F_{q^2}$.
- A curve attaining the Hasse-Weil bound is called a maximal curve. Usually classical error-correcting codes obtained from a maximal curve have good parameters. However, the quantum stabilizer codes obtained from such classical error-correcting codes via Euclidean or Hermitian self-orthogonality do not always possess good parameters. In this paper, the Hermitian self-orthogonality of algebraic geometry codes obtained from two maximal curves is investigated. It turns out that the stabilizer quantum codes produced from such Hermitian self-orthogonal classical codes have good parameters.
- In an undirected social graph, a friendship link involves two users and the friendship is visible in both the users' friend lists. Such a dual visibility of the friendship may raise privacy threats. This is because both users can separately control the visibility of a friendship link to other users and their privacy policies for the link may not be consistent. Even if one of them conceals the link from a third user, the third user may find such a friendship link from another user's friend list. In addition, as most users allow their friends to see their friend lists in most social network systems, an adversary can exploit the inconsistent policies to launch privacy attacks to identify and infer many of a targeted user's friends. In this paper, we propose, analyze and evaluate such an attack which is called Friendship Identification and Inference (FII) attack. In a FII attack scenario, we assume that an adversary can only see his friend list and the friend lists of his friends who do not hide the friend lists from him. Then, a FII attack contains two attack steps: 1) friend identification and 2) friend inference. In the friend identification step, the adversary tries to identify a target's friends based on his friend list and those of his friends. In the friend inference step, the adversary attempts to infer the target's friends by using the proposed random walk with restart approach. We present experimental results using three real social network datasets and show that FII attacks are generally efficient and effective when adversaries and targets are friends or 2-distant neighbors. We also comprehensively analyze the attack results in order to find what values of parameters and network features could promote FII attacks. Currently, most popular social network systems with an undirected friendship graph, such as Facebook, LinkedIn and Foursquare, are susceptible to FII attacks.
- In the present paper, we show that if the dimension of an arbitrary algebraic geometry code over a finite field of even characters is slightly less than half of its length, then it is equivalent to an Euclidean self-orthogonal code. However, in the literatures, a strong contrition about existence of certain differential is required to obtain such a result. We also show a similar result on Hermitian self-orthogonal algebraic geometry codes. As a consequence, we can apply our result to quantum codes and obtain quantum codes with good asymptotic bounds.
- There have been various constructions of classical codes from polynomial valuations in literature \citeARC04, LNX01,LX04,XF04,XL00. In this paper, we present a construction of classical codes based on polynomial construction again. One of the features of this construction is that not only the classical codes arisen from the construction have good parameters, but also quantum codes with reasonably good parameters can be produced from these classical codes. In particular, some new quantum codes are constructed (see Examples \ref5.5 and \ref5.6).
- It is well known that quantum codes can be constructed through classical symplectic self-orthogonal codes. In this paper, we give a kind of Gilbert-Varshamov bound for symplectic self-orthogonal codes first and then obtain the Gilbert-Varshamov bound for quantum codes. The idea of obtaining the Gilbert-Varshamov bound for symplectic self-orthogonal codes follows from counting arguments.