results for au:Zhong_S in:cs

- Nov 21 2017 cs.MM arXiv:1711.07306v1Deep learning based image steganalysis has attracted increasing attentions in recent years. Several Convolutional Neural Network (CNN) models have been proposed and achieved state-of-the-art performances on detecting steganography. In this paper, we explore an important technique in deep learning, the batch normalization, for the task of image steganalysis. Different from natural image classification, steganalysis is to discriminate cover images and stego images which are the result of adding weak stego signals into covers. This characteristic makes a cover image is more statistically similar to its stego than other cover images, requiring steganalytic methods to use paired learning to extract effective features for image steganalysis. Our theoretical analysis shows that a CNN model with multiple normalization layers is hard to be generalized to new data in the test set when it is well trained with paired learning. To hand this difficulty, we propose a novel normalization technique called Shared Normalization (SN) in this paper. Unlike the batch normalization layer utilizing the mini-batch mean and standard deviation to normalize each input batch, SN shares same statistics for all training and test batches. Based on the proposed SN layer, we further propose a novel neural network model for image steganalysis. Extensive experiments demonstrate that the proposed network with SN layers is stable and can detect the state of the art steganography with better performances than previous methods.
- The online sports gambling industry employs teams of data analysts to build forecast models that turn the odds at sports games in their favour. While several betting strategies have been proposed to beat bookmakers, from expert prediction models and arbitrage strategies to odds bias exploitation, their returns have been inconsistent and it remains to be shown that a betting strategy can outperform the online sports betting market. We designed a strategy to beat football bookmakers with their own numbers. Instead of building a forecasting model to compete with bookmakers predictions, we exploited the probability information implicit in the odds publicly available in the marketplace to find bets with mispriced odds. Our strategy proved profitable in a 10-year historical simulation using closing odds, a 6-month historical simulation using minute to minute odds, and a 5-month period during which we staked real money with the bookmakers (we made code, data and models publicly available). Our results demonstrate that the football betting market is inefficient - bookmakers can be consistently beaten across thousands of games in both simulated environments and real-life betting. We provide a detailed description of our betting experience to illustrate how the sports gambling industry compensates these market inefficiencies with discriminatory practices against successful clients.
- The finite-difference time-domain (FDTD) method has been commonly utilized in the numerical solution of electromagnetic (EM) waves propagation through the plasma media. However, the FDTD method may bring about a significant increment in additional run-times consuming for computationally large and complicated EM problems. Graphics Processing Unit (GPU) computing based on Compute Unified Device Architecture (CUDA) has grown in response to increased concern for reduction of run-times. We represent the CUDA-based FDTD method with the Runge-Kutta exponential time differencing scheme (RKETD) for the unmagnetized plasma implemented on GPU. In the paper, we derive the RKETD-FDTD formulation for the unmagnetized plasma comprehensively, and describe the detailed flowchart of CUDA-implemented RKETD-FDTD method on GPU. The accuracy and acceleration performance of the posed CUDA-based RKETD-FDTD method implemented on GPU are substantiated by the numerical experiment that simulates the EM waves traveling through the unmagnetized plasma slab, compared with merely CPU-based RKETD-FDTD method. The accuracy is validated by calculating the reflection and transmission coefficients for one-dimensional unmagnetized plasma slab. Comparison between the elapsed times of two methods proves that the GPU-based RKETD-FDTD method can acquire better application acceleration performance with sufficient accuracy.
- Sep 04 2017 cs.CV arXiv:1709.00192v1Hyperspectral imaging, providing abundant spatial and spectral information simultaneously, has attracted a lot of interest in recent years. Unfortunately, due to the hardware limitations, the hyperspectral image (HSI) is vulnerable to various degradations, such noises (random noise, HSI denoising), blurs (Gaussian and uniform blur, HSI deblurring), and down-sampled (both spectral and spatial downsample, HSI super-resolution). Previous HSI restoration methods are designed for one specific task only. Besides, most of them start from the 1-D vector or 2-D matrix models and cannot fully exploit the structurally spectral-spatial correlation in 3-D HSI. To overcome these limitations, in this work, we propose a unified low-rank tensor recovery model for comprehensive HSI restoration tasks, in which non-local similarity between spectral-spatial cubic and spectral correlation are simultaneously captured by 3-order tensors. Further, to improve the capability and flexibility, we formulate it as a weighted low-rank tensor recovery (WLRTR) model by treating the singular values differently, and study its analytical solution. We also consider the exclusive stripe noise in HSI as the gross error by extending WLRTR to robust principal component analysis (WLRTR-RPCA). Extensive experiments demonstrate the proposed WLRTR models consistently outperform state-of-the-arts in typical low level vision HSI tasks, including denoising, destriping, deblurring and super-resolution.
- Mar 20 2017 cs.CV arXiv:1703.05870v2Chinese font recognition (CFR) has gained significant attention in recent years. However, due to the sparsity of labeled font samples and the structural complexity of Chinese characters, CFR is still a challenging task. In this paper, a DropRegion method is proposed to generate a large number of stochastic variant font samples whose local regions are selectively disrupted and an inception font network (IFN) with two additional convolutional neural network (CNN) structure elements, i.e., a cascaded cross-channel parametric pooling (CCCP) and global average pooling, is designed. Because the distribution of strokes in a font image is non-stationary, an elastic meshing technique that adaptively constructs a set of local regions with equalized information is developed. Thus, DropRegion is seamlessly embedded in the IFN, which enables end-to-end training; the proposed DropRegion-IFN can be used for high performance CFR. Experimental results have confirmed the effectiveness of our new approach for CFR.
- The cospark of a matrix is the cardinality of the sparsest vector in the column space of the matrix. Computing the cospark of a matrix is well known to be an NP hard problem. Given the sparsity pattern (i.e., the locations of the non-zero entries) of a matrix, if the non-zero entries are drawn from independently distributed continuous probability distributions, we prove that the cospark of the matrix equals, with probability one, to a particular number termed the generic cospark of the matrix. The generic cospark also equals to the maximum cospark of matrices consistent with the given sparsity pattern. We prove that the generic cospark of a matrix can be computed in polynomial time, and offer an algorithm that achieves this.
- In this paper, we consider the energy-bandwidth allocation for a network of multiple users, where the transmitters each powered by both an energy harvester and conventional grid, access the network orthogonally on the assigned frequency band. We assume that the energy harvesting state and channel gain of each transmitter can be predicted for $K$ time slots a priori. The different transmitters can cooperate by donating energy to each other. The tradeoff among the weighted sum throughput, the use of grid energy, and the amount of energy cooperation is studied through an optimization objective which is a linear combination of these quantities. This leads to an optimization problem with O($N^2K$) constraints, where $N$ is the total number of transmitter-receiver pairs, and the optimization is over seven sets of variables that denote energy and bandwidth allocation, grid energy utilization, and energy cooperation. To solve the problem efficiently, an iterative algorithm is proposed using the Proximal Jacobian ADMM. The optimization sub-problems corresponding to Proximal Jacobian ADMM steps are solved in closed form. We show that this algorithm converges to the optimal solution with an overall complexity of O($N^2K^2$). Numerical results show that the proposed algorithms can make efficient use of the harvested energy, grid energy, energy cooperation, and the available bandwidth.
- Integration between biology and information science benefits both fields. Many related models have been proposed, such as computational visual cognition models, computational motor control models, integrations of both and so on. In general, the robustness and precision of recognition is one of the key problems for object recognition models. In this paper, inspired by features of human recognition process and their biological mechanisms, a new integrated and dynamic framework is proposed to mimic the semantic extraction, concept formation and feature re-selection in human visual processing. The main contributions of the proposed model are as follows: (1) Semantic feature extraction: Local semantic features are learnt from episodic features that are extracted from raw images through a deep neural network; (2) Integrated concept formation: Concepts are formed with local semantic information and structural information learnt through network. (3) Feature re-selection: When ambiguity is detected during recognition process, distinctive features according to the difference between ambiguous candidates are re-selected for recognition. Experimental results on hand-written digits and facial shape dataset show that, compared with other methods, the new proposed model exhibits higher robustness and precision for visual recognition, especially in the condition when input samples are smantic ambiguous. Meanwhile, the introduced biological mechanisms further strengthen the interaction between neuroscience and information science.
- Dec 08 2015 cs.NA arXiv:1512.02183v1A major challenge of using AMI data in power system analysis is the large size of the data sets. For rapid analysis that addresses historical behavior of systems consisting of a few hundred feeders, all of the AMI load data can be loaded into memory and used in a power flow analysis. However, if a system contains thousands of feeders then the handling of the AMI data in the analysis becomes more challenging. The work here seeks to demonstrate that the information contained in large AMI data sets can be compressed into accurate load models using wavelets. Two types of wavelet based load models are considered, the multi-resolution wavelet load model for each individual customer and the classified wavelet load model for customers that share similar load patterns. The multi-resolution wavelet load model compresses the data, and the classified wavelet load model further compresses the data. The method of grouping customers into classes using the wavelet based classification technique is illustrated.
- May 25 2015 cs.CR arXiv:1505.05958v1Motion sensors (e.g., accelerometers) on smartphones have been demonstrated to be a powerful side channel for attackers to spy on users' inputs on touchscreen. In this paper, we reveal another motion accelerometer-based attack which is particularly serious: when a person takes the metro, a malicious application on her smartphone can easily use accelerator readings to trace her. We first propose a basic attack that can automatically extract metro-related data from a large amount of mixed accelerator readings, and then use an ensemble interval classier built from supervised learning to infer the riding intervals of the user. While this attack is very effective, the supervised learning part requires the attacker to collect labeled training data for each station interval, which is a significant amount of effort. To improve the efficiency of our attack, we further propose a semi-supervised learning approach, which only requires the attacker to collect labeled data for a very small number of station intervals with obvious characteristics. We conduct real experiments on a metro line in a major city. The results show that the inferring accuracy could reach 89\% and 92\% if the user takes the metro for 4 and 6 stations, respectively.
- May 25 2015 cs.CR arXiv:1505.05960v1Today's large-scale enterprise networks, data center networks, and wide area networks can be decomposed into multiple administrative or geographical domains. Domains may be owned by different administrative units or organizations. Hence protecting domain information is an important concern. Existing general-purpose Secure Multi-Party Computation (SMPC) methods that preserves privacy for domains are extremely slow for cross-domain routing problems. In this paper we present PYCRO, a cryptographic protocol specifically designed for privacy-preserving cross-domain routing optimization in Software Defined Networking (SDN) environments. PYCRO provides two fundamental routing functions, policy-compliant shortest path computing and bandwidth allocation, while ensuring strong protection for the private information of domains. We rigorously prove the privacy guarantee of our protocol. We have implemented a prototype system that runs PYCRO on servers in a campus network. Experimental results using real ISP network topologies show that PYCRO is very efficient in computation and communication costs.
- Since the advent of software defined networks (SDN), there have been many attempts to outsource the complex and costly local network functionality, i.e. the middlebox, to the cloud in the same way as outsourcing computation and storage. The privacy issues, however, may thwart the enterprises' willingness to adopt this innovation since the underlying configurations of these middleboxes may leak crucial and confidential information which can be utilized by attackers. To address this new problem, we use firewall as an sample functionality and propose the first privacy preserving outsourcing framework and schemes in SDN. The basic technique that we exploit is a ground-breaking tool in cryptography, the \textitcryptographic multilinear map. In contrast to the infeasibility in efficiency if a naive approach is adopted, we devise practical schemes that can outsource the middlebox as a blackbox after \textitobfuscating it such that the cloud provider can efficiently perform the same functionality without knowing its underlying private configurations. Both theoretical analysis and experiments on real-world firewall rules demonstrate that our schemes are secure, accurate, and practical.
- Sep 03 2014 cs.DS arXiv:1409.0706v2We propose an efficient method for active particle selection, working with Hermite Individual Time Steps (HITS) scheme in direct N-body simulation code $\varphi$GRAPE. For a simulation with $N$ particles, this method can reduce the computation complexity of active particle selection, from $O(N\cdot N_{step})$ to $O(\overline{N_{act}}\cdot N_{step})$, where $\overline{N_{act}}$ is the average active particle number in every time step which is much smaller than $N$ and $N_{step}$ is the total time steps integrated during the simulation. Thus can save a lot of time spent on active particle selection part, especially in the case of low $\overline{N_{act}}$.
- We apply several state-of-the-art techniques developed in recent advances of counting algorithms and statistical physics to study the spatial mixing property of the two-dimensional codes arising from local hard (independent set) constraints, including: hard-square, hard-hexagon, read/write isolated memory (RWIM), and non-attacking kings (NAK). For these constraints, the strong spatial mixing would imply the existence of polynomial-time approximation scheme (PTAS) for computing the capacity. It was previously known for the hard-square constraint the existence of strong spatial mixing and PTAS. We show the existence of strong spatial mixing for hard-hexagon and RWIM constraints by establishing the strong spatial mixing along self-avoiding walks, and consequently we give PTAS for computing the capacities of these codes. We also show that for the NAK constraint, the strong spatial mixing does not hold along self-avoiding walks.
- Compressive sensing is the newly emerging method in information technology that could impact array beamforming and the associated engineering applications. However, practical measurements are inevitably polluted by noise from external interference and internal acquisition process. Then, compressive sensing based beamforming was studied in this work for those noisy measurements with a signal-to-noise ratio. In this article, we firstly introduced the fundamentals of compressive sensing theory. After that, we implemented two algorithms (CSB-I and CSB-II). Both algorithms are proposed for those presumably spatially sparse and incoherent signals. The two algorithms were examined using a simple simulation case and a practical aeroacoustic test case. The simulation case clearly shows that the CSB-I algorithm is quite sensitive to the sensing noise. The CSB-II algorithm, on the other hand, is more robust to noisy measurements. The results by CSB-II at $\mathrm{SNR}=-10\,$dB are still reasonable with good resolution and sidelobe rejection. Therefore, compressive sensing beamforming can be considered as a promising array signal beamforming method for those measurements with inevitably noisy interference.