results for au:Dai_B in:cs

- Jan 01 2018 cs.LG arXiv:1712.10285v1We revisit the Bellman optimality equation with Nesterov's smoothing technique and provide a unique saddle-point optimization perspective of the policy optimization problem in reinforcement learning based on Fenchel duality. A new reinforcement learning algorithm, called Smoothed Dual Embedding Control or SDEC, is derived to solve the saddle-point reformulation with arbitrary learnable function approximator. The algorithm bypasses the policy evaluation step in the policy optimization from a principled scheme and is extensible to integrate with multi-step bootstrapping and eligibility traces. We provide a PAC-learning bound on the number of samples needed from one single off-policy sample path, and also characterize the convergence of the algorithm. Finally, we show the algorithm compares favorably to the state-of-the-art baselines on several benchmark control problems.
- Jan 01 2018 cs.LG arXiv:1712.10282v1This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC. It is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation, which can be viewed as a two-player game between the actor and a critic-like function, which is named as dual critic. Compared to its actor-critic relatives, Dual-AC has the desired property that the actor and dual critic are updated cooperatively to optimize the same objective function, providing a more transparent way for learning the critic that is directly related to the objective function of the actor. We then provide a concrete algorithm that can effectively solve the minimax optimization problem, using techniques of multi-step bootstrapping, path regularization, and stochastic dual ascent algorithm. We demonstrate that the proposed algorithm achieves the state-of-the-art performances across several benchmarks.
- Secure and reliable transmission over communication channels is first characterized by Wyner's wiretap channel (WTC) model. In recent years, the WTC with noiseless feedback (WTC-NF) attracts a lot attention, and it has been shown that the legal receiver's feedback enlarges the secrecy capacity of the WTC. Even so, the current feedback strategy for the WTC mainly focuses on generating key used for encrypting the transmitted message, and this strategy does not achieve the secrecy capacity of the WTC-NF. In this paper, a new coding strategy for the WTC-NF is presented. Besides used for generating key, the legal receiver's feedback of this new strategy is also used to produced a help information which helps the legal receiver to ameliorate his decoding performance. For the WTC-NF, the proposed new feedback strategy is shown to be better than the already existing one, and a binary example is given to further explain these results.
- In this paper, density evolution-based construction methods to design good polar codes on impulsive noise channels for single-carrier and multi-carrier systems are proposed and evaluated. For a single-carrier system, the tight bound of the block error probability (BLEP) is derived by applying density evolution and the performance of the proposed construction methods are compared. For the multi-carrier system employing orthogonal frequency-division multiplexing, the accurate BLEP estimation is not feasible so a tight lower bound on the BLEP for polar codes is derived by assuming the noise on each sub-carrier is Gaussian. The results show that the lower bound becomes tighter as the number of carriers increases.
- Convolution as inner product has been the founding basis of convolutional neural networks (CNNs) and the key to end-to-end visual representation learning. Benefiting from deeper architectures, recent CNNs have demonstrated increasingly strong representation abilities. Despite such improvement, the increased depth and larger parameter space have also led to challenges in properly training a network. In light of such challenges, we propose hyperspherical convolution (SphereConv), a novel learning framework that gives angular representations on hyperspheres. We introduce SphereNet, deep hyperspherical convolution networks that are distinct from conventional inner product based convolutional networks. In particular, SphereNet adopts SphereConv as its basic convolution operator and is supervised by generalized angular softmax loss - a natural loss formulation under SphereConv. We show that SphereNet can effectively encode discriminative representation and alleviate training difficulty, leading to easier optimization, faster convergence and comparable (even better) classification accuracy over convolutional counterparts. We also provide some theoretical insights for the advantages of learning on hyperspheres. In addition, we introduce the learnable SphereConv, i.e., a natural improvement over prefixed SphereConv, and SphereNorm, i.e., hyperspherical learning as a normalization method. Experiments have verified our conclusions.
- Nov 03 2017 cs.CV arXiv:1711.00677v1Selecting attractive photos from a human action shot sequence is quite challenging, because of the subjective nature of the "attractiveness", which is mainly a combined factor of human pose in action and the background. Prior works have actively studied high-level image attributes including interestingness, memorability, popularity, and aesthetics. However, none of them has ever studied the "attractiveness" of human action shot. In this paper, we present the first study of the "attractiveness" of human action shots by taking a systematic data-driven approach. Specifically, we create a new action-shot dataset composed of about 8000 high quality action-shot photos. We further conduct rich crowd-sourced human judge studies on Amazon Mechanical Turk(AMT) in terms of global attractiveness of a single photo, and relative attractiveness of a pair of photos. A deep Siamese network with a novel hybrid distribution matching loss was further proposed to fully exploit both types of ratings. Extensive experiments reveal that (1) the property of action shot attractiveness is subjective but predicable (2) our proposed method is both efficient and effective for predicting the attractive human action shots.
- In this paper, we make an important step towards the black-box machine teaching by considering the cross-space teaching setting, where the teacher and the learner use different feature representations and the teacher can not fully observe the learner's model. In such scenario, we study how the teacher is still able to teach the learner to achieve a faster convergence rate than the traditional passive learning. We propose an active teacher model that can actively query the learner (i.e., make the learner take exams) for estimating the learner's status, and provide the sample complexity for both teaching and query, respectively. In the experiments, we compare the proposed active teacher with the omniscient teacher and verify the effectiveness of the active teacher model.
- Oct 10 2017 cs.CV arXiv:1710.02534v1Image captioning, a popular topic in computer vision, has achieved substantial progress in recent years. However, the distinctiveness of natural descriptions is often overlooked in previous work. It is closely related to the quality of captions, as distinctive captions are more likely to describe images with their unique aspects. In this work, we propose a new learning method, Contrastive Learning (CL), for image captioning. Specifically, via two constraints formulated on top of a reference model, the proposed method can encourage distinctiveness, while maintaining the overall quality of the generated captions. We tested our method on two challenging datasets, where it improves the baseline model by significant margins. We also showed in our studies that the proposed method is generic and can be used for models with various structures.
- In this paper, the individual secrecy of two-way wiretap channel is investigated, where two legitimate users' messages are separately guaranteed secure against an external eavesdropper. For one thing, in some communication scenarios, the joint secrecy is impossible to achieve both positive secrecy rates of two users. For another, the individual secrecy satisfies the secrecy demand of many practical communication systems. Thus, firstly, an achievable secrecy rate region is derived for the general two-way wiretap channel with individual secrecy. In a deterministic channel, the region with individual secrecy is shown to be larger than that with joint secrecy.Secondly, outer bounds on the secrecy capacity region are obtained for the general two-way wiretap channel and for two classes of special two-way wiretap channels.The gap between inner and outer bounds on the secrecy capacity region is explored via the binary input two-way wiretap channels and the degraded Gaussian two-way wiretap. Most notably, the secrecy capacity regions are established for the XOR channel and the degraded Gaussian two-way wiretap channel.Furthermore, the secure sum-rate of the degraded Gaussian two-way wiretap channel under the individual secrecy constraint is demonstrated to be strictly larger than that under the joint secrecy constraint.
- Jun 19 2017 cs.LG arXiv:1706.05148v3Variational autoencoders (VAE) represent a popular, flexible form of deep generative model that can be stochastically fit to samples from a given random process using an information-theoretic variational bound on the true underlying distribution. Once so-obtained, the model can be putatively used to generate new samples from this distribution, or to provide a low-dimensional latent representation of existing samples. While quite effective in numerous application domains, certain important mechanisms which govern the behavior of the VAE are obfuscated by the intractable integrals and resulting stochastic approximations involved. Moreover, as a highly non-convex model, it remains unclear exactly how minima of the underlying energy relate to original design purposes. We attempt to better quantify these issues by analyzing a series of tractable special cases of increasing complexity. In doing so, we unveil interesting connections with more traditional dimensionality reduction models, as well as an intrinsic yet underappreciated propensity for robustly dismissing outliers when estimating latent manifolds. With respect to the latter, we demonstrate that the VAE can be viewed as the natural evolution of recent robust PCA models, capable of learning nonlinear manifolds obscured by gross corruptions. However, this previously unexplored feature comes with the cost of potential model collapse to a degenerate distribution that may be less suitable as the basis for generating new samples.
- In this paper, we consider the problem of machine teaching, the inverse problem of machine learning. Different from traditional machine teaching which views the learners as batch algorithms, we study a new paradigm where the learner uses an iterative algorithm and a teacher can feed examples sequentially and intelligently based on the current performance of the learner. We show that the teaching complexity in the iterative case is very different from that in the batch case. Instead of constructing a minimal training set for learners, our iterative machine teaching focuses on achieving fast convergence in the learner model. Depending on the level of information the teacher has from the learner model, we design teaching algorithms which can provably reduce the number of teaching examples and achieve faster convergence than learning without teachers. We also validate our theoretical findings with extensive experiments on different data distribution and real image datasets.
- Apr 12 2017 cs.CV arXiv:1704.03114v2Relationships among objects play a crucial role in image understanding. Despite the great success of deep learning techniques in recognizing individual objects, reasoning about the relationships among objects remains a challenging task. Previous methods often treat this as a classification problem, considering each type of relationship (e.g. "ride") or each distinct visual phrase (e.g. "person-ride-horse") as a category. Such approaches are faced with significant difficulties caused by the high diversity of visual appearance for each kind of relationships or the large number of distinct visual phrases. We propose an integrated framework to tackle this problem. At the heart of this framework is the Deep Relational Network, a novel formulation designed specifically for exploiting the statistical dependencies between objects and their relationships. On two large datasets, the proposed method achieves substantial improvement over state-of-the-art.
- Recently, the finite state Markov channel (FSMC) with an additional eavesdropper and delayed feedback from the legitimate receiver to the transmitter has been shown to be a useful model for the physical layer security of the practical mobile wireless communication systems. In this paper, we extend this model to a multiple-access situation (up-link of the wireless communication systems), which we call the finite state multiple-access wiretap channel (FS-MAC-WT) with delayed feedback. To be specific, the FS-MAC-WT is a channel with two inputs (transmitters) and two outputs (a legitimate receiver and an eavesdropper). The channel depends on a state which undergoes a Markov process, and the state is entirely known by the legitimate receiver and the eavesdropper. The legitimate receiver intends to send his channel output and the perfectly known state back to the transmitters through noiseless feedback channels after some time delay. The main contribution of this paper is to provide inner and outer bounds on the secrecy capacity regions of the FS-MAC-WT with delayed state feedback, and with or without delayed legitimate receiver's channel output feedback. The capacity results are further explained via a degraded Gaussian fading example, and from this example we see that sending the legitimate receiver's channel output back to the transmitters helps to enhance the achievable secrecy rate region of the FS-MAC-WT with only delayed state feedback.
- Mar 20 2017 cs.CV arXiv:1703.06029v3Despite the substantial progress in recent years, the image captioning techniques are still far from being perfect.Sentences produced by existing methods, e.g. those based on RNNs, are often overly rigid and lacking in variability. This issue is related to a learning principle widely used in practice, that is, to maximize the likelihood of training samples. This principle encourages high resemblance to the "ground-truth" captions while suppressing other reasonable descriptions. Conventional evaluation metrics, e.g. BLEU and METEOR, also favor such restrictive methods. In this paper, we explore an alternative approach, with the aim to improve the naturalness and diversity -- two essential properties of human expression. Specifically, we propose a new framework based on Conditional Generative Adversarial Networks (CGAN), which jointly learns a generator to produce descriptions conditioned on images and an evaluator to assess how well a description fits the visual content. It is noteworthy that training a sequence generator is nontrivial. We overcome the difficulty by Policy Gradient, a strategy stemming from Reinforcement Learning, which allows the generator to receive early feedback along the way. We tested our method on two large datasets, where it performed competitively against real people in our user study and outperformed other methods on various tasks.
- The physical layer security in the up-link of the wireless communication systems is often modeled as the multiple access wiretap channel (MAC-WT), and recently it has received a lot attention. In this paper, the MAC-WT has been re-visited by considering the situation that the legitimate receiver feeds his received channel output back to the transmitters via two noiseless channels, respectively. This model is called the MAC-WT with noiseless feedback. Inner and outer bounds on the secrecy capacity region of this feedback model are provided. To be specific, we first present a decode-and-forward (DF) inner bound on the secrecy capacity region of this feedback model, and this bound is constructed by allowing each transmitter to decode the other one's transmitted message from the feedback, and then each transmitter uses the decoded message to re-encode his own messages, i.e., this DF inner bound allows the independent transmitters to co-operate with each other. Then, we provide a hybrid inner bound which is strictly larger than the DF inner bound, and it is constructed by using the feedback as a tool not only to allow the independent transmitters to co-operate with each other, but also to generate two secret keys respectively shared between the legitimate receiver and the two transmitters. Finally, we give a sato-type outer bound on the secrecy capacity region of this feedback model. The results of this paper are further explained via a Gaussian example.
- Learning-based binary hashing has become a powerful paradigm for fast search and retrieval in massive databases. However, due to the requirement of discrete outputs for the hash functions, learning such functions is known to be very challenging. In addition, the objective functions adopted by existing hashing techniques are mostly chosen heuristically. In this paper, we propose a novel generative approach to learn hash functions through Minimum Description Length principle such that the learned hash codes maximally compress the dataset and can also be used to regenerate the inputs. We also develop an efficient learning algorithm based on the stochastic distributional gradient, which avoids the notorious difficulty caused by binary output constraints, to jointly optimize the parameters of the hash function and the associated generative model. Extensive experiments on a variety of large-scale datasets show that the proposed method achieves better retrieval results than the existing state-of-the-art methods.
- Many machine learning tasks, such as learning with invariance and policy evaluation in reinforcement learning, can be characterized as problems of learning from conditional distributions. In such problems, each sample $x$ itself is associated with a conditional distribution $p(z|x)$ represented by samples $\{z_i\}_{i=1}^M$, and the goal is to learn a function $f$ that links these conditional distributions to target values $y$. These learning problems become very challenging when we only have limited samples or in the extreme case only one sample from each conditional distribution. Commonly used approaches either assume that $z$ is independent of $x$, or require an overwhelmingly large samples from each conditional distribution. To address these challenges, we propose a novel approach which employs a new min-max reformulation of the learning from conditional distribution problem. With such new reformulation, we only need to deal with the joint distribution $p(z,x)$. We also design an efficient learning algorithm, Embedding-SGD, and establish theoretical sample complexity for such problems. Finally, our numerical experiments on both synthetic and real-world datasets show that the proposed approach can significantly improve over the existing algorithms.
- The finite state Markov channel (FSMC), where the channel transition probability is controlled by a state undergoing a Markov process, is a useful model for the mobile wireless communication channel. In this paper, we investigate the security issue in the mobile wireless communication systems by considering the FSMC with an eavesdropper, which we call the finite state Markov wiretap channel (FSM-WC). We assume that the state is perfectly known by the legitimate receiver and the eavesdropper, and through a noiseless feedback channel, the legitimate receiver sends his received channel output and the state back to the transmitter after some time delay. Inner and outer bounds on the capacity-equivocation regions of the FSM-WC with delayed state feedback and with or without delayed channel output feedback are provided in this paper, and we show that these bounds meet if the eavesdropper's received symbol is a degraded version of the legitimate receiver's. The above results are further explained via degraded Gaussian and Gaussian fading examples.
- Mar 18 2016 cs.LG arXiv:1603.05629v4Kernel classifiers and regressors designed for structured data, such as sequences, trees and graphs, have significantly advanced a number of interdisciplinary areas such as computational biology and drug design. Typically, kernels are designed beforehand for a data type which either exploit statistics of the structures or make use of probabilistic generative models, and then a discriminative classifier is learned based on the kernels via convex optimization. However, such an elegant two-stage approach also limited kernel methods from scaling up to millions of data points, and exploiting discriminative information to learn feature representations. We propose, structure2vec, an effective and scalable approach for structured data representation based on the idea of embedding latent variable models into feature spaces, and learning such feature spaces using discriminative information. Interestingly, structure2vec extracts features by performing a sequence of function mappings in a way similar to graphical model inference procedures, such as mean field and belief propagation. In applications involving millions of data points, we showed that structure2vec runs 2 times faster, produces models which are $10,000$ times smaller, while at the same time achieving the state-of-the-art predictive performance.
- This paper studies the energy efficiency of the cloud radio access network (C-RAN), specifically focusing on two fundamental and different downlink transmission strategies, namely the data-sharing strategy and the compression strategy. In the data-sharing strategy, the backhaul links connecting the central processor (CP) and the base-stations (BSs) are used to carry user messages -- each user's messages are sent to multiple BSs; the BSs locally form the beamforming vectors then cooperatively transmit the messages to the user. In the compression strategy, the user messages are precoded centrally at the CP, which forwards a compressed version of the analog beamformed signals to the BSs for cooperative transmission. This paper compares the energy efficiencies of the two strategies by formulating an optimization problem of minimizing the total network power consumption subject to user target rate constraints, where the total network power includes the BS transmission power, BS activation power, and load-dependent backhaul power. To tackle the discrete and nonconvex nature of the optimization problems, we utilize the techniques of reweighted $\ell_1$ minimization and successive convex approximation to devise provably convergent algorithms. Our main finding is that both the optimized data-sharing and compression strategies in C-RAN achieve much higher energy efficiency as compared to the non-optimized coordinated multi-point transmission, but their comparative effectiveness in energy saving depends on the user target rate. At low user target rate, data-sharing consumes less total power than compression, however, as the user target rate increases, the backhaul power consumption for data-sharing increases significantly leading to better energy efficiency of compression at the high user rate regime.
- Bayesian methods are appealing in their flexibility in modeling complex data and ability in capturing uncertainty in parameters. However, when Bayes' rule does not result in tractable closed-form, most approximate inference algorithms lack either scalability or rigorous guarantees. To tackle this challenge, we propose a simple yet provable algorithm, \emphParticle Mirror Descent (PMD), to iteratively approximate the posterior density. PMD is inspired by stochastic functional mirror descent where one descends in the density space using a small batch of data points at each iteration, and by particle filtering where one uses samples to approximate a function. We prove result of the first kind that, with $m$ particles, PMD provides a posterior density estimator that converges in terms of $KL$-divergence to the true posterior in rate $O(1/\sqrt{m})$. We demonstrate competitive empirical performances of PMD compared to several approximate inference algorithms in mixture models, logistic regression, sparse Gaussian processes and latent Dirichlet allocation on large scale datasets.
- This paper considers a downlink cloud radio access network (C-RAN) in which all the base-stations (BSs) are connected to a central computing cloud via digital backhaul links with finite capacities. Each user is associated with a user-centric cluster of BSs; the central processor shares the user's data with the BSs in the cluster, which then cooperatively serve the user through joint beamforming. Under this setup, this paper investigates the user scheduling, BS clustering and beamforming design problem from a network utility maximization perspective. Differing from previous works, this paper explicitly considers the per-BS backhaul capacity constraints. We formulate the network utility maximization problem for the downlink C-RAN under two different models depending on whether the BS clustering for each user is dynamic or static over different user scheduling time slots. In the former case, the user-centric BS cluster is dynamically optimized for each scheduled user along with the beamforming vector in each time-frequency slot, while in the latter case the user-centric BS cluster is fixed for each user and we jointly optimize the user scheduling and the beamforming vector to account for the backhaul constraints. In both cases, the nonconvex per-BS backhaul constraints are approximated using the reweighted l1-norm technique. This approximation allows us to reformulate the per-BS backhaul constraints into weighted per-BS power constraints and solve the weighted sum rate maximization problem through a generalized weighted minimum mean square error approach. This paper shows that the proposed dynamic clustering algorithm can achieve significant performance gain over existing naive clustering schemes. This paper also proposes two heuristic static clustering schemes that can already achieve a substantial portion of the gain.
- The general perception is that kernel methods are not scalable, and neural nets are the methods of choice for nonlinear learning problems. Or have we simply not tried hard enough for kernel methods? Here we propose an approach that scales up kernel methods using a novel concept called "doubly stochastic functional gradients". Our approach relies on the fact that many kernel methods can be expressed as convex optimization problems, and we solve the problems by making two unbiased stochastic approximations to the functional gradient, one using random training points and another using random functions associated with the kernel, and then descending using this noisy functional gradient. We show that a function produced by this procedure after $t$ iterations converges to the optimal function in the reproducing kernel Hilbert space in rate $O(1/t)$, and achieves a generalization performance of $O(1/\sqrt{t})$. This doubly stochasticity also allows us to avoid keeping the support vectors and to implement the algorithm in a small memory footprint, which is linear in number of iterations and independent of data dimension. Our approach can readily scale kernel methods up to the regimes which are dominated by neural nets. We show that our method can achieve competitive performance to neural nets in datasets such as 8 million handwritten digits from MNIST, 2.3 million energy materials from MolecularSpace, and 1 million photos from ImageNet.
- In this paper, we investigate the effects of an additional trusted relay node on the secrecy of multiple-access wiretap channel (MAC-WT) by considering the model of multiple-access relay wiretap channel (MARC-WT). More specifically, first, we investigate the discrete memoryless MARC-WT. Three inner bounds (with respect to decode-forward (DF), noise-forward (NF) and compress-forward (CF) strategies) on the secrecy capacity region are provided. Second, we investigate the degraded discrete memoryless MARC-WT, and present an outer bound on the secrecy capacity region of this degraded model. Finally, we investigate the Gaussian MARC-WT, and find that the NF and CF strategies help to enhance Tekin-Yener's achievable secrecy rate region of Gaussian MAC-WT. Moreover, we find that if the noise variance of the transmitters-relay channel is smaller than that of the transmitters-receiver channel, the DF strategy may also enhance Tekin-Yener's achievable secrecy rate region of Gaussian MAC-WT, and it may perform even better than the NF and CF strategies.
- The overhead of internal network monitoring motivates techniques of network tomography. Network coding (NC) presents a new opportunity for network tomography as NC introduces topology-dependent correlation that can be further exploited in topology estimation. Compared with traditional methods, network tomography with NC has many advantages such as the improvement of tomography accuracy and the reduction of complexity in choosing monitoring paths. In this paper we first introduce the problem of tomography with NC and then propose the taxonomy criteria to classify various methods. We also present existing solutions and future trend. We expect that our comprehensive review on network tomography with NC can serve as a good reference for researchers and practitioners working in the area.
- Software Defined Networking (SDN) is a revolutionary network architecture that separates out network control functions from the underlying equipment and is an increasingly trend to help enterprises build more manageable data centers where big data processing emerges as an important part of applications. To concurrently process large-scale data, MapReduce with an open source implementation named Hadoop is proposed. In practical Hadoop systems one kind of issue that vitally impacts the overall performance is know as the NP-complete minimum make span problem. One main solution is to assign tasks on data local nodes to avoid link occupation since network bandwidth is a scarce resource. Many methodologies for enhancing data locality are proposed such as the HDS and state-of-the-art scheduler BAR. However, all of them either ignore allocating tasks in a global view or disregard available bandwidth as the basis for scheduling. In this paper we propose a heuristic bandwidth-aware task scheduler BASS to combine Hadoop with SDN. It is not only able to guarantee data locality in a global view but also can efficiently assign tasks in an optimized way. Both examples and experiments demonstrate that BASS has the best performance in terms of job completion time. To our knowledge, BASS is the first to exploit talent of SDN for big data processing and we believe it points out a new trend for large-scale data processing.
- Given a hypothesis space, the large volume principle by Vladimir Vapnik prioritizes equivalence classes according to their volume in the hypothesis space. The volume approximation has hitherto been successfully applied to binary learning problems. In this paper, we extend it naturally to a more general definition which can be applied to several transductive problem settings, such as multi-class, multi-label and serendipitous learning. Even though the resultant learning method involves a non-convex optimization problem, the globally optimal solution is almost surely unique and can be obtained in O(n^3) time. We theoretically provide stability and error analyses for the proposed method, and then experimentally show that it is promising.
- We investigate the effects of an additional relay node on the secrecy of broadcast channels by considering the model of relay broadcast channels with confidential messages. We show that this additional relay node can increase the achievable secrecy rate region of the broadcast channels with confidential messages. More specifically, first, we investigate the discrete memoryless relay broadcast channels with two confidential messages and one common message. Three inner bounds (with respect to decode-forward, generalized noise-forward and compress-forward strategies) and an outer bound on the capacity-equivocation region are provided. Second, we investigate the discrete memoryless relay broadcast channels with two confidential messages. Inner and outer bounds on the capacity-equivocation region are provided. Finally, we investigate the discrete memoryless relay broadcast channels with one confidential message and one common message. Inner and outer bounds on the capacity-equivocation region are provided, and the results are further explained via a Gaussian example. Compared with Csiszar-Korner's work on broadcast channels with confidential messages (BCC), we find that with the help of the relay node, the secrecy capacity region of the Gaussian BCC is enhanced.
- Spectral methods have greatly advanced the estimation of latent variable models, generating a sequence of novel and efficient algorithms with strong theoretical guarantees. However, current spectral algorithms are largely restricted to mixtures of discrete or Gaussian distributions. In this paper, we propose a kernel method for learning multi-view latent variable models, allowing each mixture component to be nonparametric. The key idea of the method is to embed the joint distribution of a multi-view latent variable into a reproducing kernel Hilbert space, and then the latent parameters are recovered using a robust tensor power method. We establish that the sample complexity for the proposed method is quadratic in the number of latent components and is a low order polynomial in the other relevant parameters. Thus, our non-parametric tensor approach to learning latent variable models enjoys good sample and computational efficiencies. Moreover, the non-parametric tensor power method compares favorably to EM algorithm and other existing spectral algorithms in our experiments.
- Aug 12 2013 cs.NI arXiv:1308.2002v2A Peer-to-Peer (P2P) network can boost its performance if peers are provided with underlying network-layer routing topology. The task of inferring the network-layer routing topology and link performance from an end host to a set of other hosts is termed as network tomography, and it normally requires host computers to send probing messages. We design a passive network tomography method that does not requires any probing messages and takes a free ride over data flows in P2P networks. It infers routing topology based on end-to-end delay correlation estimation (DCE) without requiring any synchronization or cooperation from the intermediate routers. We implement and test our method in the real world Internet environment and achieved the accuracy of 92% in topology recovery. We also perform extensive simulation in OMNet++ to evaluate its performance over large scale networks, showing that its topology recovery accuracy is about 95% for large networks.
- Jul 20 2013 cs.NI arXiv:1307.5085v1Tomography is important for network design and routing optimization. Prior approaches require either precise time synchronization or complex cooperation. Furthermore, active tomography consumes explicit probeing resulting in limited scalability. To address the first issue we propose a novel Delay Correlation Estimation methodology named DCE with no need of synchronization and special cooperation. For the second issue we develop a passive realization mechanism merely using regular data flow without explicit bandwidth consumption. Extensive simulations in OMNeT++ are made to evaluate its accuracy where we show that DCE measured delay correlation is highly identical with the true value. Also from test result we find that mechanism of passive realization is able to achieve both regular data transmission and purpose of tomography with excellent robustness versus different background traffic and package size.
- May 01 2013 cs.CG arXiv:1304.7833v1We consider the problem of computing the time-convex hull of a point set under the general $L_p$ metric in the presence of a straight-line highway in the plane. The traveling speed along the highway is assumed to be faster than that off the highway, and the shortest time-path between a distant pair may involve traveling along the highway. The time-convex hull ${TCH}(P)$ of a point set $P$ is the smallest set containing both $P$ and \emphall shortest time-paths between any two points in ${TCH}(P)$. In this paper we give an algorithm that computes the time-convex hull under the $L_p$ metric in optimal $O(n\log n)$ time for a given set of $n$ points and a real number $p$ with $1\le p \le \infty$.
- Several security models of multiple-access channel (MAC) are investigated. First, we study the degraded MAC with confidential messages, where two users transmit their confidential messages (no common message) to a destination, and each user obtains a degraded version of the output of the MAC. Each user views the other user as a eavesdropper, and wishes to keep its confidential message as secret as possible from the other user. Measuring each user's uncertainty about the other user's confidential message by equivocation, the inner and outer bounds on the capacity-equivocation region for this model have been provided. The result is further explained via the binary and Gaussian examples. Second, the discrete memoryless multiple-access wiretap channel (MAC-WT) is studied, where two users transmit their corresponding confidential messages (no common message) to a legitimate receiver, while an additional wiretapper wishes to obtain the messages via a wiretap channel. This new model is considered into two cases: the general MAC-WT with cooperative encoders, and the degraded MAC-WT with non-cooperative encoders. The capacity-equivocation region is totally determined for the cooperative case, and inner and outer bounds on the capacity-equivocation region are provided for the non-cooperative case. For both cases, the results are further explained via the binary examples.
- Recent technological advances in wireless communications offer new opportunities and challenges for relay network.To enhance system performance, Demodulate-Network Coding (Dm-NC) scheme has been examined at relay node; it works directly to De-map the received signals and after that forward the mixture to the destination. Simulation analysis has been proven that the performance of Dm-NC has superiority over analog-NC. In addition, the Quantize-Decode-NC scheme (QDF-NC) has been introduced. The presented simulation results clearly provide that the QDF-NC perform better than analog-NC. The toggle between analogNC and QDF-NC is simulated in order to investigate delay and power consumption reduction at relay node.
- We propose a general information-theoretic approach called Seraph (SEmi-supervised metRic leArning Paradigm with Hyper-sparsity) for metric learning that does not rely upon the manifold assumption. Given the probability parameterized by a Mahalanobis distance, we maximize the entropy of that probability on labeled data and minimize it on unlabeled data following entropy regularization, which allows the supervised and unsupervised parts to be integrated in a natural and meaningful way. Furthermore, Seraph is regularized by encouraging a low-rank projection induced from the metric. The optimization of Seraph is solved efficiently and stably by an EM-like scheme with the analytical E-Step and convex M-Step. Experiments demonstrate that Seraph compares favorably with many well-known global and local metric learning methods.
- Gaussian processes (GPs) provide a nonparametric representation of functions. However, classical GP inference suffers from high computational cost and it is difficult to design nonstationary GP priors in practice. In this paper, we propose a sparse Gaussian process model, EigenGP, based on the Karhunen-Loeve (KL) expansion of a GP prior. We use the Nystrom approximation to obtain data dependent eigenfunctions and select these eigenfunctions by evidence maximization. This selection reduces the number of eigenfunctions in our model and provides a nonstationary covariance function. To handle nonlinear likelihoods, we develop an efficient expectation propagation (EP) inference algorithm, and couple it with expectation maximization for eigenfunction selection. Because the eigenfunctions of a Gaussian kernel are associated with clusters of samples - including both the labeled and unlabeled - selecting relevant eigenfunctions enables EigenGP to conduct semi-supervised learning. Our experimental results demonstrate improved predictive performance of EigenGP over alternative state-of-the-art sparse GP and semisupervised learning methods for regression, classification, and semisupervised classification.
- This paper presents a relay selection for decode-and-forward based on network coding (DF-NC) and analog-NC protocols in general scheme of cellular network system. In the propose scheme the two source node simultaneously transmit their own information to all the relays as well as the destination node, and then, a single relay i.e. best with a minimum symbol error rate (SER) will be selected to forward the new version of the received signal. Simulation results show that, the DF-NC scheme with considerable performance has exactness over analog-NC scheme. To improve the system performance, optimal power allocation between the two sources and the best relay is determined based on the asymptotic SER. By increasing the number of relays node, the optimum power allocation achieve better performance than asymptotic SER.
- In this paper, first, we investigate the model of degraded broadcast channel with side information and confidential messages. This work is from Steinberg's work on the degraded broadcast channel with causal and noncausal side information, and Csisz$\acute{a}$r-Körner's work on broadcast channel with confidential messages. Inner and outer bounds on the capacity-equivocation regions are provided for the noncausal and causal cases. Superposition coding and double-binning technique are used in the corresponding achievability proofs. Then, we investigate the degraded broadcast channel with side information, confidential messages and noiseless feedback. The noiseless feedback is from the non-degraded receiver to the channel encoder. Inner and outer bounds on the capacity-equivocation region are provided for the noncausal case, and the capacity-equivocation region is determined for the causal case. Compared with the model without feedback, we find that the noiseless feedback helps to enlarge the inner bounds for both causal and noncausal cases. In the achievability proof of the feedback model, the noiseless feedback is used as a secret key shared by the non-degraded receiver and the transmitter, and therefore, the code construction for the feedback model is a combination of superposition coding, Gel'fand-Pinsker's binning, block Markov coding and Ahlswede-Cai's secret key on the feedback system.
- We propose a general information-theoretic approach called Seraph (SEmi-supervised metRic leArning Paradigm with Hyper-sparsity) for metric learning that does not rely upon the manifold assumption. Given the probability parameterized by a Mahalanobis distance, we maximize the entropy of that probability on labeled data and minimize it on unlabeled data following entropy regularization, which allows the supervised and unsupervised parts to be integrated in a natural and meaningful way. Furthermore, Seraph is regularized by encouraging a low-rank projection induced from the metric. The optimization of Seraph is solved efficiently and stably by an EM-like scheme with the analytical E-Step and convex M-Step. Experiments demonstrate that Seraph compares favorably with many well-known global and local metric learning methods.
- This submission has been withdrawn by the author.