results for au:Yang_S in:cs

- Feb 08 2018 cs.DB arXiv:1802.02301v2Usually, game companies avoid sharing their game data with external researchers and only a few number of research groups were granted limited access to gam data so far. Such reluctance of the companies to make data publicly available closes doors on the wide use and development of the data mining techniques and AI research specific to the game industry. In this work, we propose an international competition on game data mining using the commercial game log data from one of the major game companies in Korea: NCSOFT. It provides opportunities to many researchers who wish to develop and apply state-of-the-art data mining techniques to game log data by making the data open. The data has been collected from Blade & Soul, Action Role Playing Game, from NCSoft. The data comprises of approximately 100GB of game logs from 10,000 players. The main aim of the competition was to predict whether a player would churn and when the player would churn during two different periods between which its business model was changed to the free-to-play model from monthly fixed charged one. The final result of the competition reveals that the highly ranked competitors used deep learning, tree boosting, and linear regression.
- Jan 29 2018 cs.AI arXiv:1801.08650v1An intelligent robot agent based on domain ontology, machine learning mechanism, and Fuzzy Markup Language (FML) for students and robot co-learning is presented in this paper. The machine-human co-learning model is established to help various students learn the mathematical concepts based on their learning ability and performance. Meanwhile, the robot acts as a teacher's assistant to co-learn with children in the class. The FML-based knowledge base and rule base are embedded in the robot so that the teachers can get feedback from the robot on whether students make progress or not. Next, we inferred students' learning performance based on learning content's difficulty and students' ability, concentration level, as well as teamwork sprit in the class. Experimental results show that learning with the robot is helpful for disadvantaged and below-basic children. Moreover, the accuracy of the intelligent FML-based agent for student learning is increased after machine learning mechanism.
- Considered in this short note is the design of output layer nodes of feedforward neural networks for solving multi-class classification problems with r (bigger than or equal to 3) classes of samples. The common and conventional setting of output layer, called "one-to-one approach" in this paper, is as follows: The output layer contains r output nodes corresponding to the r classes. And for an input sample of the i-th class, the ideal output is 1 for the i-th output node, and 0 for all the other output nodes. We propose in this paper a new "binary approach": Suppose r is (2^(q minus 1), 2^q] with q bigger than or equal to 2, then we let the output layer contain q output nodes, and let the ideal outputs for the r classes be designed in a binary manner. Numerical experiments carried out in this paper show that our binary approach does equally good job as, but uses less output nodes than, the traditional one-to-one approach.
- Jan 24 2018 cs.DC astro-ph.IM arXiv:1801.07548v1With many large science equipment constructing and putting into use, astronomy has stepped into the big data era. The new method and infrastructure of big data processing has become a new requirement of many astronomers. Cloud computing, Map/Reduce, Hadoop, Spark, etc. many new technology has sprung up in recent years. Comparing to the high performance computing(HPC), Data is the center of these new technology. So, a new computing architecture infrastructure is necessary, which can be shared by both HPC and big data processing. Based on Astronomy Cloud project of Chinese Virtual Observatory (China-VO), we have made much efforts to optimize the designation of the hybrid computing platform. which include the hardware architecture, cluster management, Job and Resource scheduling.
- This paper is concerned with the sparsification of the input-hidden weights of ELM (Extreme Learning Machine). For ordinary feedforward neural networks, the sparsification is usually done by introducing certain regularization technique into the learning process of the network. But this strategy can not be applied for ELM, since the input-hidden weights of ELM are supposed to be randomly chosen rather than to be learned. To this end, we propose a modified ELM, called ELM-LC (ELM with local connections), which is designed for the sparsification of the input-hidden weights as follows: The hidden nodes and the input nodes are divided respectively into several corresponding groups, and an input node group is fully connected with its corresponding hidden node group, but is not connected with any other hidden node group. As in the usual ELM, the hidden-input weights are randomly given, and the hidden-output weights are obtained through a least square learning. In the numerical simulations on some benchmark problems, the new ELM-CL behaves better than the traditional ELM.
- The use of electronic textbooks (e-book) has been heavily studied over the years due to their flexibility, accessibility, interactivity and extensibility. Yet current shortcomings of e-book, which is often just a digitized version of the original book, does not encourage adoption. Consequently, this leads to a rethinking of e-book that should incorporate current technologies to augment its capabilities, where inclusion of information search and organization tools have shown to be favorable. This paper is on a preliminary work to add intelligence into such tools in terms of information retrieval. Construction of knowledge graph for e-book material with little overhead is first introduced. Information retrieval through typed similarity query is then performed via random walk. Case study demonstrate the applicability of the e-book platform, with promising application and advancement in the area of electronic textbooks.
- In spatial spectrum estimation field, medium-scale or large-scale receive antenna array with digital beamforming can be employed at receiver to achieve a high-resolution direction of arrival (DOA) estimation and make a significant interference suppression, but leads to an expensive RF-chain circuit cost. Thus, a robust hybrid analog-and-digital beamforming (ADB) structure is proposed to greatly reduces the number of RF-chains so that the cost of RF-chains is lowered significantly. This ADB structure at receiver is to attenuate the jamming signals from undesired directions and boost the receive quality of the desired signal for the purpose of secure receiving. In this paper, we first propose a hybrid ADB scheme with analytic expression, which consists of null space projection in analog beamforming part and diagonal loading method in digital beamforming part. Then, by exploiting DOA estimation errors, we construct a robust hybrid beamforming algorithm using the conditional expectation of DOA instead of making direct use of the estimated DOA. Simulation result shows that the proposed robust beamforming scheme performs more robust and makes a more efficient reduction in jamming interference to improve the receive signal to interference and noise ratio compared to non-burst ADB scheme proposed by us.
- Conversational agents are exploding in popularity. However, much work remains in the area of non goal-oriented conversations, despite significant growth in research interest over recent years. To advance the state of the art in conversational AI, Amazon launched the Alexa Prize, a 2.5-million dollar university competition where sixteen selected university teams built conversational agents to deliver the best social conversational experience. Alexa Prize provided the academic community with the unique opportunity to perform research with a live system used by millions of users. The subjectivity associated with evaluating conversations is key element underlying the challenge of building non-goal oriented dialogue systems. In this paper, we propose a comprehensive evaluation strategy with multiple metrics designed to reduce subjectivity by selecting metrics which correlate well with human judgement. The proposed metrics provide granular analysis of the conversational agents, which is not captured in human ratings. We show that these metrics can be used as a reasonable proxy for human judgment. We provide a mechanism to unify the metrics for selecting the top performing agents, which has also been applied throughout the Alexa Prize competition. To our knowledge, to date it is the largest setting for evaluating agents with millions of conversations and hundreds of thousands of ratings from users. We believe that this work is a step towards an automatic evaluation process for conversational AIs.
- A major practical limitation of the Maddah-Ali-Niesen coded caching techniques is their high subpacketization level. For the simple network with a single server and multiple users, Yan et al. proposed an alternative scheme with the so-called placement delivery arrays (PDA). Such a scheme requires slightly higher transmission rates but significantly reduces the subpacketization level. In this paper, we extend the PDA framework and propose three low-subpacketization schemes for combination networks, i.e., networks with a single server, multiple relays, and multiple cache-aided users that are connected to subsets of relays. One of the schemes achieves the cutset lower bound on the link rate when the cache memories are sufficiently large. Our other two schemes apply only to resolvable combination networks. For these networks, the new schemes perform closely to the currently best-known coded caching schemes for a wide range of cache sizes while having significantly reduced subpacketization levels.
- We consider coded caching over the fading broadcast channel, where the users, equipped with a memory of finite size, experience asymmetric fading statistics. It is known that a naive application of coded caching over the channel at hand performs poorly especially in the regime of a large number of users due to a vanishing multicast rate. We overcome this detrimental effect by a careful design of opportunistic scheduling policies such that some utility function of the long-term average rates should be maximized while balancing fairness among users. In particular, we propose a threshold-based scheduling that requires only statistical channel state information and one-bit feedback from each user. More specifically, each user indicates via feedback whenever its SNR is above a threshold determined solely by the fading statistics and the fairness requirement. Surprisingly, we prove that this simple scheme achieves the optimal utility in the regime of a large number of users. Numerical examples show that our proposed scheme performs closely to the scheduling with full channel state information, but at a significantly reduced complexity.
- Clinical electroencephalographic (EEG) data varies significantly depending on a number of operational conditions (e.g., the type and placement of electrodes, the type of electrical grounding used). This investigation explores the statistical differences present in two different referential montages: Linked Ear (LE) and Averaged Reference (AR). Each of these accounts for approximately 45% of the data in the TUH EEG Corpus. In this study, we explore the impact this variability has on machine learning performance. We compare the statistical properties of features generated using these two montages, and explore the impact of performance on our standard Hidden Markov Model (HMM) based classification system. We show that a system trained on LE data significantly outperforms one trained only on AR data (77.2% vs. 61.4%). We also demonstrate that performance of a system trained on both data sets is somewhat compromised (71.4% vs. 77.2%). A statistical analysis of the data suggests that mean, variance and channel normalization should be considered. However, cepstral mean subtraction failed to produce an improvement in performance, suggesting that the impact of these statistical differences is subtler.
- To be effective, state of the art machine learning technology needs large amounts of annotated data. There are numerous compelling applications in healthcare that can benefit from high performance automated decision support systems provided by deep learning technology, but they lack the comprehensive data resources required to apply sophisticated machine learning models. Further, for economic reasons, it is very difficult to justify the creation of large annotated corpora for these applications. Hence, automated annotation techniques become increasingly important. In this study, we investigated the effectiveness of using an active learning algorithm to automatically annotate a large EEG corpus. The algorithm is designed to annotate six types of EEG events. Two model training schemes, namely threshold-based and volume-based, are evaluated. In the threshold-based scheme the threshold of confidence scores is optimized in the initial training iteration, whereas for the volume-based scheme only a certain amount of data is preserved after each iteration. Recognition performance is improved 2% absolute and the system is capable of automatically annotating previously unlabeled data. Given that the interpretation of clinical EEG data is an exceedingly difficult task, this study provides some evidence that the proposed method is a viable alternative to expensive manual annotation.
- Dec 25 2017 cs.CV arXiv:1712.08263v1This work shows that it is possible to fool/attack recent state-of-the-art face detectors which are based on the single-stage networks. Successfully attacking face detectors could be a serious malware vulnerability when deploying a smart surveillance system utilizing face detectors. We show that existing adversarial perturbation methods are not effective to perform such an attack, especially when there are multiple faces in the input image. This is because the adversarial perturbation specifically generated for one face may disrupt the adversarial perturbation for another face. In this paper, we call this problem the Instance Perturbation Interference (IPI) problem. This IPI problem is addressed by studying the relationship between the deep neural network receptive field and the adversarial perturbation. As such, we propose the Localized Instance Perturbation (LIP) that uses adversarial perturbation constrained to the Effective Receptive Field (ERF) of a target to perform the attack. Experiment results show the LIP method massively outperforms existing adversarial perturbation generation methods -- often by a factor of 2 to 10.
- Hybrid circuit and packet switching for data center networking (DCN) has received considerable research attention recently. A hybrid-switched DCN employs a much faster circuit switch that is reconfigurable with a nontrivial cost, and a much slower packet switch that is reconfigurable with no cost, to interconnect its racks of servers. The research problem is, given a traffic demand matrix (between the racks), how to compute a good circuit switch configuration schedule so that the vast majority of the traffic demand is removed by the circuit switch, leaving a remaining demand matrix that contains only small elements for the packet switch to handle. In this paper, we propose two new hybrid switch scheduling algorithms under two different scheduling constraints. Our first algorithm, called 2-hop Eclipse, strikes a much better tradeoff between the resulting performance (of the hybrid switch) and the computational complexity (of the algorithm) than the state of the art solution Eclipse/Eclipse++. Our second algorithm, called BFF (best first fit), is the first hybrid switching solution that exploits the potential partial reconfiguration capability of the circuit switch for performance gains.
- Dec 19 2017 cs.CL physics.soc-ph arXiv:1712.06074v1A universal First-Letter Law (FLL) is derived and described. It predicts the percentages of first letters for words in novels. The FLL is akin to Benford's law (BL) of first digits, which predicts the percentages of first digits in a data collection of numbers. Both are universal in the sense that FLL only depends on the numbers of letters in the alphabet, whereas BL only depends on the number of digits in the base of the number system. The existence of these types of universal laws appears counter-intuitive. Nonetheless both describe data very well. Relations to some earlier works are given. FLL predicts that an English author on the average starts about 16 out of 100 words with the English letter `t'. This is corroborated by data, yet an author can freely write anything. Fuller implications and the applicability of FLL remain for the future.
- Dec 08 2017 cs.CV arXiv:1712.02514v1This work tackles the face recognition task on images captured using thermal camera sensors which can operate in the non-light environment. While it can greatly increase the scope and benefits of the current security surveillance systems, performing such a task using thermal images is a challenging problem compared to face recognition task in the Visible Light Domain (VLD). This is partly due to the much smaller amount number of thermal imagery data collected compared to the VLD data. Unfortunately, direct application of the existing very strong face recognition models trained using VLD data into the thermal imagery data will not produce a satisfactory performance. This is due to the existence of the domain gap between the thermal and VLD images. To this end, we propose a Thermal-to-Visible Generative Adversarial Network (TV-GAN) that is able to transform thermal face images into their corresponding VLD images whilst maintaining identity information which is sufficient enough for the existing VLD face recognition models to perform recognition. Some examples are presented in Figure 1. Unlike the previous methods, our proposed TV-GAN uses an explicit closed-set face recognition loss to regularize the discriminator network training. This information will then be conveyed into the generator network in the forms of gradient loss. In the experiment, we show that by using this additional explicit regularization for the discriminator network, the TV-GAN is able to preserve more identity information when translating a thermal image of a person which is not seen before by the TV-GAN.
- Nov 15 2017 cs.CV arXiv:1711.04147v1Scene text detection is a challenging problem in computer vision. In this paper, we propose a novel text detection network based on prevalent object detection frameworks. In order to obtain stronger semantic feature, we adopt ResNet as feature extraction layers and exploit multi-level feature by combining hierarchical convolutional networks. A vertical proposal mechanism is utilized to avoid proposal classification, while regression layer remains working to improve localization accuracy. Our approach evaluated on ICDAR2013 dataset achieves F-measure of 0.91, which outperforms previous state-of-the-art results in scene text detection.
- In this paper, a recognition framework named D-PCN using a discriminator is proposed, which can intensify the feature extracting ability of convolutional neural networks. The framework contains two parallel convolutional neural networks, and a discriminator, which is introduced from the Generative Adversarial Nets and can improve the performance of parallel networks. The two nets are devised side by side, and the discriminator takes in the features from parallel networks as input, aiming to guide the two nets to learn features of different details in a reverse adversarial style. After that, the feature maps from two nets get aggregated, then an extra overall classifier is added and will output the final prediction employing the fused features. The training strategy of the D-PCN is also introduced which ensures the utilization of the discriminator. We experiment the D-PCN with several CNN models including NIN, ResNet, ResNeXt and DenseNet using single NVIDIA TITAN Xp, on the two benchmark datasets: CIFAR-100 and downsampled ImageNet-1k, the D-PCN enhances all models on CIFAR-100 and also reinforces the performance of ResNet on downsampled ImageNet-1k explicitly. In particular, it yields state-of-the-art classification performance on CIFAR-100 with compared to relative works.
- Oct 31 2017 cs.CV arXiv:1710.10749v1Deep region-based object detector consists of a region proposal step and a deep object recognition step. In this paper, we make significant improvements on both of the two steps. For region proposal we propose a novel lightweight cascade structure which can effectively improve RPN proposal quality. For object recognition we re-implement global context modeling with a few modications and obtain a performance boost (4.2% mAP gain on the ILSVRC 2016 validation set). Besides, we apply the idea of pre-training extensively and show its importance in both steps. Together with common training and testing tricks, we improve Faster R-CNN baseline by a large margin. In particular, we obtain 87.9% mAP on the PASCAL VOC 2012 test set, 65.3% on the ILSVRC 2016 test set and 36.8% on the COCO test-std set.
- Oct 31 2017 cs.LG arXiv:1710.10657v1The multi-armed bandit problem where the rewards are realizations of general non-stationary stochastic processes is a challenging setting which has not been previously tackled in the bandit literature in its full generality. We present the first theoretical analysis of this problem by deriving guarantees for both the path-dependent dynamic pseudo-regret and the standard pseudo-regret that, remarkably, are both logarithmic in the number of rounds under certain natural conditions. We describe several UCB-type algorithms based on the notion of weighted discrepancy, a key measure of non-stationarity for stochastic processes. We show that discrepancy provides a unified framework for the analysis of non-stationary rewards. Our experiments demonstrate a significant improvement in practice compared to standard benchmarks.
- Oct 25 2017 cs.CC arXiv:1710.08602v2The complexity class PPA consists of NP-search problems which are reducible to the parity principle in undirected graphs. It contains a wide variety of interesting problems from graph theory, combinatorics, algebra and number theory, but only a few of these are known to be complete in the class. Before this work, the known complete problems were all discretizations or combinatorial analogues of topological fixed point theorems. Here we prove the PPA-completeness of two problems of radically different style. They are PPA-Circuit CNSS and PPA-Circuit Chevalley, related respectively to the Combinatorial Nullstellensatz and to the Chevalley-Warning Theorem over the two elements field GF(2). The input of these problems contain PPA-circuits which are arithmetic circuits with special symmetric properties that assure that the polynomials computed by them have always an even number of zeros. In the proof of the result we relate the multilinear degree of the polynomials to the parity of the maximal parse subcircuits that compute monomials with maximal multilinear degree, and we show that the maximal parse subcircuits of a PPA-circuit can be paired in polynomial time.
- Most of today's high-speed switches and routers adopt an input-queued crossbar switch architecture. Such a switch needs to compute a matching (crossbar schedule) between the input ports and output ports during each switching cycle (time slot). A key research challenge in designing large (in number of input/output ports $N$) input-queued crossbar switches is to develop crossbar scheduling algorithms that can compute "high quality" matchings - i.e. those that result in high switch throughput (ideally $100\%$) and low queueing delays for packets - at line rates. SERENA is arguably the best algorithm in that regard: It outputs excellent matching decisions that result in $100\%$ switch throughput and near-optimal queueing delays. However, since SERENA is a centralized algorithm with $O(N)$ computational complexity, it cannot support switches that both are large (in terms of $N$) and have a very high line rate per port. In this work, we propose SERENADE (SERENA, the Distributed Edition), a parallel algorithm suite that emulates SERENA in only $O(\log N)$ %(distributed) iterations between input ports and output ports, and hence has a time complexity of only $O(\log N)$ per port. Through extensive simulations, we show that all three variants in the SERENADE suite can, either provably or empirically, achieve 100\% throughput, and that they have similar delay performances as SERENA under heavy traffic loads.
- Oct 18 2017 cs.CV arXiv:1710.06177v1In recent years, we witnessed a huge success of Convolutional Neural Networks on the task of the image classification. However, these models are notoriously data hungry and require tons of training images to learn the parameters. In contrast, people are far better learner who can learn a new concept very fast with only a few samples. The plausible mysteries making the difference are two fundamental learning mechanisms: learning to learn and learning by analogy. In this paper, we attempt to investigate a new human-like learning method by organically combining these two mechanisms. In particular, we study how to generalize the classification parameters of previously learned concepts to a new concept. we first propose a novel Visual Analogy Network Embedded Regression (VANER) model to jointly learn a low-dimensional embedding space and a linear mapping function from the embedding space to classification parameters for base classes. We then propose an out-of-sample embedding method to learn the embedding of a new class represented by a few samples through its visual analogy with base classes. By inputting the learned embedding into VANER, we can derive the classification parameters for the new class.These classification parameters are purely generalized from base classes (i.e. transferred classification parameters), while the samples in the new class, although only a few, can also be exploited to generate a set of classification parameters (i.e. model classification parameters). Therefore, we further investigate the fusion strategy of the two kinds of parameters so that the prior knowledge and data knowledge can be fully leveraged. We also conduct extensive experiments on ImageNet and the results show that our method can consistently and significantly outperform state-of-the-art baselines.
- The problem of network function computation over a directed acyclic network is investigated in this paper. In such a network, a sink node desires to compute with zero error a \em target function, of which the inputs are generated at multiple source nodes. The edges in the network are assumed to be error-free and have limited capacity. The nodes in the network are assumed to have unbounded computing capability and be able to perform network coding. The \em computing rate of a network code that can compute the target function over the network is the average number of times that the target function is computed with zero error for one use of the network. In this paper, we obtain an improved upper bound on the computing capacity, which is applicable to arbitrary target functions and arbitrary network topologies. This improved upper bound not only is an enhancement of the previous upper bounds but also is the first tight upper bound on the computing capacity for computing an arithmetic sum over a certain non-tree network, which has been widely studied in the literature. We also introduce a multi-dimensional array approach that facilitates evaluation of the improved upper bound. Furthermore, we apply this bound to the problem of computing a vector-linear function over a network. With this bound, we are able to not only enhance a previous result on computing a vector-linear function over a network but also simplify the proof significantly. Finally, we prove that for computing the binary maximum function over the reverse butterfly network, our improved upper bound is not achievable. This result establishes that in general our improved upper bound is non achievable, but whether it is asymptotically achievable or not remains open.
- Adapted from biological sequence alignment, trace alignment is a process mining technique used to visualize and analyze workflow data. Any analysis done with this method, however, is affected by the alignment quality. The best existing trace alignment techniques use progressive guide-trees to heuristically approximate the optimal alignment in O(N2L2) time. These algorithms are heavily dependent on the selected guide-tree metric, often return sum-of-pairs-score-reducing errors that interfere with interpretation, and are computationally intensive for large datasets. To alleviate these issues, we propose process-oriented iterative multiple alignment (PIMA), which contains specialized optimizations to better handle workflow data. We demonstrate that PIMA is a flexible framework capable of achieving better sum-of-pairs score than existing trace alignment algorithms in only O(NL2) time. We applied PIMA to analyzing medical workflow data, showing how iterative alignment can better represent the data and facilitate the extraction of insights from data visualization.
- Batched sparse (BATS) code is a promising technology for reliable data transmission in multi-hop wireless networks. As a BATS code consists of an outer code and an inner code that typically is a random linear network code, one main research topic for BATS codes is to design an inner code with good performance in transmission efficiency and complexity. In this paper, this issue is addressed with a focus on the problem of minimizing the total number of packets transmitted by the source and intermediate nodes. Subsequently, the problem is formulated as a mixed integer nonlinear programming (MINLP) problem that is NP-hard in general. By exploiting the properties of inner codes and the incomplete beta function, we construct a nonlinear programming (NLP) problem that gives a valid upper bound on the best performance that can be achieved by any feasible solutions. Moreover, both centralized and decentralized real-time optimization strategies are developed. In particular, the decentralized approach is performed independently by each node to find a feasible solution in linear time with the use of look-up tables. Numerical results show that the gap in performance between our proposed approaches and the upper bound is very small, which demonstrates that all feasible solutions developed in the paper are near-optimal with a guaranteed performance bound.
- Aug 11 2017 cs.DC arXiv:1708.03184v2Big data analytics on geographically distributed datasets (across data centers or clusters) has been attracting increasing interests from both academia and industry, but also significantly complicates the system and algorithm designs. In this article, we systematically investigate the geo-distributed big-data analytics framework by analyzing the fine-grained paradigm and the key design principles. We present a dynamic global manager selection algorithm (GMSA) to minimize energy consumption cost by fully exploiting the system diversities in geography and variation over time. The algorithm makes real-time decisions based on the measurable system parameters through stochastic optimization methods, while achieving the performance balances between energy cost and latency. Extensive trace-driven simulations verify the effectiveness and efficiency of the proposed algorithm. We also highlight several potential research directions that remain open and require future elaborations in analyzing geo-distributed big data.
- We consider the multiple-input multiple-output (MIMO) communication channel impaired by phase noises at both the transmitter and receiver. We focus on the maximum likelihood (ML) detection problem for uncoded single-carrier transmission. We derive an approximation of the likelihood function, based on which we propose an efficient detection algorithm. The proposed algorithm, named self-interference whitening (SIW), consists in 1) estimating the self-interference caused by the phase noise perturbation, then 2) whitening the said interference, and finally 3) detecting the transmitted vector. While the exact ML solution is computationally intractable, we construct a simulation-based lower bound on the error probability of ML detection. Leveraging this lower bound, we perform extensive numerical experiments demonstrating that SIW is, in most cases of interest, very close to optimal with moderate phase noise. More importantly and perhaps surprisingly, such near-ML performance can be achieved by applying only twice the nearest neighbor detection algorithm. In this sense, our results reveal a striking fact: near-ML detection of phase noise corrupted MIMO channels can be done as efficiently as for conventional MIMO channels without phase noise.
- Aug 09 2017 cs.CV arXiv:1708.02337v2Face detection and recognition benchmarks have shifted toward more difficult environments. The challenge presented in this paper addresses the next step in the direction of automatic detection and identification of people from outdoor surveillance cameras. While face detection has shown remarkable success in images collected from the web, surveillance cameras include more diverse occlusions, poses, weather conditions and image blur. Although face verification or closed-set face identification have surpassed human capabilities on some datasets, open-set identification is much more complex as it needs to reject both unknown identities and false accepts from the face detector. We show that unconstrained face detection can approach high detection rates albeit with moderate false accept rates. By contrast, open-set face recognition is currently weak and requires much more attention.
- Semantic 3D mapping can be used for many applications such as robot navigation and virtual interaction. In recent years, there has been great progress in semantic segmentation and geometric 3D mapping. However, it is still challenging to combine these two tasks for accurate and large-scale semantic mapping from images. In the paper, we propose an incremental and (near) real-time semantic mapping system. A 3D scrolling occupancy grid map is built to represent the world, which is memory and computationally efficient and bounded for large scale environments. We utilize the CNN segmentation as prior prediction and further optimize 3D grid labels through a novel CRF model. Superpixels are utilized to enforce smoothness and form robust P N high order potential. An efficient mean field inference is developed for the graph optimization. We evaluate our system on the KITTI dataset and improve the segmentation accuracy by 10% over existing systems.
- Jul 18 2017 cs.AI arXiv:1707.04828v1In this paper, we demonstrate the application of Fuzzy Markup Language (FML) to construct an FML-based Dynamic Assessment Agent (FDAA), and we present an FML-based Human-Machine Cooperative System (FHMCS) for the game of Go. The proposed FDAA comprises an intelligent decision-making and learning mechanism, an intelligent game bot, a proximal development agent, and an intelligent agent. The intelligent game bot is based on the open-source code of Facebook Darkforest, and it features a representational state transfer application programming interface mechanism. The proximal development agent contains a dynamic assessment mechanism, a GoSocket mechanism, and an FML engine with a fuzzy knowledge base and rule base. The intelligent agent contains a GoSocket engine and a summarization agent that is based on the estimated win rate, real-time simulation number, and matching degree of predicted moves. Additionally, the FML for player performance evaluation and linguistic descriptions for game results commentary are presented. We experimentally verify and validate the performance of the FDAA and variants of the FHMCS by testing five games in 2016 and 60 games of Google Master Go, a new version of the AlphaGo program, in January 2017. The experimental results demonstrate that the proposed FDAA can work effectively for Go applications.
- In this letter, we investigate the energy efficiency (EE) problem in a millimeter wave (mmWave) massive MIMO (mMIMO) system with non-orthogonal multiple access (NOMA). Multiple two-user clusters are formulated according to their channel correlation and gain difference. Following this, we propose a hybrid analog/digital precoding scheme for the low radio frequency (RF) chains structure at the base station (BS). On this basis, we formulate a power allocation problem aiming to maximize the EE under users' quality of service (QoS) requirements and per-cluster power constraint. An iterative algorithm is proposed to obtain the optimal power allocation. Simulation results show that the proposed NOMA achieves superior EE performance than that of conventional OMA.
- Jul 07 2017 cs.SD arXiv:1707.01670v2In this paper, we aim at improving the performance of synthesized speech in statistical parametric speech synthesis (SPSS) based on a generative adversarial network (GAN). In particular, we propose a novel architecture combining the traditional acoustic loss function and the GAN's discriminative loss under a multi-task learning (MTL) framework. The mean squared error (MSE) is usually used to estimate the parameters of deep neural networks, which only considers the numerical difference between the raw audio and the synthesized one. To mitigate this problem, we introduce the GAN as a second task to determine if the input is a natural speech with specific conditions. In this MTL framework, the MSE optimization improves the stability of GAN, and at the same time GAN produces samples with a distribution closer to natural speech. Listening tests show that the multi-task architecture can generate more natural speech that satisfies human perception than the conventional methods.
- Jun 30 2017 cs.CV arXiv:1706.09579v2In this paper, we propose a novel method called Rotational Region CNN (R2CNN) for detecting arbitrary-oriented texts in natural scene images. The framework is based on Faster R-CNN [1] architecture. First, we use the Region Proposal Network (RPN) to generate axis-aligned bounding boxes that enclose the texts with different orientations. Second, for each axis-aligned text box proposed by RPN, we extract its pooled features with different pooled sizes and the concatenated features are used to simultaneously predict the text/non-text score, axis-aligned box and inclined minimum area box. At last, we use an inclined non-maximum suppression to get the detection results. Our approach achieves competitive results on text detection benchmarks: ICDAR 2015 and ICDAR 2013.
- Every channel can be expressed as a convex combination of deterministic channels with each deterministic channel corresponding to one particular intrinsic state. Such convex combinations are in general not unique, each giving rise to a specific intrinsic-state distribution. In this paper we study the maximum and the minimum capacities of a channel when the realization of its intrinsic state is causally available at the encoder and/or the decoder. Several conclusive results are obtained for binary-input channels and binary-output channels. Byproducts of our investigation include a generalization of the Birkhoff-von Neumann theorem and a condition on the uselessness of causal state information at the encoder.
- Jun 12 2017 cs.CV arXiv:1706.02863v1In this paper, we share our experience in designing a convolutional network-based face detector that could handle faces of an extremely wide range of scales. We show that faces with different scales can be modeled through a specialized set of deep convolutional networks with different structures. These detectors can be seamlessly integrated into a single unified network that can be trained end-to-end. In contrast to existing deep models that are designed for wide scale range, our network does not require an image pyramid input and the model is of modest complexity. Our network, dubbed ScaleFace, achieves promising performance on WIDER FACE and FDDB datasets with practical runtime speed. Specifically, our method achieves 76.4 average precision on the challenging WIDER FACE dataset and 96% recall rate on the FDDB dataset with 7 frames per second (fps) for 900 * 1300 input image.
- This paper is concerned with the construction of algebraic geometric codes defined from GGS curves. It is of significant use to describe bases for the Riemann-Roch spaces associated with totally ramified places, which enables us to study multi-point AG codes. Along this line, we characterize explicitly the Weierstrass semigroups and pure gaps. Additionally, we determine the floor of a certain type of divisor and investigate the properties of AG codes from GGS curves. Finally, we apply these results to find multi-point codes with excellent parameters. As one of the examples, a presented code with parameters $ [216,190,\geqslant 18] $ over $ \mathbb{F}_{64} $ yields a new record.
- May 26 2017 cs.LG arXiv:1705.08971v2Cooperative transmission of data fosters rapid accumulation of knowledge by efficiently combining experiences across learners. Although well studied in human learning and increasingly in machine learning, we lack formal frameworks through which we may reason about the benefits and limitations of cooperative inference. We present such a framework. We introduce novel indices for measuring the effectiveness of probabilistic and cooperative information transmission. We relate our indices to the well-known Teaching Dimension in deterministic settings. We prove conditions under which optimal cooperative inference can be achieved, including a representation theorem that constrains the form of inductive biases for learners optimized for cooperative inference. We conclude by demonstrating how these principles may inform the design of machine learning algorithms and discuss implications for human and machine learning.
- RPL, an IPv6 routing protocol for Low power Lossy Networks (LLNs), is considered to be the de facto routing standard for the Internet of Things (IoT). However, more and more experimental results demonstrate that RPL performs poorly when it comes to throughput and adaptability to network dynamics. This significantly limits the application of RPL in many practical IoT scenarios, such as an LLN with high-speed sensor data streams and mobile sensing devices. To address this issue, we develop BRPL, an extension of RPL, providing a practical approach that allows users to smoothly combine any RPL Object Function (OF) with backpressure routing. BRPL uses two novel algorithms, QuickTheta and QuickBeta, to support time-varying data traffic loads and node mobility respectively. We implement BRPL on Contiki OS, an open-source operating system for the Internet of Things. We conduct an extensive evaluation using both real-world experiments based on the FIT IoT-LAB testbed and large-scale simulations using Cooja over 18 virtual servers on the Cloud. The evaluation results demonstrate that BRPL not only is fully backward compatible with RPL (i.e. devices running RPL and BRPL can work together seamlessly), but also significantly improves network throughput and adaptability to changes in network topologies and data traffic loads. The observed packet loss reduction in mobile networks is, at a minimum, 60% and up to 1000% can be seen in extreme cases.
- May 18 2017 cs.NI arXiv:1705.05988v1The emerging Fog paradigm has been attracting increasing interests from both academia and industry, due to the low-latency, resilient, and cost-effective services it can provide. Many Fog applications such as video mining and event monitoring, rely on data stream processing and analytics, which are very popular in the Cloud, but have not been comprehensively investigated in the context of Fog architecture. In this article, we present the general models and architecture of Fog data streaming, by analyzing the common properties of several typical applications. We also analyze the design space of Fog streaming with the consideration of four essential dimensions (system, data, human, and optimization), where both new design challenges and the issues arise from leveraging existing techniques are investigated, such as Cloud stream processing, computer networks, and mobile computing.
- May 17 2017 cs.DC arXiv:1705.05541v1Fault tolerance is one of the major design goals for HPC. The emergence of non-volatile memories (NVM) provides a solution to build fault tolerant HPC. Data in NVM-based main memory are not lost when the system crashes because of the non-volatility nature of NVM. However, because of volatile caches, data must be logged and explicitly flushed from caches into NVM to ensure consistence and correctness before crashes, which can cause large runtime overhead. In this paper, we introduce an algorithm-based method to establish crash consistence in NVM for HPC applications. We slightly extend application data structures or sparsely flush cache blocks, which introduce ignorable runtime overhead. Such extension or cache flushing allows us to use algorithm knowledge to \textitreason data consistence or correct inconsistent data when the application crashes. We demonstrate the effectiveness of our method for three algorithms, including an iterative solver, dense matrix multiplication, and Monte-Carlo simulation. Based on comprehensive performance evaluation on a variety of test environments, we demonstrate that our approach has very small runtime overhead (at most 8.2\% and less than 3\% in most cases), much smaller than that of traditional checkpoint, while having the same or less recomputation cost.
- In this paper, by employing the results over Kummer extensions, we give an arithmetic characterization of pure gaps at many totally ramified places over the quotients of Hermitian curves, including the well-studied Hermitian curves as special cases. The cardinality of these pure gaps is explicitly investigated. In particular, the numbers of gaps and pure gaps at a pair of distinct places are determined precisely, which can be regarded as an extension of the previous work by Matthews (2001) considered Hermitian curves. Additionally, some concrete examples are provided to illustrate our results.
- This paper considers a multipair amplify-and-forward massive MIMO relaying system with low-resolution ADCs at both the relay and destinations. The channel state information (CSI) at the relay is obtained via pilot training, which is then utilized to perform simple maximum-ratio combining/maximum-ratio transmission processing by the relay. Also, it is assumed that the destinations use statistical CSI to decode the transmitted signals. Exact and approximated closed-form expressions for the achievable sum rate are presented, which enable the efficient evaluation of the impact of key system parameters on the system performance. In addition, optimal relay power allocation scheme is studied, and power scaling law is characterized. It is found that, with only low-resolution ADCs at the relay, increasing the number of relay antennas is an effective method to compensate for the rate loss caused by coarse quantization. However, it becomes ineffective to handle the detrimental effect of low-resolution ADCs at the destination. Moreover, it is shown that deploying massive relay antenna arrays can still bring significant power savings, i.e., the transmit power of each source can be cut down proportional to $1/M$ to maintain a constant rate, where $M$ is the number of relay antennas.
- The state-dependent $K$-user memoryless Broadcast Channel~(BC) with state feedback is investigated. We propose a novel transmission scheme and derive its corresponding achievable rate region, which, compared to some general schemes that deal with feedback, has the advantage of being relatively simple and thus is easy to evaluate. In particular, it is shown that the capacity region of the symmetric erasure BC with an arbitrary input alphabet size is achievable with the proposed scheme. For the fading Gaussian BC, we derive a symmetric achievable rate as a function of the signal-to-noise ratio~(SNR) and a small set of parameters. Besides achieving the optimal degrees of freedom at high SNR, the proposed scheme is shown, through numerical results, to outperform existing schemes from the literature in the finite SNR regime.
- As item trading becomes more popular, users can change their game items or money into real money more easily. At the same time, hackers turn their eyes on stealing other users game items or money because it is much easier to earn money than traditional gold-farming by running game bots. Game companies provide various security measures to block account- theft attempts, but many security measures on the user-side are disregarded by users because of lack of usability. In this study, we propose a server-side account theft detection system base on action sequence analysis to protect game users from malicious hackers. We tested this system in the real Massively Multiplayer Online Role Playing Game (MMORPG). By analyzing users full game play log, our system can find the particular action sequences of hackers with high accuracy. Also, we can trace where the victim accounts stolen money goes.
- May 02 2017 cs.LG arXiv:1705.00132v4We consider a general framework of online learning with expert advice where regret is defined with respect to sequences of experts accepted by a weighted automaton. Our framework covers several problems previously studied, including competing against k-shifting experts. We give a series of algorithms for this problem, including an automata-based algorithm extending weighted-majority and more efficient algorithms based on the notion of failure transitions. We further present efficient algorithms based on an approximation of the competitor automaton, in particular n-gram models obtained by minimizing the ∞-Rényi divergence, and present an extensive study of the approximation properties of such models. Finally, we also extend our algorithms and results to the framework of sleeping experts.
- Obstacle avoidance from monocular images is a challenging problem for robots. Though multi-view structure-from-motion could build 3D maps, it is not robust in textureless environments. Some learning based methods exploit human demonstration to predict a steering command directly from a single image. However, this method is usually biased towards certain tasks or demonstration scenarios and also biased by human understanding. In this paper, we propose a new method to predict a trajectory from images. We train our system on more diverse NYUv2 dataset. The ground truth trajectory is computed from the designed cost functions automatically. The Convolutional Neural Network perception is divided into two stages: first, predict depth map and surface normal from RGB images, which are two important geometric properties related to 3D obstacle representation. Second, predict the trajectory from the depth and normal. Results show that our intermediate perception increases the accuracy by 20% than the direct prediction. Our model generalizes well to other public indoor datasets and is also demonstrated for robot flights in simulation and experiments.
- Apr 25 2017 cs.CV arXiv:1704.06904v1In this work, we propose "Residual Attention Network", a convolutional neural network using attention mechanism which can incorporate with state-of-art feed forward network architecture in an end-to-end training fashion. Our Residual Attention Network is built by stacking Attention Modules which generate attention-aware features. The attention-aware features from different modules change adaptively as layers going deeper. Inside each Attention Module, bottom-up top-down feedforward structure is used to unfold the feedforward and feedback attention process into a single feedforward process. Importantly, we propose attention residual learning to train very deep Residual Attention Networks which can be easily scaled up to hundreds of layers. Extensive analyses are conducted on CIFAR-10 and CIFAR-100 datasets to verify the effectiveness of every module mentioned above. Our Residual Attention Network achieves state-of-the-art object recognition performance on three benchmark datasets including CIFAR-10 (3.90% error), CIFAR-100 (20.45% error) and ImageNet (4.8% single model and single crop, top-5 error). Note that, our method achieves 0.6% top-1 accuracy improvement with 46% trunk depth and 69% forward FLOPs comparing to ResNet-200. The experiment also demonstrates that our network is robust against noisy labels.
- Apr 18 2017 cs.AI arXiv:1704.04719v1In this paper, we present a robotic prediction agent including a darkforest Go engine, a fuzzy markup language (FML) assessment engine, an FML-based decision support engine, and a robot engine for game of Go application. The knowledge base and rule base of FML assessment engine are constructed by referring the information from the darkforest Go engine located in NUTN and OPU, for example, the number of MCTS simulations and winning rate prediction. The proposed robotic prediction agent first retrieves the database of Go competition website, and then the FML assessment engine infers the winning possibility based on the information generated by darkforest Go engine. The FML-based decision support engine computes the winning possibility based on the partial game situation inferred by FML assessment engine. Finally, the robot engine combines with the human-friendly robot partner PALRO, produced by Fujisoft incorporated, to report the game situation to human Go players. Experimental results show that the FML-based prediction agent can work effectively.
- Apr 10 2017 cs.CV arXiv:1704.02224v1We propose a novel 3D neural network architecture for 3D hand pose estimation from a single depth image. Different from previous works that mostly run on 2D depth image domain and require intermediate or post process to bring in the supervision from 3D space, we convert the depth map to a 3D volumetric representation, and feed it into a 3D convolutional neural network(CNN) to directly produce the pose in 3D requiring no further process. Our system does not require the ground truth reference point for initialization, and our network architecture naturally integrates both local feature and global context in 3D space. To increase the coverage of the hand pose space of the training data, we render synthetic depth image by transferring hand pose from existing real image datasets. We evaluation our algorithm on two public benchmarks and achieve the state-of-the-art performance. The synthetic hand pose dataset will be available.
- Reliability or Sustainability: Optimal Data Stream Estimation and Scheduling in Smart Water NetworksMar 30 2017 cs.SI arXiv:1703.09781v1As a typical Cyber-Physical System (CPS), smart water distribution networks require monitoring of underground water pipes with high sample rates for precise data analysis and water network control. Due to poor underground wireless channel quality and long-range communication requirements, high transmission power is typically adopted to communicate high-speed sensor data streams; posing challenges for long term sustainable monitoring. In this paper, we develop the first sustainable water sensing system, exploiting energy harvesting opportunities from water flows. Our system does this by scheduling the transmission of a subset of the data streams, while other correlated streams are estimated using auto-regressive models based on the sound-velocity propagation of pressure signals inside water networks. To compute the optimal scheduling policy, we formalize a stochastic optimization problem to maximize the estimation reliability, while ensuring the system's sustainable operation under dynamic conditions. We develop Data Transmission Scheduling (DTS), an asymptotically optimal scheme; and FAST-DTS, a lightweight online algorithm that can adapt to arbitrary energy and correlation dynamics. Using over 170 days of real data from our smart water system deployment and conducting in-vitro experiments to our small-scale testbed; our evaluation demonstrates that Fast-DTS significantly outperforms three alternatives, considering data reliability, energy utilization, and sustainable operation.
- Deep learning-based approaches have been widely used for training controllers for autonomous vehicles due to their powerful ability to approximate nonlinear functions or policies. However, the training process usually requires large labeled data sets and takes a lot of time. In this paper, we analyze the influences of features on the performance of controllers trained using the convolutional neural networks (CNNs), which gives a guideline of feature selection to reduce computation cost. We collect a large set of data using The Open Racing Car Simulator (TORCS) and classify the image features into three categories (sky-related, roadside-related, and road-related features).We then design two experimental frameworks to investigate the importance of each single feature for training a CNN controller.The first framework uses the training data with all three features included to train a controller, which is then tested with data that has one feature removed to evaluate the feature's effects. The second framework is trained with the data that has one feature excluded, while all three features are included in the test data. Different driving scenarios are selected to test and analyze the trained controllers using the two experimental frameworks. The experiment results show that (1) the road-related features are indispensable for training the controller, (2) the roadside-related features are useful to improve the generalizability of the controller to scenarios with complicated roadside information, and (3) the sky-related features have limited contribution to train an end-to-end autonomous vehicle controller.
- Existing simultaneous localization and mapping (SLAM) algorithms are not robust in challenging low-texture environments because there are only few salient features. The resulting sparse or semi-dense map also conveys little information for motion planning. Though some work utilize plane or scene layout for dense map regularization, they require decent state estimation from other sources. In this paper, we propose real-time monocular plane SLAM to demonstrate that scene understanding could improve both state estimation and dense mapping especially in low-texture environments. The plane measurements come from a pop-up 3D plane model applied to each single image. We also combine planes with point based SLAM to improve robustness. On a public TUM dataset, our algorithm generates a dense semantic 3D model with pixel depth error of 6.2 cm while existing SLAM algorithms fail. On a 60 m long dataset with loops, our method creates a much better 3D model with state estimation error of 0.67%.
- Mar 21 2017 cs.NE arXiv:1703.06263v1In the evolutionary computation research community, the performance of most evolutionary algorithms (EAs) depends strongly on their implemented coordinate system. However, the commonly used coordinate system is fixed and not well suited for different function landscapes, EAs thus might not search efficiently. To overcome this shortcoming, in this paper we propose a framework, named ACoS, to adaptively tune the coordinate systems in EAs. In ACoS, an Eigen coordinate system is established by making use of the cumulative population distribution information, which can be obtained based on a covariance matrix adaptation strategy and an additional archiving mechanism. Since the population distribution information can reflect the features of the function landscape to some extent, EAs in the Eigen coordinate system have the capability to identify the modality of the function landscape. In addition, the Eigen coordinate system is coupled with the original coordinate system, and they are selected according to a probability vector. The probability vector aims to determine the selection ratio of each coordinate system for each individual, and is adaptively updated based on the collected information from the offspring. ACoS has been applied to two of the most popular EA paradigms, i.e., particle swarm optimization (PSO) and differential evolution (DE), for solving 30 test functions with 30 and 50 dimensions at the 2014 IEEE Congress on Evolutionary Computation. The experimental studies demonstrate its effectiveness.
- Most visual odometry algorithm for a monocular camera focuses on points, either by feature matching, or direct alignment of pixel intensity, while ignoring a common but important geometry entity: edges. In this paper, we propose an odometry algorithm that combines points and edges to benefit from the advantages of both direct and feature based methods. It works better in texture-less environments and is also more robust to lighting changes and fast motion by increasing the convergence basin. We maintain a depth map for the keyframe then in the tracking part, the camera pose is recovered by minimizing both the photometric error and geometric error to the matched edge in a probabilistic framework. In the mapping part, edge is used to speed up and increase stereo matching accuracy. On various public datasets, our algorithm achieves better or comparable performance than state-of-the-art monocular odometry methods. In some challenging texture-less environments, our algorithm reduces the state estimation error over 50%.
- We consider the content delivery problem in a fading multi-input single-output channel with cache-aided users. We are interested in the scalability of the equivalent content delivery rate when the number of users, K, is large. Analytical results show that, using coded caching and wireless multicasting, without channel state information at the transmitter (CSIT), linear scaling of the content delivery rate with respect to K can be achieved in three different ways. First, with quasi-static fading, it can be achieved when the number of transmit antennas grows logarithmically with K. Second, even with a fixed number of antennas, we can achieve the linear scaling with a threshold-based user selection requiring only one-bit feedbacks from the users. Third, if the multicast transmission can span over multiple independent sub-channels, e.g., in block fading or multi-carrier systems, linear scaling can be obtained when the product of the number of sub-channels and the number of transmit antennas scales logarithmically with K. When CSIT is available, we propose a mixed strategy that combines spatial multiplexing and multicasting. Numerical results show that, by optimizing the power split between spatial multiplexing and multicasting, we can achieve a significant gain of the content delivery rate with moderate cache size.
- Mar 16 2017 cs.NI arXiv:1703.05174v1The intelligent transportation systems (ITS) framework from European Telecommunication Standards Institute (ETSI) imposes requirements on the exchange of periodic safety messages between components of ITS such as vehicles. In particular, it requires ETSI standardized Decentralized Congestion Control (DCC) algorithm to regulate the beaconing activity of vehicles based on wireless channel utilization. However, the DCC state that defines the beaconing behavior under heavy channel congestion, i.e., the Restrictive state, has a serious connectivity problem that safety beacons do not reach other vehicles in safety-critical distances. In this paper, we demonstrate the problem through analysis, simulation, and on-road measurements. We suggest that DCC change the transmit power setting for the Restrictive state before a full-scale deployment of the ETSI ITS framework starts, and we discuss its consequences in terms of changes in communicability and channel utilization.
- Mar 13 2017 cs.LG arXiv:1703.03478v1We introduce and analyze an on-line learning setting where the learner has the added option of abstaining from making a prediction at the price of a fixed cost. When the learner abstains, no feedback is provided, and she does not receive the label associated with the example. We design several algorithms and derive regret guarantees in both the adversarial and stochastic loss setting. In the process, we derive a new bound for on-line learning with feedback graphs that generalizes and extends existing work. We also design a new algorithm for on-line learning with sleeping experts that takes advantage of time-varying feedback graphs. We present natural extensions of existing algorithms as a baseline, and we then design more sophisticated algorithms that explicitly exploit the structure of our problem. We empirically validate the improvement of these more sophisticated algorithms on several datasets.
- We consider the transmission of a common message from a transmitter to three receivers over a broadcast channel, referred to as a multicast channel in this case. All the receivers are allowed to cooperate with each other over full-duplex bi-directional non-orthogonal cooperation links. We investigate the information-theoretic upper and lower bounds on the transmission rate. In particular, we propose a three-receiver fully interactive cooperation scheme (3FC) based on superpositions of CF and DF at the receivers. In the 3FC scheme, the receivers interactively perform compress-forward (CF) simultaneously to initiate the scheme, and then decode-forward (DF) sequentially to allow a correlation of each layer of the DF superposition in cooperation with the transmitter toward the next receiver in the chain to improve the achievable rate. The analysis leads to a closed-form expression that allows for numerical evaluation, and also gives some insight on key points to design interactive schemes. The numerical results provided in the Gaussian case show that the proposed scheme outperforms existing schemes and show the benefit of interaction.
- Feb 17 2017 cs.OH arXiv:1702.04719v4Trace alignment algorithms have been used in process mining for discovering the consensus treatment procedures and process deviations. Different alignment algorithms, however, may produce very different results. No widely-adopted method exists for evaluating the results of trace alignment. Existing reference-free evaluation methods cannot adequately and comprehensively assess the alignment quality. We analyzed and compared the existing evaluation methods, identifying their limitations, and introduced improvements in two reference-free evaluation methods. Our approach assesses the alignment result globally instead of locally, and therefore helps the algorithm to optimize overall alignment quality. We also introduced a novel metric to measure the alignment complexity, which can be used as a constraint on alignment algorithm optimization. We tested our evaluation methods on a trauma resuscitation dataset and provided the medical explanation of the activities and patterns identified as deviations using our proposed evaluation methods.
- We consider content delivery over fading broadcast channels. A server wants to transmit K files to K users, each equipped with a cache of finite size. Using the coded caching scheme of Maddah-Ali and Niesen, we design an opportunistic delivery scheme where the long-term sum content delivery rate scales with K the number of users in the system. The proposed delivery scheme combines superposition coding together with appropriate power allocation across sub-files intended to different subsets of users. We analyze the long-term average sum content delivery rate achieved by two special cases of our scheme: a) a selection scheme that chooses the subset of users with the largest weighted rate, and b) a baseline scheme that transmits to K users using the scheme of Maddah-Ali and Niesen. We prove that coded caching with appropriate user selection is scalable since it yields a linear increase of the average sum content delivery rate.
- Jan 31 2017 cs.CV arXiv:1701.08393v3We propose a deep convolutional neural network (CNN) for face detection leveraging on facial attributes based supervision. We observe a phenomenon that part detectors emerge within CNN trained to classify attributes from uncropped face images, without any explicit part supervision. The observation motivates a new method for finding faces through scoring facial parts responses by their spatial structure and arrangement. The scoring mechanism is data-driven, and carefully formulated considering challenging cases where faces are only partially visible. This consideration allows our network to detect faces under severe occlusion and unconstrained pose variations. Our method achieves promising performance on popular benchmarks including FDDB, PASCAL Faces, AFW, and WIDER FACE.
- Jan 24 2017 cs.DS arXiv:1701.06005v3In current cloud computing systems, when leveraging virtualization technology, the customer's requested data computing or storing service is accommodated by a set of communicated virtual machines (VM) in a scalable and elastic manner. These VMs are placed in one or more server nodes according to the node capacities or failure probabilities. The VM placement availability refers to the probability that at least one set of all customer's requested VMs operates during the requested lifetime. In this paper, we first study the problem of placing at most H groups of k requested VMs on a minimum number of nodes, such that the VM placement availability is no less than $\delta$, and that the specified communication delay and connection availability for each VM pair under the same placement group are not violated. We consider this problem with and without Shared-Risk Node Group (SRNG) failures, and prove this problem is NP-hard in both cases. We subsequently propose an exact Integer Nonlinear Program (INLP) and an efficient heuristic to solve this problem. We conduct simulations to compare the proposed algorithms with two existing heuristics in terms of performance. Finally, we study the related reliable routing problem of establishing a connection over at most w link-disjoint paths from a source to a destination, such that the connection availability requirement is satisfied and each path delay is no more than a given value. We devise an exact algorithm and two heuristics to solve this NP-hard problem, and evaluate them via simulations.
- Mobile Crowdsourcing is a promising service paradigm utilizing ubiquitous mobile devices to facilitate largescale crowdsourcing tasks (e.g. urban sensing and collaborative computing). Many applications in this domain require Device-to-Device (D2D) communications between participating devices for interactive operations such as task collaborations and file transmissions. Considering the private participating devices and their opportunistic encountering behaviors, it is highly desired to establish secure and trustworthy D2D connections in a fast and autonomous way, which is vital for implementing practical Mobile Crowdsourcing Systems (MCSs). In this paper, we develop an efficient scheme, Trustworthy Device Pairing (TDP), which achieves user-transparent secure D2D connections and reliable peer device selections for trustworthy D2D communications. Through rigorous analysis, we demonstrate the effectiveness and security intensity of TDP in theory. The performance of TDP is evaluated based on both real-world prototype experiments and extensive trace-driven simulations. Evaluation results verify our theoretical analysis and show that TDP significantly outperforms existing approaches in terms of pairing speed, stability, and security.
- Recently, linear codes constructed from defining sets have been studied extensively. They may have nice parameters if the defining set is chosen properly. Let $ m >2$ be a positive integer. For an odd prime $ p $, let $ r=p^m $ and $\text{Tr}$ be the absolute trace function from $\mathbb{F}_r$ onto $\mathbb{F}_p$. In this paper, we give a construction of linear codes by defining the code $ C_{D}=\{(\mathrm{Tr}(ax))_{x\in D}: a \in \mathbb{F}_{r} \}, $ where $ D =\left\{x\in \mathbb{F}_{r} : \mathrm{Tr}(x)=1, \mathrm{Tr}(x^2)=0 \right\}. $ Its complete weight enumerator and weight enumerator are determined explicitly by employing cyclotomic numbers and Gauss sums. In addition, we obtain several optimal linear codes with a few weights. They have higher rate compared with other codes, which enables them to have essential applications in areas such as association schemes and secret sharing schemes.
- Caching popular contents at base stations (BSs) of a heterogeneous cellular network (HCN) avoids frequent information passage from content providers to the network edge, thereby reducing latency and alleviating traffic congestion in backhaul links. In general, the optimal strategies for content placement in HCNs remain largely unknown and deriving them forms the theme of this paper. To this end, we adopt the popular random HCN model where $K$ tiers of BSs are modelled as independent Poisson point processes distributed in the plane with different densities. Further, the random caching scheme is considered where each of a given set of $M$ files with corresponding popularity measures is placed at each BS of a particular tier with a corresponding probability, called placement probability. The probabilities are identical for all BSs in the same tier but vary over tiers, giving the name tier-level content placement. We consider the network performance metric, hit probability, defined as the probability that a file requested by the typical user is delivered successfully to the user. We maximize the hit probability over content placement probabilities, which yields the optimal tier-level placement policies. For the case of uniform received signal-to-interference thresholds for successful transmissions for BSs in different tiers, the policy is in closed-form where the placement probability for a particular file is proportional to the square-root of the corresponding popularity measure with an offset depending on BS caching capacities. For the general case of non-uniform SIR thresholds, the optimization problem is non-convex and a sub-optimal placement policy is designed by approximation, which has a similar structure as in the case of uniform SIR thresholds and shown by simulation to be close-to-optimal.
- Dec 14 2016 cs.CR arXiv:1612.04350v2High-dimensional crowdsourced data collected from a large number of users produces rich knowledge for our society. However, it also brings unprecedented privacy threats to participants. Local privacy, a variant of differential privacy, is proposed as a means to eliminate the privacy concern. Unfortunately, achieving local privacy on high-dimensional crowdsourced data raises great challenges on both efficiency and effectiveness. Here, based on EM and Lasso regression, we propose efficient multi-dimensional joint distribution estimation algorithms with local privacy. Then, we develop a Locally privacy-preserving high-dimensional data Publication algorithm, LoPub, by taking advantage of our distribution estimation techniques. In particular, both correlations and joint distribution among multiple attributes can be identified to reduce the dimension of crowdsourced data, thus achieving both efficiency and effectiveness in locally private high-dimensional data publication. Extensive experiments on real-world datasets demonstrated that the efficiency of our multivariate distribution estimation scheme and confirm the effectiveness of our LoPub scheme in generating approximate datasets with local privacy.
- Dec 09 2016 cs.CV arXiv:1612.02742v1Hand detection is essential for many hand related tasks, e.g. parsing hand pose, understanding gesture, which are extremely useful for robotics and human-computer interaction. However, hand detection in uncontrolled environments is challenging due to the flexibility of wrist joint and cluttered background. We propose a deep learning based approach which detects hands and calibrates in-plane rotation under supervision at the same time. To guarantee the recall, we propose a context aware proposal generation algorithm which significantly outperforms the selective search. We then design a convolutional neural network(CNN) which handles object rotation explicitly to jointly solve the object detection and rotation estimation tasks. Experiments show that our method achieves better results than state-of-the-art detection models on widely-used benchmarks such as Oxford and Egohands database. We further show that rotation estimation and classification can mutually benefit each other.
- Nov 29 2016 cs.CV arXiv:1611.09026v2In this work, we explore the problem of generating fantastic special-effects for the typography. It is quite challenging due to the model diversities to illustrate varied text effects for different characters. To address this issue, our key idea is to exploit the analytics on the high regularity of the spatial distribution for text effects to guide the synthesis process. Specifically, we characterize the stylized patches by their normalized positions and the optimal scales to depict their style elements. Our method first estimates these two features and derives their correlation statistically. They are then converted into soft constraints for texture transfer to accomplish adaptive multi-scale texture synthesis and to make style element distribution uniform. It allows our algorithm to produce artistic typography that fits for both local texture patterns and the global spatial distribution in the example. Experimental results demonstrate the superiority of our method for various text effects over conventional style transfer methods. In addition, we validate the effectiveness of our algorithm with extensive artistic typography library generation.
- Performing arts organizations aim to enrich their communities through the arts. To do this, they strive to match their performance offerings to the taste of those communities. Success relies on understanding audience preference and predicting their behavior. Similar to most e-commerce or digital entertainment firms, arts presenters need to recommend the right performance to the right customer at the right time. As part of the Michigan Data Science Team (MDST), we partnered with the University Musical Society (UMS), a non-profit performing arts presenter housed in the University of Michigan, Ann Arbor. We are providing UMS with analysis and business intelligence, utilizing historical individual-level sales data. We built a recommendation system based on collaborative filtering, gaining insights into the artistic preferences of customers, along with the similarities between performances. To better understand audience behavior, we used statistical methods from customer-base analysis. We characterized customer heterogeneity via segmentation, and we modeled customer cohorts to understand and predict ticket purchasing patterns. Finally, we combined statistical modeling with natural language processing (NLP) to explore the impact of wording in program descriptions. These ongoing efforts provide a platform to launch targeted marketing campaigns, helping UMS carry out its mission by allocating its resources more efficiently. Celebrating its 138th season, UMS is a 2014 recipient of the National Medal of Arts, and it continues to enrich communities by connecting world-renowned artists with diverse audiences, especially students in their formative years. We aim to contribute to that mission through data science and customer analytics.