Aug 10 2017 cs.CV
Recognizing freehand sketches with high arbitrariness is greatly challenging. Most existing methods either ignore the geometric characteristics or treat sketches as handwritten characters with fixed structural ordering. Consequently, they can hardly yield high recognition performance even though sophisticated learning techniques are employed. In this paper, we propose a sequential deep learning strategy that combines both shape and texture features. A coded shape descriptor is exploited to characterize the geometry of sketch strokes with high flexibility, while the outputs of constitutional neural networks (CNN) are taken as the abstract texture feature. We develop dual deep networks with memorable gated recurrent units (GRUs), and sequentially feed these two types of features into the dual networks, respectively. These dual networks enable the feature fusion by another gated recurrent unit (GRU), and thus accurately recognize sketches invariant to stroke ordering. The experiments on the TU-Berlin data set show that our method outperforms the average of human and state-of-the-art algorithms even when significant shape and appearance variations occur.
Relation detection is a core component for many NLP applications including Knowledge Base Question Answering (KBQA). In this paper, we propose a hierarchical recurrent neural network enhanced by residual learning that detects KB relations given an input question. Our method uses deep residual bidirectional LSTMs to compare questions and relation names via different hierarchies of abstraction. Additionally, we propose a simple KBQA system that integrates entity linking and our proposed relation detector to enable one enhance another. Experimental results evidence that our approach achieves not only outstanding relation detection performance, but more importantly, it helps our KBQA system to achieve state-of-the-art accuracy for both single-relation (SimpleQuestions) and multi-relation (WebQSP) QA benchmarks.
The rapid evolution of Internet-of-Things (IoT) technologies has led to an emerging need to make it smarter. A variety of applications now run simultaneously on an ARM-based processor. For example, devices on the edge of the Internet are provided with higher horsepower to be entrusted with storing, processing and analyzing data collected from IoT devices. This significantly improves efficiency and reduces the amount of data that needs to be transported to the cloud for data processing, analysis and storage. However, commodity OSes are prone to compromise. Once they are exploited, attackers can access the data on these devices. Since the data stored and processed on the devices can be sensitive, left untackled, this is particularly disconcerting. In this paper, we propose a new system, TrustShadow that shields legacy applications from untrusted OSes. TrustShadow takes advantage of ARM TrustZone technology and partitions resources into the secure and normal worlds. In the secure world, TrustShadow constructs a trusted execution environment for security-critical applications. This trusted environment is maintained by a lightweight runtime system that coordinates the communication between applications and the ordinary OS running in the normal world. The runtime system does not provide system services itself. Rather, it forwards requests for system services to the ordinary OS, and verifies the correctness of the responses. To demonstrate the efficiency of this design, we prototyped TrustShadow on a real chip board with ARM TrustZone support, and evaluated its performance using both microbenchmarks and real-world applications. We showed TrustShadow introduces only negligible overhead to real-world applications.
This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the model. As a side effect, the embedding comes with an easy way of visualizing what specific parts of the sentence are encoded into the embedding. We evaluate our model on 3 different tasks: author profiling, sentiment classification, and textual entailment. Results show that our model yields a significant performance gain compared to other sentence embedding methods in all of the 3 tasks.
Mar 09 2017 cs.DS
This paper introduces a new single-pass reservoir weighted-sampling stream aggregation algorithm, Priority Sample and Hold. PrSH combines aspects of the well-known Sample and Hold algorithm with Priority Sampling. In particular, it achieves a reduced computational cost for rate adaptation in a fixed cache by using a single persistent random variable across the lifetime of each key in the cache. The basic approach can be supplemented with a Sample and Hold pre-sampling stage with a sampling rate adaptation controlled by PrSH. We prove that PrSH provides unbiased estimates of the true aggregates. We analyze the computational complexity of PrSH and its variants, and provide a detailed evaluation of its accuracy on synthetic and trace data. Weighted relative error is reduced by 40% to 65% at sampling rates of 5% to 17%, relative to Adaptive Sample and Hold; there is also substantial improvement for rank queries.
Feb 21 2017 cs.CV
In order to accurately detect defects in patterned fabric images, a novel detection algorithm based on Gabor-HOG (GHOG) and low-rank decomposition is proposed in this paper. Defect-free pattern fabric images have the specified direction, while defects damage their regularity of direction. Therefore, a direction-aware descriptor is designed, denoted as GHOG, a combination of Gabor and HOG, which is extremely valuable for localizing the defect region. Upon devising a powerful directional descriptor, an efficient low-rank decomposition model is constructed to divide the matrix generated by the directional feature extracted from image blocks into a low-rank matrix (background information) and a sparse matrix (defect information). A nonconvex log det(.) as a smooth surrogate function for the rank instead of the nuclear norm is also exploited to improve the efficiency of the low-rank model. Moreover, the computational efficiency is further improved by utilizing the alternative direction method of multipliers (ADMM). Thereafter, the saliency map generated by the sparse matrix is segmented via the optimal threshold algorithm to locate the defect regions. Experimental results show that the proposed method can effectively detect patterned fabric defects and outperform the state-of-the-art methods.
Feb 08 2017 cs.CL
Deep neural networks (DNN) have revolutionized the field of natural language processing (NLP). Convolutional neural network (CNN) and recurrent neural network (RNN), the two main types of DNN architectures, are widely explored to handle various NLP tasks. CNN is supposed to be good at extracting position-invariant features and RNN at modeling units in sequence. The state of the art on many NLP tasks often switches due to the battle between CNNs and RNNs. This work is the first systematic comparison of CNN and RNN on a wide range of representative NLP tasks, aiming to give basic guidance for DNN selection.
In this paper, we study a wireless packet broadcast system that uses linear network coding (LNC) to help receivers recover data packets that are missing due to packet erasures. We study two intertwined performance metrics, namely throughput and average packet decoding delay (APDD) and establish strong/weak approximation relations based on whether the approximation holds for the performance of every receiver (strong) or for the average performance across all receivers (weak). We prove an equivalence between strong throughput approximation and strong APDD approximation. We prove that throughput-optimal LNC techniques can strongly approximate APDD, and partition-based LNC techniques may weakly approximate throughput. We also prove that memoryless LNC techniques, including instantly decodable network coding techniques, are not strong throughput and APDD approximation nor weak throughput approximation techniques.
Nov 01 2016 cs.CL
This paper proposes dynamic chunk reader (DCR), an end-to-end neural reading comprehension (RC) model that is able to extract and rank a set of answer candidates from a given document to answer questions. DCR is able to predict answers of variable lengths, whereas previous neural RC models primarily focused on predicting single tokens or entities. DCR encodes a document and an input question with recurrent neural networks, and then applies a word-by-word attention mechanism to acquire question-aware representations for the document, followed by the generation of chunk representations and a ranking module to propose the top-ranked chunk as the answer. Experimental results show that DCR achieves state-of-the-art exact match and F1 scores on the SQuAD dataset.
Jun 13 2016 cs.CL
This work focuses on answering single-relation factoid questions over Freebase. Each question can acquire the answer from a single fact of form (subject, predicate, object) in Freebase. This task, simple question answering (SimpleQA), can be addressed via a two-step pipeline: entity linking and fact selection. In fact selection, we match the subject entity in a fact candidate with the entity mention in the question by a character-level convolutional neural network (char-CNN), and match the predicate in that fact with the question by a word-level CNN (word-CNN). This work makes two main contributions. (i) A simple and effective entity linker over Freebase is proposed. Our entity linker outperforms the state-of-the-art entity linker over SimpleQA task. (ii) A novel attentive maxpooling is stacked over word-CNN, so that the predicate representation can be matched with the predicate-focused question representation more effectively. Experiments show that our system sets new state-of-the-art in this task.
May 24 2016 cs.SE
The mobile application (app) has become the main entrance to access the Internet on handheld devices. Unlike the Web where each webpage has a global URL to reach directly, a specific "content page" of an app can be opened only by exploring the app with several operations from the landing page. The interoperability between apps is quite fixed and thus limits the value-added "linked data" between apps. Recently, deep link has been proposed to enable targeting and opening a specific page of an app externally with an accessible uniform resource identifier (URI). However, implementing deep link for mobile apps requires a lot of manual efforts by app developers, which can be very error-prone and time-consuming. In this paper, we propose DroidLink to automatically generating deep links for existing Android apps. We design a deep link model suitable for automatic generation. Then we explore the transition of pages and build a navigation graph based on static and dynamic analysis of Android apps. Next, we realize an updating mechanism that keeps on revisiting the target app and discover new pages, and thus generates deep links for every single page of the app. Finally, we repackage the app with deep link supports, but requires no additional deployment requirements. We generate deep links for some popular apps and demonstrate the feasibility of DroidLink.
Modern NLP models rely heavily on engineered features, which often combine word and contextual information into complex lexical features. Such combination results in large numbers of features, which can lead to over-fitting. We present a new model that represents complex lexical features---comprised of parts for words, contextual information and labels---in a tensor that captures conjunction information among these parts. We apply low-rank tensor approximations to the corresponding parameter tensors to reduce the parameter space and improve prediction speed. Furthermore, we investigate two methods for handling features that include $n$-grams of mixed lengths. Our model achieves state-of-the-art results on tasks in relation extraction, PP-attachment, and preposition disambiguation.
Jan 08 2016 cs.CL
Recurrent Neural Network (RNN) and one of its specific architectures, Long Short-Term Memory (LSTM), have been widely used for sequence labeling. In this paper, we first enhance LSTM-based sequence labeling to explicitly model label dependencies. Then we propose another enhancement to incorporate the global information spanning over the whole input sequence. The latter proposed method, encoder-labeler LSTM, first encodes the whole input sequence into a fixed length vector with the encoder LSTM, and then uses this encoded vector as the initial state of another LSTM for sequence labeling. Combining these methods, we can predict the label sequence with considering label dependencies and information of whole input sequence. In the experiments of a slot filling task, which is an essential component of natural language understanding, with using the standard ATIS corpus, we achieved the state-of-the-art F1-score of 95.66%.
Graph-cuts are widely used in computer vision. In order to speed up the optimization process and improve the scalability for large graphs, Strandmark and Kahl introduced a splitting method to split a graph into multiple subgraphs for parallel computation in both shared and distributed memory models. However, this parallel algorithm (parallel BK-algorithm) does not have a polynomial bound on the number of iterations and is found non-convergent in some cases due to the possible multiple optimal solutions of its sub-problems. To remedy this non-convergence problem, in this work we first introduce a merging method capable of merging any number of those adjacent sub-graphs which could hardly reach an agreement on their overlapped region in the parallel BK algorithm. Based on the pseudo-boolean representations of graph cuts,our merging method is shown able to effectively reuse all the computed flows in these sub-graphs. Through both the splitting and merging, we further propose a dynamic parallel and distributed graph-cuts algorithm with guaranteed convergence to the globally optimal solutions within a predefined number of iterations. In essence, this work provides a general framework to allow more sophisticated splitting and merging strategies to be employed to further boost performance. Our dynamic parallel algorithm is validated with extensive experimental results.
Sep 17 2015 cs.CV
Recently, very high-dimensional feature representations, e.g., Fisher Vector, have achieved excellent performance for visual recognition and retrieval. However, these lengthy representations always cause extremely heavy computational and storage costs and even become unfeasible in some large-scale applications. A few existing techniques can transfer very high-dimensional data into binary codes, but they still require the reduced code length to be relatively long to maintain acceptable accuracies. To target a better balance between computational efficiency and accuracies, in this paper, we propose a novel embedding method called Binary Projection Bank (BPB), which can effectively reduce the very high-dimensional representations to medium-dimensional binary codes without sacrificing accuracies. Instead of using conventional single linear or bilinear projections, the proposed method learns a bank of small projections via the max-margin constraint to optimally preserve the intrinsic data similarity. We have systematically evaluated the proposed method on three datasets: Flickr 1M, ILSVR2010 and UCF101, showing competitive retrieval and recognition accuracies compared with state-of-the-art approaches, but with a significantly smaller memory footprint and lower coding complexity.
Aug 04 2015 cs.CV
Conventional vision algorithms adopt a single type of feature or a simple concatenation of multiple features, which is always represented in a high-dimensional space. In this paper, we propose a novel unsupervised spectral embedding algorithm called Kernelized Multiview Projection (KMP) to better fuse and embed different feature representations. Computing the kernel matrices from different features/views, KMP can encode them with the corresponding weights to achieve a low-dimensional and semantically meaningful subspace where the distribution of each view is sufficiently smooth and discriminative. More crucially, KMP is linear for the reproducing kernel Hilbert space (RKHS) and solves the out-of-sample problem, which allows it to be competent for various practical applications. Extensive experiments on three popular image datasets demonstrate the effectiveness of our multiview embedding algorithm.
In this paper, we introduce the $k\times n$ (with $k\leq n$) truncated, supplemented Pascal matrix which has the property that any $k$ columns form a linearly independent set. This property is also present in Reed-Solomon codes; however, Reed-Solomon codes are completely dense, whereas the truncated, supplemented Pascal matrix has multiple zeros. If the maximal-distance separable code conjecture is correct, then our matrix has the maximal number of columns (with the aformentioned property) that the conjecture allows. This matrix has applications in coding, network coding, and matroid theory.
We consider broadcasting a block of packets to multiple wireless receivers under random packet erasures using instantly decodable network coding (IDNC). The sender first broadcasts each packet uncoded once, then generates coded packets according to receivers' feedback about their missing packets. We focus on strict IDNC (S-IDNC), where each coded packet includes at most one missing packet of every receiver. But we will also compare it with general IDNC (G-IDNC), where this condition is relaxed. We characterize two fundamental performance limits of S-IDNC: 1) the number of transmissions to complete the broadcast, and 2) the average delay for a receiver to decode a packet. We derive a closed-form expression for the expected minimum number of transmissions in terms of the number of packets and receivers and the erasure probability. We prove that it is NP-hard to minimize the decoding delay of S-IDNC. We also derive achievable upper bounds on the above two performance limits. We show that G-IDNC can outperform S-IDNC %in terms of the number of transmissions without packet erasures, but not necessarily with packet erasures. Next, we design optimal and heuristic S-IDNC transmission schemes and coding algorithms with full/intermittent receiver feedback. We present simulation results to corroborate the developed theory and compare with existing schemes.
Random linear network coding (RLNC) is asymptotically throughput optimal in the wireless broadcast of a block of packets from a sender to a set of receivers, but suffers from heavy computational load and packet decoding delay. To mitigate this problem while maintaining good throughput, we partition the packet block into disjoint generations after broadcasting the packets uncoded once and collecting one round of feedback about receivers' packet reception state. We prove the NP-hardness of the optimal partitioning problem by using a hypergraph coloring approach, and develop an efficient heuristic algorithm for its solution. Simulations show that our algorithm outperforms existing solutions.
Compositional embedding models build a representation (or embedding) for a linguistic structure based on its component word embeddings. We propose a Feature-rich Compositional Embedding Model (FCM) for relation extraction that is expressive, generalizes to new domains, and is easy-to-implement. The key idea is to combine both (unlexicalized) hand-crafted features with learned word embeddings. The model is able to directly tackle the difficulties met by traditional compositional embeddings models, such as handling arbitrary types of sentence annotations and utilizing global information for composition. We test the proposed model on two relation extraction tasks, and demonstrate that our model outperforms both previous compositional models and traditional feature rich models on the ACE 2005 relation extraction task, and the SemEval 2010 relation classification task. The combination of our model and a log-linear classifier with hand-crafted features gives state-of-the-art results.
We consider a setting in which a sender wishes to broadcast a block of K data packets to a set of wireless receivers, where each of the receivers has a subset of the data packets already available to it (e.g., from prior transmissions) and wants the rest of the packets. Our goal is to find a linear network coding scheme that yields the minimum average packet decoding delay (APDD), i.e., the average time it takes for a receiver to decode a data packet. Our contributions can be summarized as follows. First, we prove that this problem is NP-hard by presenting a reduction from the hypergraph coloring problem. Next, we show that %\alexnan MDS-based solution or a random linear network coding (RLNC) provides an approximate solution to this problem with approximation ratio $2$ with high probability. Next, we present a methodology for designing specialized approximation algorithms for this problem that outperform RLNC solutions while maintaining the same throughput. In a special case of practical interest with a small number of wanted packets our solution can achieve an approximation ratio (4-2/K)/3. Finally, we conduct an experimental study that demonstrates the advantages of the presented methodology.
Deterministic linear network coding (DLNC) is an important family of network coding techniques for wireless packet broadcast. In this paper, we show that DLNC is strongly related to and can be effectively studied using matroid theory without bridging index coding. We prove the equivalence between the DLNC solution and matrix matroid. We use this equivalence to study the performance limits of DLNC in terms of the number of transmissions and its dependence on the finite field size. Specifically, we derive the sufficient and necessary condition for the existence of perfect DLNC solutions and prove that such solutions may not exist over certain finite fields. We then show that identifying perfect solutions over any finite field is still an open problem in general. To fill this gap, we develop a heuristic algorithm which employs graphic matroids to find perfect DLNC solutions over any finite field. Numerical results show that its performance in terms of minimum number of transmissions is close to the lower bound, and is better than random linear network coding when the field size is not so large.
In this article, we analyze the application of options contract in special commodity supply chain such as fresh agricultural products. This problem is discussed in the point of the retailer. When spot market and future market are both available, we discuss how the retailer chooses the optimal production. Furthermore, overconfidence is introduced to the supply chain of the fresh agricultural products, which has not happened before. Then,based on the overconfidence of the retailer, we explore how overconfidence affects the supply chain system under different circumstances. At last, we get the conclusion that different overconfidence level has different affection on retailer's optimal ordering quantity and profit.
Many social networks in our daily life are bipartite networks built on reciprocity. How can we recommend users/friends to a user, so that the user is interested in and attractive to recommended users? In this research, we propose a new collaborative filtering model to improve user recommendations in reciprocal and bipartite social networks. The model considers a user's "taste" in picking others and "attractiveness" in being picked by others. A case study of an online dating network shows that the new model has good performance in recommending both initial and reciprocal contacts.
Our primary goal in this paper is to traverse the performance gap between two linear network coding schemes: random linear network coding (RLNC) and instantly decodable network coding (IDNC) in terms of throughput and decoding delay. We first redefine the concept of packet generation and use it to partition a block of partially-received data packets in a novel way, based on the coding sets in an IDNC solution. By varying the generation size, we obtain a general coding framework which consists of a series of coding schemes, with RLNC and IDNC identified as two extreme cases. We then prove that the throughput and decoding delay performance of all coding schemes in this coding framework are bounded between the performance of RLNC and IDNC and hence throughput-delay tradeoff becomes possible. We also propose implementations of this coding framework to further improve its throughput and decoding delay performance, to manage feedback frequency and coding complexity, or to achieve in-block performance adaption. Extensive simulations are then provided to verify the performance of the proposed coding schemes and their implementations.
In this paper, a comprehensive study of packet-based instantly decodable network coding (IDNC) for single-hop wireless broadcast is presented. The optimal IDNC solution in terms of throughput is proposed and its packet decoding delay performance is investigated. Lower and upper bounds on the achievable throughput and decoding delay performance of IDNC are derived and assessed through extensive simulations. Furthermore, the impact of receivers' feedback frequency on the performance of IDNC is studied and optimal IDNC solutions are proposed for scenarios where receivers' feedback is only available after and IDNC round, composed of several coded transmissions. However, since finding these IDNC optimal solutions is computational complex, we further propose simple yet efficient heuristic IDNC algorithms. The impact of system settings and parameters such as channel erasure probability, feedback frequency, and the number of receivers is also investigated and simple guidelines for practical implementations of IDNC are proposed.
This paper studies the tension between throughput and decoding delay performance of two widely-used network coding schemes: random linear network coding (RLNC) and instantly decodable network coding (IDNC). A single-hop broadcasting system model is considered that aims to deliver a block of packets to all receivers in the presence of packet erasures. For a fair and analytically tractable comparison between the two coding schemes, the transmission comprises two phases: a systematic transmission phase and a network coded transmission phase which is further divided into rounds. After the systematic transmission phase and given the same packet reception state, three quantitative metrics are proposed and derived in each scheme: 1) the absolute minimum number of transmissions in the first coded transmission round (assuming no erasures), 2) probability distribution of extra coded transmissions in a subsequent round (due to erasures), and 3) average packet decoding delay. This comparative study enables application-aware adaptive selection between IDNC and RLNC after systematic transmission phase. One contribution of this paper is to provide a deep and systematic understanding of the IDNC scheme, to propose the notion of packet diversity and an optimal IDNC encoding scheme for minimizing metric 1. This is generally NP-hard, but nevertheless required for characterizing and deriving all the three metrics. Analytical and numerical results show that there is no clear winner between RLNC and IDNC if one is concerned with both throughput and decoding delay performance. IDNC is more preferable than RLNC when the number of receivers is smaller than packet block size, and the case reverses when the number of receivers is much greater than the packet block size. In the middle regime, the choice can depend on the application and a specific instance of the problem.
Mar 19 2012 cs.DS
A large fraction of online display advertising is sold via guaranteed contracts: a publisher guarantees to the advertiser a certain number of user visits satisfying the targeting predicates of the contract. The publisher is then tasked with solving the ad serving problem - given a user visit, which of the thousands of matching contracts should be displayed, so that by the expiration time every contract has obtained the requisite number of user visits. The challenges of the problem come from (1) the sheer size of the problem being solved, with tens of thousands of contracts and billions of user visits, (2) the unpredictability of user behavior, since these contracts are sold months ahead of time, when only a forecast of user visits is available and (3) the minute amount of resources available online, as an ad server must respond with a matching contract in a fraction of a second. We present a solution to the guaranteed delivery ad serving problem using \em compact allocation plans. These plans, computed offline, can be efficiently queried by the ad server during an ad call; they are small, using only O(1) space for contract; and are stateless, allowing for distributed serving without any central coordination. We evaluate this approach on a real set of user visits and guaranteed contracts and show that the compact allocation plans are an effective way of solving the guaranteed delivery ad serving problem.