# Computer Science (cs)

• We consider a problem introduced by Mossel and Ross [Shotgun assembly of labeled graphs, arXiv:1504.07682]. Suppose a random $n\times n$ jigsaw puzzle is constructed by independently and uniformly choosing the shape of each "jig" from $q$ possibilities. We are given the shuffled pieces. Then, depending on $q$, what is the probability that we can reassemble the puzzle uniquely? We say that two solutions of a puzzle are similar if they only differ by permutation of duplicate pieces, and rotation of rotationally symmetric pieces. In this paper, we show that, with high probability, such a puzzle has at least two non-similar solutions when $2\leq q \leq \frac{2}{\sqrt{e}}n$, all solutions are similar when $q\geq (2+\varepsilon)n$, and the solution is unique when $q=\omega(n)$.
• We present a categorical construction for modelling both definite and indefinite causal structures within a general class of process theories that include classical probability theory and quantum theory. Unlike prior constructions within categorical quantum mechanics, the objects of this theory encode finegrained causal relationships between subsystems and give a new method for expressing and deriving consequences for a broad class of causal structures. To illustrate this point, we show that this framework admits processes with definite causal structures, namely one-way signalling processes, non-signalling processes, and quantum n-combs, as well as processes with indefinite causal structure, such as the quantum switch and the process matrices of Oreshkov, Costa, and Brukner. We furthermore give derivations of their operational behaviour using simple, diagrammatic axioms.
• A large amount of information exists in reviews written by users. This source of information has been ignored by most of the current recommender systems while it can potentially alleviate the sparsity problem and improve the quality of recommendations. In this paper, we present a deep model to learn item properties and user behaviors jointly from review text. The proposed model, named Deep Cooperative Neural Networks (DeepCoNN), consists of two parallel neural networks coupled in the last layers. One of the networks focuses on learning user behaviors exploiting reviews written by the user, and the other one learns item properties from the reviews written for the item. A shared layer is introduced on the top to couple these two networks together. The shared layer enables latent factors learned for users and items to interact with each other in a manner similar to factorization machine techniques. Experimental results demonstrate that DeepCoNN significantly outperforms all baseline recommender systems on a variety of datasets.
• We propose to leverage concept-level representations for complex event recognition in photographs given limited training examples. We introduce a novel framework to discover event concept attributes from the web and use that to extract semantic features from images and classify them into social event categories with few training examples. Discovered concepts include a variety of objects, scenes, actions and event sub-types, leading to a discriminative and compact representation for event images. Web images are obtained for each discovered event concept and we use (pretrained) CNN features to train concept classifiers. Extensive experiments on challenging event datasets demonstrate that our proposed method outperforms several baselines using deep CNN features directly in classifying images into events with limited training examples. We also demonstrate that our method achieves the best overall accuracy on a dataset with unseen event categories using a single training example.
• While recent deep neural networks have achieved promising results for 3D reconstruction from a single-view image, these rely on the availability of RGB textures in images and extra information as supervision. In this work, we propose novel stacked hierarchical networks and an end to end training strategy to tackle a more challenging task for the first time, 3D reconstruction from a single-view 2D silhouette image. We demonstrate that our model is able to conduct 3D reconstruction from a single-view silhouette image both qualitatively and quantitatively. Evaluation is performed using Shapenet for the single-view reconstruction and results are presented in comparison with a single network, to highlight the improvements obtained with the proposed stacked networks and the end to end training strategy. Furthermore, 3D re- construction in forms of IoU is compared with the state of art 3D reconstruction from a single-view RGB image, and the proposed model achieves higher IoU than the state of art of reconstruction from a single view RGB image.
• Governments and businesses increasingly rely on data analytics and machine learning (ML) for improving their competitive edge in areas such as consumer satisfaction, threat intelligence, decision making, and product efficiency. However, by cleverly corrupting a subset of data used as input to a target's ML algorithms, an adversary can perturb outcomes and compromise the effectiveness of ML technology. While prior work in the field of adversarial machine learning has studied the impact of input manipulation on correct ML algorithms, we consider the exploitation of bugs in ML implementations. In this paper, we characterize the attack surface of ML programs, and we show that malicious inputs exploiting implementation bugs enable strictly more powerful attacks than the classic adversarial machine learning techniques. We propose a semi-automated technique, called steered fuzzing, for exploring this attack surface and for discovering exploitable bugs in machine learning programs, in order to demonstrate the magnitude of this threat. As a result of our work, we responsibly disclosed five vulnerabilities, established three new CVE-IDs, and illuminated a common insecure practice across many machine learning systems. Finally, we outline several research directions for further understanding and mitigating this threat.
• Jan 18 2017 cs.DC arXiv:1701.04733v1
GPUs are dedicated processors used for complex calculations and simulations and they can be effectively used for tropical algebra computations. Tropical algebra is based on max-plus algebra and min-plus algebra. In this paper we proposed and designed a library based on Tropical Algebra which is used to provide standard vector and matrix operations namely Basic Tropical Algebra Subroutines (BTAS). The testing of BTAS library is conducted by implementing the sequential version of Floyd Warshall Algorithm on CPU and furthermore parallel version on GPU. The developed library for tropical algebra delivered extensively better results on a less expensive GPU as compared to the same on CPU.
• Variational Autoencoders (VAEs) are expressive latent variable models that can be used to learn complex probability distributions from training data. However, the quality of the resulting model crucially relies on the expressiveness of the inference model used during training. We introduce Adversarial Variational Bayes (AVB), a technique for training Variational Autoencoders with arbitrarily expressive inference models. We achieve this by introducing an auxiliary discriminative network that allows to rephrase the maximum-likelihood-problem as a two-player game, hence establishing a principled connection between VAEs and Generative Adversarial Networks (GANs). We show that in the nonparametric limit our method yields an exact maximum-likelihood assignment for the parameters of the generative model, as well as the exact posterior distribution over the latent variables given an observation. Contrary to competing approaches which combine VAEs with GANs, our approach has a clear theoretical justification, retains most advantages of standard Variational Autoencoders and is easy to implement.
• Jan 18 2017 cs.CV q-bio.NC arXiv:1701.04674v1
Computer vision has made remarkable progress in recent years. Deep neural network (DNN) models optimized to identify objects in images exhibit unprecedented task-trained accuracy and, remarkably, some generalization ability: new visual problems can now be solved more easily based on previous learning. Biological vision (learned in life and through evolution) is also accurate and general-purpose. Is it possible that these different learning regimes converge to similar problem-dependent optimal computations? We therefore asked whether the human system-level computation of visual perception has DNN correlates and considered several anecdotal test cases. We found that perceptual sensitivity to image changes has DNN mid-computation correlates, while sensitivity to segmentation, crowding and shape has DNN end-computation correlates. Our results quantify the applicability of using DNN computation to estimate perceptual loss, and are consistent with the fascinating theoretical view that properties of human perception are a consequence of architecture-independent visual learning.
• We present Convolutional Oriented Boundaries (COB), which produces multiscale oriented contours and region hierarchies starting from generic image classification Convolutional Neural Networks (CNNs). COB is computationally efficient, because it requires a single CNN forward pass for multi-scale contour detection and it uses a novel sparse boundary representation for hierarchical segmentation; it gives a significant leap in performance over the state-of-the-art, and it generalizes very well to unseen categories and datasets. Particularly, we show that learning to estimate not only contour strength but also orientation provides more accurate results. We perform extensive experiments for low-level applications on BSDS, PASCAL Context, PASCAL Segmentation, and NYUD to evaluate boundary detection performance, showing that COB provides state-of-the-art contours and region hierarchies in all datasets. We also evaluate COB on high-level tasks when coupled with multiple pipelines for object proposals, semantic contours, semantic segmentation, and object detection on various databases (MS-COCO, SBD, PASCAL VOC'07), showing that COB also improves the results for all tasks.
• Given a vertex-weighted graph $G=(V,E)$ and a set $S \subseteq V$, a subset feedback vertex set $X$ is a set of the vertices of $G$ such that the graph induced by $V \setminus X$ has no cycle containing a vertex of $S$. The \textscSubset Feedback Vertex Set problem takes as input $G$ and $S$ and asks for the subset feedback vertex set of minimum total weight. In contrast to the classical \textscFeedback Vertex Set problem which is obtained from the \textscSubset Feedback Vertex Set problem for $S=V$, restricted to graph classes the \textscSubset Feedback Vertex Set problem is known to be NP-complete on split graphs and, consequently, on chordal graphs. However as \textscFeedback Vertex Set is polynomially solvable for AT-free graphs, no such result is known for the \textscSubset Feedback Vertex Set problem on any subclass of AT-free graphs. Here we give the first polynomial-time algorithms for the problem on two unrelated subclasses of AT-free graphs: interval graphs and permutation graphs. As a byproduct we show that there exists a polynomial-time algorithm for circular-arc graphs by suitably applying our algorithm for interval graphs. Moreover towards the unknown complexity of the problem for AT-free graphs, we give a polynomial-time algorithm for co-bipartite graphs. Thus we contribute to the first positive results of the \textscSubset Feedback Vertex Set problem when restricted to graph classes for which \textscFeedback Vertex Set is solved in polynomial time.
• The evaluation of a query over a probabilistic database boils down to computing the probability of a suitable Boolean function, the lineage of the query over the database. The method of query compilation approaches the task in two stages: first, the query lineage is implemented (compiled) in a circuit form where probability computation is tractable; and second, the desired probability is computed over the compiled circuit. A basic theoretical quest in query compilation is that of identifying pertinent classes of queries whose lineages admit compact representations over increasingly succinct, tractable circuit classes. Fostering previous work by Jha and Suciu (2012) and Petke and Razgon (2013), we focus on queries whose lineages admit circuit implementations with small treewidth, and investigate their compilability within tame classes of decision diagrams. In perfect analogy with the characterization of bounded circuit pathwidth by bounded OBDD width, we show that a class of Boolean functions has bounded circuit treewidth if and only if it has bounded SDD width. Sentential decision diagrams (SDDs) are central in knowledge compilation, being essentially as tractable as OBDDs but exponentially more succinct. By incorporating constant width SDDs and polynomial size SDDs, we refine the panorama of query compilation for unions of conjunctive queries with and without inequalities.
• Recently there has been an enormous interest in generative models for images in deep learning. In pursuit of this, Generative Adversarial Networks (GAN) and Variational Auto-Encoder (VAE) have surfaced as two most prominent and popular models. While VAEs tend to produce excellent reconstructions but blurry samples, GANs generate sharp but slightly distorted images. In this paper we propose a new model called Variational InfoGAN (ViGAN). Our aim is two fold: (i) To generated new images conditioned on visual descriptions, and (ii) modify the image, by fixing the latent representation of image and varying the visual description. We evaluate our model on Labeled Faces in the Wild (LFW), celebA and a modified version of MNIST datasets and demonstrate the ability of our model to generate new images as well as to modify a given image by changing attributes.
• Automatic continuous time, continuous value assessment of a patient's pain from face video is highly sought after by the medical profession. Despite the recent advances in deep learning that attain impressive results in many domains, pain estimation risks not being able to benefit from this due to the difficulty in obtaining data sets of considerable size. In this work we propose a combination of hand-crafted and deep-learned features that makes the most of deep learning techniques in small sample settings. Encoding shape, appearance, and dynamics, our method significantly outperforms the current state of the art, attaining a RMSE error of less than 1 point on a 16-level pain scale, whilst simultaneously scoring a 67.3% Pearson correlation coefficient between our predicted pain level time series and the ground truth.
• Most existing community-related studies focus on detection, which aim to find the community membership for each user from user friendship links. However, membership alone, without a complete profile of what a community is and how it interacts with other communities, has limited applications. This motivates us to consider systematically profiling the communities and thereby developing useful community-level applications. In this paper, we for the first time formalize the concept of community profiling. With rich user information on the network, such as user published content and user diffusion links, we characterize a community in terms of both its internal content profile and external diffusion profile. The difficulty of community profiling is often underestimated. We novelly identify three unique challenges and propose a joint Community Profiling and Detection (CPD) model to address them accordingly. We also contribute a scalable inference algorithm, which scales linearly with the data size and it is easily parallelizable. We evaluate CPD on large-scale real-world data sets, and show that it is significantly better than the state-of-the-art baselines in various tasks.
• This volume contains the papers presented at LINEARITY 2016, the Fourth International Workshop on Linearity, held on June 26, 2016 in Porto, Portugal. The workshop was a one-day satellite event of FSCD 2016, the first International Conference on Formal Structures for Computation and Deduction. The aim of this workshop was to bring together researchers who are developing theory and applications of linear calculi, to foster their interaction and provide a forum for presenting new ideas and work in progress, and enable newcomers to learn about current activities in this area. Of interest were new results that made a central use of linearity, ranging from foundational work to applications in any field. This included: sub-linear logics, linear term calculi, linear type systems, linear proof-theory, linear programming languages, applications to concurrency, interaction-based systems, verification of linear systems, and biological and chemical models of computation.
• In recent times, the use of separable convolutions in deep convolutional neural network architectures has been explored. Several researchers, most notably (Chollet, 2016) and (Ghosh, 2017) have used separable convolutions in their deep architectures and have demonstrated state of the art or close to state of the art performance. However, the underlying mechanism of action of separable convolutions are still not fully understood. Although their mathematical definition is well understood as a depthwise convolution followed by a pointwise convolution, deeper interpretations such as the extreme Inception hypothesis (Chollet, 2016) have failed to provide a thorough explanation of their efficacy. In this paper, we propose a hybrid interpretation that we believe is a better model for explaining the efficacy of separable convolutions.
• This paper is a tutorial for newcomers to the field of automated verification tools, though we assume the reader to be relatively familiar with Hoare-style verification. In this paper, besides introducing the most basic features of the language and verifier Dafny, we place special emphasis on how to use Dafny as an assistant in the development of verified programs. Our main aim is to encourage the software engineering community to make the move towards using formal verification tools.
• How much can pruning algorithms teach us about the fundamentals of learning representations in neural networks? A lot, it turns out. Neural network model compression has become a topic of great interest in recent years, and many different techniques have been proposed to address this problem. In general, this is motivated by the idea that smaller models typically lead to better generalization. At the same time, the decision of what to prune and when to prune necessarily forces us to confront our assumptions about how neural networks actually learn to represent patterns in data. In this work we set out to test several long-held hypotheses about neural network learning representations and numerical approaches to pruning. To accomplish this we first reviewed the historical literature and derived a novel algorithm to prune whole neurons (as opposed to the traditional method of pruning weights) from optimally trained networks using a second-order Taylor method. We then set about testing the performance of our algorithm and analyzing the quality of the decisions it made. As a baseline for comparison we used a first-order Taylor method based on the Skeletonization algorithm and an exhaustive brute-force serial pruning algorithm. Our proposed algorithm worked well compared to a first-order method, but not nearly as well as the brute-force method. Our error analysis led us to question the validity of many widely-held assumptions behind pruning algorithms in general and the trade-offs we often make in the interest of reducing computational complexity. We discovered that there is a straightforward way, however expensive, to serially prune 40-70% of the neurons in a trained network with minimal effect on the learning representation and without any re-training.
• We prove a downward separation for $\mathsf{\Sigma}_2$-time classes. Specifically, we prove that if $\Sigma_2$E does not have polynomial size non-deterministic circuits, then $\Sigma_2$SubEXP does not have \textitfixed polynomial size non-deterministic circuits. To achieve this result, we use Santhanam's technique on augmented Arthur-Merlin protocols defined by Aydinlioğlu and van Melkebeek. We show that augmented Arthur-Merlin protocols with one bit of advice do not have fixed polynomial size non-deterministic circuits. We also prove a weak unconditional derandomization of a certain type of promise Arthur-Merlin protocols. Using Williams' easy hitting set technique, we show that $\Sigma_2$-promise AM problems can be decided in $\Sigma_2$SubEXP with $n^c$ advice, for some fixed constant $c$.
• Nowadays distributed computing approach has become very popular due to several advantages over the centralized computing approach as it also offers high performance computing at a very low cost. Each router implements some queuing mechanism for resources allocation in a best possible optimize manner and governs with packet transmission and buffer mechanism. In this paper, different types of queuing disciplines have been implemented for packet transmission when the bandwidth is allocated as well as packet dropping occurs due to buffer overflow. This gives result in latency in packet transmission, as the packet has to wait in a queue which is to be transmitted again. Some common queuing mechanisms are first in first out, priority queue and weighted fair queuing, etc. This targets simulation in heterogeneous environment through simulator tool to improve the quality of services by evaluating the performance of said queuing disciplines. This is demonstrated by interconnecting heterogeneous devices through step topology. In this paper, authors compared data packet, voice and video traffic by analyzing the performance based on packet dropped rate, delay variation, end to end delay and queuing delay and how the different queuing discipline effects the applications and utilization of network resources at the routers. Before evaluating the performance of the connected devices, a Unified Modeling Language class diagram is designed to represent the static model for evaluating the performance of step topology. Results are described by taking the various case studies.
• We study the expressive power of subrecursive probabilistic higher-order calculi. More specifically, we show that endowing a very expressive deterministic calculus like Godel's T with various forms of probabilistic choice operators may result in calculi which are not equivalent as for the class of distributions they give rise to, although they all guarantee almost-sure termination. Along the way, we introduce a probabilistic variation of the classic reducibility technique, and we prove that the simplest form of probabilistic choice leaves the expressive power of T essentially unaltered. The paper ends with some observations about functional expressivity: expectedly, all the considered calculi represent precisely the functions which T itself represents.
• A novel parallel algorithm for solving the classical Decision Boolean Satisfiability problem with clauses in conjunctive normal form is depicted. My approach for solving SAT is without using algebra or other computational search strategies such as branch and bound, back-forward, tree representation, etc. The method is based on the special class of SAT problems, Simple SAT (SSAT). The algorithm includes parallel execution, object oriented, and short termination as my previous versions but it keep track of the tested unsatisfactory binary values to improve the efficiency and to favor short termination. The resulting algorithm is linear with respect to the number of clauses plus a process data on the partial solutions of the subproblems SSAT of an arbitrary SAT and it is bounded by $2^{n}$ iterations where $n$ is the number of logical variables. The novelty for the solution of arbitrary SAT problems is a linear algorithm, such its complexity is less or equal than the algorithms of the state of the art for solving SAT. The implication for the class NP is depicted in detail.
• We consider a game of decentralized timing of jobs to a single server (machine) with a penalty for deviation from some due date and no delay costs. The jobs sizes are homogeneous and deterministic. Each job belongs to a single decision maker, a customer, who aims to arrive at a time that minimizes his deviation penalty. If multiple customers arrive at the same time then their order is determined by a uniform random draw. If the cost function has a weighted absolute deviation form then any Nash equilibrium is pure and symmetric, that is, all customers arrive together. Furthermore, we show that there exist multiple, in fact a continuum, of equilibrium arrival times, and provide necessary and sufficient conditions for the socially optimal arrival time to be an equilibrium. The base model is solved explicitly, but the prevalence of a pure symmetric equilibrium is shown to be robust to several relaxations of the assumptions: inclusion of small waiting costs, stochastic job sizes, random sized population, heterogeneous due dates and non-linear deviation penalties.
• Nowadays many companies have available large amounts of raw, unstructured data. Among Big Data enabling technologies, a central place is held by the MapReduce framework and, in particular, by its open source implementation, Apache Hadoop. For cost effectiveness considerations, a common approach entails sharing server clusters among multiple users. The underlying infrastructure should provide every user with a fair share of computational resources, ensuring that Service Level Agreements (SLAs) are met and avoiding wastes. In this paper we consider two mathematical programming problems that model the optimal allocation of computational resources in a Hadoop 2.x cluster with the aim to develop new capacity allocation techniques that guarantee better performance in shared data centers. Our goal is to get a substantial reduction of power consumption while respecting the deadlines stated in the SLAs and avoiding penalties associated with job rejections. The core of this approach is a distributed algorithm for runtime capacity allocation, based on Game Theory models and techniques, that mimics the MapReduce dynamics by means of interacting players, namely the central Resource Manager and Class Managers.
• This paper introduces a class of specific puncturing patterns, called symmetric puncturing patterns, which can be characterized and generated from the rows of the generator matrix $G_N$. They are first shown to be non-equivalent, then a low-complexity method to generate symmetric puncturing patterns is proposed, which performs a search tree algorithm with limited depth, over the rows of $G_N$. Symmetric patterns are further optimized by density evolution, and shown to yield better performance than state-of-the-art rate compatible code constructions, relying on either puncturing or shortening techniques.
• Finding the camera pose is an important step in many egocentric video applications. It has been widely reported that, state of the art SLAM algorithms fail on egocentric videos. In this paper, we propose a robust method for camera pose estimation, designed specifically for egocentric videos. In an egocentric video, the camera views the same scene point multiple times as the wearer's head sweeps back and forth. We use this specific motion profile to perform short loop closures aligned with wearer's footsteps. For egocentric videos, depth estimation is usually noisy. In an important departure, we use 2D computations for rotation averaging which do not rely upon depth estimates. The two modification results in much more stable algorithm as is evident from our experiments on various egocentric video datasets for different egocentric applications. The proposed algorithm resolves a long standing problem in egocentric vision and unlocks new usage scenarios for future applications.
• This paper focuses on the recently introduced Successive Cancellation Flip (SCFlip) decoder of polar codes. Our contribution is twofold. First, we propose the use of an optimized metric to determine the flipping positions within the SCFlip decoder, which improves its ability to find the first error that occurred during the initial SC decoding attempt. We also show that the proposed metric allows closely approaching the performance of an ideal SCFlip decoder. Second, we introduce a generalisation of the SCFlip decoder to a number of $\omega$ nested flips, denoted by SCFlip-$\omega$, using a similar optimized metric to determine the positions of the nested flips. We show that the SCFlip-2 decoder yields significant gains in terms of decoding performance and competes with the performance of the CRC-aided SC-List decoder with list size L=4, while having an average decoding complexity similar to that of the standard SC decoding, at medium to high signal to noise ratio.
• We formulate and analyze a graphical model selec- tion method for inferring the conditional independence graph of a high-dimensional non-stationary Gaussian random process (time series) from a finite-length observation. The observed process samples are assumed uncorrelated over time but having different covariance matrices. We characterize the sample complexity of graphical model selection for such processes by analyzing a particular selection method, which is based on sparse neighborhood regression. Our results indicate, similar to the case of i.i.d. samples, accurate GMS is possible even in the high- dimensional regime if the underlying conditional independence graph is sufficiently sparse.
• A novel processing-in-storage (PRinS) architecture based on Resistive CAM (ReCAM) is described and proposed for Smith-Waterman (S-W) sequence alignment. The ReCAM massively-parallel compare operation finds matching base-pairs in a fixed number of cycles, regardless of sequence length. The ReCAM PRinS S-W algorithm is simulated and compared to FPGA, Xeon Phi and GPU-based implementations, showing at least 3.7x higher throughput and at least 15x lower power dissipation.
• Scene understanding and object recognition is a difficult to achieve yet crucial skill for robots. Recently, Convolutional Neural Networks (CNN), have shown success in this task. However, there is still a gap between their performance on image datasets and real-world robotics scenarios. We present a novel paradigm for incrementally improving a robot's visual perception through active human interaction. In this paradigm, the user introduces novel objects to the robot by means of pointing and voice commands. Given this information, the robot visually explores the object and adds images from it to re-train the perception module. Our base perception module is based on recent development in object detection and recognition using deep learning. Our method leverages state of the art CNNs from off-line batch learning, human guidance, robot exploration and incremental on-line learning.
• Jan 18 2017 cs.LO arXiv:1701.04691v1
We present an interaction net implementation of optimal reduction for the pure untyped lambda calculus without use of any control nodes to solve the problem of matching fans. While preserving optimality, the implementation beats the interaction net implementation of closed reduction by the total number of interactions.
• The emergence of low-power wide area networks (LPWANs) as a new agent in the Internet of Things (IoT) will result in the incorporation into the digital world of low-automated processes from a wide variety of sectors. The single-hop conception of typical LPWAN deployments, though simple and robust, overlooks the self-organization capabilities of network devices, suffers from lack of scalability in crowded scenarios, and pays little attention to energy consumption. Aimed to take the most out of devices' capabilities, the HARE protocol stack is proposed in this paper as a new LPWAN technology flexible enough to adopt uplink multi-hop communications when proving energetically more efficient. In this way, results from a real testbed show energy savings of up to 15% when using a multi-hop approach while keeping the same network reliability. System's self-organizing capability and resilience have been also validated after performing numerous iterations of the association mechanism and deliberately switching off network devices.
• A compact information-rich representation of the environment, also called a feature abstraction, can simplify a robot's task of mapping its raw sensory inputs to useful action sequences. However, in environments that are non-stationary and only partially observable, a single abstraction is probably not sufficient to encode most variations. Therefore, learning multiple sets of spatially or temporally local, modular abstractions of the inputs would be beneficial. How can a robot learn these local abstractions without a teacher? More specifically, how can it decide from where and when to start learning a new abstraction? A recently proposed algorithm called Curious Dr. MISFA addresses this problem. The algorithm is based on two underlying learning principles called artificial curiosity and slowness. The former is used to make the robot self-motivated to explore by rewarding itself whenever it makes progress learning an abstraction; the later is used to update the abstraction by extracting slowly varying components from raw sensory inputs. Curious Dr. MISFA's application is, however, limited to discrete domains constrained by a pre-defined state space and has design limitations that make it unstable in certain situations. This paper presents a significant improvement that is applicable to continuous environments, is computationally less expensive, simpler to use with fewer hyper parameters, and stable in certain non-stationary environments. We demonstrate the efficacy and stability of our method in a vision-based robot simulator.
• In this paper, we investigate whether text from a Community Question Answering (QA) platform can be used to predict and describe real-world attributes. We experiment with predicting a wide range of 62 demographic attributes for neighbourhoods of London. We use the text from QA platform of Yahoo! Answers and compare our results to the ones obtained from Twitter microblogs. Outcomes show that the correlation between the predicted demographic attributes using text from Yahoo! Answers discussions and the observed demographic attributes can reach an average Pearson correlation coefficient of h̊o = 0.54, slightly higher than the predictions obtained using Twitter data. Our qualitative analysis indicates that there is semantic relatedness between the highest correlated terms extracted from both datasets and their relative demographic attributes. Furthermore, the correlations highlight the different natures of the information contained in Yahoo! Answers and Twitter. While the former seems to offer a more encyclopedic content, the latter provides information related to the current sociocultural aspects or phenomena.
• Jan 18 2017 cs.DB arXiv:1701.04652v1
With XML becoming an ubiquitous language for data interoperability purposes in various domains, efficiently querying XML data is a critical issue. This has lead to the design of algebraic frameworks based on tree-shaped patterns akin to the tree-structured data model of XML. Tree patterns are graphic representations of queries over data trees. They are actually matched against an input data tree to answer a query. Since the turn of the twenty-first century, an astounding research effort has been focusing on tree pattern models and matching optimization (a primordial issue). This paper is a comprehensive survey of these topics, in which we outline and compare the various features of tree patterns. We also review and discuss the two main families of approaches for optimizing tree pattern matching, namely pattern tree minimization and holistic matching. We finally present actual tree pattern-based developments, to provide a global overview of this significant research topic.
• We introduce a technique for the analysis of general spatially coupled systems that are governed by scalar recursions. Such systems can be expressed in variational form in terms of a potential functional. We show, under mild conditions, that the potential functional is \emphdisplacement convex and that the minimizers are given by the fixed points of the recursions. Furthermore, we give the conditions on the system such that the minimizing fixed point is unique up to translation along the spatial direction. The condition matches those in \citeKRU12 for the existence of spatial fixed points. \emphDisplacement convexity applies to a wide range of spatially coupled recursions appearing in coding theory, compressive sensing, random constraint satisfaction problems, as well as statistical mechanical models. We illustrate it with applications to Low-Density Parity-Check and generalized LDPC codes used for transmission on the binary erasure channel, or general binary memoryless symmetric channels within the Gaussian reciprocal channel approximation, as well as compressive sensing.
• With any (not necessarily proper) edge $k$-colouring $\gamma:E(G)\longrightarrow\{1,\dots,k\}$ of a graph $G$,one can associate a vertex colouring $\sigma\_{\gamma}$ given by $\sigma\_{\gamma}(v)=\sum\_{e\ni v}\gamma(e)$.A neighbour-sum-distinguishing edge $k$-colouring is an edge colouring whose associated vertex colouring is proper.The neighbour-sum-distinguishing index of a graph $G$ is then the smallest $k$ for which $G$ admitsa neighbour-sum-distinguishing edge $k$-colouring.These notions naturally extends to total colourings of graphs that assign colours to both vertices and edges.We study in this paper equitable neighbour-sum-distinguishing edge colourings andtotal colourings, that is colourings $\gamma$ for whichthe number of elements in any two colour classes of $\gamma$ differ by at most one.We determine the equitable neighbour-sum-distinguishing indexof complete graphs, complete bipartite graphs and forests,and the equitable neighbour-sum-distinguishing total chromatic numberof complete graphs and bipartite graphs.
• Crowdsourcing, a major economic issue, is the fact that the firm outsources internal task to the crowd. It is a form of digital subcontracting for the general public. The evaluation of the participants work quality is a major issue in crowdsourcing. Indeed, contributions must be controlled to ensure the effectiveness and relevance of the campaign. We are particularly interested in small, fast and not automatable tasks. Several methods have been proposed to solve this problem, but they are applicable when the "golden truth" is not always known. This work has the particularity to propose a method for calculating the degree of expertise in the presence of gold data in crowdsourcing. This method is based on the belief function theory and proposes a structuring of data using graphs. The proposed approach will be assessed and applied to the data.
• We study computable topological spaces and semicomputable and computable sets in these spaces. In particular, we investigate conditions under which semicomputable sets are computable. We prove that a semicomputable compact manifold $M$ is computable if its boundary $\partial M$ is computable. We also show how this result combined with certain construction which compactifies a semicomputable set leads to the conclusion that some noncompact semicomputable manifolds in computable metric spaces are computable.
• Weighted automata (WA) are an important formalism to describe quantitative properties. Obtaining equivalent deterministic machines is a longstanding research problem. In this paper we consider WA with a set semantics, meaning that the semantics is given by the set of weights of accepting runs. We focus on multi-sequential WA that are defined as finite unions of sequential WA. The problem we address is to minimize the size of this union. We call this minimum the degree of sequentiality of (the relation realized by) the WA. For a given positive integer k, we provide multiple characterizations of relations realized by a union of k sequential WA over an infinitary finitely generated group: a Lipschitz-like machine independent property, a pattern on the automaton (a new twinning property) and a subclass of cost register automata. When possible, we effectively translate a WA into an equivalent union of k sequential WA. We also provide a decision procedure for our twinning property for commutative computable groups thus allowing to compute the degree of sequentiality. Last, we show that these results also hold for word transducers and that the associated decision problem is Pspace-complete.
• In this paper we propose and implement novel techniques for performance evaluation of web traffic (response time, response code, etc.), with no reassembly of the underlying TCP connection, which severely restricts the traffic analysis throughput. Furthermore, our proposed software for HTTP traffic analysis runs in standard hardware, which is very cost-effective. Besides, we present sub-TCP connection load balancing techniques that significantly increase throughput at the expense of losing very few HTTP transactions. Such techniques provide performance evaluation statistics which are indistinguishable from the single-threaded alternative with full TCP connection reassembly.
• Traditional social organizations such as those for the management of healthcare are the result of designs that matched well with an operational context considerably different from the one we are experiencing today. The new context reveals all the fragility of our societies. In this paper, a platform is introduced by combining social-oriented communities and complex-event processing concepts: SELFSERV. Its aim is to complement the "old recipes" with smarter forms of social organization based on the self-service paradigm and by exploring culture-specific aspects and technological challenges.
• We device a new method to calculate a large number of Mellin moments of single scale quantities using the systems of differential and/or difference equations obtained by integration-by-parts identities between the corresponding Feynman integrals of loop corrections to physical quantities. These scalar quantities have a much simpler mathematical structure than the complete quantity. A sufficiently large set of moments may even allow the analytic reconstruction of the whole quantity considered, holding in case of first order factorizing systems. In any case, one may derive highly precise numerical representations in general using this method, which is otherwise completely analytic.
• Content-based routing (CBR) is a powerful model that supports scalable asynchronous communication among large sets of geographically distributed nodes. Yet, preserving privacy represents a major limitation for the wide adoption of CBR, notably when the routers are located in public clouds. Indeed, a CBR router must see the content of the messages sent by data producers, as well as the filters (or subscriptions) registered by data consumers. This represents a major deterrent for companies for which data is a key asset, as for instance in the case of financial markets or to conduct sensitive business-to-business transactions. While there exists some techniques for privacy-preserving computation, they are either prohibitively slow or too limited to be usable in real systems. In this paper, we follow a different strategy by taking advantage of trusted hardware extensions that have just been introduced in off-the-shelf processors and provide a trusted execution environment. We exploit Intel's new software guard extensions (SGX) to implement a CBR engine in a secure enclave. Thanks to the hardware-based trusted execution environment (TEE), the compute-intensive CBR operations can operate on decrypted data shielded by the enclave and leverage efficient matching algorithms. Extensive experimental evaluation shows that SGX adds only limited overhead to insecure plaintext matching outside secure enclaves while providing much better performance and more powerful filtering capabilities than alternative software-only solutions. To the best of our knowledge, this work is the first to demonstrate the practical benefits of SGX for privacy-preserving CBR.
• Jan 18 2017 cs.LG cs.IR arXiv:1701.04600v1
There has been considerable work on improving popular clustering algorithm `K-means' in terms of mean squared error (MSE) and speed, both. However, most of the k-means variants tend to compute distance of each data point to each cluster centroid for every iteration. We propose a fast heuristic to overcome this bottleneck with only marginal increase in MSE. We observe that across all iterations of K-means, a data point changes its membership only among a small subset of clusters. Our heuristic predicts such clusters for each data point by looking at nearby clusters after the first iteration of k-means. We augment well known variants of k-means with our heuristic to demonstrate effectiveness of our heuristic. For various synthetic and real-world datasets, our heuristic achieves speed-up of up-to 3 times when compared to efficient variants of k-means.
• To maximize offloading gain of cache-enabled device-to-device (D2D) communications, content placement and delivery should be jointly designed. In this letter, we jointly optimize caching and scheduling policies to maximize successful offloading probability, defined as the probability that a user can obtain desired file in local cache or via D2D link with data rate larger than a given threshold. We obtain the optimal scheduling factor for a random scheduling policy that can control interference in a distributed manner, and a low complexity solution to compute caching distribution. We show that the offloading gain can be remarkably improved by the joint optimization.
• All the existing real world networks are evolving, hence, study of traffic dynamics in these enlarged networks is a challenging task. The critical issue is to optimize the network structure to improve network capacity and avoid traffic congestion. We are interested in taking user's routes such that it is least congested with optimal network capacity. Network capacity may be improved either by optimizing network topology or enhancing in routing approach. In this context, we propose and design a model of the time varying data communication networks (TVCN) based on the dynamics of in-flowing links. Newly appeared node prefers to attach with most influential node present in the network. In this paper, influence is termed as \textitreputation and is applied for computing overall congestion at any node. User path with least betweenness centrality and most reputation is preferred for routing. Kelly's optimization formulation for a rate allocation problem is used for obtaining optimal rates of distinct users at different time instants and it is found that the user's path with lowest betweenness centrality and highest reputation will always give maximum rate at stable point.
• We present a novel solution for Channel Assignment Problem (CAP) in Device-to-Device (D2D) wireless networks that takes into account the throughput estimation noise. CAP is known to be NP-hard in the literature and there is no practical optimal learning algorithm that takes into account the estimation noise. In this paper, we first formulate the CAP as a stochastic optimization problem to maximize the expected sum data rate. To capture the estimation noise, CAP is modeled as a noisy potential game, a novel notion we introduce in this paper. Then, we propose a distributed Binary Log-linear Learning Algorithm (BLLA) that converges to the optimal channel assignments. Convergence of BLLA is proved for bounded and unbounded noise. Proofs for fixed and decreasing temperature parameter of BLLA are provided. A sufficient number of estimation samples is given that guarantees the convergence to the optimal state. We assess the performance of BLLA by extensive simulations, which show that the sum data rate increases with the number of channels and users. Contrary to the better response algorithm, the proposed algorithm achieves the optimal channel assignments distributively even in presence of estimation noise.
• Optimization is becoming a crucial element in industrial applications involving sustainable alternative energy systems. During the design of such systems, the engineer/decision maker would often encounter noise factors (e.g. solar insolation and ambient temperature fluctuations) when their system interacts with the environment. In this chapter, the sizing and design optimization of the solar powered irrigation system was considered. This problem is multivariate, noisy, nonlinear and multiobjective. This design problem was tackled by first using the Fuzzy Type II approach to model the noise factors. Consequently, the Bacterial Foraging Algorithm (BFA) (in the context of a weighted sum framework) was employed to solve this multiobjective fuzzy design problem. This method was then used to construct the approximate Pareto frontier as well as to identify the best solution option in a fuzzy setting. Comprehensive analyses and discussions were performed on the generated numerical results with respect to the implemented solution methods.

Zoltán Zimborás Jan 12 2017 20:38 UTC

Here is a nice description, with additional links, about the importance of this work if it turns out to be flawless (thanks a lot to Martin Schwarz for this link): [dichotomy conjecture][1].

[1]: http://processalgebra.blogspot.com/2017/01/has-feder-vardi-dichotomy-conjecture.html

J. Smith Dec 14 2016 17:43 UTC

Very good Insight on android security problems and malware. Nice Work !

phaeladr Nov 14 2016 11:03 UTC

[magic mirrors][1] really?

[1]: http://buchderFarben.de

Māris Ozols Oct 21 2016 21:06 UTC

Very nice! Now we finally know how to fairly cut a cake in a finite number of steps! What is more, the number of steps is expected to go down from the whopping $n^{n^{n^{n^{n^n}}}}$ to just barely $n^{n^n}$. I can't wait to get my slice!

https://www.quantamagazine.org/20161006-new-algorithm-solve

...(continued)
sattath Oct 05 2016 12:13 UTC

Thank you for your kind words. Indeed, we worked hard to achieve the attributes you mentioned.

Frédéric Grosshans Oct 04 2016 15:05 UTC

I do not find this second abstract more informative, and it is definitely less entertaining to read. I really like the original abstract because, despite its tale format, it really works as an informative abstract.

Chris Ferrie Oct 04 2016 01:31 UTC

I approve of this comment.

Cedric Yen-Yu Lin Sep 29 2016 12:54 UTC

Sounds like a nice fable for young readers of [this book][1].

[1]: https://www.amazon.com/Quantum-Physics-Babies-Chris-Ferrie/dp/1492309532

sattath Sep 29 2016 11:15 UTC