# Artificial Intelligence (cs.AI)

• Intelligent creatures can explore their environments and learn useful skills without supervision. In this paper, we propose DIAYN ("Diversity is All You Need"), a method for learning useful skills without a reward function. Our proposed method learns skills by maximizing an information theoretic objective using a maximum entropy policy. On a variety of simulated robotic tasks, we show that this simple objective results in the unsupervised emergence of diverse skills, such as walking and jumping. In a number of reinforcement learning benchmark environments, our method is able to learn a skill that solves the benchmark task despite never receiving the true task reward. In these environments, some of the learned skills correspond to solving the task, and each skill that solves the task does so in a distinct manner. Our results suggest that unsupervised discovery of skills can serve as an effective pretraining mechanism for overcoming challenges of exploration and data efficiency in reinforcement learning
• The use of artificial intelligence intelligencein medicine can be traced back to 1968 when Paycha published his paper Le diagnostic a l'aide d'intelligences artificielle, presentation de la premiere machine diagnostri. Few years later Shortliffe et al. presented an expert system named Mycin which was able to identify bacteria causing severe blood infections and to recommend antibiotics. Despite the fact that Mycin outperformed members of the Stanford medical school in the reliability of diagnosis it was never used in practice due to a legal issue who do you sue if it gives a wrong diagnosis?. However only in 2016 when the artificial intelligence software built into the IBM Watson AI platform correctly diagnosed and proposed an effective treatment for a 60-year-old womans rare form of leukemia the AI use in medicine become really popular.On of first papers presenting the use of AI in paediatrics was published in 1984. The paper introduced a computer-assisted medical decision making system called SHELP.
• In this paper, we consider an online optimization process, where the objective functions are not convex (nor concave) but instead belong to a broad class of continuous submodular functions. We first propose a variant of the Frank-Wolfe algorithm that has access to the full gradient of the objective functions. We show that it achieves a regret bound of $O(\sqrt{T})$ (where $T$ is the horizon of the online optimization problem) against a $(1-1/e)$-approximation to the best feasible solution in hindsight. However, in many scenarios, only an unbiased estimate of the gradients are available. For such settings, we then propose an online stochastic gradient ascent algorithm that also achieves a regret bound of $O(\sqrt{T})$ regret, albeit against a weaker $1/2$-approximation to the best feasible solution in hindsight. We also generalize our results to $\gamma$-weakly submodular functions and prove the same sublinear regret bounds. Finally, we demonstrate the efficiency of our algorithms on a few problem instances, including non-convex/non-concave quadratic programs, multilinear extensions of submodular set functions, and D-optimal design.
• Although chatbots have been very popular in recent years, they still have some serious weaknesses which limit the scope of their applications. One major weakness is that they cannot learn new knowledge during the conversation process, i.e., their knowledge is fixed beforehand and cannot be expanded or updated during conversation. In this paper, we propose to build a general knowledge learning engine for chatbots to enable them to continuously and interactively learn new knowledge during conversations. As time goes by, they become more and more knowledgeable and better and better at learning and conversation. We model the task as an open-world knowledge base completion problem and propose a novel technique called lifelong interactive learning and inference (LiLi) to solve it. LiLi works by imitating how humans acquire knowledge and perform inference during an interactive conversation. Our experimental results show LiLi is highly promising.
• Objective: This work aims at providing a new method for the automatic detection of atrial fibrillation, other arrhythmia and noise on short single lead ECG signals, emphasizing the importance of the interpretability of the classification results. Approach: A morphological and rhythm description of the cardiac behavior is obtained by a knowledge-based interpretation of the signal using the \textitConstrue abductive framework. Then, a set of meaningful features are extracted for each individual heartbeat and as a summary of the full record. The feature distributions were used to elucidate the expert criteria underlying the labeling of the 2017 Physionet/CinC Challenge dataset, enabling a manual partial relabeling to improve the consistency of the classification rules. Finally, state-of-the-art machine learning methods are combined to provide an answer on the basis of the feature values. Main results: The proposal tied for the first place in the official stage of the Challenge, with a combined $F_1$ score of 0.83, and was even improved in the follow-up stage to 0.85 with a significant simplification of the model. Significance: This approach demonstrates the potential of \textitConstrue to provide robust and valuable descriptions of temporal data even with significant amounts of noise and artifacts. Also, we discuss the importance of a consistent classification criteria in manually labeled training datasets, and the fundamental advantages of knowledge-based approaches to formalize and validate that criteria.
• Recent developments in the field of robot grasping have shown great improvements in the grasp success rates when dealing with unknown objects. In this work we improve on one of the most promising approaches, the Grasp Quality Convolutional Neural Network (GQ-CNN) trained on the DexNet 2.0 dataset. We propose a new architecture for the GQ-CNN and describe practical improvements that increase the model validation accuracy from 92.2% to 95.8% and from 85.9% to 88.0% on respectively image-wise and object-wise training and validation splits.
• This paper describes the N-Tuple Bandit Evolutionary Algorithm (NTBEA), an optimisation algorithm developed for noisy and expensive discrete (combinatorial) optimisation problems. The algorithm is applied to two game-based hyper-parameter optimisation problems. The N-Tuple system directly models the statistics, approximating the fitness and number of evaluations of each modelled combination of parameters. The model is simple, efficient and informative. Results show that the NTBEA significantly outperforms grid search and an estimation of distribution algorithm.
• Recently, the interest in reinforcement learning in game playing has been renewed. This is evidenced by the groundbreaking results achieved by AlphaGo. General Game Playing (GGP) provides a good testbed for reinforcement learning, currently one of the hottest fields of AI. In GGP, a specification of games rules is given. The description specifies a reinforcement learning problem, leaving programs to find strategies for playing well. Q-learning is one of the canonical reinforcement learning methods, which is used as baseline on some previous work (Banerjee & Stone, IJCAI 2007). We implement Q-learning in GGP for three small board games (Tic-Tac-Toe, Connect-Four, Hex). We find that Q-learning converges, and thus that this general reinforcement learning method is indeed applicable to General Game Playing. However, convergence is slow, in comparison to MCTS (a reinforcement learning method reported to achieve good results). We enhance Q-learning with Monte Carlo Search. This enhancement improves performance of pure Q-learning, although it does not yet out-perform MCTS. Future work is needed into the relation between MCTS and Q-learning, and on larger problem instances.
• We present a technique for estimating the similarity between objects such as movies or foods whose proper representation depends on human perception. Our technique combines a modest number of human similarity assessments to infer a pairwise similarity function between the objects. This similarity function captures some human notion of similarity which may be difficult or impossible to automatically extract, such as which movie from a collection would be a better substitute when the desired one is unavailable. In contrast to prior techniques, our method does not assume that all similarity questions on the collection can be answered or that all users perceive similarity in the same way. When combined with a user model, we find how each assessor's tastes vary, affecting their perception of similarity.
• Estimating causal models from observational data is a crucial task in data analysis. For continuous-valued data, Shimizu et al. have proposed a linear acyclic non-Gaussian model to understand the data generating process, and have shown that their model is identifiable when the number of data is sufficiently large. However, situations in which continuous and discrete variables coexist in the same problem are common in practice. Most existing causal discovery methods either ignore the discrete data and apply a continuous-valued algorithm or discretize all the continuous data and then apply a discrete Bayesian network approach. These methods possibly loss important information when we ignore discrete data or introduce the approximation error due to discretization. In this paper, we define a novel hybrid causal model which consists of both continuous and discrete variables. The model assumes: (1) the value of a continuous variable is a linear function of its parent variables plus a non-Gaussian noise, and (2) each discrete variable is a logistic variable whose distribution parameters depend on the values of its parent variables. In addition, we derive the BIC scoring function for model selection. The new discovery algorithm can learn causal structures from mixed continuous and discrete data without discretization. We empirically demonstrate the power of our method through thorough simulations.
• This work exploits translation data as a source of semantically relevant learning signal for models of word representation. In particular, we exploit equivalence through translation as a form of distributed context and jointly learn how to embed and align with a deep generative model. Our EmbedAlign model embeds words in their complete observed context and learns by marginalisation of latent lexical alignments. Besides, it embeds words as posterior probability densities, rather than point estimates, which allows us to compare words in context using a measure of overlap between distributions (e.g. KL divergence). We investigate our model's performance on a range of lexical semantics tasks achieving competitive results on several standard benchmarks including natural language inference, paraphrasing, and text similarity.
• Feb 19 2018 cs.AI arXiv:1802.05875v1
We investigate and generalize to an extended framework the notion of 'true on components' introduced by Zhou, Wang and Sun in their paper "Automated Reducible Geometric Theorem Proving and Discovery by Gröbner Basis Method", J. Automat. Reasoning 59 (3), 331-344, 2017. A new, simple criterion is presented for a statement to be simultaneously not generally true and not generally false (i.e. true on components), and its performance is exemplified through the implementation of this test in the dynamic geometry program GeoGebra.
• In this paper, we unify causal and non-causal feature feature selection methods based on the Bayesian network framework. We first show that the objectives of causal and non-causal feature selection methods are equal and are to find the Markov blanket of a class attribute, the theoretically optimal feature set for classification. We demonstrate that causal and non-causal feature selection take different assumptions of dependency among features to find Markov blanket, and their algorithms are shown different level of approximation for finding Markov blanket. In this framework, we are able to analyze the sample and error bounds of casual and non-causal methods. We conducted extensive experiments to show the correctness of our theoretical analysis.
• Integrated task and motion planning has emerged as a challenging problem in sequential decision making, where a robot needs to compute high-level strategy and low-level motion plans for solving complex tasks. While high-level strategies require decision making over longer time-horizons and scales, their feasibility depends on low-level constraints based upon the geometries and continuous dynamics of the environment. The hybrid nature of this problem makes it difficult to scale; most existing approaches focus on deterministic, fully observable scenarios. We present a new approach where the high-level decision problem occurs in a stochastic setting and can be modeled as a Markov decision process. In contrast to prior efforts, we show that complete MDP policies, or contingent behaviors, can be computed effectively in an anytime fashion. Our algorithm continuously improves the quality of the solution and is guaranteed to be probabilistically complete. We evaluate the performance of our approach on a challenging, realistic test problem: autonomous aircraft inspection. Our results show that we can effectively compute consistent task and motion policies for the most likely execution-time outcomes using only a fraction of the computation required to develop the complete task and motion policy.
• Given a target name, which can be a product aspect or entity, identifying its aspect words and opinion words in a given corpus is a fine-grained task in target-based sentiment analysis (TSA). This task is challenging, especially when we have no labeled data and we want to perform it for any given domain. To address it, we propose a general two-stage approach. Stage one extracts/groups the target-related words (call t-words) for a given target. This is relatively easy as we can apply an existing semantics-based learning technique. Stage two separates the aspect and opinion words from the grouped t-words, which is challenging because we often do not have enough word-level aspect and opinion labels. In this work, we formulate this problem in a PU learning setting and incorporate the idea of lifelong learning to solve it. Experimental results show the effectiveness of our approach.
• In recent years, Convolutional Neural Networks (CNNs) have shown remarkable performance in many computer vision tasks such as object recognition and detection. However, complex training issues, such as "catastrophic forgetting" and hyper-parameter tuning, make incremental learning in CNNs a difficult challenge. In this paper, we propose a hierarchical deep neural network, with CNNs at multiple levels, and a corresponding training method for lifelong learning. The network grows in a tree-like manner to accommodate the new classes of data without losing the ability to identify the previously trained classes. The proposed network was tested on CIFAR-10 and CIFAR-100 datasets, and compared against the method of fine tuning specific layers of a conventional CNN. We obtained comparable accuracies and achieved 40% and 20% reduction in training effort in CIFAR-10 and CIFAR 100 respectively. The network was able to organize the incoming classes of data into feature-driven super-classes. Our model improves upon existing hierarchical CNN models by adding the capability of self-growth and also yields important observations on feature selective classification.
• Feb 19 2018 cs.AI stat.ML arXiv:1802.05786v1
In the modern era, abundant information is easily accessible from various sources, however only a few of these sources are reliable as they mostly contain unverified contents. We develop a system to validate the truthfulness of a given statement together with underlying evidence. The proposed system provides supporting evidence when the statement is tagged as false. Our work relies on an inference method on a knowledge graph (KG) to identify the truthfulness of statements. In order to extract the evidence of falseness, the proposed algorithm takes into account combined knowledge from KG and ontologies. The system shows very good results as it provides valid and concise evidence. The quality of KG plays a role in the performance of the inference method which explicitly affects the performance of our evidence-extracting algorithm.

Bin Shi Oct 05 2017 00:07 UTC

Welcome to give the comments for this paper!

Māris Ozols Oct 21 2016 21:06 UTC

Very nice! Now we finally know how to fairly cut a cake in a finite number of steps! What is more, the number of steps is expected to go down from the whopping $n^{n^{n^{n^{n^n}}}}$ to just barely $n^{n^n}$. I can't wait to get my slice!

https://www.quantamagazine.org/20161006-new-algorithm-solve

...(continued)
anti-plagiarism Jul 09 2015 15:11 UTC

This paper "**Tree-based convolution for sentence modeling**" is a deliberate plagiarism. The texts, models and ideas overlap significantly with previous work on arXiv.

- TBCNN: A **Tree-based Convolutional** Neural Network for Programming
Language Processing (arXiv:1409.5718)
- **Tree-based

...(continued)