# Multiagent Systems (cs.MA)

• The analysis in Part I revealed interesting properties for subgradient learning algorithms in the context of stochastic optimization when gradient noise is present. These algorithms are used when the risk functions are non-smooth and involve non-differentiable components. They have been long recognized as being slow converging methods. However, it was revealed in Part I that the rate of convergence becomes linear for stochastic optimization problems, with the error iterate converging at an exponential rate $\alpha^i$ to within an $O(\mu)-$neighborhood of the optimizer, for some $\alpha \in (0,1)$ and small step-size $\mu$. The conclusion was established under weaker assumptions than the prior literature and, moreover, several important problems (such as LASSO, SVM, and Total Variation) were shown to satisfy these weaker assumptions automatically (but not the previously used conditions from the literature). These results revealed that sub-gradient learning methods have more favorable behavior than originally thought when used to enable continuous adaptation and learning. The results of Part I were exclusive to single-agent adaptation. The purpose of the current Part II is to examine the implications of these discoveries when a collection of networked agents employs subgradient learning as their cooperative mechanism. The analysis will show that, despite the coupled dynamics that arises in a networked scenario, the agents are still able to attain linear convergence in the stochastic case; they are also able to reach agreement within $O(\mu)$ of the optimizer.
• In large-scale natural disasters, humans are likely to fail when they attempt to reach high-risk sites or act in search and rescue operations. Robots, however, outdo their counterparts in surviving the hazards and handling the search and rescue missions due to their multiple and diverse sensing and actuation capabilities. The dynamic formation of optimal coalition of these heterogeneous robots for cost efficiency is very challenging and research in the area is gaining more and more attention. In this paper, we propose a novel heuristic. Since the population of robots in large-scale disaster settings is very large, we rely on Quantum Multi-Objective Particle Swarm Optimization (QMOPSO). The problem is modeled as a multi-objective optimization problem. Simulations with different test cases and metrics, and comparison with other algorithms such as NSGA-II and SPEA-II are carried out. The experimental results show that the proposed algorithm outperforms the existing algorithms not only in terms of convergence but also in terms of diversity and processing time.
• Recent progress in artificial intelligence enabled the design and implementation of autonomous computing devices, agents, that may interact and learn from each other to achieve certain goals. Sometimes however, a human operator needs to intervene and interrupt an agent in order to prevent certain dangerous situations. Yet, as part of their learning process, agents may link these interruptions that impact their reward to specific states, and deliberately avoid them. The situation is particularly challenging in a distributed context because agents might not only learn from their own past interruptions, but also from those of other agents. This paper defines the notion of safe interruptibility as a distributed computing problem, and studies this notion in the two main learning frameworks: joint action learners and independent learners. We give realistic sufficient conditions on the learning algorithm for safe interruptibility in the case of joint action learners, yet show that these conditions are not sufficient for independent learners. We show however that if agents can detect interruptions, it is possible to prune the observations to ensure safe interruptibility even for independent learners
• We study the problem of cooperative inference where a group of agents interact over a network and seek to estimate a joint parameter that best explains a set of observations. Agents do not know the network topology or the observations of other agents. We explore a variational interpretation of the Bayesian posterior density, and its relation to the stochastic mirror descent algorithm, to propose a new distributed learning algorithm. We show that, under appropriate assumptions, the beliefs generated by the proposed algorithm concentrate around the true parameter exponentially fast. We provide explicit non-asymptotic bounds for the convergence rate. Moreover, we develop explicit and computationally efficient algorithms for observation models belonging to exponential families.
• In this paper, we argue that the future of Artificial Intelligence research resides in two keywords: integration and embodiment. We support this claim by analyzing the recent advances of the field. Regarding integration, we note that the most impactful recent contributions have been made possible through the integration of recent Machine Learning methods (based in particular on Deep Learning and Recurrent Neural Networks) with more traditional ones (e.g. Monte-Carlo tree search, goal babbling exploration or addressable memory systems). Regarding embodiment, we note that the traditional benchmark tasks (e.g. visual classification or board games) are becoming obsolete as state-of-the-art learning algorithms approach or even surpass human performance in most of them, having recently encouraged the development of first-person 3D game platforms embedding realistic physics. Building upon this analysis, we first propose an embodied cognitive architecture integrating heterogenous sub-fields of Artificial Intelligence into a unified framework. We demonstrate the utility of our approach by showing how major contributions of the field can be expressed within the proposed framework. We then claim that benchmarking environments need to reproduce ecologically-valid conditions for bootstrapping the acquisition of increasingly complex cognitive skills through the concept of a cognitive arms race between embodied agents.
• The usual epistemic S5 model for multi-agent systems is a Kripke graph, whose edges are labeled with the agents that do not distinguish between two states. We propose to uncover the higher dimensional information implicit in the Kripke graph, by using as a model its dual, a chromatic simplicial complex. For each state of the Kripke model there is a facet in the complex, with one vertex per agent. If an edge (u,v) is labeled with a set of agents S, the facets corresponding to u and v intersect in a simplex consisting of one vertex for each agent of S. Then we use dynamic epistemic logic to study how the simplicial complex epistemic model changes after the agents communicate with each other. We show that there are topological invariants preserved from the initial epistemic complex to the epistemic complex after an action model is applied, that depend on how reliable the communication is. In turn these topological properties determine the knowledge that the agents may gain after the communication happens.
• We present AutonoVi:, a novel algorithm for autonomous vehicle navigation that supports dynamic maneuvers and satisfies traffic constraints and norms. Our approach is based on optimization-based maneuver planning that supports dynamic lane-changes, swerving, and braking in all traffic scenarios and guides the vehicle to its goal position. We take into account various traffic constraints, including collision avoidance with other vehicles, pedestrians, and cyclists using control velocity obstacles. We use a data-driven approach to model the vehicle dynamics for control and collision avoidance. Furthermore, our trajectory computation algorithm takes into account traffic rules and behaviors, such as stopping at intersections and stoplights, based on an arc-spline representation. We have evaluated our algorithm in a simulated environment and tested its interactive performance in urban and highway driving scenarios with tens of vehicles, pedestrians, and cyclists. These scenarios include jaywalking pedestrians, sudden stops from high speeds, safely passing cyclists, a vehicle suddenly swerving into the roadway, and high-density traffic where the vehicle must change lanes to progress more effectively.
• Black-Scholes (BS) is the standard mathematical model for option pricing in financial markets. Option prices are calculated using an analytical formula whose main inputs are strike (at which price to exercise) and volatility. The BS framework assumes that volatility remains constant across all strikes, however, in practice it varies. How do traders come to learn these parameters? We introduce natural models of learning agents, in which they update their beliefs about the true implied volatility based on the opinions of other traders. We prove convergence of these opinion dynamics using techniques from control theory and leader-follower models, thus providing a resolution between theory and market practices. We allow for two different models, one with feedback and one with an unknown leader and no feedback. Both scalar and multidimensional cases are analyzed.
• This dissertation is motivated by the need, in today's globalist world, for a precise way to enable governments, organisations and other regulatory bodies to evaluate the constraints they place on themselves and others. An organisation's modus operandi is enacting and fulfilling contracts between itself and its participants. Yet, organisational contracts should respect external laws, such as those setting out data privacy rights and liberties. Contracts can only be enacted by following contract law processes, which often require bilateral agreement and consideration. Governments need to legislate whilst understanding today's context of national and international governance hierarchy where law makers shun isolationism and seek to influence one another. Governments should avoid punishment by respecting constraints from international treaties and human rights charters. Governments can only enact legislation by following their own, pre-existing, law making procedures. In other words, institutions, such as laws and contracts are designed and enacted under constraints.
• In this work a mixed agent-based and discrete event simulation model is developed for a high frequency bus route in the Netherlands. With this model, different passenger growth scenarios can be easily evaluated. This simulation model helps policy makers to predict changes that have to be made to bus routes and planned travel times before problems occur. The model is validated using several performance indicators, showing that under some model assumptions, it can realistically simulate real-life situations. The simulation's workings are illustrated by two use cases.
• Human societies around the world interact with each other by developing and maintaining social norms, and it is critically important to understand how such norms emerge and change. In this work, we define an evolutionary game-theoretic model to study how norms change in a society, based on the idea that different strength of norms in societies translate to different game-theoretic interaction structures and incentives. We use this model to study, both analytically and with extensive agent-based simulations, the evolutionary relationships of the need for coordination in a society (which is related to its norm strength) with two key aspects of norm change: cultural inertia (whether or how quickly the population responds when faced with conditions that make a norm change desirable), and exploration rate (the willingness of agents to try out new strategies). Our results show that a high need for coordination leads to both high cultural inertia and a low exploration rate, while a low need for coordination leads to low cultural inertia and high exploration rate. This is the first work, to our knowledge, on understanding the evolutionary causal relationships among these factors.
• A social approach can be exploited for the Internet of Things (IoT) to manage a large number of connected objects. These objects operate as autonomous agents to request and provide information and services to users. Establishing trustworthy relationships among the objects greatly improves the effectiveness of node interaction in the social IoT and helps nodes overcome perceptions of uncertainty and risk. However, there are limitations in the existing trust models. In this paper, a comprehensive model of trust is proposed that is tailored to the social IoT. The model includes ingredients such as trustor, trustee, goal, trustworthiness evaluation, decision, action, result, and context. Building on this trust model, we clarify the concept of trust in the social IoT in five aspects such as (1) mutuality of trustor and trustee, (2) inferential transfer of trust, (3) transitivity of trust, (4) trustworthiness update, and (5) trustworthiness affected by dynamic environment. With network connectivities that are from real-world social networks, a series of simulations are conducted to evaluate the performance of the social IoT operated with the proposed trust model. An experimental IoT network is used to further validate the proposed trust model.
• The paper proposes a hierarchical, agent-based, DES supported, distributed architecture for networked organization control. Taking into account enterprise integration engineering frameworks and business process management techniques, the paper intends to apply control engineering approaches for solving some problems of coordinating networked organizations, such as performance evaluation and optimization of workflows.
• This thesis contributes to the formalisation of the notion of an agent within the class of finite multivariate Markov chains. Agents are seen as entities that act, perceive, and are goal-directed. We present a new measure that can be used to identify entities (called $\iota$-entities), some general requirements for entities in multivariate Markov chains, as well as formal definitions of actions and perceptions suitable for such entities. The intuition behind $\iota$-entities is that entities are spatiotemporal patterns for which every part makes every other part more probable. The measure, complete local integration (CLI), is formally investigated in general Bayesian networks. It is based on the specific local integration (SLI) which is measured with respect to a partition. CLI is the minimum value of SLI over all partitions. We prove that $\iota$-entities are blocks in specific partitions of the global trajectory. These partitions are the finest partitions that achieve a given SLI value. We also establish the transformation behaviour of SLI under permutations of nodes in the network. We go on to present three conditions on general definitions of entities. These are not fulfilled by sets of random variables i.e.\ the perception-action loop, which is often used to model agents, is too restrictive. We propose that any general entity definition should in effect specify a subset (called an an entity-set) of the set of all spatiotemporal patterns of a given multivariate Markov chain. The set of $\iota$-entities is such a set. Importantly the perception-action loop also induces an entity-set. We then propose formal definitions of actions and perceptions for arbitrary entity-sets. These specialise to standard notions in case of the perception-action loop entity-set. Finally we look at some very simple examples.
• This paper contains an axiomatic study of consistent approval-based multi-winner rules, i.e., voting rules that select a fixed-size group of candidates based on approval ballots. We introduce the class of counting rules, provide an axiomatic characterization of this class and, in particular, show that counting rules are consistent. Building upon this result, we axiomatically characterize three important consistent multi-winner rules: Proportional Approval Voting, Multi-Winner Approval Voting and Approval Chamberlin--Courant. Our results demonstrate the variety of multi-winner rules and the different, orthogonal goals that multi-winner voting rules may pursue.
• If the influence diagram (ID) depicting a Bayesian game is common knowledge to its players then additional assumptions may allow the players to make use of its embodied irrelevance statements. They can then use these to discover a simpler game which still embodies both their optimal decision policies. However the impact of this result has been rather limited because many common Bayesian games do not exhibit sufficient symmetry to be fully and efficiently represented by an ID. The tree-based chain event graph (CEG) has been developed specifically for such asymmetric problems. By using these graphs rational players can make analogous deductions, assuming the topology of the CEG as common knowledge. In this paper we describe these powerful new techniques and illustrate them through an example modelling a game played between a government department and the provider of a website designed to radicalise vulnerable people.
• In this paper, a distributed average tracking problem is studied for Lipschitz-type nonlinear dynamical systems. The objective is to design distributed average tracking algorithms for locally interactive agents to track the average of multiple reference signals. Here, in both the agents' and the reference signals' dynamics, there is a nonlinear term satisfying the Lipschitz-type condition. Three types of distributed average tracking algorithms are designed. First, based on state-dependent-gain designing approaches, a robust distributed average tracking algorithm is developed to solve distributed average tracking problems without requiring the same initial condition. Second, by using a gain adaption scheme, an adaptive distributed average tracking algorithm is proposed in this paper to remove the requirement that the Lipschitz constant is known for agents. Third, to reduce chattering and make the algorithms easier to implement, a continuous distributed average tracking algorithm based on a time-varying boundary layer is further designed as a continuous approximation of the previous discontinuous distributed average tracking algorithms.
• We study the problem of planning tours for an Unmanned Aerial Vehicle (UAV) to visit a given set of sites in the least amount of time. This is the classic Traveling Salesperson Problem (TSP). UAVs have limited battery life and as a result may not be able to visit all the points on a single charge. We envision scenarios where the UAVs can be recharged along the way either by landing on stationary recharging stations or on Unmanned Ground Vehicles (UGVs) acting as mobile recharging stations. We present an algorithm to find the optimal tours to determine not only the order in which to visit the sites but also when and where to land on the UGV to recharge. Our algorithm plans tours for the UGVs as well as determines best locations to place stationary charging stations. While the problem we study is NP-Hard, we present a practical solution using Generalized TSP that finds the optimal solution (albeit in possibly exponential worst-case running time). Our simulation results show that the running time is acceptable for reasonably sized instances in practice. We also show how to modify our algorithms to plan for package delivery with UAVs using UGVs as mobile warehouses.
• In search engines, online marketplaces and other human-computer interfaces large collectives of individuals sequentially interact with numerous alternatives of varying quality. In these contexts, trial and error (exploration) is crucial for uncovering novel high-quality items or solutions, but entails a high cost for individual users. Self-interested decision makers, are often better off imitating the choices of individuals who have already incurred the costs of exploration. Although imitation makes sense at the individual level, it deprives the group of additional information that could have been gleaned by individual explorers. In this paper we show that in such problems, preference diversity can function as a welfare enhancing mechanism. It leads to a consistent increase in the quality of the consumed alternatives that outweighs the increased cost of search for the users.
• Norms have been extensively proposed as coordination mechanisms for both agent and human societies. Nevertheless, choosing the norms to regulate a society is by no means straightforward. The reasons are twofold. First, the norms to choose from may not be independent (i.e, they can be related to each other). Second, different preference criteria may be applied when choosing the norms to enact. This paper advances the state of the art by modeling a series of decision-making problems that regulation authorities confront when choosing the policies to establish. In order to do so, we first identify three different norm relationships -namely, generalisation, exclusivity, and substitutability- and we then consider norm representation power, cost, and associated moral values as alternative preference criteria. Thereafter, we show that the decision-making problems faced by policy makers can be encoded as linear programs, and hence solved with the aid of state-of-the-art solvers.
• An event-based state estimation approach for reducing communication in a networked control system is proposed. Multiple distributed sensor agents observe a dynamic process and sporadically transmit their measurements to estimator agents over a shared bus network. Local event-triggering protocols ensure that data is transmitted only when necessary to meet a desired estimation accuracy. The event-based design is shown to emulate the performance of a centralised state observer design up to guaranteed bounds, but with reduced communication. The stability results for state estimation are extended to the distributed control system that results when the local estimates are used for feedback control. Results from numerical simulations and hardware experiments illustrate the effectiveness of the proposed approach in reducing network communication.