# Multiagent Systems (cs.MA)

• We propose a novel computational method to extract information about interactions among individuals with different behavioral states in a biological collective from ordinary video recordings. Assuming that individuals are acting as finite state machines, our method first detects discrete behavioral states of those individuals and then constructs a model of their state transitions, taking into account the positions and states of other individuals in the vicinity. We have tested the proposed method through applications to two real-world biological collectives, termites in an experimental setting and human pedestrians in an open space. For each application, a robust tracking system was developed in-house, utilizing interactive human intervention (for termite tracking) or online agent-based simulation (for pedestrian tracking). In both cases, significant interactions were detected between nearby individuals with different states, demonstrating the effectiveness of the proposed method.
• Humanity faces numerous problems of common-pool resource appropriation. This class of multi-agent social dilemma includes the problems of ensuring sustainable use of fresh water, common fisheries, grazing pastures, and irrigation systems. Abstract models of common-pool resource appropriation based on non-cooperative game theory predict that self-interested agents will generally fail to find socially positive equilibria---a phenomenon called the tragedy of the commons. However, in reality, human societies are sometimes able to discover and implement stable cooperative solutions. Decades of behavioral game theory research have sought to uncover aspects of human behavior that make this possible. Most of that work was based on laboratory experiments where participants only make a single choice: how much to appropriate. Recognizing the importance of spatial and temporal resource dynamics, a recent trend has been toward experiments in more complex real-time video game-like environments. However, standard methods of non-cooperative game theory can no longer be used to generate predictions for this case. Here we show that deep reinforcement learning can be used instead. To that end, we study the emergent behavior of groups of independently learning agents in a partially observed Markov game modeling common-pool resource appropriation. Our experiments highlight the importance of trial-and-error learning in common-pool resource appropriation and shed light on the relationship between exclusion, sustainability, and inequality.
• Sensing in complex systems requires large-scale information exchange and on-the-go communications over heterogeneous networks and integrated processing platforms. Many networked cyber-physical systems exhibit hierarchical infrastructures of information flows, which naturally leads to a multi-level tree-like information structure in which each level corresponds to a particular scale of representation. This work focuses on the multiscale fusion of data collected at multiple levels of the system. We propose a multiscale state-space model to represent multi-resolution data over the hierarchical information system and formulate a multi-stage dynamic zero-sum game to design a multi-scale $H_{\infty}$ robust filter. We present numerical experiments for one and two-dimensional signals and provide a comparative analysis of the minimax filter with the standard Kalman filter to show the improvement in signal-to-noise ratio (SNR).
• This paper presents a data-driven approach for multi-robot coordination in partially-observable domains based on Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a general framework for cooperative sequential decision making under uncertainty and MAs allow temporally extended and asynchronous action execution. To date, most methods assume the underlying Dec-POMDP model is known a priori or a full simulator is available during planning time. Previous methods which aim to address these issues suffer from local optimality and sensitivity to initial conditions. Additionally, few hardware demonstrations involving a large team of heterogeneous robots and with long planning horizons exist. This work addresses these gaps by proposing an iterative sampling based Expectation-Maximization algorithm (iSEM) to learn polices using only trajectory data containing observations, MAs, and rewards. Our experiments show the algorithm is able to achieve better solution quality than the state-of-the-art learning-based methods. We implement two variants of multi-robot Search and Rescue (SAR) domains (with and without obstacles) on hardware to demonstrate the learned policies can effectively control a team of distributed robots to cooperate in a partially observable stochastic environment.
• In the article I study the evolutionary adaptivity of two simple population models, based on either altruistic or egoistic law of energy exchange. The computational experiments show the convincing advantage of the altruists, which brings us to a small discussion about genetic algorithms and extraterrestrial life.
• The paper considers the problem of planning a set of non-conflict trajectories for the coalition of intelligent agents (mobile robots). Two divergent approaches, e.g. centralized and decentralized, are surveyed and analyzed. Decentralized planner - MAPP is described and applied to the task of finding trajectories for dozens UAVs performing nap-of-the-earth flight in urban environments. Results of the experimental studies provide an opportunity to claim that MAPP is a highly efficient planner for solving considered types of tasks.
• We propose a minority route choice game to investigate the effect of the network structure on traffic network performance under the assumption of drivers' bounded rationality. We investigate ring-and-hub topologies to capture the nature of traffic networks in cities, and employ a minority game-based inductive learning process to model the characteristic behavior under the route choice scenario. Through numerical experiments, we find that topological changes in traffic networks induce a phase transition from an uncongested phase to a congested phase. Understanding this phase transition is helpful in planning new traffic networks.
• This work studies the problem of inferring whether an agent is directly influenced by another agent over an adaptive diffusion network. Agent i influences agent j if they are connected (according to the network topology), and if agent j uses the data from agent i to update its online statistic. The solution of this inference task is challenging for two main reasons. First, only the output of the diffusion learning algorithm is available to the external observer that must perform the inference based on these indirect measurements. Second, only output measurements from a fraction of the network agents is available, with the total number of agents itself being also unknown. The main focus of this article is ascertaining under these demanding conditions whether consistent tomography is possible, namely, whether it is possible to reconstruct the interaction profile of the observable portion of the network, with negligible error as the network size increases. We establish a critical achievability result, namely, that for symmetric combination policies and for any given fraction of observable agents, the interacting and non-interacting agent pairs split into two separate clusters as the network size increases. This remarkable property then enables the application of clustering algorithms to identify the interacting agents influencing the observations. We provide a set of numerical experiments that verify the results for finite network sizes and time horizons. The numerical experiments show that the results hold for asymmetric combination policies as well, which is particularly relevant in the context of causation.
• Jul 21 2017 math.OC cs.MA arXiv:1707.06465v1
This work studies the convergence properties of continuous-time fictitious play in potential games. It is shown that in almost every potential game and for almost every initial condition, fictitious play converges to a pure-strategy Nash equilibrium. We focus our study on the class of regular potential games; i.e., the set of potential games in which all Nash equilibria are regular. As byproducts of the proof of our main result we show that (i) a regular mixed-strategy equilibrium of a potential game can only be reached by a fictitious play process from a set of initial conditions with Lebesgue measure zero, and (ii) in regular potential games, solutions of fictitious play are unique for almost all initial conditions.
• Jul 21 2017 math.OC cs.MA arXiv:1707.06466v1
A fundamental problem with the Nash equilibrium concept is the existence of certain "structurally deficient" equilibria that (i) lack fundamental robustness properties, and (ii) are difficult to analyze. The notion of a "regular" Nash equilibrium was introduced by Harsanyi. Such equilibria are highly robust and relatively simple to analyze. A game is said to be regular if all equilibria in the game are regular. In this paper it is shown that almost all potential games are regular. That is, except for a closed subset of potential games with Lebesgue measure zero, all potential games are regular. As an immediate consequence of this, the paper also proves an oddness result for potential games: In almost all potential games, the number of Nash equilibrium strategies is finite and odd.
• A novel distributed energy allocation mechanism for Distribution System Operator (DSO) market through a bi-level iterative auction is proposed. With the locational marginal price at the substation node known, the DSO runs an upper level auction with aggregators as intermediate agents competing for energy. This DSO level auction takes into account physical grid constraints such as line flows, transformer capacities and node voltage limits. This auction mechanism is a straightforward implementation of projected gradient descent on the social welfare (SW) of all home level agents. Aggregators, which serve home level agents - both buyers and sellers, implement lower level auctions in parallel, through proportional allocation and without asking for utility functions and generation capacities that are considered private information. The overall bi-level auction is shown to be efficient and weakly budget balanced.