Successful analysis of player skills in video games has important impacts on the process of enhancing player experience without undermining their continuous skill development. Moreover, player skill analysis becomes more intriguing in team-based video games because such form of study can help discover useful factors in effective team formation. In this paper, we consider the problem of skill decomposition in MOBA (MultiPlayer Online Battle Arena) games, with the goal to understand what player skill factors are essential for the outcome of a game match. To understand the construct of MOBA player skills, we utilize various skill-based predictive models to decompose player skills into interpretative parts, the impact of which are assessed in statistical terms. We apply this analysis approach on two widely known MOBAs, namely League of Legends (LoL) and Defense of the Ancients 2 (DOTA2). The finding is that base skills of in-game avatars, base skills of players, and players' champion-specific skills are three prominent skill components influencing LoL's match outcomes, while those of DOTA2 are mainly impacted by in-game avatars' base skills but not much by the other two.
This paper addresses tracking of a moving target in a multi-agent network. The target follows a linear dynamics corrupted by an adversarial noise, i.e., the noise is not generated from a statistical distribution. The location of the target at each time induces a global time-varying loss function, and the global loss is a sum of local losses, each of which is associated to one agent. Agents noisy observations could be nonlinear. We formulate this problem as a distributed online optimization where agents communicate with each other to track the minimizer of the global loss. We then propose a decentralized version of the Mirror Descent algorithm and provide the non-asymptotic analysis of the problem. Using the notion of dynamic regret, we measure the performance of our algorithm versus its offline counterpart in the centralized setting. We prove that the bound on dynamic regret scales inversely in the network spectral gap, and it represents the adversarial noise causing deviation with respect to the linear dynamics. Our result subsumes a number of results in the distributed optimization literature. Finally, in a numerical experiment, we verify that our algorithm can be simply implemented for multi-agent tracking with nonlinear observations.
In multi-robot systems where a central decision maker is specifying the movement of each individual robot, a communication failure can severely impair the performance of the system. This paper develops a motion strategy that allows robots to safely handle critical communication failures for such multi-robot architectures. For each robot, the proposed algorithm computes a time horizon over which collisions with other robots are guaranteed not to occur. These safe time horizons are included in the commands being transmitted to the individual robots. In the event of a communication failure, the robots execute the last received velocity commands for the corresponding safe time horizons leading to a provably safe open-loop motion strategy. The resulting algorithm is computationally effective and is agnostic to the task that the robots are performing. The efficacy of the strategy is verified in simulation as well as on a team of differential-drive mobile robots.
We consider a swarm of $n$ autonomous mobile robots, distributed on a 2-dimensional grid. A basic task for such a swarm is the gathering process: all robots have to gather at one (not predefined) place. The work in this paper is motivated by the following insight: On one side, for swarms of robots distributed in the 2-dimensional Euclidean space, several gathering algorithms are known for extremely simple robots that are oblivious, have bounded viewing radius, no compass, and no "flags" to communicate a status to others. On the other side, in case of the 2-dimensional grid, the only known gathering algorithms for robots with bounded viewing radius without compass, need to memorize a constant number of rounds and need flags. In this paper we contribute the, to the best of our knowledge, first gathering algorithm on the grid, which works for anonymous, oblivious robots with bounded viewing range, without any further means of communication and without any memory. We prove its correctness and an $O(n^2)$ time bound. This time bound matches those of the best known algorithms for the Euclidean plane mentioned above.
Matrix games like Prisoner's Dilemma have guided research on social dilemmas for decades. However, they necessarily treat the choice to cooperate or defect as an atomic action. In real-world social dilemmas these choices are temporally extended. Cooperativeness is a property that applies to policies, not elementary actions. We introduce sequential social dilemmas that share the mixed incentive structure of matrix game social dilemmas but also require agents to learn policies that implement their strategic intentions. We analyze the dynamics of policies learned by multiple self-interested independent learning agents, each using its own deep Q-network, on two Markov games we introduce here: 1. a fruit Gathering game and 2. a Wolfpack hunting game. We characterize how learned behavior in each domain changes as a function of environmental factors including resource abundance. Our experiments show how conflict can emerge from competition over shared resources and shed light on how the sequential nature of real world social dilemmas affects cooperation.
Interactions between vehicles and pedestrians have always been a major problem in traffic safety. Experienced human drivers are generally able to analyze the environment and choose driving strategies that will help them avoid crashes. What is not yet clear, however, is how automated vehicles will interact with pedestrians. This paper proposes a new method for evaluating the safety and feasibility of the driving strategy of automated vehicles when encountering unsignalized crossings. MobilEye sensors installed on buses in Ann Arbor, Michigan, collected data on 2,973 valid crossing events. A stochastic interaction model was then created using a multivariate Gaussian mixture model. This model allowed us to simulate the movements of pedestrians reacting to an oncoming vehicle when approaching unsignalized crossings, and to evaluate the passing strategies of automated vehicles. A simulation was then conducted to demonstrate the evaluation procedure.
Social conventions govern countless behaviors all of us engage in every day, from how we greet each other to the languages we speak. But how can shared conventions emerge spontaneously in the absence of a central coordinating authority? The Naming Game model shows that networks of locally interacting individuals can spontaneously self-organize to produce global coordination. Here, we provide a gentle introduction to the main features of the model, from the dynamics observed in homogeneously mixing populations to the role played by more complex social networks, and to how slight modifications of the basic interaction rules give origin to a richer phenomenology in which more conventions can co-exist indefinitely.
Opponent modeling consists in modeling the strategy or preferences of an agent thanks to the data it provides. In the context of automated negotiation and with machine learning, it can result in an advantage so overwhelming that it may restrain some casual agents to be part of the bargaining process. We qualify as "curious" an agent driven by the desire of negotiating in order to collect information and improve its opponent model. However, neither curiosity-based rational-ity nor curiosity-robust protocol have been studied in automatic negotiation. In this paper, we rely on mechanism design to propose three extensions of the standard bargaining protocol that limit information leak. Those extensions are supported by an enhanced rationality model, that considers the exchanged information. Also, they are theoretically analyzed and experimentally evaluated.
Algorithms for equilibrium computation generally make no attempt to ensure that the computed strategies are understandable by humans. For instance the strategies for the strongest poker agents are represented as massive binary files. In many situations, we would like to compute strategies that can actually be implemented by humans, who may have computational limitations and may only be able to remember a small number of features or components of the strategies that have been computed. We study poker games where private information distributions can be arbitrary. We create a large training set of game instances and solutions, by randomly selecting the information probabilities, and present algorithms that learn from the training instances in order to perform well in games with unseen information distributions. We are able to conclude several new fundamental rules about poker strategy that can be easily implemented by humans.
The paper studies the problem of achieving consensus in multi-agent systems in the case where the dependency digraph $\Gamma$ has no spanning in-tree. We consider the regularization protocol that amounts to the addition of a dummy agent (hub) uniformly connected to the agents. The presence of such a hub guarantees the achievement of an asymptotic consensus. For the "evaporation" of the dummy agent, the strength of its influences on the other agents vanishes, which leads to the concept of latent consensus. We obtain a closed-form expression for the consensus when the connections of the hub are symmetric, in this case, the impact of the hub upon the consensus remains fixed. On the other hand, if the hub is essentially influenced by the agents, whereas its influence on them tends to zero, then the consensus is expressed by the scalar product of the vector of column means of the Laplacian eigenprojection of $\Gamma$ and the initial state vector of the system. Another protocol, which assumes the presence of vanishingly weak uniform background links between the agents, leads to the same latent consensus.
We present a distributed (non-Bayesian) learning algorithm for the problem of parameter estimation with Gaussian noise. The algorithm is expressed as explicit updates on the parameters of the Gaussian beliefs (i.e. means and precision). We show a convergence rate of $O(1/k)$ with the constant term depending on the number of agents and the topology of the network. Moreover, we show almost sure convergence to the optimal solution of the estimation problem for the general case of time-varying directed graphs.
In this paper we extend the principle of proportional representation to rankings. We consider the setting where alternatives need to be ranked based on approval preferences. In this setting, proportional representation requires that cohesive groups of voters are represented proportionally in each initial segment of the ranking. Proportional rankings are desirable in situations where initial segments of different lengths may be relevant, e.g., hiring decisions (if it is unclear how many positions are to be filled), the presentation of competing proposals on a liquid democracy platform (if it is unclear how many proposals participants are taking into consideration), or recommender systems (if a ranking has to accommodate different user types). We study the proportional representation provided by several ranking methods and prove theoretical guarantees. Furthermore, we experimentally evaluate these methods and present preliminary evidence as to which methods are most suitable for producing proportional rankings.
In this paper, we discuss how to design the graph topology to reduce the communication complexity of certain algorithms for decentralized optimization. Our goal is to minimize the total communication needed to achieve a prescribed accuracy. We discover that the so-called expander graphs are near-optimal choices. We propose three approaches to construct expander graphs for different numbers of nodes and node degrees. Our numerical results show that the performance of decentralized optimization is significantly better on expander graphs than other regular graphs.
The computational study of elections generally assumes that the preferences of the electorate come in as a list of votes. Depending on the context, it may be much more natural to represent the list succinctly, as the distinct votes of the electorate and their counts. We consider how this succinct representation of the voters affects the computational complexity of election problems. Though the succinct representation may be exponentially smaller than the nonsuccinct representation, we find only one natural case where the complexity increases, namely the complexity of winner determination for Kemeny elections. This is in sharp contrast to the case where each voter has a weight, where the complexity usually increases.
Nov 24 2016 cs.MA
Published during a severe economic crisis, this study presents the first spatial microsimulation model for the analysis of income inequalities and poverty in Greece. First, we present a brief overview of the method and discuss its potential for the analysis of multidimensional poverty and income inequality in Greece. We then present the SimAthens model, based on a combination of small-area demographic and socioeconomic information available from the Greek census of population with data from the European Union Statistics on Income and Living Conditions (EU-SILC). The model is based on an iterative proportional fitting (IPF) algorithm, and is used to reweigh EU-SILC records to fit in small-area descriptions for Athens based on 2001 and 2011 censuses. This is achieved by using demographic and socioeconomic characteristics as constraint variables. Finally, synthesis of the labor market and occupations are chosen as the main variables for externally validating our results, in order to verify the integrity of the model. Results of this external validation process are found to be extremely satisfactory, indicating a high goodness of fit between simulated and real values. Finally, the study presents a number of model outputs, illustrating changes in social and economic geography, during a severe economic crisis, offering a great opportunity for discussing further potential of this model in policy analysis.
Logic-based representations of multi-agent systems have been extensively studied. In this work, we focus on the action language BC to formalize global views of MAS domains. Methodologically, we start representing the behaviour of each agent by an action description from a single agent perspective. Then, it goes through two stages that guide the modeler in composing the global view by first designating multi-agent aspects of the domain via potential conflicts and later resolving these conflicts according to the expected behaviour of the overall system. Considering that representing single agent descriptions is relatively simpler than representing multi-agent description directly, the formalization developed here is valuable from a knowledge representation perspective.
Learning your first language is an incredible feat and not easily duplicated. Doing this using nothing but a few pictureless books, a corpus, would likely be impossible even for humans. As an alternative we propose to use situated interactions between agents as a driving force for communication, and the framework of Deep Recurrent Q-Networks (DRQN) for learning a common language grounded in the provided environment. We task the agents with interactive image search in the form of the game Guess Who?. The images from the game provide a non trivial environment for the agents to discuss and a natural grounding for the concepts they decide to encode in their communication. Our experiments show that it is possible to learn this task using DRQN and even more importantly that the words the agents use correspond to physical attributes present in the images that make up the agents environment.
Generative adversarial networks (GANs) are a framework for producing a generative model by way of a two-player minimax game. In this paper, we propose the \emphGenerative Multi-Adversarial Network (GMAN), a framework that extends GANs to multiple discriminators. In previous work, the successful training of GANs requires modifying the minimax objective to accelerate training early on. In contrast, GMAN can be reliably trained with the original, untampered objective. We explore a number of design perspectives with the discriminator role ranging from formidable adversary to forgiving teacher. Image generation tasks comparing the proposed framework to standard GANs demonstrate GMAN produces higher quality samples in a fraction of the iterations when measured by a pairwise GAM-type metric.
We consider election scenarios with incomplete information, a situation that arises often in practice. There are several models of incomplete information and accordingly, different notions of outcomes of such elections. In one well-studied model of incompleteness, the votes are given by partial orders over the candidates. In this context we can frame the problem of finding a possible winner, which involves determining whether a given candidate wins in at least one completion of a given set of partial votes for a specific voting rule. The possible winner problem is well-known to be NP-complete in general, and it is in fact known to be NP-complete for several voting rules where the number of undetermined pairs in every vote is bounded only by some constant. In this paper, we address the question of determining precisely the smallest number of undetermined pairs for which the possible winner problem remains NP-complete. In particular, we find the exact values of $t$ for which the possible winner problem transitions to being NP-complete from being in P, where $t$ is the maximum number of undetermined pairs in every vote. We demonstrate tight results for a broad subclass of scoring rules which includes all the commonly used scoring rules (such as plurality, veto, Borda, $k$-approval, and so on), Copeland$^\alpha$ for every $\alpha\in[0,1]$, maximin, and Bucklin voting rules. A somewhat surprising aspect of our results is that for many of these rules, the possible winner problem turns out to be hard even if every vote has at most one undetermined pair of candidates.
In participatory budgeting, communities collectively decide on the allocation of public tax dollars for local public projects. In this work, we consider the question of fairly aggregating the preferences of community members to determine an allocation of funds to projects. This problem is different from standard fair resource allocation because of public goods: The allocated goods benefit all users simultaneously. Fairness is crucial in participatory decision making, since generating equitable outcomes is an important goal of these processes. We argue that the classic game theoretic notion of core captures fairness in the setting. To compute the core, we first develop a novel characterization of a public goods market equilibrium called the Lindahl equilibrium, which is always a core solution. We then provide the first (to our knowledge) polynomial time algorithm for computing such an equilibrium for a broad set of utility functions; our algorithm also generalizes (in a non-trivial way) the well-known concept of proportional fairness. We use our theoretical insights to perform experiments on real participatory budgeting voting data. We empirically show that the core can be efficiently computed for utility functions that naturally model our practical setting, and examine the relation of the core with the familiar welfare objective. Finally, we address concerns of incentives and mechanism design by developing a randomized approximately dominant-strategy truthful mechanism building on the exponential mechanism from differential privacy.
This paper proposes models of learning process in teams of individuals who collectively execute a sequence of tasks and whose actions are determined by individual skill levels and networks of interpersonal appraisals and influence. The closely-related proposed models have increasing complexity, starting with a centralized manager-based assignment and learning model, and finishing with a social model of interpersonal appraisal, assignments, learning, and influences. We show how rational optimal behavior arises along the task sequence for each model, and discuss conditions of suboptimality. Our models are grounded in replicator dynamics from evolutionary games, influence networks from mathematical sociology, and transactive memory systems from organization science.
Sep 27 2016 cs.MA
Finding feasible, collision-free paths for multiagent systems can be challenging, particularly in non-communicating scenarios where each agent's intent (e.g. goal) is unobservable to the others. In particular, finding time efficient paths often requires anticipating interaction with neighboring agents, the process of which can be computationally prohibitive. This work presents a decentralized multiagent collision avoidance algorithm based on a novel application of deep reinforcement learning, which effectively offloads the online computation (for predicting interaction patterns) to an offline learning procedure. Specifically, the proposed approach develops a value network that encodes the estimated time to the goal given an agent's joint configuration (positions and velocities) with its neighbors. Use of the value network not only admits efficient (i.e., real-time implementable) queries for finding a collision-free velocity vector, but also considers the uncertainty in the other agents' motion. Simulation results show more than 26 percent improvement in paths quality (i.e., time to reach the goal) when compared with optimal reciprocal collision avoidance (ORCA), a state-of-the-art collision avoidance strategy.
We overview some results on distributed learning with focus on a family of recently proposed algorithms known as non-Bayesian social learning. We consider different approaches to the distributed learning problem and its algorithmic solutions for the case of finitely many hypotheses. The original centralized problem is discussed at first, and then followed by a generalization to the distributed setting. The results on convergence and convergence rate are presented for both asymptotic and finite time regimes. Various extensions are discussed such as those dealing with directed time-varying networks, Nesterov's acceleration technique and a continuum sets of hypothesis.
Studies on microscopic pedestrian requires large amounts of trajectory data from real-world pedestrian crowds. Such data collection, if done manually, needs tremendous effort and is very time consuming. Though many studies have asserted the possibility of automating this task using video cameras, we found that only a few have demonstrated good performance in very crowded situations or from a top-angled view scene. This paper deals with tracking pedestrian crowd under heavy occlusions from an angular scene. Our automated tracking system consists of two modules that perform sequentially. The first module detects moving objects as blobs. The second module is a tracking system. We employ probability distribution from the detection of each pedestrian and use Bayesian update to track the next position. The result of such tracking is a database of pedestrian trajectories over time and space. With certain prior information, we showed that the system can track a large number of people under occlusion and clutter scene.
We study the computational complexity of several scenarios of strategic behavior for the Kemeny procedure in the setting of judgment aggregation. In particular, we investigate (1) manipulation, where an individual aims to achieve a better group outcome by reporting an insincere individual opinion, (2) bribery, where an external agent aims to achieve an outcome with certain properties by bribing a number of individuals, and (3) control (by adding or deleting issues), where an external agent aims to achieve an outcome with certain properties by influencing the set of issues in the judgment aggregation situation. We show that determining whether these types of strategic behavior are possible (and if so, computing a policy for successful strategic behavior) is complete for the second level of the Polynomial Hierarchy. That is, we show that these problems are $\Sigma^p_2$-complete.
In this paper, we develop an agent-based model which integrates four important elements, i.e. organisational energy management policies/regulations, energy management technologies, electric appliances and equipment, and human behaviour, based on a case study, to simulate the energy consumption in office buildings. With the model, we test the effectiveness of different energy management strategies, and solve practical office energy consumption problems. This paper theoretically contributes to an integration of four elements involved in the complex organisational issue of office energy consumption, and practically contributes to an application of agent-based approach for office building energy consumption study.
A framework for consensus modelling is introduced using Kleene's three valued logic as a means to express vagueness in agents' beliefs. Explicitly borderline cases are inherent to propositions involving vague concepts where sentences of a propositional language may be absolutely true, absolutely false or borderline. By exploiting these intermediate truth values, we can allow agents to adopt a more vague interpretation of underlying concepts in order to weaken their beliefs and reduce the levels of inconsistency, so as to achieve consensus. We consider a consensus combination operation which results in agents adopting the borderline truth value as a shared viewpoint if they are in direct conflict. Simulation experiments are presented which show that applying this operator to agents chosen at random (subject to a consistency threshold) from a population, with initially diverse opinions, results in convergence to a smaller set of more precise shared beliefs. Furthermore, if the choice of agents for combination is dependent on the payoff of their beliefs, this acting as a proxy for performance or usefulness, then the system converges to beliefs which, on average, have higher payoff.
We consider the positioning problem of aerial drone systems for efficient three-dimensional (3-D) coverage. Our solution draws from molecular geometry, where forces among electron pairs surrounding a central atom arrange their positions. In this paper, we propose a 3-D clustering algorithm for autonomous positioning (VBCA) of aerial drone networks based on virtual forces. These virtual forces induce interactions among drones and structure the system topology. The advantages of our approach are that (1) virtual forces enable drones to self-organize the positioning process and (2) VBCA can be implemented entirely localized. Extensive simulations show that our virtual forces clustering approach produces scalable 3-D topologies exhibiting near-optimal volume coverage. VBCA triggers efficient topology rearrangement for an altering number of nodes, while providing network connectivity to the central drone. We also draw a comparison of volume coverage achieved by VBCA against existing approaches and find VBCA up to 40\% more efficient.
A true lie is a lie that becomes true when announced. In a logic of announcements, where the announcing agent is not modelled, a true lie is a formula (that is false and) that becomes true when announced. We investigate true lies and other types of interaction between announced formulas, their preconditions and their postconditions, in the setting Gerbrandy's logic of believed announcements, wherein agents may have or obtain incorrect beliefs. Our results are on the satisfiability and validity of instantiations of these semantically defined categories, on iterated announcements, including arbitrarily often iterated announcements, and on syntactic characterization. We close with results for iterated announcements in the logic of knowledge (instead of belief), and for lying as private announcements (instead of public announcements) to different agents. Detailed examples illustrate our lying concepts.
Extensive work has been conducted both in game theory and logic to model strategic interaction. An important question is whether we can use these theories to design agents for interacting with people? On the one hand, they provide a formal design specification for agent strategies. On the other hand, people do not necessarily adhere to playing in accordance with these strategies, and their behavior is affected by a multitude of social and psychological factors. In this paper we will consider the question of whether strategies implied by theories of strategic behavior can be used by automated agents that interact proficiently with people. We will focus on automated agents that we built that need to interact with people in two negotiation settings: bargaining and deliberation. For bargaining we will study game-theory based equilibrium agents and for argumentation we will discuss logic-based argumentation theory. We will also consider security games and persuasion games and will discuss the benefits of using equilibrium based agents.
We consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility. In these environments, agents must learn communication protocols in order to share information that is needed to solve the tasks. By embracing deep neural networks, we are able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability. We propose two approaches for learning in these domains: Reinforced Inter-Agent Learning (RIAL) and Differentiable Inter-Agent Learning (DIAL). The former uses deep Q-learning, while the latter exploits the fact that, during learning, agents can backpropagate error derivatives through (noisy) communication channels. Hence, this approach uses centralised learning but decentralised execution. Our experiments introduce new environments for studying the learning of communication protocols and present a set of engineering innovations that are essential for success in these domains.
May 09 2016 cs.MA
This paper presents a computational approach to modelling group creativity. It presents an analysis of two studies of group creativity selected from different research cultures and identifies a common theme ("idea build-up") that is then used in the formalisation of an agent-based model used to support reasoning about the complex dynamics of building on the ideas of others.
Peer review, evaluation, and selection is the foundation on which modern science is built. Funding bodies the world over employ experts to study and select the best proposals of those submitted for funding. The problem of peer selection, however, is much more universal: a professional society may want give a subset of its members awards based on the opinions of all the members; an instructor for a MOOC or online course may want to crowdsource grading; or a marketing company may select ideas from group brainstorming sessions based on peer evaluation. We make three fundamental contributions to the study of procedures or mechanisms for peer selection, a specific type of group decision making problem studied in computer science, economics, political science, and beyond. First, we detail a novel mechanism that is strategyproof, i.e., agents cannot benefit themselves by reporting insincere valuations, in addition to other desirable normative properties. Second, we demonstrate the effectiveness of our mechanism through a comprehensive simulation based comparison of our mechanism with a suite of mechanisms found in the computer science and economics literature. Finally, our mechanism employs a randomized rounding technique that is of independent interest, as it can be used as a randomized method to addresses the ubiquitous apportionment problem that arises in various settings where discrete resources such as parliamentary representation slots need to be divided fairly.
Despite much scientific evidence, a large fraction of the American public doubts that greenhouse gases are causing global warming. We present a simulation model as a computational test-bed for climate prediction markets. Traders adapt their beliefs about future temperatures based on the profits of other traders in their social network. We simulate two alternative climate futures, in which global temperatures are primarily driven either by carbon dioxide or by solar irradiance. These represent, respectively, the scientific consensus and a hypothesis advanced by prominent skeptics. We conduct sensitivity analyses to determine how a variety of factors describing both the market and the physical climate may affect traders' beliefs about the cause of global climate change. Market participation causes most traders to converge quickly toward believing the "true" climate model, suggesting that a climate market could be useful for building public consensus.
This article outlines a method for automatically generating models of dynamic decision-making that both have strong predictive power and are interpretable in human terms. This is useful for designing empirically grounded agent-based simulations and for gaining direct insight into observed dynamic processes. We use an efficient model representation and a genetic algorithm-based estimation process to generate simple approximations that explain most of the structure of complex stochastic processes. This method, implemented in C++ and R, scales well to large data sets. We apply our methods to empirical data from human subjects game experiments and international relations. We also demonstrate the method's ability to recover known data-generating processes by simulating data with agent-based models and correctly deriving the underlying decision models for multiple agent models and degrees of stochasticity.
The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding a given formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicles. We present a distributed iterative learning control (ILC) approach where each vehicle learns from the experience of its own and its neighbors' previous task repetitions, and adapts its feedforward input to improve performance. Existing algorithms are extended in theory to make them more applicable to real-world experiments. In particular, we prove stability for any causal learning function with gains chosen according to a simple scalar condition. Previous proofs were restricted to a specific learning function that only depends on the tracking error derivative (D-type ILC). Our extension provides more degrees of freedom in the ILC design and, as a result, better performance can be achieved. We also show that stability is not affected by a linear dynamic coupling between neighbors. This allows us to use an additional consensus feedback controller to compensate for non-repetitive disturbances. Experiments with two quadrotors attest the effectiveness of the proposed distributed multi-agent ILC approach. This is the first work to show distributed ILC in experiment.
We investigate the effects of social interactions in task al- location using Evolutionary Game Theory (EGT). We propose a simple task-allocation game and study how different learning mechanisms can give rise to specialised and non- specialised colonies under different ecological conditions. By combining agent-based simulations and adaptive dynamics we show that social learning can result in colonies of generalists or specialists, depending on ecological parameters. Agent-based simulations further show that learning dynamics play a crucial role in task allocation. In particular, introspective individual learning readily favours the emergence of specialists, while a process resembling task recruitment favours the emergence of generalists.
Multi-agent path finding (MAPF) is well-studied in artificial intelligence, robotics, theoretical computer science and operations research. We discuss issues that arise when generalizing MAPF methods to real-world scenarios and four research directions that address them. We emphasize the importance of addressing these issues as opposed to developing faster methods for the standard formulation of the MAPF problem.
Until now mean-field-type game theory was not focused on cognitively-plausible models of choices in humans, animals, machines, robots, software-defined and mobile devices strategic interactions. This work presents some effects of users' psychology in mean-field-type games. In addition to the traditional "material" payoff modelling, psychological patterns are introduced in order to better capture and understand behaviors that are observed in engineering practice or in experimental settings. The psychological payoff value depends upon choices, mean-field states, mean-field actions, empathy and beliefs. It is shown that the affective empathy enforces mean-field equilibrium payoff equity and improves fairness between the players. It establishes equilibrium systems for such interactive decision-making problems. Basic empathy concepts are illustrated in several important problems in engineering including resource sharing, packet collision minimization, energy markets, and forwarding in Device-to-Device communications. The work conducts also an experiment with 47 people who have to decide whether to cooperate or not. The basic Interpersonal Reactivity Index of empathy metrics were used to measure the empathy distribution of each participant. Android app called Empathizer is developed to analyze systematically the data obtained from the participants. The experimental results reveal that the dominated strategies of the classical game theory are not dominated any more when users' psychology is involved, and a significant level of cooperation is observed among the users who are positively partially empathetic.
This paper investigates the task assignment problem for multiple dispersed robots constrained by limited communication range. The robots are initially randomly distributed and need to visit several target locations while minimizing the total travel time. A centralized rendezvous-based algorithm is proposed, under which all the robots first move towards a rendezvous position until communication paths are established between every pair of robots either directly or through intermediate peers, and then one robot is chosen as the leader to make a centralized task assignment for the other robots. Furthermore, we propose a decentralized algorithm based on a single-traveling-salesman tour, which does not require all the robots to be connected through communication. We investigate the variation of the quality of the assignment solutions as the level of information sharing increases and as the communication range grows, respectively. The proposed algorithms are compared with a centralized algorithm with shared global information and a decentralized greedy algorithm respectively. Monte Carlo simulation results show the satisfying performance of the proposed algorithms.
All-pay auctions, a common mechanism for various human and agent interactions, suffers, like many other mechanisms, from the possibility of players' failure to participate in the auction. We model such failures, and fully characterize equilibrium for this class of games, we present a symmetric equilibrium and show that under some conditions the equilibrium is unique. We reveal various properties of the equilibrium, such as the lack of influence of the most-likely-to-participate player on the behavior of the other players. We perform this analysis with two scenarios: the sum-profit model, where the auctioneer obtains the sum of all submitted bids, and the max-profit model of crowdsourcing contests, where the auctioneer can only use the best submissions and thus obtains only the winning bid. Furthermore, we examine various methods of influencing the probability of participation such as the effects of misreporting one's own probability of participating, and how influencing another player's participation chances changes the player's strategy.
This paper studies scenarios of cyclic dominance in a coevolutionary spatial model in which game strategies and links between agents adaptively evolve over time. The Optional Prisoner's Dilemma (OPD) game is employed. The OPD is an extended version of the traditional Prisoner's Dilemma where players have a third option to abstain from playing the game. We adopt an agent-based simulation approach and use Monte Carlo methods to perform the OPD with coevolutionary rules. The necessary conditions to break the scenarios of cyclic dominance are also investigated. This work highlights that cyclic dominance is essential in the sustenance of biodiversity. Moreover, we also discuss the importance of a spatial coevolutionary model in maintaining cyclic dominance in adverse conditions.
Online learning with streaming data in a distributed and collaborative manner can be useful in a wide range of applications. This topic has been receiving considerable attention in recent years with emphasis on both single-task and multitask scenarios. In single-task adaptation, agents cooperate to track an objective of common interest, while in multitask adaptation agents track multiple objectives simultaneously. Regularization is one useful technique to promote and exploit similarity among tasks in the latter scenario. This work examines an alternative way to model relations among tasks by assuming that they all share a common latent feature representation. As a result, a new multitask learning formulation is presented and algorithms are developed for its solution in a distributed online manner. We present a unified framework to analyze the mean-square-error performance of the adaptive strategies, and conduct simulations to illustrate the theoretical findings and potential applications.
Managing micro-tasks on crowdsourcing marketplaces involves balancing conflicting objectives -- the quality of work, total cost incurred and time to completion. Previous agents have focused on cost-quality, or cost-time tradeoffs, limiting their real-world applicability. As a step towards this goal we present Octopus, the first AI agent that jointly manages all three objectives in tandem. Octopus is based on a computationally tractable, multi-agent formulation consisting of three components; one that sets the price per ballot to adjust the rate of completion of tasks, another that optimizes each task for quality and a third that performs task selection. We demonstrate that Octopus outperforms existing state-of-the-art approaches in simulation and experiments with real data, demonstrating its superior performance. We also deploy Octopus on Amazon Mechanical Turk to establish its ability to manage tasks in a real-world, dynamic setting.
This paper extends and adapts an existing abstract model into an empirical metropolitan region in Brazil. The model - named SEAL: a Spatial Economic Agent-based Lab - comprehends a framework to enable public policy ex-ante analysis. The aim of the model is to use official data and municipalities spatial boundaries to allow for policy experimentation. The current version considers three markets: housing, labor and goods. Families' members age, consume, join the labor market and trade houses. A single consumption tax is collected by municipalities that invest back into quality of life improvements. We test whether a single metropolitan government - which is an aggregation of municipalities - would be in the best interest of its citizens. Preliminary results for 20 runs indicate that it may be the case. Future developments include improving performance to enable running of higher percentage of the population and a number of runs that make the model more robust.
Feb 10 2017 cs.MA
In many problems, agents cooperate locally so that a leader or fusion center can infer the state of every agent from probing the state of only a small number of agents. Versions of this problem arise when a fusion center reconstructs an extended physical field by accessing the state of just a few of the sensors measuring the field, or a leader monitors the formation of a team of robots. Given a link cost, the paper presents a polynomial time algorithm to design a minimum cost coordinated network dynamics followed by the agents, under an observability constraint. The problem is placed in the context of structural observability and solved even when up to k agents in the coordinated network dynamics fail.
In human societies, people's willingness to compete and strive for better social status as well as being envious of those perceived in some way superior lead to social structures that are intrinsically hierarchical. Here we propose an agent-based, network model to mimic the ranking behaviour of individuals and its possible repercussions in human society. The main ingredient of the model is the assumption that the relevant feature of social interactions is each individual's keenness to maximise his or her status relative to others. The social networks produced by the model are homophilous and assortative, as frequently observed in human communities and most of the network properties seem quite independent of its size. However, it is seen that for small number of agents the resulting network consists of disjoint weakly connected communities while being highly assortative and homophilic. On the other hand larger networks turn out to be more cohesive with larger communities but less homophilic. We find that the reason for these changes is that larger network size allows agents to use new strategies for maximizing their social status allowing for more diverse links between them.
We introduce the concept of a V-formation game between a controller and an attacker, where controller's goal is to maneuver the plant (a simple model of flocking dynamics) into a V-formation, and the goal of the attacker is to prevent the controller from doing so. Controllers in V-formation games utilize a new formulation of model-predictive control we call Adaptive-Horizon MPC (AMPC), giving them extraordinary power: we prove that under certain controllability assumptions, an AMPC controller is able to attain V-formation with probability 1. We define several classes of attackers, including those that in one move can remove R birds from the flock, or introduce random displacement into flock dynamics. We consider both naive attackers, whose strategies are purely probabilistic, and AMPC-enabled attackers, putting them on par strategically with the controllers. While an AMPC-enabled controller is expected to win every game with probability 1, in practice, it is resource-constrained: its maximum prediction horizon and the maximum number of game execution steps are fixed. Under these conditions, an attacker has a much better chance of winning a V-formation game. Our extensive performance evaluation of V-formation games uses statistical model checking to estimate the probability an attacker can thwart the controller. Our results show that for the bird-removal game with R = 1, the controller almost always wins (restores the flock to a V-formation). For R = 2, the game outcome critically depends on which two birds are removed. For the displacement game, our results again demonstrate that an intelligent attacker, i.e. one that uses AMPC in this case, significantly outperforms its naive counterpart that randomly executes its attack.
Feb 01 2017 cs.MA
Affect Control Theory (ACT) is a powerful and general sociological model of human affective interaction. ACT provides an empirically derived mathematical model of culturally shared sentiments as heuristic guides for human decision making. BayesACT, a variant on classical ACT, combines affective reasoning with cognitive (denotative or logical) reasoning as is traditionally found in AI. Bayes\-ACT allows for the creation of agents that are both emotionally guided and goal-directed. In this work, we simulate BayesACT agents in the Iterated Networked Prisoner's Dilemma (INPD), and we show four out of five known properties of human play in INPD are replicated by these socio-affective agents. In particular, we show how the observed human behaviours of network structure invariance, anti-correlation of cooperation and reward, and player type stratification are all clearly emergent properties of the networked BayesACT agents. We further show that decision hyteresis (Moody Conditional Cooperation) is replicated by BayesACT agents in over $2/3$ of the cases we have considered. In contrast, previously used imitation-based agents are only able to replicate one of the five properties. We discuss the implications of these findings in the development of human-agent societies.
Organic Computing is an initiative in the field of systems engineering that proposed to make use of concepts such as self-adaptation and self-organisation to increase the robustness of technical systems. Based on the observation that traditional design and operation concepts reach their limits, transferring more autonomy to the systems themselves should result in a reduction of complexity for users, administrators, and developers. However, there seems to be a need for an updated definition of the term "Organic Computing", of desired properties of technical, organic systems, and the objectives of the Organic Computing initiative. With this article, we will address these points.