# Multiagent Systems (cs.MA)

• We present an approach for implementing a specific form of collaborative industrial practices-called Industrial Symbiotic Networks (ISNs)-as MC-Net cooperative games and address the so called ISN implementation problem. This is, the characteristics of ISNs may lead to inapplicability of fair and stable benefit allocation methods even if the collaboration is a collectively desired one. Inspired by realistic ISN scenarios and the literature on normative multi-agent systems, we consider regulations and normative socioeconomic policies as two elements that in combination with ISN games resolve the situation and result in the concept of coordinated ISNs.
• Apr 20 2018 cs.MA cs.AI arXiv:1804.07178v1
Interest in emergent communication has recently surged in Machine Learning. The focus of this interest has largely been either on investigating the properties of the learned protocol or on utilizing emergent communication to better solve problems that already have a viable solution. Here, we consider self-driving cars coordinating with each other and focus on how communication influences the agents' collective behavior. Our main result is that communication helps (most) with adverse conditions.
• Apr 18 2018 cs.MA arXiv:1804.06011v1
Queen Daniela of Sardinia is asleep at the center of a round room at the top of the tower in her castle. She is accompanied by her faithful servant, Eva. Suddenly, they are awakened by cries of "Fire". The room is pitch black and they are disoriented. There is exactly one exit from the room somewhere along its boundary. They must find it as quickly as possible in order to save the life of the queen. It is known that with two people searching while moving at maximum speed 1 anywhere in the room, the room can be evacuated (i.e., with both people exiting) in $1 + \frac{2\pi}{3} + \sqrt{3} \approx 4.8264$ time units and this is optimal~[Czyzowicz et al., DISC'14], assuming that the first person to find the exit can directly guide the other person to the exit using her voice. Somewhat surprisingly, in this paper we show that if the goal is to save the queen (possibly leaving Eva behind to die in the fire) there is a slightly better strategy. We prove that this "priority" version of evacuation can be solved in time at most $4.81854$. Furthermore, we show that any strategy for saving the queen requires time at least $3 + \pi/6 + \sqrt{3}/2 \approx 4.3896$ in the worst case. If one or both of the queen's other servants (Biddy and/or Lili) are with her, we show that the time bounds can be improved to $3.8327$ for two servants, and $3.3738$ for three servants. Finally we show lower bounds for these cases of $3.6307$ (two servants) and $3.2017$ (three servants). The case of $n\geq 4$ is the subject of an independent study by Queen Daniela's Royal Scientific Team.
• In the parable of Simon's Ant, an ant follows a complex path along a beach on to reach its goal. The story shows how the interaction of simple rules and a complex environment result in complex behavior. But this relationship can be looked at in another way - given path and rules, we can infer the environment. With a large population of agents - human or animal - it should be possible to build a detailed map of a population's social and physical environment. In this abstract, we describe the development of a framework to create such maps of human belief space. These maps are built from the combined trajectories of a large number of agents. Currently, these maps are built using multidimensional agent-based simulation, but the framework is designed to work using data from computer-mediated human communication. Maps incorporating human data should support visualization and navigation of the "plains of research", "fashionable foothills" and "conspiracy cliffs" of human belief spaces.
• Distributed controllers are often necessary for a multi-agent system to satisfy safety properties such as collision avoidance. Communication and coordination are key requirements in the implementation of a distributed control protocol, but maintaining an all-to-all communication topology is unreasonable and not always necessary. Given a safety objective and a controller implementation, we consider the problem of identifying when agents need to communicate with one another and coordinate their actions to satisfy the safety constraint. We define a coordination-free controllable predecessor operator that is used to derive a subset of the state space that allows agents to act independently, without consulting other agents to double check that the action is safe. Applications are shown for identifying an upper bound on connection delays and a self-triggered coordination scheme. Examples are provided which showcase the potential for designers to visually interpret a system's ability to tolerate delays when initializing a network connection.
• In this work, we present a programming paradigm allowing the control of swarms with a minimum communication bandwidth in a simple manner, yet allowing the emergence of diverse complex behaviors and autonomy of the swarm. Communication in the proposed paradigm is based on single bit "ping"-signals propagating as information-waves throughout the swarm. We show that even this minimum bandwidth communication between agents suffices for the design of a substantial set of behaviors in the domain of essential behaviors of a collective, including locomotion and self awareness of the swarm.
• The ability of algorithms to evolve or learn (compositional) communication protocols has traditionally been studied in the language evolution literature through the use of emergent communication tasks. Here we scale up this research by using contemporary deep learning methods and by training reinforcement-learning neural network agents on referential communication games. We extend previous work, in which agents were trained in symbolic environments, by developing agents which are able to learn from raw pixel data, a more challenging and realistic input representation. We find that the degree of structure found in the input data affects the nature of the emerged protocols, and thereby corroborate the hypothesis that structured compositional language is most likely to emerge when agents perceive the world as being structured.
• Multi-agent reinforcement learning offers a way to study how communication could emerge in communities of agents needing to solve specific problems. In this paper, we study the emergence of communication in the negotiation environment, a semi-cooperative model of agent interaction. We introduce two communication protocols -- one grounded in the semantics of the game, and one which is \textita priori ungrounded and is a form of cheap talk. We show that self-interested agents can use the pre-grounded communication channel to negotiate fairly, but are unable to effectively use the ungrounded channel. However, prosocial agents do learn to use cheap talk to find an optimal negotiating strategy, suggesting that cooperation is necessary for language to emerge. We also study communication behaviour in a setting where one agent interacts with agents in a community with different levels of prosociality and show how agent identifiability can aid negotiation.
• We present a novel algorithm for computing collision-free navigation for heterogeneous road-agents such as cars, tricycles, bicycles, and pedestrians in dense traffic. Our approach currently assumes the positions, shapes, and velocities of all vehicles and pedestrians are known and computes smooth trajectories for each agent by taking into account the dynamic constraints. We describe an efficient optimization-based algorithm for each road-agent based on reciprocal velocity obstacles that takes into account kinematic and dynamic constraints. Our algorithm uses tight fitting shape representations based on medial axis to compute collision-free trajectories in dense traffic situations. We evaluate the performance of our algorithm in real-world dense traffic scenarios and highlight the benefits over prior reciprocal collision avoidance schemes.
• Decentralized (PO)MDPs provide an expressive framework for sequential decision making in a multiagent system. Given their computational complexity, recent research has focused on tractable yet practical subclasses of Dec-POMDPs. We address such a subclass called CDEC-POMDP where the collective behavior of a population of agents affects the joint-reward and environment dynamics. Our main contribution is an actor-critic (AC) reinforcement learning method for optimizing CDEC-POMDP policies. Vanilla AC has slow convergence for larger problems. To address this, we show how a particular decomposition of the approximate action-value function over agents leads to effective updates, and also derive a new way to train the critic based on local reward signals. Comparisons on a synthetic benchmark and a real-world taxi fleet optimization problem show that our new AC approach provides better quality solutions than previous best approaches.
• The social community in open source software developers has a complex network structure. The network structure represents the relations between the project and the engineer in the software developer's community. A project forms some teams which consist of engineers categorized into some task group. Source Forge is well known to be one of open source websites. The node and arc in the network structure means the engineer and their connection among engineers in the Source Forge. In the previous study, we found the growing process of project becomes strong according to the number of developers joining into the project. In the growing phase, we found some characteristic patterns between the number of agents and the produced projects. By such observations, we developed a simulation model of performing the growing process of project. In this paper, we introduced the altruism behavior as shown in the Army Ant model into the software developer's simulation model. The efficiency of the software developing process was investigated by some experimental simulation results.
• We present a novel algorithm for reciprocal collision avoidance between heterogeneous agents of different shapes and sizes. We present a novel CTMAT representation based on medial axis transform to compute a tight fitting bounding shape for each agent. Each CTMAT is represented using tuples, which are composed of circular arcs and line segments. Based on the reciprocal velocity obstacle formulation, we reduce the problem to solving a low-dimensional linear programming between each pair of tuples belonging to adjacent agents. We precompute the Minkowski Sums of tuples to accelerate the runtime performance. Finally, we provide an efficient method to update the orientation of each agent in a local manner. We have implemented the algorithm and highlight its performance on benchmarks corresponding to road traffic scenarios and different vehicles. The overall runtime performance is comparable to prior multi-agent collision avoidance algorithms that use circular or elliptical agents. Our approach is less conservative and results in fewer false collisions.
• This paper describes an agent based simulation used to model human actions in belief space, a high-dimensional subset of information space associated with opinions. Using insights from animal collective behavior, we are able to simulate and identify behavior patterns that are similar to nomadic, flocking and stampeding patterns of animal groups. These behaviors have analogous manifestations in human interaction, emerging as solitary explorers, the fashion-conscious, and members of polarized echo chambers. We demonstrate that a small portion of nomadic agents that widely traverse belief space can disrupt a larger population of stampeding agents. Extending the model, we introduce the concept of Adversarial Herding, where bad actors can exploit properties of technologically mediated communication to artificially create self sustaining runaway polarization. We call this condition the Pishkin Effect as it recalls the large scale buffalo stampedes that could be created by native Americans hunters. We then discuss opportunities for system design that could leverage the ability to recognize these negative patterns, and discuss affordances that may disrupt the formation of natural and deliberate echo chambers.
• Self-organization has been an important concept within a number of disciplines, which Artificial Life (ALife) also has heavily utilized since its inception. The term and its implications, however, are often confusing or misinterpreted. In this work, we provide a mini-review of self-organization and its relationship with ALife, aiming at initiating discussions on this important topic with the interested audience. We first articulate some fundamental aspects of self-organization, outline its usage, and review its applications to ALife within its soft, hard, and wet domains. We also provide perspectives for further research.
• Real-time strategy games have been an important field of game artificial intelligence in recent years. This paper presents a reinforcement learning and curriculum transfer learning method to control multiple units in StarCraft micromanagement. We define an efficient state representation, which breaks down the complexity caused by the large state space in the game environment. Then a parameter sharing multi-agent gradientdescent Sarsa(\lambda) (PS-MAGDS) algorithm is proposed to train the units. The learning policy is shared among our units to encourage cooperative behaviors. We use a neural network as a function approximator to estimate the action-value function, and propose a reward function to help units balance their move and attack. In addition, a transfer learning method is used to extend our model to more difficult scenarios, which accelerates the training process and improves the learning performance. In small scale scenarios, our units successfully learn to combat and defeat the built-in AI with 100% win rates. In large scale scenarios, curriculum transfer learning method is used to progressively train a group of units, and shows superior performance over some baseline methods in target scenarios. With reinforcement learning and curriculum transfer learning, our units are able to learn appropriate strategies in StarCraft micromanagement scenarios.
• The authors present an overview of a hierarchical framework for coordinating task- and motion-level operations in multirobot systems. Their framework is based on the idea of using simple temporal networks to simultaneously reason about precedence/causal constraints required for task-level coordination and simple temporal constraints required to take some kinematic constraints of robots into account. In the plan-generation phase, the framework provides a computationally scalable method for generating plans that achieve high-level tasks for groups of robots and take some of their kinematic constraints into account. In the plan-execution phase, the framework provides a method for absorbing an imperfect plan execution to avoid time-consuming re-planning in many cases. The authors use the multirobot path-planning problem as a case study to present the key ideas behind their framework for the long-term autonomy of multirobot systems.
• In many real-world settings, a team of agents must coordinate their behaviour while acting in a decentralised way. At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted. Learning joint action-values conditioned on extra state information is an attractive way to exploit centralised learning, but the best strategy for then extracting decentralised policies is unclear. Our solution is QMIX, a novel value-based method that can train decentralised policies in a centralised end-to-end fashion. QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations. We structurally enforce that the joint-action value is monotonic in the per-agent values, which allows tractable maximisation of the joint action-value in off-policy learning, and guarantees consistency between the centralised and decentralised policies. We evaluate QMIX on a challenging set of StarCraft II micromanagement tasks, and show that QMIX significantly outperforms existing value-based multi-agent reinforcement learning methods.
• Apr 02 2018 cs.MA cs.CY arXiv:1803.11457v1
The emerging field of morphogenetic engineering proposes to design complex heterogeneous system focused on the paradigm of emergence. Necessarily at the interface of disciplines, its concepts can be defined through multiple viewpoints. This contribution aims at linking a co-evolutionary perspective on such systems with morphogenesis, and therein at bringing a novel conceptual approach to the bottom-up design of complex systems which allows to fully consider co-evolutive processes. We first situate systems of interest at the interface between biological and social systems, and introduce a multidisciplinary perspective on co-evolution. Building on Holland's signals and boundaries theory of complex adaptive systems, we finally suggest that morphogenetic systems are equivalent to combinations of co-evolutionary niches. This introduces an entry to morphogenetic engineering focused on co-evolution between components of a system. Applications can be found in a broad range of subjects, which we illustrate with the example of planning in territorial systems, suggesting an extended scope for the relevance of morphogenetic engineering concepts.
• This paper develops an optimal relative output-feedback based solution to the containment control problem of linear heterogeneous multi-agent systems. A distributed optimal control protocol is presented for the followers to not only assure that their outputs fall into the convex hull of the leaders' output (i.e., the desired or safe region), but also optimizes their transient performance. The proposed optimal control solution is composed of a feedback part, depending of the followers' state, and a feed-forward part, depending on the convex hull of the leaders' state. To comply with most real-world applications, the feedback and feed-forward states are assumed to be unavailable and are estimated using two distributed observers. That is, since the followers cannot directly sense their absolute states, a distributed observer is designed that uses only relative output measurements with respect to their neighbors (measured for example by using range sensors in robotic) and the information which is broadcasted by their neighbors to estimate their states. Moreover, another adaptive distributed observer is designed that uses exchange of information between followers over a communication network to estimate the convex hull of the leaders' state. The proposed observer relaxes the restrictive requirement of knowing the complete knowledge of the leaders' dynamics by all followers. An off-policy reinforcement learning algorithm on an actor-critic structure is next developed to solve the optimal containment control problem online, using relative output measurements and without requirement of knowing the leaders' dynamics by all followers. Finally, the theoretical results are verified by numerical simulations.
• The amount of personal data collected in our everyday interactions with connected devices offers great opportunities for innovative services fueled by machine learning, as well as raises serious concerns for the privacy of individuals. In this paper, we propose a massively distributed protocol for a large set of users to privately compute averages over their joint data, which can then be used to learn predictive models. Our protocol can find a solution of arbitrary accuracy, does not rely on a third party and preserves the privacy of users throughout the execution in both the honest-but-curious and malicious adversary models. Specifically, we prove that the information observed by the adversary (the set of maliciours users) does not significantly reduce the uncertainty in its prediction of private values compared to its prior belief. The level of privacy protection depends on a quantity related to the Laplacian matrix of the network graph and generally improves with the size of the graph. Furthermore, we design a verification procedure which offers protection against malicious users joining the service with the goal of manipulating the outcome of the algorithm.
• Applications in robotics, such as multi-robot target tracking, involve the execution of information acquisition tasks by teams of mobile robots. However, in failure-prone or adversarial environments, robots get attacked, their communication channels get jammed, and their sensors fail, resulting in the withdrawal of robots from the collective task, and, subsequently, the inability of the remaining active robots to coordinate with each other. As a result, traditional design paradigms become insufficient and, in contrast, resilient designs against system-wide failures and attacks become important. In general, resilient design problems are hard, and even though they often involve objective functions that are monotone and (possibly) submodular, scalable approximation algorithms for their solution have been hitherto unknown. In this paper, we provide the first algorithm, enabling the following capabilities: minimal communication, i.e., the algorithm is executed by the robots based only on minimal communication between them, system-wide resiliency, i.e., the algorithm is valid for any number of denial-of-service attacks and failures, and provable approximation performance, i.e., the algorithm ensures for all monotone and (possibly) submodular objective functions a solution that is finitely close to the optimal. We support our theoretical analyses with simulated and real-world experiments, by considering an active information acquisition application scenario, namely, multi-robot target tracking.
• Mar 28 2018 cs.MA cs.SY math.OC arXiv:1803.08950v1
We consider a multi-agent framework for distributed optimization where each agent in the network has access to a local convex function and the collective goal is to achieve consensus on the parameters that minimize the sum of the agents' local functions. We propose an algorithm wherein each agent operates asynchronously and independently of the other agents in the network. When the local functions are strongly-convex with Lipschitz-continuous gradients, we show that a subsequence of the iterates at each agent converges to a neighbourhood of the global minimum, where the size of the neighbourhood depends on the degree of asynchrony in the multi-agent network. When the agents work at the same rate, convergence to the global minimizer is achieved. Numerical experiments demonstrate that Asynchronous Subgradient-Push can minimize the global objective faster than state-of-the-art synchronous first-order methods, is more robust to failing or stalling agents, and scales better with the network size.
• In this paper, we study the problem of distributed multi-agent optimization over a network, where each agent possesses a local cost function that is smooth and strongly convex. The global objective is to find a common solution that minimizes the average of all cost functions. Assuming agents only have access to unbiased estimates of the gradients of their local cost functions, we consider a distributed stochastic gradient tracking method. We show that, in expectation, the iterates generated by each agent are attracted to a neighborhood of the optimal solution, where they accumulate exponentially fast (under a constant step size choice). More importantly, the limiting (expected) error bounds on the distance of the iterates from the optimal solution decrease with the network size, which is a comparable performance to a centralized stochastic gradient algorithm. Numerical examples further demonstrate the effectiveness of the method.
• When scheduling public works or events in a shared facility one needs to accommodate preferences of a population. We formalize this problem by introducing the notion of a collective schedule. We show how to extend fundamental tools from social choice theory---positional scoring rules, the Kemeny rule and the Condorcet principle---to collective scheduling. We study the computational complexity of finding collective schedules. We also experimentally demonstrate that optimal collective schedules can be found for instances with realistic sizes.
• The spread of autonomous systems into safety-critical areas has increased the demand for their formal verification, not only due to stronger certification requirements but also to public uncertainty over these new technologies. However, the complex nature of such systems, for example, the intricate combination of discrete and continuous aspects, ensures that whole system verification is often infeasible. This motivates the need for novel analysis approaches that modularise the problem, allowing us to restrict our analysis to one particular aspect of the system while abstracting away from others. For instance, while verifying the real-time properties of an autonomous system we might hide the details of the internal decision-making components. In this paper we describe verification of a range of properties across distinct dimesnions on a practical hybrid agent architecture. This allows us to verify the autonomous decision-making, real-time aspects, and spatial aspects of an autonomous vehicle platooning system. This modular approach also illustrates how both algorithmic and deductive verification techniques can be applied for the analysis of different system subcomponents.
• Making decisions is a great challenge in distributed autonomous environments due to enormous state spaces and uncertainty. Many online planning algorithms rely on statistical sampling to avoid searching the whole state space, while still being able to make acceptable decisions. However, planning often has to be performed under strict computational constraints making online planning in multi-agent systems highly limited, which could lead to poor system performance, especially in stochastic domains. In this paper, we propose Emergent Value function Approximation for Distributed Environments (EVADE), an approach to integrate global experience into multi-agent online planning in stochastic domains to consider global effects during local planning. For this purpose, a value function is approximated online based on the emergent system behaviour by using methods of reinforcement learning. We empirically evaluated EVADE with two statistical multi-agent online planning algorithms in a highly complex and stochastic smart factory environment, where multiple agents need to process various items at a shared set of machines. Our experiments show that EVADE can effectively improve the performance of multi-agent online planning while offering efficiency w.r.t. the breadth and depth of the planning process.
• This chapter discusses the interplay between structure and dynamics in complex networks. Given a particular network with an endowed dynamics, our goal is to find partitions aligned with the dynamical process acting on top of the network. We thus aim to gain a reduced description of the system that takes into account both its structure and dynamics. In the first part, we introduce the general mathematical setup for the types of dynamics we consider throughout the chapter. We provide two guiding examples, namely consensus dynamics and diffusion processes (random walks), motivating their connection to social network analysis, and provide a brief discussion on the general dynamical framework and its possible extensions. In the second part, we focus on the influence of graph structure on the dynamics taking place on the network, focusing on three concepts that allow us to gain insight into this notion. First, we describe how time scale separation can appear in the dynamics on a network as a consequence of graph structure. Second, we discuss how the presence of particular symmetries in the network give rise to invariant dynamical subspaces that can be precisely described by graph partitions. Third, we show how this dynamical viewpoint can be extended to study dynamics on networks with signed edges, which allow us to discuss connections to concepts in social network analysis, such as structural balance. In the third part, we discuss how to use dynamical processes unfolding on the network to detect meaningful network substructures. We then show how such dynamical measures can be related to seemingly different algorithm for community detection and coarse-graining proposed in the literature. We conclude with a brief summary and highlight interesting open future directions.
• With the emergence of autonomous vehicles, it is important to understand their impact on the transportation system. However, conventional traffic simulations are time-consuming. In this paper, we introduce an analytical traffic model for unmanaged intersections accounting for microscopic vehicle interactions. The macroscopic property, i.e., delay at the intersection, is modeled as an event-driven stochastic dynamic process, whose dynamics encode the microscopic vehicle behaviors. The distribution of macroscopic properties can be obtained through either direct analysis or event-driven simulation. They are more efficient than conventional (time-driven) traffic simulation, and capture more microscopic details compared to conventional macroscopic flow models. We illustrate the efficiency of this method by delay analyses under two different policies at a two-lane intersection. The proposed model allows for 1) efficient and effective comparison among different policies, 2) policy optimization, 3) traffic prediction, and 4) system optimization (e.g., infrastructure and protocol).
• We studied the long-term dynamics of evolutionary Swarm Chemistry by extending the simulation length ten-fold compared to earlier work and by developing and using a new automated object harvesting method. Both macroscopic dynamics and microscopic object features were characterized and tracked using several measures. Results showed that the evolutionary dynamics tended to settle down into a stable state after the initial transient period, and that the extent of environmental perturbations also affected the evolutionary trends substantially. In the meantime, the automated harvesting method successfully produced a huge collection of spontaneously evolved objects, revealing the system's autonomous creativity at an unprecedented scale.
• Hierarchical Modular Reinforcement Learning (HMRL), consists of 2 layered learning where Profit Sharing works to plan a prey position in the higher layer and Q-learning method trains the state-actions to the target in the lower layer. In this paper, we expanded HMRL to multi-target problem to take the distance between targets to the consideration. The function, called `AT field', can estimate the interests for an agent according to the distance between 2 agents and the advantage/disadvantage of the other agent. Moreover, the knowledge related to state-action rules is extracted by C4.5. The action under the situation is decided by using the acquired knowledge. To verify the effectiveness of proposed method, some experimental results are reported.
• The concept of truth, as a public good is the production of a collective understanding, which emerges from a complex network of social interactions. The recent impact of social networks on shaping the perception of truth in political arena shows how such perception is corroborated and established by the online users, collectively. However, investigative journalism for discovering truth is a costly option, given the vast spectrum of online information. In some cases, both journalist and online users choose not to investigate the authenticity of the news they receive, because they assume other actors of the network had carried the cost of validation. Therefore, the new phenomenon of "fake news" has emerged within the context of social networks. The online social networks, similarly to System of Systems, cause emergent properties, which makes authentication processes difficult, given availability of multiple sources. In this study, we show how this conflict can be modeled as a volunteer's dilemma. We also show how the public contribution through news subscription (shared rewards) can impact the dominance of truth over fake news in the network.
• In this letter we discuss cost optimization of sensor networks monitoring structurally full-rank systems under distributed observability constraint. Using structured systems theory, the problem is relaxed into two subproblems: (i) sensing cost optimization and (ii) networking cost optimization. Both problems are reformulated as combinatorial optimization problems. The sensing cost optimization is shown to have a polynomial order solution. The networking cost optimization is shown to be NP-hard in general, but has a polynomial order solution under specific conditions. A 2-approximation polynomial order relaxation is provided for general networking cost optimization, which is applicable in large-scale system monitoring.
• The behavior of heterogeneous multi-agent systems is studied when the coupling matrices are possibly all different and/or singular (that is, its rank is less than the system dimension). Rank-deficient coupling allows exchange of limited state information, which is suitable for study of output coupling in multi-agent systems. We present a coordinate change that transforms the heterogeneous multi-agent system into a singularly perturbed form. The slow dynamics is still a reduced-order multi-agent system consisting of a weighted average of the vector fields of all agents, and some sub-dynamics of agents. The weighted average is an emergent dynamics, which we call a blended dynamics. By analyzing or synthesizing the blended dynamics, one can predict or design the behavior of heterogeneous multi-agent system when the coupling gain is sufficiently large. For this result, stability of the blended dynamics is required. Since stability of individual agent is not asked, stability of the blended dynamics is the outcome of trading stability among the agents. It can be seen that, under stability of the blended dynamics, the initial conditions of individual agents are forgotten as time goes on, and thus, the behavior of the synthesized multi-agent system are initialization-free and suitable for plug-and-play operation. As a showcase, we apply the proposed tool to two application problems; distributed state estimation for linear systems, and practical synchronization of heterogeneous Van der Pol oscillators (for which phase cohesiveness is achieved). We also present underlying intuition for two more applications; estimation of the number of nodes in a network, and a problem of distributed optimization.
• We consider a scenario consisting of a set of heterogeneous mobile agents located at a depot, and a set of tasks dispersed over a geographic area. The agents are partitioned into different types. The tasks are partitioned into specialized tasks that can only be done by agents of a certain type, and generic tasks that can be done by any agent. The distances between each pair of tasks are specified, and satisfy the triangle inequality. Given this scenario, we address the problem of allocating these tasks among the available agents (subject to type compatibility constraints) while minimizing the maximum cost to tour the allocation by any agent and return to the depot. This problem is NP-hard, and we give a three phase algorithm to solve this problem that provides 5-factor approximation, regardless of the total number of agents and the number of agents of each type. We also show that in the special case where there is only one agent of each type, the algorithm has an approximation factor of 4.
• Groups of humans are often able to find ways to cooperate with one another in complex, temporally extended social dilemmas. Models based on behavioral economics are only able to explain this phenomenon for unrealistic stateless matrix games. Recently, multi-agent reinforcement learning has been applied to generalize social dilemma problems to temporally and spatially extended Markov games. However, this has not yet generated an agent that learns to cooperate in social dilemmas as humans do. A key insight is that many, but not all, human individuals have inequity averse social preferences. This promotes a particular resolution of the matrix game social dilemma wherein inequity-averse individuals are personally pro-social and punish defectors. Here we extend this idea to Markov games and show that it promotes cooperation in several types of sequential social dilemma, via a profitable interaction with policy learnability. In particular, we find that inequity aversion improves temporal credit assignment for the important class of intertemporal social dilemmas. These results help explain how large-scale cooperation may emerge and persist.
• Chu Spaces and Channel Theory are well established areas of investigation in the general context of category theory. We review a range of examples and applications of these methods in logic and computer science, including Formal Concept Analysis, distributed systems and ontology development. We then employ these methods to describe human object perception, beginning with the construction of uncategorized object files and proceeding through categorization, individual object identification and the tracking of object identity through time. We investigate the relationship between abstraction and mereological categorization, particularly as these affect object identity tracking. This we accomplish in terms of information flow that is semantically structured in terms of local logics, while at the same time this framework also provides an inferential mechanism towards identification and perception. We show how a mereotopology naturally emerges from the representation of classifications by simplicial complexes, and briefly explore the emergence of geometric relations and interactions between objects.
• Demand Responsive Shared Transport DRST services take advantage of Information and Communication Technologies ICT, to provide on demand transport services booking in real time a ride on a shared vehicle. In this paper, an agent-based model ABM is presented to test different the feasibility of different service configurations in a real context. First results show the impact of route choice strategy on the system performance.
• This paper presents a distributed position synchronization strategy that also preserves the initial communication links for single-integrator multi-agent systems with time-varying delays. The strategy employs a coordinating proportional control derived from a specific type of potential energy, augmented with damping injected through a dynamic filter. The injected damping maintains all agents within the communication distances of their neighbours, and asymptotically stabilizes the multi-agent system, in the presence of time delays. Regarding the closed-loop single-integrator multi-agent system as a double-integrator system suggests an extension of the proposed strategy to connectivity-preserving coordination of Euler-Lagrange networks with time-varying delays. Lyapunov stability analysis and simulation results validate the two designs.
• Distributed model predictive control (MPC) has been proven a successful method in regulating the operation of large-scale networks of constrained dynamical systems. This paper is concerned with cooperative distributed MPC in which the decision actions of the systems are usually derived by the solution of a system-wide optimization problem. However, formulating and solving such large-scale optimization problems is often a hard task which requires extensive information communication among the individual systems and fails to address privacy concerns in the network. Hence, the main challenge is to design decision policies with a prescribed structure so that the resulting system-wide optimization problem to admit a loosely coupled structure and be amendable to distributed computation algorithms. In this paper, we propose a decentralized problem synthesis scheme which only requires each system to communicate sets which bound its states evolution to neighboring systems. The proposed method alleviates concerns on privacy since this limited communication scheme does not reveal the exact characteristics of the dynamics within each system. In addition, it enables a distributed computation of the solution, making our method highly scalable. We demonstrate in a number of numerical studies, inspired by engineering and finance, the efficacy of the proposed approach which leads to solutions that closely approximate those obtained by the centralized formulation only at a fraction of the computational effort.
• In this paper, we propose a distributed model predictive control (DMPC) scheme for linear time-invariant constrained systems which admit a separable structure. To exploit the merits of distributed computation algorithms, the stabilizing terminal controller, value function and invariant terminal set of the DMPC optimization problem need to respect the loosely coupled structure of the system. Although existing methods in the literature address this task, they typically decouple the synthesis of terminal controllers and value functions from the one of terminal sets. In addition, these approaches do not explicitly consider the effect of the current state of the system in the synthesis process. These limitations can lead the resulting DMPC scheme to poor performance since it may admit small or even empty terminal sets. Unlike other approaches, this paper presents a unified framework to encapsulate the synthesis of both the stabilizing terminal controller and invariant terminal set into the DMPC formulation. Conditions for Lyapunov stability and invariance are imposed in the synthesis problem in a way that allows the value function and invariant terminal set to admit the desired distributed structure. We illustrate the effectiveness of the proposed method on several examples including a benchmark spring-mass-damper problem.
• This paper is about a new model of opinion dynamics with opinion-dependent connectivity. We assume that agents update their opinions asynchronously and that each agent's new opinion depends on the opinions of the $k$ agents that are closest to it. We show that the resulting dynamics is substantially different from comparable models in the literature, such as bounded-confidence models. We study the equilibria of the dynamics, observing that they are robust to perturbations caused by the introduction of new agents. We also prove that if the number of agents $n$ is smaller than $2k$, the dynamics converge to consensus. This condition is only sufficient.