# Multiagent Systems (cs.MA)

• In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.
• The development and deployment of Autonomous Vehicles (AVs) on our roads is not only realistic in the near future but can also bring significant benefits. In particular, it can potentially solve several problems relating to vehicles and traffic, for instance: (i) possible reduction of traffic congestion, with the consequence of improved fuel economy and reduced driver inactivity; (ii) possible reduction in the number of accidents, assuming that an AV can minimise the human errors that often cause traffic accidents; and (iii) increased ease of parking, especially when one considers the potential for shared AVs. In order to deploy an AV there are significant steps that must be completed in terms of hardware and software. As expected, software components play a key role in the complex AV system and so, at least for safety, we should assess the correctness of these components. In this paper, we are concerned with the high-level software component(s) responsible for the decisions in an AV. We intend to model an AV capable of navigation; obstacle avoidance; obstacle selection (when a crash is unavoidable) and vehicle recovery, etc, using a rational agent. To achieve this, we have established the following stages. First, the agent plans and actions have been implemented within the Gwendolen agent programming language. Second, we have built a simulated automotive environment in the Java language. Third, we have formally specified some of the required agent properties through LTL formulae, which are then formally verified with the AJPF verification tool. Finally, within the MCAPL framework (which comprises all the tools used in previous stages) we have obtained formal verification of our AV agent in terms of its specific behaviours. For example, the agent plans responsible for selecting an obstacle with low potential damage, instead of a higher damage obstacle (when possible) can be formally verified within MCAPL. We must emphasise that the major goal (of our present approach) lies in the formal verification of agent plans, rather than evaluating real-world applications. For this reason we utilised a simple matrix representation concerning the environment used by our agent.
• We discuss the connection between computational social choice (comsoc) and computational complexity. We stress the work so far on, and urge continued focus on, two less-recognized aspects of this connection. Firstly, this is very much a two-way street: Everyone knows complexity classification is used in comsoc, but we also highlight benefits to complexity that have arisen from its use in comsoc. Secondly, more subtle, less-known complexity tools often can be very productively used in comsoc.
• The goal of this paper is to present an end-to-end, data-driven framework to control Autonomous Mobility-on-Demand systems (AMoD, i.e. fleets of self-driving vehicles). We first model the AMoD system using a time-expanded network, and present a formulation that computes the optimal rebalancing strategy (i.e., preemptive repositioning) and the minimum feasible fleet size for a given travel demand. Then, we adapt this formulation to devise a Model Predictive Control (MPC) algorithm that leverages short-term demand forecasts based on historical data to compute rebalancing strategies. We test the end-to-end performance of this controller with a state-of-the-art LSTM neural network to predict customer demand and real customer data from DiDi Chuxing: we show that this approach scales very well for large systems (indeed, the computational complexity of the MPC algorithm does not depend on the number of customers and of vehicles in the system) and outperforms state-of-the-art rebalancing strategies by reducing the mean customer wait time by up to to 89.6%.
• Decentralized control of robots has attracted huge research interests. However, some of the research used unrealistic assumptions without collision avoidance. This report focuses on the collision-free control for multiple robots in both complete coverage and search tasks in 2D and 3D areas which are arbitrary unknown. All algorithms are decentralized as robots have limited abilities and they are mathematically proved. The report starts with the grid selection in the two tasks. Grid patterns simplify the representation of the area and robots only need to move straightly between neighbor vertices. For the 100% complete 2D coverage, the equilateral triangular grid is proposed. For the complete coverage ignoring the boundary effect, the grid with the fewest vertices is calculated in every situation for both 2D and 3D areas. The second part is for the complete coverage in 2D and 3D areas. A decentralized collision-free algorithm with the above selected grid is presented driving robots to sections which are furthest from the reference point. The area can be static or expanding, and the algorithm is simulated in MATLAB. Thirdly, three grid-based decentralized random algorithms with collision avoidance are provided to search targets in 2D or 3D areas. The number of targets can be known or unknown. In the first algorithm, robots choose vacant neighbors randomly with priorities on unvisited ones while the second one adds the repulsive force to disperse robots if they are close. In the third algorithm, if surrounded by visited vertices, the robot will use the breadth-first search algorithm to go to one of the nearest unvisited vertices via the grid. The second search algorithm is verified on Pioneer 3-DX robots. The general way to generate the formula to estimate the search time is demonstrated. Algorithms are compared with five other algorithms in MATLAB to show their effectiveness.
• Due to the complexity of the natural world, a programmer cannot foresee all possible situations a connected and autonomous vehicle (CAV) will face during its operation, and hence, CAVs will need to learn to make decisions autonomously. Due to the sensing of its surroundings and information exchanged with other vehicles and road infrastructure a CAV will have access to large amounts of useful data. This paper investigates a data driven driving policy learning framework through an agent based learning. A reinforcement learning framework is presented in the paper, which simulates the self-evolution of a CAV over its lifetime. The results indicated that overtime the CAVs are able to learn useful policies to avoid crashes and achieve its objectives in more efficient ways. Vehicle to vehicle communication in particular, enables additional useful information to be acquired by CAVs, which in turn enables CAVs to learn driving policies more efficiently. The simulation results indicate that while a CAV can learn to make autonomous decision V2V communication of information improves this capability. The future work will investigate complex driving policies such as roundabout negotiations, cooperative learning between CAVs and deep reinforcement learning to traverse larger state spaces.
• In this paper, we conduct an empirical study on discovering the ordered collective dynamics obtained by a population of artificial intelligence (AI) agents. Our intention is to put AI agents into a simulated natural context, and then to understand their induced dynamics at the population level. In particular, we aim to verify if the principles developed in the real world could also be used in understanding an artificially-created intelligent population. To achieve this, we simulate a large-scale predator-prey world, where the laws of the world are designed by only the findings or logical equivalence that have been discovered in nature. We endow the agents with the intelligence based on deep reinforcement learning, and scale the population size up to millions. Our results show that the population dynamics of AI agents, driven only by each agent's individual self interest, reveals an ordered pattern that is similar to the Lotka-Volterra model studied in population biology. We further discover the emergent behaviors of collective adaptations in studying how the agents' grouping behaviors will change with the environmental resources. Both of the two findings could be explained by the self-organization theory in nature.
• We introduce an axiomatic approach to group recommendations, in line of previous work on the axiomatic treatment of trust-based recommendation systems, ranking systems, and other foundational work on the axiomatic approach to internet mechanisms in social choice settings. In group recommendations we wish to recommend to a group of agents, consisting of both opinionated and undecided members, a joint choice that would be acceptable to them. Such a system has many applications, such as choosing a movie or a restaurant to go to with a group of friends, recommending games for online game players, & other communal activities. Our method utilizes a given social graph to extract information on the undecided, relying on the agents influencing them. We first show that a set of fairly natural desired requirements (a.k.a axioms) leads to an impossibility, rendering mutual satisfaction of them unreachable. However, we also show a modified set of axioms that fully axiomatize a group variant of the random-walk recommendation system, expanding a previous result from the individual recommendation case.
• Agent-based IoT applications have recently been proposed in several domains, such as health care, smart cities and agriculture. Deploying these applications in specific settings has been very challenging for many reasons including the complex static and dynamic variability of the physical devices such as sensors and actuators, the software application behavior and the environment in which the application is embedded. In this paper, we propose a self-configurable IoT agent approach based on feedback-evaluative machine-learning. The approach involves: i) a variability model of IoT agents; ii) generation of sets of customized agents; iii) feedback evaluative machine learning; iv) modeling and composition of a group of IoT agents; and v) a feature-selection method based on manual and automatic feedback.
• Multi-agent Systems (MASs) have found a variety of industrial applications from economics to robotics, owing to their high adaptability, scalability and applicability. However, with the increasing complexity of MASs, multi-agent control has become a challenging problem to solve. Among different approaches to deal with this complex problem, game theoretic learning recently has received researchers' attention as a possible solution. In such learning scheme, by playing a game, each agent eventually discovers a solution on its own. The main focus of this paper is on enhancement of two types of game-theoretic learning algorithms: log linear learning and reinforcement learning. Each algorithm proposed in this paper, relaxes and imposes different assumptions to fit a class of MAS problems. Numerical experiments are also conducted to verify each algorithm's robustness and performance.
• A work zone bottleneck in a roadway network can cause traffic delays, emissions and safety issues. Accurate measurement and prediction of work zone travel time can help travelers make better routing decisions and therefore mitigate its impact. Historically, data used for travel time analyses comes from fixed loop detectors, which are expensive to install and maintain. With connected vehicle technology, such as Vehicle-to-Infrastructure, portable roadside unit (RSU) can be located in and around a work zone segment to communicate with the vehicles and collect traffic data. A PARAMICS simulation model for a prototypical freeway work zone in a connected vehicle environment was built to test this idea using traffic demand data from NY State Route 104. For the simulation, twelve RSUs were placed along the work zone segment and sixteen variables were extracted from the simulation results to explore travel time estimation and prediction. For the travel time analysis, four types of models were constructed, including linear regression, multivariate adaptive regression splines (MARS), stepwise regression and elastic net. The results show that the modeling approaches under consideration have similar performance in terms of the Root of Mean Square Error (RMSE), which provides an opportunity for model selection based on additional factors including the number and locations of the RSUs according to the significant variables identified in the various models. Among the four approaches, the stepwise regression model only needs variables from two RSUs: one placed sufficiently upstream of the work zone and one at the end of the work zone.
• In this paper, we investigate the problem of anti-coordination under rationality constraints in ad-hoc resource allocation settings. Inspired by human behavior, we propose a framework (CA3NONY) that enables fast convergence to efficient and fair allocations based on a simple convention of courtesy. We prove that following such convention induces a strategy which constitutes an approximate subgame-perfect equilibrium of the repeated resource allocation game with discounting. Simulation results highlight the effectiveness of CA3NONY as compared to state-of-the-art bandit algorithms, since it achieves more than two orders of magnitude faster convergence, higher efficiency, fairness, and average payoff for the agents.
• Most of the knowledge Representation formalisms developed for representing prescriptive norms can be categorized as either suitable for representing either low level or high level norms.We argue that low level norm representations do not advance the cause of autonomy in agents in the sense that it is not the agent itself that determines the normative position it should be at a particular time, on the account of a more general rule. In other words an agent on some external system for a nitty gritty prescriptions of its obligations and prohibitions. On the other hand, high level norms which have an explicit description of a norm's precondition and have some form of implication, do not as they exist in the literature do not support generalized inferences about violation like low level norm representations do. This paper presents a logical formalism for the representation of high level norms in open societies that enable violation inferences that detail the situation in which the norm violation took place and the identity of the norm violation. Norms are formalized as logic programs whose heads specify what an agent is obliged or permitted to do when a situation arises and within what time constraint of the situation.Each norm is also assigned an identity using some reification scheme. The body of each logic program describes the nature of the situation in which the agent is expected to act or desist from acting. This kind of violation is novel in the literature.
• Streets are large, diverse, and used for several (and possibly conflicting) transport modalities as well as social and cultural activities. Proper planning is essential and requires data. Manually fabricating data that represent streets (street reconstruction) is error-prone and time consuming. Automatising street reconstruction is a challenge because of the diversity, size, and scale of the details (few centimetres for cornerstone) required. The state-of-the-art focuses on roads (no context, no urban features) and is strongly determined by each application (simulation, visualisation, planning). We propose a unified framework that works on real Geographic Information System (GIS) data and uses a strong, yet simple modelling hypothesis when possible to robustly model streets at the city level or street level. Our method produces a coherent street-network model containing topological traffic information, road surface and street objects. We demonstrate the robustness and genericity of our method by reconstructing the entire city of Paris streets and exploring other similar reconstruction (airport driveway).
• Conversational agents are exploding in popularity. However, much work remains in the area of non goal-oriented conversations, despite significant growth in research interest over recent years. To advance the state of the art in conversational AI, Amazon launched the Alexa Prize, a 2.5-million dollar university competition where sixteen selected university teams built conversational agents to deliver the best social conversational experience. Alexa Prize provided the academic community with the unique opportunity to perform research with a live system used by millions of users. The subjectivity associated with evaluating conversations is key element underlying the challenge of building non-goal oriented dialogue systems. In this paper, we propose a comprehensive evaluation strategy with multiple metrics designed to reduce subjectivity by selecting metrics which correlate well with human judgement. The proposed metrics provide granular analysis of the conversational agents, which is not captured in human ratings. We show that these metrics can be used as a reasonable proxy for human judgment. We provide a mechanism to unify the metrics for selecting the top performing agents, which has also been applied throughout the Alexa Prize competition. To our knowledge, to date it is the largest setting for evaluating agents with millions of conversations and hundreds of thousands of ratings from users. We believe that this work is a step towards an automatic evaluation process for conversational AIs.
• Dialog evaluation is a challenging problem, especially for non task-oriented dialogs where conversational success is not well-defined. We propose to evaluate dialog quality using topic-based metrics that describe the ability of a conversational bot to sustain coherent and engaging conversations on a topic, and the diversity of topics that a bot can handle. To detect conversation topics per utterance, we adopt Deep Average Networks (DAN) and train a topic classifier on a variety of question and query data categorized into multiple topics. We propose a novel extension to DAN by adding a topic-word attention table that allows the system to jointly capture topic keywords in an utterance and perform topic classification. We compare our proposed topic based metrics with the ratings provided by users and show that our metrics both correlate with and complement human judgment. Our analysis is performed on tens of thousands of real human-bot dialogs from the Alexa Prize competition and highlights user expectations for conversational bots.
• Conversational agents are exploding in popularity. However, much work remains in the area of social conversation as well as free-form conversation over a broad range of domains and topics. To advance the state of the art in conversational AI, Amazon launched the Alexa Prize, a 2.5-million-dollar university competition where sixteen selected university teams were challenged to build conversational agents, known as socialbots, to converse coherently and engagingly with humans on popular topics such as Sports, Politics, Entertainment, Fashion and Technology for 20 minutes. The Alexa Prize offers the academic community a unique opportunity to perform research with a live system used by millions of users. The competition provided university teams with real user conversational data at scale, along with the user-provided ratings and feedback augmented with annotations by the Alexa team. This enabled teams to effectively iterate and make improvements throughout the competition while being evaluated in real-time through live user interactions. To build their socialbots, university teams combined state-of-the-art techniques with novel strategies in the areas of Natural Language Understanding, Context Modeling, Dialog Management, Response Generation, and Knowledge Acquisition. To support the efforts of participating teams, the Alexa Prize team made significant scientific and engineering investments to build and improve Conversational Speech Recognition, Topic Tracking, Dialog Evaluation, Voice User Experience, and tools for traffic management and scalability. This paper outlines the advances created by the university teams as well as the Alexa Prize team to achieve the common goal of solving the problem of Conversational AI.
• Many living and non-living complex systems can be modeled and understood as collective systems made of heterogeneous components that self-organize and generate nontrivial morphological structures and behaviors. This chapter presents a brief overview of our recent effort that investigated various aspects of such morphogenetic collective systems. We first propose a theoretical classification scheme that distinguishes four complexity levels of morphogenetic collective systems based on the nature of their components and interactions. We conducted a series of computational experiments using a self-propelled particle swarm model to investigate the effects of (1) heterogeneity of components, (2) differentiation/re-differentiation of components, and (3) local information sharing among components, on the self-organization of a collective system. Results showed that (a) heterogeneity of components had a strong impact on the system's structure and behavior, (b) dynamic differentiation/re-differentiation of components and local information sharing helped the system maintain spatially adjacent, coherent organization, (c) dynamic differentiation/re-differentiation contributed to the development of more diverse structures and behaviors, and (d) stochastic re-differentiation of components naturally realized a self-repair capability of self-organizing morphologies. We also explored evolutionary methods to design novel self-organizing patterns, using interactive evolutionary computation and spontaneous evolution within an artificial ecosystem. These self-organizing patterns were found to be remarkably robust against dimensional changes from 2D to 3D, although evolution worked efficiently only in 2D settings.
• Space syntax matrix has been the main approach for human movement prediction in the urban environment. An alternative, relatively new methodology is an agent-based pedestrian model constructed using machine learning techniques. Even though both approaches have been studied intensively, the quantitative comparison between them has not been conducted. In this paper, comparative analysis of space syntax metrics and maximum entropy inverse reinforcement learning (MEIRL) is performed. The experimental result on trajectory data of artificially generated pedestrian agents shows that MEIRL outperforms space syntax matrix. The possibilities for combining two methods are drawn out as conclusions, and the relative challenges with the data collection are highlighted.
• Jan 03 2018 cs.MA arXiv:1801.00259v1
Public Policy involves proposing changes to existing practices, alternatives, new habits. Citizens and institutions react accordingly, accepting, refuting or adapting. Agent-based modeling is a tool that can enrich the policy analysis package explicitly considering dynamics, space and individual-level interactions. This paper presents a modeling platform called PolicySpace that models public policies within an empirical, spatial environment using data from 46 metropolitan regions in Brazil. We describe the basics of the model, its agents and markets, the tax scheme, the parametrization, and how to run the model. Finally, we validate the model and demonstrate an application of the fiscal analysis. Besides providing the basics of the platform, our results indicate the relevance of the rules of taxes transfer for cities' quality of life.
• Dec 29 2017 cs.MA cs.SE cs.SI arXiv:1712.09496v1
The design of agent-based models (ABMs) is often ad-hoc when it comes to defining their scope. In order for the inclusion of features such as network structure, location, or dynamic change to be justified, their role in a model should be systematically analysed. We propose a mechanism to compare and assess the impact of such features. In particular we are using techniques from software engineering and semantics to support the development and assessment of ABMs, such as graph transformations as semantic representations for agent-based models, feature diagrams to identify ingredients under consideration, and extension relations between graph transformation systems to represent model fragments expressing features.
• Biological systems can share and collectively process information to yield emergent effects, despite inherent noise in communication. While man-made systems often employ intricate structural solutions to overcome noise, the structure of many biological systems is more amorphous. It is not well understood how communication noise may affect the computational repertoire of such groups. To approach this question we consider the basic collective task of rumor spreading, in which information from few knowledgeable sources must reliably flow into the rest of the population. In order to study the effect of communication noise on the ability of groups that lack stable structures to efficiently solve this task, we consider a noisy version of the uniform PULL model. We prove a lower bound which implies that, in the presence of even moderate levels of noise that affect all facets of the communication, no scheme can significantly outperform the trivial one in which agents have to wait until directly interacting with the sources. Our results thus show an exponential separation between the uniform PUSH and PULL communication models in the presence of noise. Such separation may be interpreted as suggesting that, in order to achieve efficient rumor spreading, a system must exhibit either some degree of structural stability or, alternatively, some facet of the communication which is immune to noise. We corroborate our theoretical findings with a new analysis of experimental data regarding recruitment in Cataglyphis niger desert ants.
• The multiagent-based participatory simulation features prominently in urban planning as the acquired model is considered as the hybrid system of the domain and the local knowledge. However, the key problem of generating realistic agents for particular social phenomena invariably remains. The existing models have attempted to dictate the factors involving human behavior, which appeared to be intractable. In this paper, Inverse Reinforcement Learning (IRL) is introduced to address this problem. IRL is developed for computational modeling of human behavior and has achieved great successes in robotics, psychology and machine learning. The possibilities presented by this new style of modeling are drawn out as conclusions, and the relative challenges with this modeling are highlighted.
• This paper discusses new techniques to enhance Automated Transit Networks (ATN, previously called Personal Rapid Transit - PRT) based on Artificial Intelligence tools. The main direction is improvement of the cooperation of autonomous modules that use negotiation protocols, following the IoT paradigm. One of the goals is to increase ATN system throughput by tuning up autonomous vehicles cooperation. Machine learning (ML) was used to improve algorithms designed by human programmers. We used "existing controls" corresponding to near-optimal solutions and built refinement models to more accurately relate a system's dynamics to its performance. A mechanism that mostly influences ATN performance is Empty Vehicle Management (EVM). The algorithms designed by human programmers was used: calls to empty vehicles for waiting passengers and balancing based on reallocation of empty vehicles to achieve better regularity of their settlement. In this paper we discuss how we can improve these algorithms (and tune them to current conditions) by using ML to tailor individual behavioral policies. Using ML techniques was possible because our algorithm is based on a set of parameters. A number of weights and thresholds could be tuned up to give better decisions on moving empty vehicles across the track.
• Running agent-based models (ABMs) is a burdensome computational task, specially so when considering the flexibility ABMs intrinsically provide. This paper uses a bundle of model configuration parameters along with obtained results from a validated ABM to train some Machine Learning methods for socioeconomic optimal cases. A larger space of possible parameters and combinations of parameters are then used as input to predict optimal cases and confirm parameters calibration. Analysis of the parameters of the optimal cases are then compared to the baseline model. This exploratory initial exercise confirms the adequacy of most of the parameters and rules and suggests changing of directions to two parameters. Additionally, it helps highlight metropolitan regions of higher quality of life. Better understanding of ABM mechanisms and parameters' influence may nudge policy-making slightly closer to optimal level.
• We propose the creation of a systematic effort to identify and replicate key findings in neuroscience and allied fields related to understanding human values. Our aim is to ensure that research underpinning the value alignment problem of artificial intelligence has been sufficiently validated to play a role in the design of AI systems.
• In this paper, we aim to find a robust network formation strategy that can adaptively evolve the network topology against network dynamics in a distributed manner. We consider a network coding deployed wireless ad hoc network where source nodes are connected to terminal nodes with the help of intermediate nodes. We show that mixing operations in network coding can induce packet anonymity that allows the inter-connections in a network to be decoupled. This enables each intermediate node to consider complex network inter-connections as a node-environment interaction such that the Markov decision process (MDP) can be employed at each intermediate node. The optimal policy that can be obtained by solving the MDP provides each node with optimal amount of changes in transmission range given network dynamics (e.g., the number of nodes in the range and channel condition). Hence, the network can be adaptively and optimally evolved by responding to the network dynamics. The proposed strategy is used to maximize long-term utility, which is achieved by considering both current network conditions and future network dynamics. We define the utility of an action to include network throughput gain and the cost of transmission power. We show that the resulting network of the proposed strategy eventually converges to stationary networks, which maintain the states of the nodes. Moreover, we propose to determine initial transmission ranges and initial network topology that can expedite the convergence of the proposed algorithm. Our simulation results confirm that the proposed strategy builds a network which adaptively changes its topology in the presence of network dynamics. Moreover, the proposed strategy outperforms existing strategies in terms of system goodput and successful connectivity ratio.
• We introduce MAgent, a platform to support research and development of many-agent reinforcement learning. Unlike previous research platforms on single or multi-agent reinforcement learning, MAgent focuses on supporting the tasks and the applications that require hundreds to millions of agents. Within the interactions among a population of agents, it enables not only the study of learning algorithms for agents' optimal polices, but more importantly, the observation and understanding of individual agent's behaviors and social phenomena emerging from the AI society, including communication languages, leaderships, altruism. MAgent is highly scalable and can host up to one million agents on a single GPU server. MAgent also provides flexible configurations for AI researchers to design their customized environments and agents. In this demo, we present three environments designed on MAgent and show emerged collective intelligence by learning from scratch.
• The human brain is one of the most complex living structures in the known Universe. It consists of billions of neurons and synapses. Due to its intrinsic complexity, it can be a formidable task to accurately depict brain's structure and functionality. In the past, numerous studies have been conducted on modeling brain disease, structure, and functionality. Some of these studies have employed Agent-based approaches including multiagent-based simulation models as well as brain complex networks. While these models have all been developed using agent-based computing, however, to our best knowledge, none of them have employed the use of Agent-Oriented Software Engineering (AOSE) methodologies in developing the brain or disease model. This is a problem because without due process, developed models can miss out on important requirements. AOSE has the unique capability of merging concepts from multiagent systems, agent-based modeling, artificial intelligence, besides concepts from distributed systems. AOSE involves the various tested software engineering principles in various phases of the model development ranging from analysis, design, implementation, and testing phases. In this paper, we employ the use of three different AOSE methodologies for modeling the Multiple Sclerosis brain disease namely GAIA, TROPOS, and MASE. After developing the models, we further employ the use of Exploratory Agent-based Modeling (EABM) to develop an actual model replicating previous results as a proof of concept. The key objective of this study is to demonstrate and explore the viability and effectiveness of AOSE methodologies in the development of complex brain structure and cognitive process models. Our key finding include demonstration that AOSE methodologies can be considerably helpful in modeling various living complex systems, in general, and the human brain, in particular.
• Modeling personality is a challenging problem with applications spanning computer games, virtual assistants, online shopping and education. Many techniques have been tried, ranging from neural networks to computational cognitive architectures. However, most approaches rely on examples with hand-crafted features and scenarios. Here, we approach learning a personality by training agents using a Deep Q-Network (DQN) model on rewards based on psychoanalysis, against hand-coded AI in the game of Pong. As a result, we obtain 4 agents, each with its own personality. Then, we define happiness of an agent, which can be seen as a measure of alignment with agent's objective function, and study it when agents play both against hand-coded AI, and against each other. We find that the agents that achieve higher happiness during testing against hand-coded AI, have lower happiness when competing against each other. This suggests that higher happiness in testing is a sign of overfitting in learning to interact with hand-coded AI, and leads to worse performance against agents with different personalities.
• The combination of Artificial Intelligence (AI) and Internet-of-Things (IoT), which is denoted as AI-powered Internet-of-Things (AIoT), is capable of processing huge amount of data generated from a large number of devices and handling complex problems in social infrastructures. As AI and IoT technologies are becoming mature, in this paper, we propose to apply AIoT technologies for traffic light control, which is an essential component for intelligent transportation system, to improve the efficiency of smart city's road system. Specifically, various sensors such as surveillance cameras provide real-time information for intelligent traffic light control system to observe the states of both motorized traffic and non-motorized traffic. In this paper, we propose an intelligent traffic light control solution by using distributed multi-agent Q learning, considering the traffic information at the neighboring intersections as well as local motorized and non-motorized traffic, to improve the overall performance of the entire control system. By using the proposed multi-agent Q learning algorithm, our solution is targeting to optimize both the motorized and non-motorized traffic. In addition, we considered many constraints/rules for traffic light control in the real world, and integrate these constraints in the learning algorithm, which can facilitate the proposed solution to be deployed in real operational scenarios. We conducted numerical simulations for a real-world map with real-world traffic data. The simulation results show that our proposed solution outperforms existing solutions in terms of vehicle and pedestrian queue lengths, waiting time at intersections, and many other key performance metrics.
• We consider ordinal approximation algorithms for a broad class of utility maximization problems for multi-agent systems. In these problems, agents have utilities for connecting to each other, and the goal is to compute a maximum-utility solution subject to a set of constraints. We represent these as a class of graph optimization problems, including matching, spanning tree problems, TSP, maximum weight planar subgraph, and many others. We study these problems in the ordinal setting: latent numerical utilities exist, but we only have access to ordinal preference information, i.e., every agent specifies an ordering over the other agents by preference. We prove that for the large class of graph problems we identify, ordinal information is enough to compute solutions which are close to optimal, thus demonstrating there is no need to know the underlying numerical utilities. For example, for problems in this class with bounded degree $b$ a simple ordinal greedy algorithm always produces a ($b+1$)-approximation; we also quantify how the quality of ordinal approximation depends on the sparsity of the resulting graphs. In particular, our results imply that ordinal information is enough to obtain a 2-approximation for Maximum Spanning Tree; a 4-approximation for Max Weight Planar Subgraph; a 2-approximation for Max-TSP; and a 2-approximation for various Matching problems.
• Social storage systems are becoming increasingly popular compared to the existing data backup systems like local, centralized and P2P systems. An endogenously built symmetric social storage model and its aspects like the utility of each agent, bilateral stability, contentment, and efficiency have been extensively discussed in Mane et. al. (2017). We include heterogeneity in this model by using the concept of Social Range Matrix from Kuznetsov et. al (2010). Now, each agent is concerned about its perceived utility, which is a linear combination of its utility as well as others utilities (depending upon whether the pair are friends, enemies or do not care about each other). We derive conditions when two agents may want to add or delete a link, and provide an algorithm that checks if a bilaterally stable network is possible or not. Finally, we take some special Social Range Matrices and prove that under certain conditions on network parameters, a bilaterally stable network is unique.
• We consider the problem of computing a matching in a bipartite graph in the presence of one-sided preferences. There are several well studied notions of optimality which include pareto optimality, rank maximality, fairness and popularity. In this paper, we conduct an in-depth experimental study comparing different notions of optimality based on a variety of metrics like cardinality, number of rank-1 edges, popularity, to name a few. Observing certain shortcomings in the standard notions of optimality, we propose an algorithm which maximizes an alternative metric called the Area under Profile Curve ratio (AUPCR). To the best of our knowledge, the AUPCR metric was used earlier but there is no known algorithm to compute an AUPCR maximizing matching. Finally, we illustrate the superiority of the AUPCR-maximizing matching by comparing its performance against other optimal matchings on synthetic instances modeling real-world data.
• Cooperative multi-agent planning (MAP) is a relatively recent research field that combines technologies, algorithms and techniques developed by the Artificial Intelligence Planning and Multi-Agent Systems communities. While planning has been generally treated as a single-agent task, MAP generalizes this concept by considering multiple intelligent agents that work cooperatively to develop a course of action that satisfies the goals of the group. This paper reviews the most relevant approaches to MAP, putting the focus on the solvers that took part in the 2015 Competition of Distributed and Multi-Agent Planning, and classifies them according to their key features and relative performance.
• Multi-agent approach has become popular in computer science and technology. However, the conventional models of multi-agent and multicomponent systems implicitly or explicitly assume existence of absolute time or even do not include time in the set of defining parameters. At the same time, it is proved theoretically and validated experimentally that there are different times and time scales in a variety of real systems - physical, chemical, biological, social, informational, etc. Thus, the goal of this work is construction of a multi-agent multicomponent system models with concurrency of processes and diversity of actions. To achieve this goal, a mathematical system actor model is elaborated and its properties are studied.
• We present ongoing work on a tool that consists of two parts: (i) A raw micro-level abstract world simulator with an interface to (ii) a 3D game engine, translator of raw abstract simulator data to photorealistic graphics. Part (i) implements a dedicated cellular automata (CA) on reconfigurable hardware (FPGA) and part (ii) interfaces with a deep learning framework for training neural networks. The bottleneck of such an architecture usually lies in the fact that transferring the state of the whole CA significantly slows down the simulation. We bypass this by sending only a small subset of the general state, which we call a 'locus of visibility', akin to a torchlight in a darkened 3D space, into the simulation. The torchlight concept exists in many games but these games generally only simulate what is in or near the locus. Our chosen architecture will enable us to simulate on a micro level outside the locus. This will give us the advantage of being able to create a larger and more fine-grained simulation which can be used to train neural networks for use in games.
• This paper proposes a novel game-theoretical autonomous decision-making framework to address a task allocation problem for a swarm of multiple agents. We consider cooperation of self-interested agents and show that agents who have social inhibition can converge to a Nash stable partition (i.e., social agreement) using our proposed decentralised algorithm within polynomial time. The algorithm is simple and executable based on local interactions with neighbour agents under a strongly-connected communication network and even in asynchronous environments. We analytically present a mathematical formulation for computing the lower bound of a converged solution's suboptimality and additionally show that 50 % of suboptimality can be minimally guaranteed if social utilities are non-decreasing functions with respect to the number of co-working agents. Through numerical experiments, it is confirmed that the proposed framework is scalable, fast adaptable against dynamical environments, and robust even in a realistic situation where some of the agents temporarily somehow do not operate during a mission.
• Swarm robotic systems are currently being used to address many real-world problems. One interesting application of swarm robotics is the self-organized formation of structures and shapes. Some of the key challenges in the swarm robotic systems include swarm size constraint, random motion, coordination among robots, localization, and adaptability in a decentralized environment. Rubenstein et al. presented a system ("Programmable self-assembly in a thousand-robot swarm", Science, 2014) for thousand-robot swarm able to form only solid shapes with the robots in aggregated form by applying the collective behavior algorithm. Even though agent-based approaches have been presented in various studies for self-organized formation, however these studies lack agent-based modeling (ABM) approach along with the constraints in term of structure complexity and heterogeneity in large swarms with dynamic localization. The cognitive agent-based computing (CABC) approach is capable of modeling such self-organization based multi-agents systems (MAS). In this paper, we develop a simulation model using ABM under CABC approach for self-organized shape formation in swarm robots. We propose a shape formation algorithm for validating our model and perform simulation-based experiments for six different shapes including hole-based shapes. We also demonstrate the formal specification for our model. The simulation result shows the robustness of the proposed approach having the emergent behavior of robots for the self-organized shape formation. The performance of the proposed approach is evaluated by robots convergence rate.
• We introduce new theoretical insights into two-population asymmetric games allowing for an elegant symmetric decomposition into two single population symmetric games. Specifically, we show how an asymmetric bimatrix game (A,B) can be decomposed into its symmetric counterparts by envisioning and investigating the payoff tables (A and B) that constitute the asymmetric game, as two independent, single population, symmetric games. We reveal several surprising formal relationships between an asymmetric two-population game and its symmetric single population counterparts, which facilitate a convenient analysis of the original asymmetric game due to the dimensionality reduction of the decomposition. The main finding reveals that if (x,y) is a Nash equilibrium of an asymmetric game (A,B), this implies that y is a Nash equilibrium of the symmetric counterpart game determined by payoff table A, and x is a Nash equilibrium of the symmetric counterpart game determined by payoff table B. Also the reverse holds and combinations of Nash equilibria of the counterpart games form Nash equilibria of the asymmetric game. We illustrate how these formal relationships aid in identifying and analysing the Nash structure of asymmetric games, by examining the evolutionary dynamics of the simpler counterpart games in several canonical examples.
• Since their inception, Multi Agent Systems (MASs) have been championed as a solution for the increasing problem of software complexity. Communities of distributed autonomous computing entities that are capable of collaborating, negotiating and acting to solve complex organisational and system management problems are an attractive proposition. Central to this is the requirement for agents to possess the capability of interacting with one another in a structured, consistent and organised manner. This thesis presents the Agent Conversation Reasoning Engine (ACRE), which constitutes a holistic view of communication management for MASs. ACRE is intended to facilitate the practical development, debugging and deployment of communication-heavy MASs. ACRE has been formally defined in terms of its operational semantics, and a generic architecture has been proposed to facilitate its integration with a wide variety of diverse agent development frameworks and Agent Oriented Programming (AOP) languages. A concrete implementation has also been developed that uses the Agent Factory AOP framework as its base. This allows ACRE to be used with a number of different AOP languages, while providing a reference implementation that other integrations can be modelled upon. A standard is also proposed for the modelling and sharing of agent-focused interaction protocols that is independent of the platform within which a concrete ACRE implementation is run. Finally, a user evaluation illustrates the benefits of incorporating conversation management into agent programming.
• To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL). The simplest form is independent reinforcement learning (InRL), where each agent treats its experience as part of its (non-stationary) environment. In this paper, we first observe that policies learned using InRL can overfit to the other agents' policies during training, failing to sufficiently generalize during execution. We introduce a new metric, joint-policy correlation, to quantify this effect. We describe an algorithm for general MARL, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection. The algorithm generalizes previous ones such as InRL, iterated best response, double oracle, and fictitious play. Then, we present a scalable implementation which reduces the memory requirement using decoupled meta-solvers. Finally, we demonstrate the generality of the resulting policies in two partially observable settings: gridworld coordination games and poker.
• For a group of cooperating UAVs, localizing each other is often a key task. This paper studies the localization problem for a group of UAVs flying in 3D space with very limited information, i.e., when noisy distance measurements are the only type of inter-agent sensing that is available, and when only one UAV knows a global coordinate basis, the others being GPS-denied. Initially for a two-agent problem, but easily generalized to some multi-agent problems, constraints are established on the minimum number of required distance measurements required to achieve the localization. The paper also proposes an algorithm based on semidefinite programming (SDP), followed by maximum likelihood estimation using a gradient descent initialized from the SDP calculation. The efficacy of the algorithm is verified with experimental noisy flight data.
• Nov 02 2017 cs.CC cs.GT cs.MA arXiv:1711.00201v1
We believe that economic design and computational complexity---while already important to each other---should become even more important to each other with each passing year. But for that to happen, experts in on the one hand such areas as social choice, economics, and political science and on the other hand computational complexity will have to better understand each other's worldviews. This article, written by two complexity theorists who also work in computational social choice theory, focuses on one direction of that process by presenting a brief overview of how most computational complexity theorists view the world. Although our immediate motivation is to make the lens through which complexity theorists see the world be better understood by those in the social sciences, we also feel that even within computer science it is very important for nontheoreticians to understand how theoreticians think, just as it is equally important within computer science for theoreticians to understand how nontheoreticians think.
• We propose a multiagent distributed actor-critic algorithm for multitask reinforcement learning (MRL), named Diff-DAC. The agents are connected, forming a (possibly sparse) network. Each agent is assigned a task and has access to data from this local task only. During the learning process, the agents are able to communicate some parameters to their neighbors. Since the agents incorporate their neighbors' parameters into their own learning rules, the information is diffused across the network, and they can learn a common policy that generalizes well across all tasks. Diff-DAC is scalable since the computational complexity and communication overhead per agent grow with the number of neighbors, rather than with the total number of agents. Moreover, the algorithm is fully distributed in the sense that agents self-organize, with no need for coordinator node. Diff-DAC follows an actor-critic scheme where the value function and the policy are approximated with deep neural networks, being able to learn expressive policies from raw data. As a by-product of Diff-DAC's derivation from duality theory, we provide novel insights into the standard actor-critic framework, showing that it is actually an instance of the dual ascent method to approximate the solution of a linear program. Experiments illustrate the performance of the algorithm in the cart-pole, inverted pendulum, and swing-up cart-pole environments.
• When voting on a proposal one in fact chooses between two alternatives: (i) A new hypothetical social state depicted by the proposal and (ii) the status quo (henceforth: Reality); a Yes vote favors a transition to the proposed hypothetical state, while a No vote favors Reality. Social Choice theory generalizes voting on one proposal to ranking multiple proposals; that Reality was forsaken during this generalization is, in our view, inexplicable. Here we propose to rectify this neglect and incorporate Reality into Social Choice, distinguishing between Reality and hypothesis. We do so by recognizing Reality as an ever-present, always-relevant, evolving social state that is distinguished from hypothetical social states, and explore the ramifications of this recognition. Incorporating Reality into Social Choice offers: (i) A natural way to resolve the Condorcet paradox and Condorcet cycles; (ii) a resolution to the vexing ambiguity regarding what do approval voters, in fact, approve? (iii) a simple and practical show-of-hands agenda that implements an approval vote in one round and Condorcet-consistent voting in multiple rounds; and (iv) reasoned nullification of Independence of Irrelevant Alternatives and hence abdication of Arrow's Theorem. Arrow's theorem was taken to show that democracy, conceived as government by the will of the people, is an incoherent illusion. Incorporating Reality into Social Choice may clear this intellectual blemish on democracy; pave the way for the broad application of the Condorcet criterion; and, more generally, help restore trust in democracy by showing that it offers a coherent and hopeful vision.
• In the ICS, WUT a platform for simulation of cooperation of physical and virtual mobile agents is under development. The paper describes the motivation of the research, an organization of the platform, a model of agent, and the principles of design of the platform. Several experimental simulations are briefly described.
• A challenge in multiagent control systems is to ensure that they are appropriately resilient to communication failures between the various agents. In many common game-theoretic formulations of these types of systems, it is implicitly assumed that all agents have access to as much information about other agents' actions as needed. This paper endeavors to augment these game-theoretic methods with policies that would allow agents to react on-the-fly to losses of this information. Unfortunately, we show that even if a single agent loses communication with one other weakly-coupled agent, this can cause arbitrarily-bad system states to emerge as various solution concepts of an associated game, regardless of how the agent accounts for the communication failure and regardless of how weakly coupled the agents are. Nonetheless, we show that the harm that communication failures can cause is limited by the structure of the problem; when agents' action spaces are richer, problems are more susceptible to these types of pathologies. Finally, we undertake an initial study into how a system designer might prevent these pathologies, and explore a few limited settings in which communication failures cannot cause harm.
• Since the publication of 'Complex Contagions and the Weakness of Long Ties' in 2007, complex contagions have been studied across an enormous variety of social domains. In reviewing this decade of research, we discuss recent advancements in applied studies of complex contagions, particularly in the domains of health, innovation diffusion, social media, and politics. We also discuss how these empirical studies have spurred complementary advancements in the theoretical modeling of contagions, which concern the effects of network topology on diffusion, as well as the effects of individual-level attributes and thresholds. In synthesizing these developments, we suggest three main directions for future research. The first concerns the study of how multiple contagions interact within the same network and across networks, in what may be called an ecology of contagions. The second concerns the study of how the structure of thresholds and their behavioral consequences can vary by individual and social context. The third area concerns the roles of diversity and homophily in the dynamics of complex contagion, including both diversity of demographic profiles among local peers, and the broader notion of structural diversity within a network. Throughout this discussion, we make an effort to highlight the theoretical and empirical opportunities that lie ahead.
• A key challenge in multi-robot and multi-agent systems is generating solutions that are robust to other self-interested or even adversarial parties who actively try to prevent the agents from achieving their goals. The practicality of existing works addressing this challenge is limited to only small-scale synchronous decision-making scenarios or a single agent planning its best response against a single adversary with fixed, procedurally characterized strategies. In contrast this paper considers a more realistic class of problems where a team of asynchronous agents with limited observation and communication capabilities need to compete against multiple strategic adversaries with changing strategies. This problem necessitates agents that can coordinate to detect changes in adversary strategies and plan the best response accordingly. Our approach first optimizes a set of stratagems that represent these best responses. These optimized stratagems are then integrated into a unified policy that can detect and respond when the adversaries change their strategies. The near-optimality of the proposed framework is established theoretically as well as demonstrated empirically in simulation and hardware.