The goal of this paper is to present an end-to-end, data-driven framework to control Autonomous Mobility-on-Demand systems (AMoD, i.e. fleets of self-driving vehicles). We first model the AMoD system using a time-expanded network, and present a formulation that computes the optimal rebalancing strategy (i.e., preemptive repositioning) and the minimum feasible fleet size for a given travel demand. Then, we adapt this formulation to devise a Model Predictive Control (MPC) algorithm that leverages short-term demand forecasts based on historical data to compute rebalancing strategies. We test the end-to-end performance of this controller with a state-of-the-art LSTM neural network to predict customer demand and real customer data from DiDi Chuxing: we show that this approach scales very well for large systems (indeed, the computational complexity of the MPC algorithm does not depend on the number of customers and of vehicles in the system) and outperforms state-of-the-art rebalancing strategies by reducing the mean customer wait time by up to to 89.6%.
Decentralized control of robots has attracted huge research interests. However, some of the research used unrealistic assumptions without collision avoidance. This report focuses on the collision-free control for multiple robots in both complete coverage and search tasks in 2D and 3D areas which are arbitrary unknown. All algorithms are decentralized as robots have limited abilities and they are mathematically proved. The report starts with the grid selection in the two tasks. Grid patterns simplify the representation of the area and robots only need to move straightly between neighbor vertices. For the 100% complete 2D coverage, the equilateral triangular grid is proposed. For the complete coverage ignoring the boundary effect, the grid with the fewest vertices is calculated in every situation for both 2D and 3D areas. The second part is for the complete coverage in 2D and 3D areas. A decentralized collision-free algorithm with the above selected grid is presented driving robots to sections which are furthest from the reference point. The area can be static or expanding, and the algorithm is simulated in MATLAB. Thirdly, three grid-based decentralized random algorithms with collision avoidance are provided to search targets in 2D or 3D areas. The number of targets can be known or unknown. In the first algorithm, robots choose vacant neighbors randomly with priorities on unvisited ones while the second one adds the repulsive force to disperse robots if they are close. In the third algorithm, if surrounded by visited vertices, the robot will use the breadth-first search algorithm to go to one of the nearest unvisited vertices via the grid. The second search algorithm is verified on Pioneer 3-DX robots. The general way to generate the formula to estimate the search time is demonstrated. Algorithms are compared with five other algorithms in MATLAB to show their effectiveness.
This paper deals with a new type of warehousing system, Robotic Mobile Fulfillment Systems (RMFS). In such systems, robots are sent to carry storage units, so-called "pods", from the inventory and bring them to human operators working at stations. At the stations, the items are picked according to customers' orders. There exist new decision problems in such systems, for example, the reallocation of pods after their visits at work stations or the selection of pods to fulfill orders. In order to analyze decision strategies for these decision problems and relations between them, we develop a simulation framework called "RAWSim-O" in this paper. Moreover, we show a real-world application of our simulation framework by integrating simple robot prototypes based on vacuum cleaning robots.
In multi-agent navigation, agents need to move towards their goal locations while avoiding collisions with other agents and static obstacles, often without communication with each other. Existing methods compute motions that are optimal locally but do not account for the aggregated motions of all agents, producing inefficient global behavior especially when agents move in a crowded space. In this work, we develop methods to allow agents to dynamically adapt their behavior to their local conditions. We accomplish this by formulating the multi-agent navigation problem as an action-selection problem, and propose an approach, ALAN, that allows agents to compute time-efficient and collision-free motions. ALAN is highly scalable because each agent makes its own decisions on how to move using a set of velocities optimized for a variety of navigation tasks. Experimental results show that the agents using ALAN, in general, reach their destinations faster than using ORCA, a state-of-the-art collision avoidance framework, the Social Forces model for pedestrian navigation, and a Predictive collision avoidance model.
We consider multi-agent stochastic optimization problems over reproducing kernel Hilbert spaces (RKHS). In this setting, a network of interconnected agents aims to learn decision functions, i.e., nonlinear statistical models, that are optimal in terms of a global convex functional that aggregates data across the network, with only access to locally and sequentially observed samples. We propose solving this problem by allowing each agent to learn a local regression function while enforcing consensus constraints. We use a penalized variant of functional stochastic gradient descent operating simultaneously with low-dimensional subspace projections. These subspaces are constructed greedily by applying orthogonal matching pursuit to the sequence of kernel dictionaries and weights. By tuning the projection-induced bias, we propose an algorithm that allows for each individual agent to learn, based upon its locally observed data stream and message passing with its neighbors only, a regression function that is close to the globally optimal regression function. That is, we establish that with constant step-size selections agents' functions converge to a neighborhood of the globally optimal one while satisfying the consensus constraints as the penalty parameter is increased. Moreover, the complexity of the learned regression functions is guaranteed to remain finite. On both multi-class kernel logistic regression and multi-class kernel support vector classification with data generated from class-dependent Gaussian mixture models, we observe stable function estimation and state of the art performance for distributed online multi-class classification. Experiments on the Brodatz textures further substantiate the empirical validity of this approach.
Explanation of the hot topic "multi-agent path finding".
The paper reports on some results concerning Aqvist's dyadic logic known as system G, which is one of the most influential logics for reasoning with dyadic obligations ("it ought to be the case that ... if it is the case that ..."). Although this logic has been known in the literature for a while, many of its properties still await in-depth consideration. In this short paper we show: that any formula in system G including nested modal operators is equivalent to some formula with no nesting; that the universal modality introduced by Aqvist in the first presentation of the system is definable in terms of the deontic modality.
Oct 11 2017 cs.MA
Trajectory interpolation, the process of filling-in the gaps and removing noise from observed agent trajectories, is an essential task for the motion inference in multi-agent setting. A desired trajectory interpolation method should be robust to noise, changes in environments or agent densities, while also being yielding realistic group movement behaviors. Such realistic behaviors are, however, challenging to model as they require avoidance of agent-agent or agent-environment collisions and, at the same time, seek computational efficiency. In this paper, we propose a novel framework composed of data-driven priors (local, global or combined) and an efficient optimization strategy for multi-agent trajectory interpolation. The data-driven priors implicitly encode the dependencies of movements of multiple agents and the collision-avoiding desiderata, enabling elimination of costly pairwise collision constraints and resulting in reduced computational complexity and often improved estimation. Various combinations of priors and optimization algorithms are evaluated in comprehensive simulated experiments. Our experimental results reveal important insights, including the significance of the global flow prior and the lesser-than-expected influence of data-driven collision priors.
Smart grid can be considered as a complex network where each node represents generation unit, producer, and links represent transmission lines. Agent-based modeling (ABM) is a paradigm represents a complex system of autonomous agents and their interactions with each other. Previously, numbers of study have been presented in smart grid domain using ABM methodologies. However, these study lack in following any ABM specification approach. The purely text based specification tends to be ambiguous and make the model difficult to be replicated. In this paper, we present an ABM for smart grid using a combination of agent-based and complex network based approaches. Then we present ODD and DREAM specification approach for our ABM in the smart grid. We compare these two specification techniques qualitatively as well as quantitatively and conclude that DREAM approach is better than ODD in term of understanding and replicating ABM.
Oct 03 2017 cs.MA
Within the area of multi-agent systems, normative systems are a widely used framework for the coordination of interdependent activities. A crucial problem associated with normative systems is that of synthesising norms that effectively accomplish a coordination task and whose compliance forms a rational choice for the agents within the system. In this work, we introduce a framework for the synthesis of normative systems that effectively coordinate a multi-agent system and whose norms are likely to be adopted by rational agents. Our approach roots in evolutionary game theory. Our framework considers multi-agent systems in which evolutionary forces lead successful norms to prosper and spread within the agent population, while unsuccessful norms are discarded. The outputs of this evolutionary norm synthesis process are normative systems whose compliance forms a rational choice for the agents. We empirically show the effectiveness of our approach through empirical evaluation in a simulated traffic domain.
While reigning models of diffusion have privileged the structure of a given social network as the key to informational exchange, real human interactions do not appear to take place on a single graph of connections. Using data collected from a pilot study of the spread of HIV awareness in social networks of homeless youth, we show that health information did not diffuse in the field according to the processes outlined by dominant models. Since physical network diffusion scenarios often diverge from their more well-studied counterparts on digital networks, we propose an alternative Activation Jump Model (AJM) that describes information diffusion on physical networks from a multi-agent team perspective. Our model exhibits two main differentiating features from leading cascade and threshold models of influence spread: 1) The structural composition of a seed set team impacts each individual node's influencing behavior, and 2) an influencing node may spread information to non-neighbors. We show that the AJM significantly outperforms existing models in its fit to the observed node-level influence data on the youth networks. We then prove theoretical results, showing that the AJM exhibits many well-behaved properties shared by dominant models. Our results suggest that the AJM presents a flexible and more accurate model of network diffusion that may better inform influence maximization in the field.
Developing a safe and efficient collision avoidance policy for multiple robots is challenging in the decentralized scenarios where each robot generate its paths without observing other robots' states and intents. While other distributed multi-robot collision avoidance systems exist, they often require extracting agent-level features to plan a local collision-free action, which can be computationally prohibitive and not robust. More importantly, in practice the performance of these methods are much lower than their centralized counterparts. We present a decentralized sensor-level collision avoidance policy for multi-robot systems, which directly maps raw sensor measurements to an agent's steering commands in terms of movement velocity. As a first step toward reducing the performance gap between decentralized and centralized methods, we present a multi-scenario multi-stage training framework to find an optimal policy which is trained over a large number of robots on rich, complex environments simultaneously using a policy gradient based reinforcement learning algorithm. We validate the learned sensor-level collision avoidance policy in a variety of simulated scenarios with thorough performance evaluations and show that the final learned policy is able to find time efficient, collision-free paths for a large-scale robot system. We also demonstrate that the learned policy can be well generalized to new scenarios that do not appear in the entire training period, including navigating a heterogeneous group of robots and a large-scale scenario with 100 robots. Videos are available at https://sites.google.com/view/drlmaca
This paper focuses on two commonly used path assignment policies for agents traversing a congested network: self-interested routing, and system-optimum routing. In the self-interested routing policy each agent selects a path that optimizes its own utility, while the system-optimum routing agents are assigned paths with the goal of maximizing system performance. This paper considers a scenario where a centralized network manager wishes to optimize utilities over all agents, i.e., implement a system-optimum routing policy. In many real-life scenarios, however, the system manager is unable to influence the route assignment of all agents due to limited influence on route choice decisions. Motivated by such scenarios, a computationally tractable method is presented that computes the minimal amount of agents that the system manager needs to influence (compliant agents) in order to achieve system optimal performance. Moreover, this methodology can also determine whether a given set of compliant agents is sufficient to achieve system optimum and compute the optimal route assignment for the compliant agents to do so. Experimental results are presented showing that in several large-scale, realistic traffic networks optimal flow can be achieved with as low as 13% of the agent being compliant and up to 54%.
Monte Carlo Tree Search (MCTS) has been extended to many imperfect information games. However, due to the added complexity that uncertainty introduces, these adaptations have not reached the same level of practical success as their perfect information counterparts. In this paper we consider the development of agents that perform well against humans in imperfect information games with partially observable actions. We introduce the Semi-Determinized-MCTS (SDMCTS), a variant of the Information Set MCTS algorithm (ISMCTS). More specifically, SDMCTS generates a predictive model of the unobservable portion of the opponent's actions from historical behavioral data. Next, SDMCTS performs simulations on an instance of the game where the unobservable portion of the opponent's actions are determined. Thereby, it facilitates the use of the predictive model in order to decrease uncertainty. We present an implementation of the SDMCTS applied to the Cheat Game, a well-known card game, with partially observable (and often deceptive) actions. Results from experiments with 120 subjects playing a head-to-head Cheat Game against our SDMCTS agents suggest that SDMCTS performs well against humans, and its performance improves as the predictive model's accuracy increases.
This paper provides an overview of evolutionary robotics techniques applied to on-line distributed evolution for robot collectives -- namely, embodied evolution. It provides a definition of embodied evolution as well as a thorough description of the underlying concepts and mechanisms. The paper also presents a comprehensive summary of research published in the field since its inception (1999-2017), providing various perspectives to identify the major trends. In particular, we identify a shift from considering embodied evolution as a parallel search method within small robot collectives (fewer than 10 robots) to embodied evolution as an on-line distributed learning method for designing collective behaviours in swarm-like collectives. The paper concludes with a discussion of applications and open questions, providing a milestone for past and an inspiration for future research.
This work studies the labeled multi-robot path and motion planning problem in continuous domains, in the absence of static obstacles. Given $n$ robots which may be arbitrarily close to each other and assuming random start and goal configurations for the robots, we derived an $O(n^3)$, complete algorithm that produces solutions with constant-factor optimality guarantees on both makespan and distance optimality, in expectation. Furthermore, our algorithm only requires a small constant factor expansion of the initial and goal configuration footprints for solving the problem. In addition to strong theoretical guarantees, we present a thorough computational evaluation of the proposed solution. Beyond the baseline solution, adapting an effective (but non-polynomial time) robot routing subroutine, we also provide a highly efficient implementation that quickly computes near-optimal solutions. Hardware experiments on microMVP platform composed of non-holonomic robots confirms the practical applicability of our algorithmic pipeline.
Existing socio-psychological studies suggest that users of a social network form their opinions relying on the opinions of their neighbors. According to DeGroot opinion formation model, one value of particular importance is the asymptotic consensus value---the sum of user opinions weighted by the users' eigenvector centralities. This value plays the role of an attractor for the opinions in the network and is a lucrative target for external influence. However, since any potentially malicious control of the opinion distribution in a social network is clearly undesirable, it is important to design methods to prevent the external attempts to strategically change the asymptotic consensus value. In this work, we assume that the adversary wants to maximize the asymptotic consensus value by altering the opinions of some users in a network; we, then, state DIVER---an NP-hard problem of disabling such external influence attempts by strategically adding a limited number of edges to the network. Relying on the theory of Markov chains, we provide perturbation analysis that shows how eigenvector centrality and, hence, DIVER's objective function change in response to an edge's addition to the network. The latter leads to the design of a pseudo-linear-time heuristic for DIVER, whose computation relies on efficient estimation of mean first passage times in a Markov chain. We confirm our theoretical findings in experiments.
Much research in artificial intelligence is concerned with the development of autonomous agents that can interact effectively with other agents. An important aspect of such agents is the ability to reason about the behaviours of other agents, by constructing models which make predictions about various properties of interest (such as actions, goals, beliefs) of the modelled agents. A variety of modelling approaches now exist which vary widely in their methodology and underlying assumptions, catering to the needs of the different sub-communities within which they were developed and reflecting the different practical uses for which they are intended. The purpose of the present article is to provide a comprehensive survey of the salient modelling methods which can be found in the literature. The article concludes with a discussion of open problems which may form the basis for fruitful future research.
Sep 25 2017 cs.MA
The introduction of autonomous vehicles (AVs) will have far-reaching effects on road traffic in cities and on highways.The implementation of automated highway system (AHS), possibly with a dedicated lane only for AVs, is believed to be a requirement to maximise the benefit from the advantages of AVs. We study the ramifications of an increasing percentage of AVs on the traffic system with and without the introduction of a dedicated AV lane on highways. We conduct an analytical evaluation of a simplified scenario and a macroscopic simulation of the city of Singapore under user equilibrium conditions with a realistic traffic demand. We present findings regarding average travel time, fuel consumption, throughput and road usage. Instead of only considering the highways, we also focus on the effects on the remaining road network. Our results show a reduction of average travel time and fuel consumption as a result of increasing the portion of AVs in the system. We show that the introduction of an AV lane is not beneficial in terms of average commute time. Examining the effects of the AV population only, however, the AV lane provides a considerable reduction of travel time (approx. 25%) at the price of delaying conventional vehicles (approx. 7%). Furthermore a notable shift of travel demand away from the highways towards major and small roads is noticed in early stages of AV penetration of the system. Finally, our findings show that after a certain threshold percentage of AVs the differences between AV and no AV lane scenarios become negligible.
Swarm systems constitute a challenging problem for reinforcement learning (RL) as the algorithm needs to learn decentralized control policies that can cope with limited local sensing and communication abilities of the agents. Although there have been recent advances of deep RL algorithms applied to multi-agent systems, learning communication protocols while simultaneously learning the behavior of the agents is still beyond the reach of deep RL algorithms. However, while it is often difficult to directly define the behavior of the agents, simple communication protocols can be defined more easily using prior knowledge about the given task. In this paper, we propose a number of simple communication protocols that can be exploited by deep reinforcement learning to find decentralized control policies in a multi-robot swarm environment. The protocols are based on histograms that encode the local neighborhood relations of the agents and can also transmit task-specific information, such as the shortest distance and direction to a desired target. In our framework, we use an adaptation of Trust Region Policy Optimization to learn complex collaborative tasks, such as formation building, building a communication link, and pushing an intruder. We evaluate our findings in a simulated 2D-physics environment, and compare the implications of different communication protocols.
The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems. Since macro-actions last for stochastic durations, multiple agents executing decentralized policies in cooperative environments must act asynchronously. We present an algorithm that modifies Generalized Advantage Estimation for temporally extended actions, allowing a state-of-the-art policy optimization algorithm to optimize policies in Dec-POMDPs in which agents act asynchronously. We show that our algorithm is capable of learning optimal policies in two cooperative domains, one involving real-time bus holding control and one involving wildfire fighting with unmanned aircraft. Our algorithm works by framing problems as "event-driven decision processes," which are scenarios where the sequence and timing of actions and events are random and governed by an underlying stochastic process. In addition to optimizing policies with continuous state and action spaces, our algorithm also facilitates the use of event-driven simulators, which do not require time to be discretized into time-steps. We demonstrate the benefit of using event-driven simulation in the context of multiple agents taking asynchronous actions. We show that fixed time-step simulation risks obfuscating the sequence in which closely-separated events occur, adversely affecting the policies learned. Additionally, we show that arbitrarily shrinking the time-step scales poorly with the number of agents.
Inspired by biological swarms, robotic swarms are envisioned to solve real-world problems that are difficult for individual agents. Biological swarms can achieve collective intelligence based on local interactions and simple rules; however, designing effective distributed policies for large-scale robotic swarms to achieve a global objective can be challenging. Although it is often possible to design an optimal centralized strategy for smaller numbers of agents, those methods can fail as the number of agents increases. Motivated by the growing success of machine learning, we develop a deep learning approach that learns distributed coordination policies from centralized policies. In contrast to traditional distributed control approaches, which are usually based on human-designed policies for relatively simple tasks, this learning-based approach can be adapted to more difficult tasks. We demonstrate the efficacy of our proposed approach on two different tasks, the well-known rendezvous problem and a more difficult particle assignment problem. For the latter, no known distributed policy exists. From extensive simulations, it is shown that the performance of the learned coordination policies is comparable to the centralized policies, surpassing state-of-the-art distributed policies. Thereby, our proposed approach provides a promising alternative for real-world coordination problems that would be otherwise computationally expensive to solve or intangible to explore.
In this paper, we investigate how to learn to control a group of cooperative agents with limited sensing capabilities such as robot swarms. The agents have only very basic sensor capabilities, yet in a group they can accomplish sophisticated tasks, such as distributed assembly or search and rescue tasks. Learning a policy for a group of agents is difficult due to distributed partial observability of the state. Here, we follow a guided approach where a critic has central access to the global state during learning, which simplifies the policy evaluation problem from a reinforcement learning point of view. For example, we can get the positions of all robots of the swarm using a camera image of a scene. This camera image is only available to the critic and not to the control policies of the robots. We follow an actor-critic approach, where the actors base their decisions only on locally sensed information. In contrast, the critic is learned based on the true global state. Our algorithm uses deep reinforcement learning to approximate both the Q-function and the policy. The performance of the algorithm is evaluated on two tasks with simple simulated 2D agents: 1) finding and maintaining a certain distance to each others and 2) locating a target.
Cooperative motion planning is still a challenging task for robots. Recently, Value Iteration Networks (VINs) were proposed to model motion planning tasks as Neural Networks. In this work, we extend VINs to solve cooperative planning tasks under non-holonomic constraints. For this, we interconnect multiple VINs to pay respect to each other's outputs. Policies for cooperation are generated via iterative gradient descend. Validation in simulation shows that the resulting networks can resolve non-holonomic motion planning problems that require cooperation.
This paper deals with the use of hybrid simulation to build and compose heterogeneous simulation scenarios that can be proficiently exploited to model and represent the Internet of Things (IoT). Hybrid simulation is a methodology that combines multiple modalities of modeling/simulation. Complex scenarios are decomposed into simpler ones, each one being simulated through a specific simulation strategy. All these simulation building blocks are then synchronized and coordinated. This simulation methodology is an ideal one to represent IoT setups, which are usually very demanding, due to the heterogeneity of possible scenarios arising from the massive deployment of an enormous amount of sensors and devices. We present a use case concerned with the distributed simulation of smart territories, a novel view of decentralized geographical spaces that, thanks to the use of IoT, builds ICT services to manage resources in a way that is sustainable and not harmful to the environment. Three different simulation models are combined together, namely, an adaptive agent-based parallel and distributed simulator, an OMNeT++ based discrete event simulator and a script-language simulator based on MATLAB. Results from a performance analysis confirm the viability of using hybrid simulation to model complex IoT scenarios.
Although adverse effects of attacks have been acknowledged in many cyber-physical systems, there is no system-theoretic comprehension of how a compromised agent can leverage communication capabilities to maximize the damage in distributed multi-agent systems. A rigorous analysis of cyber-physical attacks enables us to increase the system awareness against attacks and design more resilient control protocols. To this end, we will take the role of the attacker to identify the worst effects of attacks on root nodes and non-root nodes in a distributed control system. More specifically, we show that a stealthy attack on root nodes can mislead the entire network to a wrong understanding of the situation and even destabilize the synchronization process. This will be called the internal model principle for the attacker and will intensify the urgency of designing novel control protocols to mitigate these types of attacks.
This paper presents a methodology for simulating the Internet of Things (IoT) using multi-level simulation models. With respect to conventional simulators, this approach allows us to tune the level of detail of different parts of the model without compromising the scalability of the simulation. As a use case, we have developed a two-level simulator to study the deployment of smart services over rural territories. The higher level is base on a coarse grained, agent-based adaptive parallel and distributed simulator. When needed, this simulator spawns OMNeT++ model instances to evaluate in more detail the issues concerned with wireless communications in restricted areas of the simulated world. The performance evaluation confirms the viability of multi-level simulations for IoT environments.
Multi-agent path finding (MAPF) is a well-studied problem in artificial intelligence, where one needs to find collision-free paths for agents with given start and goal locations. In video games, agents of different types often form teams. In this paper, we demonstrate the usefulness of MAPF algorithms from artificial intelligence for moving such non-homogeneous teams in congested video game environments.
How does the control architecture of a multiagent system impact the achievable performance guarantees? In this paper we answer this question relative to a class of resource allocation problems, termed multiagent set covering problems. More specifically, we demonstrate how the degree of information available to each agent imposes constraints on the achievable efficiency guarantees associated with the emergent collective behavior. We measure these efficiency guarantees using the well-known metrics of price of anarchy and price of stability, which seek to bound the performance of the worst and best performing equilibria, respectively. Our first result demonstrates that when agents only know local information about the resources they can select, a system operator is forced to manage an inherent trade-off between the price of anarchy and price of stability, i.e., improving the quality of the worst equilibria comes at the expenses of the best equilibria. Our second result demonstrates that a system operator can move well beyond this price of anarchy / price of stability frontier by designing a communication protocol that provides agents with a limited amount of system-level information beyond these local quantities.
In large scale collective decision making, social choice is a normative study of how one ought to design a protocol for reaching consensus. However, in instances where the underlying decision space is too large or complex for ordinal voting, standard voting methods of social choice may be impractical. How then can we design a mechanism - preferably decentralized, simple, scalable, and not requiring any special knowledge of the decision space - to reach consensus? We propose sequential deliberation as a natural solution to this problem. In this iterative method, successive pairs of agents bargain over the decision space using the previous decision as a disagreement alternative. We describe the general method and analyze the quality of its outcome when the space of preferences define a median graph. We show that sequential deliberation finds a 1.208- approximation to the optimal social cost on such graphs, coming very close to this value with only a small constant number of agents sampled from the population. We also show lower bounds on simpler classes of mechanisms to justify our design choices. We further show that sequential deliberation is ex-post Pareto efficient and has truthful reporting as an equilibrium of the induced extensive form game. We finally show that for general metric spaces, the second moment of of the distribution of social cost of the outcomes produced by sequential deliberation is also bounded.
Based on an approach with bounded virtual exosystems, the output synchronization of heterogeneous linear networks is addressed. Instead of exact synchronization with zero steady-state error, we introduce optimal stationary synchronization. Then, each agent is able to balance its synchronization error versus its consumed input-energy. In this regard, a local, time-invariant optimal control is obtained by means of recently developed methods for infinite-time linear quadratic tracking. Extending these methods, we present a parametric optimization problem to obtain an optimal control that guarantees given error-bounds on the synchronization error. In addition, we are able to relax standard assumptions and our approach applies even for agents which usually cannot synchronize exactly, e.g. agents with less inputs than outputs. All these aspects are demonstrated by an illustrative simulation example for which a detailed analysis is provided.
In decentralized optimization, nodes cooperate to minimize an overall objective function that is the sum (or average) of per-node private objective functions. Algorithms interleave local computations with communication among all or a subset of the nodes. Motivated by a variety of applications---distributed estimation in sensor networks, fitting models to massive data sets, and distributed control of multi-robot systems, to name a few---significant advances have been made towards the development of robust, practical algorithms with theoretical performance guarantees. This paper presents an overview of recent work in this area. In general, rates of convergence depend not only on the number of nodes involved and the desired level of accuracy, but also on the structure and nature of the network over which nodes communicate (e.g., whether links are directed or undirected, static or time-varying). We survey the state-of-the-art algorithms and their analyses tailored to these different scenarios, highlighting the role of the network topology.
In the smart grid, smart meters, and numerous control and monitoring applications employ bidirectional wireless communication, where security is a critical issue. In key management based encryption method for the smart grid, the Trusted Third Party (TTP), and links between the smart meter and the third party are assumed to be fully trusted and reliable. However, in wired/wireless medium, a man-in-middle may want to interfere, monitor and control the network, thus exposing its vulnerability. Acknowledging this, in this paper, we propose a novel two level encryption method based on two partially trusted simple servers (constitutes the TTP) which implement this method without increasing packet overhead. One server is responsible for data encryption between the meter and control center/central database, and the other server manages the random sequence of data transmission. Numerical calculation shows that the number of iterations required to decode a message is large which is quite impractical. Furthermore, we introduce One-class support vector machine (machine learning) algorithm for node-to-node authentication utilizing the location information and the data transmission history (node identity, packet size and frequency of transmission). This secures data communication privacy without increasing the complexity of the conventional key management scheme.
Sep 26 2017 cs.MA
Modeling and simulation of pedestrian behavior is used in applications such as planning large buildings, disaster management, or urban planning. Realistically simulating pedestrian behavior is challenging, due to the complexity of individual behavior as well as the complexity of interactions of pedestrians with each other and with the environment. This work-in-progress paper addresses the tactical (path planning) and the operational level (movement control) of pedestrian simulation from the perspective of multiagent-based modeling. We propose (1) an novel extension of the JPS routing algorithm for tactical planning, and (2) an architecture how path planning can be integrated with a social-force based movement control. The architecture is inspired by layered architectures for robot planning and control. We validate correctness and efficiency of our approach through simulation runs.
As the connectivity of consumer devices is rapidly growing and cloud computing technologies are becoming more widespread, cloud-aided techniques for parameter estimation can be designed to exploit the theoretically unlimited storage memory and computational power of the cloud, while relying on information provided by multiple sources. With the ultimate goal of developing monitoring and diagnostic strategies, this report focuses on the design of a Recursive Least-Squares (RLS) based estimator for identification over a group of devices connected to the cloud. The proposed approach, that relies on Node-to-Cloud-to-Node (N2C2N) transmissions, is designed so that: (i) estimates of the unknown parameters are computed locally and (ii) the local estimates are refined on the cloud. The proposed approach requires minimal changes to local (pre-existing) RLS estimators.
This paper considers a discrete-time opinion dynamics model in which each individual's susceptibility to being influenced by others is dependent on her current opinion. We assume that the social network has time-varying topology and that the opinions are scalars on a continuous interval. We first propose a general opinion dynamics model based on the DeGroot model, with a general function to describe the functional dependence of each individual's susceptibility on her own opinion, and show that this general model is analogous to the Friedkin-Johnsen model, which assumes a constant susceptibility for each individual. We then consider two specific functions in which the individual's susceptibility depends on the \emphpolarity of her opinion, and provide motivating social examples. First, we consider stubborn positives, who have reduced susceptibility if their opinions are at one end of the interval and increased susceptibility if their opinions are at the opposite end. A court jury is used as a motivating example. Second, we consider stubborn neutrals, who have reduced susceptibility when their opinions are in the middle of the spectrum, and our motivating examples are social networks discussing established social norms or institutionalized behavior. For each specific susceptibility model, we establish the initial and graph topology conditions in which consensus is reached, and develop necessary and sufficient conditions on the initial conditions for the final consensus value to be at either extreme of the opinion interval. Simulations are provided to show the effects of the susceptibility function when compared to the DeGroot model.
This paper addresses the problem of formation control and tracking a of desired trajectory by an Euler-Lagrange multi-agent systems. It is inspired by recent results by Qingkai et al. and adopts an event-triggered control strategy to reduce the number of communications between agents. For that purpose, to evaluate its control input, each agent maintains estimators of the states of the other agents. Communication is triggered when the discrepancy between the actual state of an agent and the corresponding estimate reaches some threshold. The impact of additive state perturbations on the formation control is studied. A condition for the convergence of the multi-agent system to a stable formation is studied. Simulations show the effectiveness of the proposed approach.
In this paper, the problem of energy trading between smart grid prosumers, who can simultaneously consume and produce energy, and a grid power company is studied. The problem is formulated as a single-leader, multiple-follower Stackelberg game between the power company and multiple prosumers. In this game, the power company acts as a leader who determines the pricing strategy that maximizes its profits, while the prosumers act as followers who react by choosing the amount of energy to buy or sell so as to optimize their current and future profits. The proposed game accounts for each prosumer's subjective decision when faced with the uncertainty of profits, induced by the random future price. In particular, the framing effect, from the framework of prospect theory (PT), is used to account for each prosumer's valuation of its gains and losses with respect to an individual utility reference point. The reference point changes between prosumers and stems from their past experience and future aspirations of profits. The followers' noncooperative game is shown to admit a unique pure-strategy Nash equilibrium (NE) under classical game theory (CGT) which is obtained using a fully distributed algorithm. The results are extended to account for the case of PT using algorithmic solutions that can achieve an NE under certain conditions. Simulation results show that the total grid load varies significantly with the prosumers' reference point and their loss-aversion level. In addition, it is shown that the power company's profits considerably decrease when it fails to account for the prosumers' subjective perceptions under PT.
We study the problem of distributed maximum computation in an open multi-agent system, where agents can leave and arrive during the execution of the algorithm. The main challenge comes from the possibility that the agent holding the largest value leaves the system, which changes the value to be computed. The algorithms must as a result be endowed with mechanisms allowing to forget outdated information. The focus is on systems in which interactions are pairwise gossips between randomly selected agents. We consider situations where leaving agents can send a last message, and situations where they cannot. For both cases, we provide algorithms able to eventually compute the maximum of the values held by agents.
We consider open multi-agent systems. Unlike the systems usually studied in the literature, here agents may join or leave while the process studied takes place. The system composition and size evolve thus with time. We focus here on systems where the interactions between agents lead to pairwise gossip averages, and where agents either arrive or are replaced at random times. These events prevent any convergence of the system. Instead, we describe the expected system behavior by showing that the evolution of scaled moments of the state can be characterized by a 2-dimensional (possibly time-varying) linear dynamical system. We apply this technique to two cases : (i) systems with fixed size where leaving agents are immediately replaced, and (ii) systems where new agents keep arriving without ever leaving, and whose size grows thus unbounded.