Multi-Agent Coordination

Multi-Agent Coordination
Автор книги: id книги: 1878836     Оценка: 0.0     Голосов: 0     Отзывы, комментарии: 0 13035,6 руб.     (132,52$) Читать книгу Купить и скачать книгу Купить бумажную книгу Электронная книга Жанр: Программы Правообладатель и/или издательство: John Wiley & Sons Limited Дата добавления в каталог КнигаЛит: ISBN: 9781119699026 Скачать фрагмент в формате   fb2   fb2.zip Возрастное ограничение: 0+ Оглавление Отрывок из книги

Реклама. ООО «ЛитРес», ИНН: 7719571260.

Описание книги

Discover the latest developments in multi-robot coordination techniques with this insightful and original resource Multi-Agent Coordination: A Reinforcement Learning Approach delivers a comprehensive, insightful, and unique treatment of the development of multi-robot coordination algorithms with minimal computational burden and reduced storage requirements when compared to traditional algorithms. The accomplished academics, engineers, and authors provide readers with both a high-level introduction to, and overview of, multi-robot coordination, and in-depth analyses of learning-based planning algorithms. You'll learn about how to accelerate the exploration of the team-goal and alternative approaches to speeding up the convergence of TMAQL by identifying the preferred joint action for the team. The authors also propose novel approaches to consensus Q-learning that address the equilibrium selection problem and a new way of evaluating the threshold value for uniting empires without imposing any significant computation overhead. Finally, the book concludes with an examination of the likely direction of future research in this rapidly developing field. Readers will discover cutting-edge techniques for multi-agent coordination, including: An introduction to multi-agent coordination by reinforcement learning and evolutionary algorithms, including topics like the Nash equilibrium and correlated equilibrium Improving convergence speed of multi-agent Q-learning for cooperative task planning Consensus Q-learning for multi-agent cooperative planning The efficient computing of correlated equilibrium for cooperative q-learning based multi-agent planning A modified imperialist competitive algorithm for multi-agent stick-carrying applications Perfect for academics, engineers, and professionals who regularly work with multi-agent learning algorithms, Multi-Agent Coordination: A Reinforcement Learning Approach also belongs on the bookshelves of anyone with an advanced interest in machine learning and artificial intelligence as it applies to the field of cooperative or competitive robotics.

Оглавление

Amit Konar. Multi-Agent Coordination

Table of Contents

List of Tables

List of Illustrations

Guide

Pages

Multi‐Agent Coordination. A Reinforcement Learning Approach

Preface

Acknowledgments

About the Authors

1 Introduction: Multi‐agent Coordination by Reinforcement Learning and Evolutionary Algorithms

1.1 Introduction

1.2 Single Agent Planning

1.2.1 Terminologies Used in Single Agent Planning. Definition 1.1

Definition 1.2

Definition 1.3

Definition 1.4

Definition 1.5

Definition 1.6

Definition 1.7

Definition 1.8

Example 1.1

1.2.2 Single Agent Search‐Based Planning Algorithms

1.2.2.1 Dijkstra's Algorithm

1.2.2.2 A* (A‐star) Algorithm

Definition 1.9

Definition 1.10

Algorithm 1.1 Dijkstra's Algorithm

Algorithm 1.2 A* Algorithm

Example 1.2

1.2.2.3 D* (D‐star) Algorithm

1.2.2.4 Planning by STRIPS‐Like Language

1.2.3 Single Agent RL

1.2.3.1 Multiarmed Bandit Problem

1.2.3.2 DP and Bellman Equation

1.2.3.3 Correlation Between RL and DP

1.2.3.4 Single Agent Q‐Learning

Definition 1.11

Algorithm 1.3 Single Agent Q‐Learning

1.2.3.5 Single Agent Planning Using Q‐Learning

1.3 Multi‐agent Planning and Coordination

1.3.1 Terminologies Related to Multi‐agent Coordination

Definition 1.12

Definition 1.13

1.3.2 Classification of MAS

1.3.3 Game Theory for Multi‐agent Coordination

Definition 1.14

Definition 1.15

Definition 1.16

Example 1.3

1.3.3.1 Nash Equilibrium

Definition 1.17

Pure Strategy NE

Mixed Strategy NE

Example 1.4

Example 1.5

1.3.3.2 Correlated Equilibrium

Definition 1.18

Example 1.6

1.3.3.3 Static Game Examples

Example 1.7

Example 1.8

1.3.4 Correlation Among RL, DP, and GT

1.3.5 Classification of MARL

1.3.5.1 Cooperative MARL

Static

Independent Learner and Joint Action Learner

Algorithm 1.4 Independent Learners

Algorithm 1.5 Joint Action Learners

Frequency Maximum Q‐Value heuristic

Algorithm 1.6 FMQ heuristic

Dynamic

Team‐Q

Algorithm 1.7 Team‐Q

Distributed‐Q

Algorithm 1.8 Distributed Q‐learning

Example 1.9

Example 1.10

Optimal Adaptive Learning

Algorithm 1.9 Optimal Adaptive Learning

Sparse Cooperative Q‐learning

Sequential Q‐learning

Algorithm 1.10 Joint Action Formation in SQL

Frequency of the Maximum Reward Q‐learning

Algorithm 1.11 FMRQ for an Agent i in Repeated Games

1.3.5.2 Competitive MARL

Minimax‐Q Learning

Algorithm 1.12 Minimax Q‐learning

Heuristically Accelerated Multi‐agent Reinforcement Learning

Algorithm 1.13 HAMRL for Zero‐sum Game

1.3.5.3 Mixed MARL

Static

Belief‐Based Learning Rule

Algorithm 1.14 Bully Algorithm

Algorithm 1.15 Meta Strategy Algorithm

Algorithm 1.16 AWESOME Algorithm

Direct Policy Search Based Algorithm

Dynamic

Equilibrium Dependent

Algorithm 1.17 NQL in General‐sum Game

Algorithm 1.18 Correlated‐Q Learning

Algorithm 1.19 Asymmetric‐Q Learning for the Leader

Algorithm 1.20 Asymmetric‐Q Learning for the Follower

Algorithm 1.21 Friend‐or Foe‐Q Learning

Definition 1.19

Definition 1.20

Algorithm 1.22 Negotiation to Evaluate the PSNE for Agent i in a Normal‐form Game

Algorithm 1.23 Negotiation to Evaluate the Nonstrict EDSP for Agent i in a Normal‐form Game

Algorithm 1.24 Negotiation‐Q Learning for Agent i in a Markov Game

Algorithm 1.25 Equilibrium Transfer‐Based MAQL

Equilibrium Independent

Algorithm 1.26 Policy Hill‐Climbing (PHC)

Algorithm 1.27 Win or Learn Fast‐PHC (WoLF‐PHC)

Algorithm 1.28 Non‐Stationary Converging Policies

Algorithm 1.29 EXORL for Agent i

1.3.6 Coordination and Planning by MAQL

1.3.7 Performance Analysis of MAQL and MAQL‐Based Coordination

1.4 Coordination by Optimization Algorithm

1.4.1 PSO Algorithm

Algorithm 1.30 Particle Swarm Optimization (PSO)

1.4.2 Firefly Algorithm

1.4.2.1 Initialization

1.4.2.2 Attraction to Brighter Fireflies

1.4.2.3 Movement of Fireflies

1.4.3 Imperialist Competitive Algorithm

Algorithm 1.31 Traditional Firefly Algorithm (FA)

1.4.3.1 Initialization

1.4.3.2 Selection of Imperialists and Colonies

1.4.3.3 Formation of Empires

1.4.3.4 Assimilation of Colonies

1.4.3.5 Revolution

1.4.3.6 Imperialistic Competition

Total Empire Power Evaluation

Reassignment of Colonies and Removal of Empire

Union of Empires

1.4.4 Differential Evolution Algorithm

1.4.4.1 Initialization

1.4.4.2 Mutation

1.4.4.3 Recombination

1.4.4.4 Selection

1.4.5 Off‐line Optimization

1.4.6 Performance Analysis of Optimization Algorithms

1.4.6.1 Friedman Test

1.4.6.2 Iman–Davenport Test

1.5 Summary

References

2 Improve Convergence Speed of Multi‐Agent Q‐Learning for Cooperative Task Planning

2.1 Introduction

2.2 Literature Review

2.3 Preliminaries

2.3.1 Single Agent Q‐learning

2.3.2 Multi‐agent Q‐learning

Definition 2.1

Definition 2.2

Definition 2.3

Algorithm 2.1 NE/CE‐Based Multi‐agent‐Q Learning

2.4 Proposed MAQL

Definition 2.4

2.4.1 Two Useful Properties

Statute 2.1

Property 2.1

Proof:

Definition 2.5

Property 2.2

Proof

2.5 Proposed FCMQL Algorithms and Their Convergence Analysis

2.5.1 Proposed FCMQL Algorithms

Algorithm 2.2 Fast Cooperative Multi‐agent Q‐Learning (FCMQL)

2.5.2 Convergence Analysis of the Proposed FCMQL Algorithms

Theorem 2.1

Proof

2.6 FCMQL‐Based Cooperative Multi‐agent Planning

Definition 2.6

Theorem 2.2

Proof

Algorithm 2.3 FCMQL‐Based Cooperative Multi‐agent Planning

2.7 Experiments and Results

Experiment 2.1 Study of Convergence Speed

Experiment 2.2 Planning Performance

Experiment 2.3 Run‐Time Complexity

Experiment 2.4 Real‐Time Planning

2.8 Conclusions

2.9 Summary

2.A More Details on Experimental Results. 2.A.1 Additional Details of Experiment 2.1

2.A.2 Additional Details of Experiment 2.2

2.A.3 Additional Details of Experiment 2.4

References

3 Consensus Q‐Learning for Multi‐agent Cooperative Planning

3.1 Introduction

3.2 Preliminaries

3.2.1 Single Agent Q‐Learning

3.2.2 Equilibrium‐Based Multi‐agent Q‐Learning

Definition 3.1

Definition 3.2

Algorithm 3.1 Equilibrium‐Based MAQL

3.3 Consensus

Definition 3.3

Definition 3.4

Definition 3.5

3.4 Proposed CoQL and Planning

3.4.1 Consensus Q‐Learning

Theorem 3.1

Proof

Algorithm 3.2 Consensus Q‐Learning (CoQL)

3.4.2 Consensus‐Based Multi‐robot Planning

Algorithm 3.3 Consensus‐Based Planning

3.5 Experiments and Results

3.5.1 Experimental Setup

3.5.2 Experiments for CoQL

3.5.3 Experiments for Consensus‐Based Planning

3.6 Conclusions

3.7 Summary

References

4 An Efficient Computing of Correlated Equilibrium for Cooperative Q‐Learning‐Based Multi‐Robot Planning

4.1 Introduction

4.2 Single‐Agent Q‐Learning and Equilibrium‐Based MAQL

4.2.1 Single Agent Q‐Learning

4.2.2 Equilibrium‐Based MAQL

Definition 4.1

4.3 Proposed Cooperative MAQL and Planning

4.3.1 Proposed Schemes with Their Applicability

4.3.2 Immediate Rewards in Scheme‐I and ‐II

Definition 4.2

4.3.3 Scheme‐I‐Induced MAQL

Note 4.1

Note 4.2

Note 4.3

Theorem 4.1

Proof

Theorem 4.2

Proof

4.3.4 Scheme‐II‐Induced MAQL

Definition 4.3

Note 4.4

Lemma 4.1

Proof

Lemma 4.2

Proof

Lemma 4.3

Proof

Lemma 4.4

Proof

Lemma 4.5

Proof

Lemma 4.6

Proof

Theorem 4.3

Proof

Theorem 4.4

Proof

4.3.5 Algorithms for Scheme‐I and II

Algorithm 4.1 Scheme‐I and –II‐Induced ΩQL (ΩQL‐I and ΩQL‐II)

4.3.6 Constraint ΩQL‐I/ΩQL‐II(CΩQL‐I/CΩQL‐II)

4.3.7 Convergence

Algorithm 4.2 Constraint ΩQL‐I/ΩQL‐II (CΩQL‐I/CΩQL‐II)

Lemma 4.7

Proof

Lemma 4.8

Proof

Lemma 4.9

Proof

Theorem 4.5

Proof

Theorem 4.6

Proof

4.3.8 Multi‐agent Planning

Algorithm 4.3 Ω Multi‐agent Planning (ΩMP)

Algorithm 4.4 Constraint Ω Multi‐agent Planning (CΩMP)

4.4 Complexity Analysis

4.4.1 Complexity of CQL

4.4.1.1 Space Complexity. Learning

Planning

4.4.1.2 Time Complexity. Learning

Planning

4.4.2 Complexity of the Proposed Algorithms

4.4.2.1 Space Complexity. Learning

Planning

4.4.2.2 Time Complexity. Learning

Planning

Space‐complexity Analysis of Stick‐Carrying and Triangle‐Carrying Problem

4.4.3 Complexity Comparison

4.4.3.1 Space Complexity

4.4.3.2 Time Complexity

4.5 Simulation and Experimental Results

4.5.1 Experimental Platform

4.5.1.1 Simulation

4.5.1.2 Hardware

4.5.2 Experimental Approach

4.5.2.1 Learning Phase

4.5.2.2 Planning Phase

4.5.3 Experimental Results

Experiment 4.1 Performance Test of the Proposed Learning Algorithms

Experiment 4.2 Object (Box)‐Carrying By Scheme‐I‐Based Proposed Planning Algorithms

Experiment 4.3 Stick‐ or Triangle‐Carrying By Model‐II‐Based Proposed Planning Algorithms

4.6 Conclusion

4.7 Summary

Appendix 4.A Supporting Algorithm and Mathematical Analysis

Algorithm 4.A.1 Correlated Q‐Learning (CQL)

References

5 A Modified Imperialist Competitive Algorithm for Multi‐Robot Stick‐Carrying Application

5.1 Introduction

5.2 Problem Formulation for Multi‐Robot Stick‐Carrying

5.3 Proposed Hybrid Algorithm

5.3.1 An Overview of ICA

5.3.1.1 Initialization

5.3.1.2 Selection of Imperialists and Colonies

5.3.1.3 Formation of Empires

5.3.1.4 Assimilation of Colonies

5.3.1.5 Revolution

5.3.1.6 Imperialistic Competition

5.3.1.6.1 Total Empire Power Evaluation

5.3.1.6.2 Reassignment of Colonies and Removal of Empire

5.3.1.6.3 Union of Empires

5.4 An Overview of FA

5.4.1 Initialization

5.4.2 Attraction to Brighter Fireflies

5.4.3 Movement of Fireflies

5.5 Proposed ICFA

5.5.1 Assimilation of Colonies

5.5.1.1 Attraction to Powerful Colonies

5.5.1.2 Modification of Empire Behavior

5.5.1.3 Union of Empires

Algorithm 5.1 Imperialist Competitive Firefly Algorithm (ICFA)

5.6 Simulation Results

5.6.1 Comparative Framework

5.6.2 Parameter Settings

5.6.3 Analysis on Explorative Power of ICFA

5.6.4 Comparison of Quality of the Final Solution

5.6.5 Performance Analysis

5.7 Computer Simulation and Experiment

5.7.1 Average Total Path Deviation (ATPD)

5.7.2 Average Uncovered Target Distance (AUTD)

5.7.3 Experimental Setup in Simulation Environment

5.7.4 Experimental Results in Simulation Environment

5.7.5 Experimental Setup with Khepera Robots

5.7.6 Experimental Results with Khepera Robots

5.8 Conclusion

5.9 Summary

Appendix 5.A Additional Comparison of ICFA

References

6 Conclusions and Future Directions

6.1 Conclusions

6.2 Future Directions

Index. a

b

c

d

e

f

g

h

i

j

k

l

m

n

o

p

q

r

s

t

u

v

w

z

WILEY END USER LICENSE AGREEMENT

Отрывок из книги

IEEE Press 445 Hoes Lane Piscataway, NJ 08854

.....

Unlike CQL, Chapter 4 proposes an attractive approach to adapt composite rewards of all the agents in one Q‐table in joint state–action space during learning, and subsequently, these rewards are employed to compute CE in the planning phase. Two separate models of multi‐agent Q‐learning have been proposed. If the success of only one agent is enough to make the team successful, then model‐I is employed. However, if an agent's success is contingent upon other agents and simultaneous success of the agents is required, then model‐II is employed. It is also shown that the CE obtained by the proposed algorithms and by the traditional CQL are identical. In order to restrict the exploration within the feasible joint states, constraint versions of the said algorithms are also proposed. Complexity analysis and experiments have been undertaken to validate the performance of the proposed algorithms in multi‐robot planning on both simulated and real platforms.

Chapter 5 hybridizes the Firefly Algorithm (FA) and the Imperialist Competitive Algorithm (ICA). The above‐explained hybridization results in the Imperialist Competitive Firefly Algorithm (ICFA), which is employed to determine the time‐optimal trajectory of a stick, being carried by two robots, from a given starting position to a predefined goal position amidst static obstacles in a robot world map. The motion dynamics of fireflies of the FA is embedded into the sociopolitical evolution‐based meta‐heuristic ICA. Also, the trade‐off between the exploration and exploitation is balanced by modifying the random walk strategy based on the position of the candidate solutions in the search space. The superiority of the proposed ICFA is studied considering run‐time and accuracy as the performance metrics. Finally, the proposed algorithm has been verified in a real‐time multi‐robot stick‐carrying problem.

.....

Добавление нового отзыва

Комментарий Поле, отмеченное звёздочкой  — обязательно к заполнению

Отзывы и комментарии читателей

Нет рецензий. Будьте первым, кто напишет рецензию на книгу Multi-Agent Coordination
Подняться наверх