Behavior Shaping and Analysis in Multi-Agent Reinforcement Learning: Experiments in Combat Tactics and Survivability
The goal of Multi-Agent Reinforcement Learning (MARL) is to autonomously learn tasks without being explicitly programed to do so. Video games serve as an ideal environment for training MARL agents. This research uses the real-time strategy game StarCraft II (SC2) to explore how MARL agents solve their tasks, as well as various methods for shaping new agent behaviors. Chapter One introduces MARL and the motivation for using a video game environment for research. It also defines the working hypotheses that guided this work and how the research was carried out. Chapter Two provides a background for: • the theory behind multi-agent reinforcement learning; • behavioral cloning and transfer learning methods used to shape behaviors and improve agent performance; • a detailed description of the SC2 environment and the targeting system of the native SC2 AI; • how MARL agents can behave unexpectedly. Chapter Three discusses the general approach used to train the agents. Chapter Four provides an in-depth discussion of the design and results for each of the five experiments. Chapter Five concludes the work completed by this thesis with a summary of the results, a discussion of their importance, and future research activities. The MARL agents trained against a variety of adversaries in the SC2 environment. Principal discoveries from the five experiments include: • agents can improve survivability with proper rewards; • agents will develop new behaviors based on their enemies; • behaviors can be retained through transfer learning to be applied against new enemies; • agents who train with expert agents significantly reduce their training time.