Modelling (complex) systems for decision making using multi-agent reinforcement learning methods (MARL) for the computation of agents’ joint policies for action in critical domains is a priority and challenge here. Specifically, we aim to resolve cases in large-scale and complex settings where agents have conflicting preferences, but need to act jointly and in coordination to resolve common problems.
For instance agents, representing flights, need to decide on own delays w.r.t. own preferences, having no information about others’ payoffs, preferences and constraints, while they plan to execute their trajectories jointly with others, adhering to operational constraints. These problems can be also considered as Markov games in which interacting agents need to reach an equilibrium: What makes the problem more interesting is the dynamic setting in which agents operate, which is also due to the unforeseen, emergent effects of their decisions in the whole system.
A challenge to address here concerns the level of automation that the system can accommodate, so as to address decision making transparency and trust issues.