Both Reaserch Units will contribute to the organisation of the project. The coordinator of each WP will ensure the integration among the research activities carried out by the WP participants, monitoring the advancement of the research activities and suggesting forms of collaboration among different units for the achievement of the project goals. A kick-off meeting will be organised in Naples at the beginning of the project (M1) with coordination meetings between the partners being organised every month via videoconferencing with one every 6 months being held in person (in Naples and Bologna alternatively). A final meeting to announce and disseminate the project results at M24 will be organised in Naples.
Workplan
This WP will be concerned with conducting a careful review of existing methods for the control of multi-agent systems based on the use of machine learning techniques and the definition of appropriate metrics to assess their performance from a control and data efficiency viewpoint. The aim will be the identification of the key strategies presented in the literature to be used in the rest of the project to benchmark the proposed strategies. BO will implement and evaluate all the strategies numerically through simulations, while NA will focus on all the control related aspects so as to merge expertise from both teams to achieve a full characterization and classification of the available strategies.
This WP will be devoted to deriving strategies to control single systems via the combination of deep learning with control laws derived on a partial knowledge of the system to be controlled. Specifically, NA will focus on the extension of the CTQL strategy recently developed to control specific applications from Open AI Gym (e.g., inverted pendulum stabilization) by combining Q-Learning with state feedback control. Using expertise on deep learning and reinforcement learning from BO, a novel CTDL (control-tutored deep learning) algorithm will be synthesised at NA and validated at BO on a set of testbed problems selected from Open AI Gym.
This WP will be focussed on the synthesis of multi-agent reinforcement learning strategies informed by control tutors as detailed in [O2]. The work on the control of complex multi-agent systems at NA will be complemented by research on multi-agent reinforcement learning at BO to synthesise strategies combining model-based controllers with MARL. All strategies will be then validated using the two testbed scenarios (synchronisation and herding) described in [O3]. The numerical implementation will be carried at BO with input on the implementation of the control strategies from NA. All strategies will be then tested and validated numerically so as to evaluate their control and learning performance and contrast them with those of the existing alternatives reviewed as part of WP1.
This WP will be focussed on exploiting the results of WP3 for engineering multi-agent autonomous systems able to be deployed for SAR applications in natural disaster zones. Given the time scale of the project, we will study a variety of emergency scenarios using simulations. We will develop environments for simulations and also visualization using tools like Unity. These simulations will be data-driven involving experts from civil protection and emergency services. In particular, we will consider scenarios where there are constraints given by the nature of the disaster (e.g., volcanic eruptions, earthquakes, and environmental disasters), morphology of the territory and availability of rescue crews. Units at NA and BO will also benefit from collaborations with groups at Sydney and New York for the numerical and experimental investigation of the proposed solutions. We believe that the results of the experiments will provide a powerful proof-of-concepts for these technologies and demonstrate their general applicability for future potential adoption and deployments, not only for SAR.
This WP will ensure that all dissemination and exploitation activities described in the project will be carried out according to plans by both research units.