Author: George Vouros

CARE aims to devise inherently interpretable safe RL methods providing transparency with regard to operational constraints.

The main activities, include:

– Study symbolic representations for constrained RL policy models.

– Design, implement and validate an interpretable safe RL-based solution in constrained settings.

CARE is an ENFIELD Exchange Scheme project on Human-Centric AI, in collaboration with Eidhoven University of Technology ( TU/e ) and  Institute for Systems and Computer Engineering, Technology and Science (INESC TEC) and will deliver symbolic models for inherently interpretable safe RL method in constrained settings. 

Duration

2024-2025

Innovations foreseen:

 – Extend safe RL methods to exploit symbolic models so as to safely adhere to operational constraints that are either learnt from demonstrations or are made explicit in some form (depending on the data available from the use case to be decided).

–  Devise an inherently interpretable safe RL method that is able to provide explanations explicating, among other contextual factors, operational constraints.

Impact:

  • Train interpretable safe policies with respect to constraints, allowing inspection of operational constraints learnt and adherence to these constraints
  • Increase humans’ abilities to maintain control in safety critical settings by offering human-understandable explanations adhering to operational needs and constraints.
  • Increase situational awareness and system trustworthiness by offering explanations that indicate explicitly the operational constraints ensuring safety of operations.
The overall goal of DeepHAC is to  advance Human-Agents Collaboration (HAC) building explainable DRL methods that enable agents to perform tasks in collaboration with humans with respect to human preferences, constraints and objectives, promoting safety and efficacy in performing collaborative tasks. The approach proposed by DeepHAC relies on three main pillars:
  1. Learning collaborative policies aligned with human preferences, constraints and objectives.
  2. Making policies explainable and transparent, and
  3. Learning to act safely and effectively in safety-critical settings.

Duration

 2024 – 2026

Objectives

The specific objectives of the project are the following:
Develop DRL methods for effective Human-Agent collaboration considering human preferences, constraints, and objectives. 
Develop explainable DRL methods for Human-Agent Collaboration, able to align policies to human preferences constraints, and objectives, promoting safety in executing tasks. 
Evaluate and validate how the proposed methods balance between safety and efficiency in executing tasks in various settings.