HJ Reachability analyis is a faormal verfication method for guranteeing performance and safety properties of dynamical systems
A reach-avoid state is computed:
set of state from which system can be driven to a target
Applications:
Aircraft landing, MPC of quadrotors, path planning, real time safe motion planning.
Limitations:
Intractable as the state space dimension increase
Computation of grid representations of PDEs (Curse of dimensionality)
Backward reachable sets for safety:
Start from known unsafe states and compute backward reachable sets which the system should avoid.
HJ is applicable to nonlinear systems, and handles control and disturbance, and can create arbitrary shaped sets.
Backward Reachable Sets (BRS):
Backward reachable set: set of states such that the trajectories that start from this set can reach some given target set.
If the target consits of failure states, then the BRS contains states that are unsafe (potentially) and should be avoided.
The computation for BRS is in terms of a two-player game:
Player 1 and Player 2 being control inputs
Player 1 $(a(.))$ will try to steer the system away from the target
Player 2 $(b(.))$ will try to steer the system toward the target
Solving the dynamics system in backwards time: \(\dot{x}(s) = f(x(s), a(s), b(s)), s \in [t,0], a(s) \in \mathcal{A}, b(s) \in \mathcal{B}\)
So, the BRS set is as follows: \(\mathcal{G}(t) = \{x:\exists \gamma \in \Gamma (t), \forall a(.) \in A, \zeta (0; x, t, a(.), \gamma [a](.)) \in \mathcal(G)_0 \}\)
Where $\zeta$ represents the trajectory starting from $x$ and after $t$ ending in $G_0$.
Requires solving a “differential game”
Player 2 can only use non-anticipative strategies.
It cannot respond differently to Player 1 controls until they become different
Player 2 also has the advantage of factoring in Player 1’s choice of input at every instant $t$
“Instantaneious informational advantage”
Important to safety and robust control
Reach Avoid Problems: goal is to control the agent to reach a target set of states while simultaneously avoiding a failure set of states.
Hamiltonian encodes the instantaneous battle between two players at each point in time and space
$\nabla G(t,x)$ is called co state
Optimal solution of $a$ ($b$ can be found similarily) given the value function: \(a^*(t,x) = arg max_a min_b c(x,a,b,t) + \lambda f(x,a,b)\)
Level Set Approach:
What should cost function $J$ be?
Signed distance function from the target boundary: $g(x)$ such that the zero sublevel set is within the target set $G_0$ \(z \in \mathcal(G)_0 \Leftrightarrow g(x) \leq 0\)
Cost function: $J_t(s, a(.), b(.)) = g(x(0))$
Here $g(x(0))$ means the signed distance of state $x$ at $t=0$ from the target set.
System reaches target set under controls $a$ and $b$ if and only if $J_t(s, a(.), b(.)) \leq 0$
$b$ wants to minimize the cost, $a$ wants to maximize
BRS can be obtained as: \(\mathcal{G}(t) = \{ x: G(t,x) \leq 0 \}\)
If within the set, Player 2 has a control sequence which will drive the system to target at time 0 (worst case scenario).
If outside $x$ is outside $G(t)$ then Player has a control sequence that will keep the system of the target set, regardless of the control applied by player 2.
If target is failure states:
Player 2 is the disturbances in system, and $G(t)$ becomes the unsafe set (set of states from disturbance can drive the system to failure states despite best control efforts)
Thus, want to find safe set $\mathcal{G} (t) ^C$ and controller $a^*$ from reachability set.
Backward Reachable Tubes (BRT):
BRS: set of states from which the system can reach a target set at exactly time 0
BRT: set of states from which the system can reach a target within a duration of $\mid t \mid$.
This discount factor can be seen as the probability of the episode continuting and its converse $1-\gamma$ represents the probability of transitionaing to a terminal state.