Summary of existing approaches to gurantee safety in robotic systems.
Designing robots with safety in mind is crucial to prevent the robot from crashing into obstacles and harming itself or worse, harming a human or their possessions. In most applications, safety is very much a requirement and not a feature which can be adjusted for other metrics. Hence, we need modules on the robots which gurantee safety and are mathamaically verifiable.
Before discussing methodologies, it is useful to mention some methodologies and provide more nuanced context to the problem of safety.
Mathamatically, the unsafe conditions of a robot can be represented by a set $F \subseteq X$ where the robot states is not allowed to enter. This set is usally called the constraint set which needs to be enforced during planning and control stages of the robot.
But complement of the constraint is not necessarily safe, for example, a drone not currently in collision is still unsafe if it is flying toward an obstacle to faast to avoid it. A safe is rather a set from which the robot can ensure safety at all future times. This will motivate the methods inspired from positive invariance later on. A positive invariant set is a set from which no trajectory can escape in forward time, in simpler terms, if a trajectory starts inside the set it cannot leave it.
Stability is often discussed with safety as it also relies on the concept of positive invariance, but they are two different concepts and often their is a tradeoff between the two. Refer to my Lyapunov analysis piece to learn more about it!
When we think of safety, we also have to account for uncertainties of the real world, the sensor noise and the dynamics model of the robot. These uncertainties give rise to either robust safety gurantees (if we know the bound of the uncertainties) or probabilitic safety gurantees (high uncertainty and bound unknown). Ideally, we would want robust gurantees but in certain cases it cannot be achieved so we have to settle for probabilistic gurantees.
(Important note: This article is primarily based on the paper by
As an aside, I think it is a useful exercise to contemplate on the definition of safety in the first place. Traditionally, in robotics, safety is usually framed as a collision avoidance problem, but I think that is a very naive way to frame it. There are robot tasks which require active contact with enviornment (ex: quadrapeds, inspection robots, etc.) and there are tasks where safety has a much more abstract definition (ex: prevent spillage during cooking). Hence, we need a more generalizable way to represent safety specifications for a given taks.
Still even under this context, the framework of safe and unsafe sets still applies, just the representation of these sets have to be represented more abstractly in it’s “latent space”. This can be achieved, for example, by having a learnt safety classifer network on top of a learnt dynamics model (also called a world model) and running any of the methods mentioned in this article *. This will allow the robot to learn the failure set and plan around it accordingly. Of course, as learning is introduced this will lead to more probablistic gurantee on learning, but still is a step forward in accounting for more complex safety specifications.
The rest of the article tackles the safety problem from the low-dimensional typical “collision avoidance” problem, but it is good to keep in mind that “collision avoidance” can be applied in a more abstract way.
As seen previously with stability, performance metrics and safety cannot be both optimally achieved in most cases, there has to a compromise. Usually, our lives are more important, so we make safety a constraint and with that constraint we try to maximize the performance metrics. This formulation can be written as follows:
\(J^*(x_0) = min_\pi J(x_0, \pi)\)] \(subject to = safety condition\)
The above formulation of safety-performance problem can be achieved either by a single controller or two seperate controllers: one focusing on performance and the other correcting the first if the action enters the unsafe zone. Using just a single controller to account for both performance and safety is possible (like MPC) but usually is limited in it’s application and scalability. Seperating the performance and safety problem is hypothesized to achieve faster computation time and prevent overly restrive control actions.
Having this seperation principle also enable the concept of “Safe Learning” where development of safety-agnostic learnt task policies can be complemented with task-agnostic safety filters. This will allow the development of safe RL approches which can freely explore and update it’s policies without damanging the robot or its enviornment.
The safety filter can be divided into two modules: monitoring and intervention. The monitoring module continously supervises the robot’s course of action and determines whether it is safe or not. If unsafe, the filter will intervene by modulating/overriding the proposed control input to enforce the safety constraint. The intervention is usually based on a fallback strategy which might be manually designed, computed offline or during runtime using an optimization based process. Figure 1 shows a brief overview of this anatomy *.
Hamilton-Jacobi (HJ) Reachability Analysis allows the computation of a least restrictive or maximal safe set using backward induction principles from dynamic programing. HJ reachability analysis can be though of as a pursuit-evasion game between the controller trying to avoid failure set while an adverserial agent aims to steer the system into the failure set. It is formulated for discrete time as follows:
\[\]In practice, results in control chatter and erratic motion near safe set boundary due to constant switching between the fall-back policy and the task policy. Furthermore, due to computational complexity and memory requirement of solving the above equation, this method is unrealastic for systems with more than six continuous state variables.
Instead of trying to find the maximal safe set, focusing on a more conservative estimate can relax the computational requirement and that is where control barrier functions come in.
Control Barrier Functions (CBFs) are one way of ensuring safety by defining the safe set with forward invariance. This condition is imposed as follows (refer to the Lyapunov article for a detailed discussion):
\[\]These can be integrated with Convex Quadratic Programs (jointly called CBF-QP) to create a safety filter. This filter has been used in many applications with various flavors that deal with discrete time formulation, perceptual uncertainty, etc. Even with such attractive properties, CBFs are challenging to implement as finding a valid CBF for a system is often difficult and non-trivial. Although recent advancements in data driven methods to find CBFs have shown promise, the still suffer from scalability issues, especially in high-dimensional systems.
To solve the scalability issues HJ and CBFs, rollout based filters are generally impoyed. These methods aim to verify safety at runtime by simulating the dynamics (“rollouts”) into the future or solving a trajectory optimizaiton problem.