SafeMDP¶
-
class
safemdp.
SafeMDP
(graph, gp, S_hat0, h, L, beta=2)¶ Base class for safe exploration in MDPs.
This class only provides basic options to compute the safely reachable and returnable sets. The actual update of the safety feature must be done in a class that inherits from SafeMDP. See safempd.GridWorld for an example.
Parameters: graph: networkx.DiGraph
The graph that models the MDP. Each edge has an attribute safe in its metadata, which determines the safety of the transition.
gp: GPy.core.GPRegression
A Gaussian process model that can be used to determine the safety of transitions. Exact structure depends heavily on the usecase.
S_hat0: boolean array
An array that has True on the ith position if the ith node in the graph is part of the safe set.
h: float
The safety threshold.
L: float
The lipschitz constant
beta: float, optional
The confidence interval used by the GP model.
Methods
add_gp_observations
(x_new, y_new)Add observations to the gp mode. compute_S_hat
()Compute the safely reachable set given the current safe_set. -
add_gp_observations
(x_new, y_new)¶ Add observations to the gp mode.
-
compute_S_hat
()¶ Compute the safely reachable set given the current safe_set.
-