pyotc.otc_backend.policy_iteration package
Subpackages
- pyotc.otc_backend.policy_iteration.dense package
- Submodules
- pyotc.otc_backend.policy_iteration.dense.approx_tce module
- pyotc.otc_backend.policy_iteration.dense.entropic module
- pyotc.otc_backend.policy_iteration.dense.entropic_tci module
- pyotc.otc_backend.policy_iteration.dense.exact module
- pyotc.otc_backend.policy_iteration.dense.exact_tce module
- pyotc.otc_backend.policy_iteration.dense.exact_tci_lp module
- pyotc.otc_backend.policy_iteration.dense.exact_tci_pot module
- Module contents
- pyotc.otc_backend.policy_iteration.sparse package
Submodules
pyotc.otc_backend.policy_iteration.utils module
- pyotc.otc_backend.policy_iteration.utils.get_best_stat_dist(P, c)[source]
Given a transition matrix P and a cost vector c, this function computes the stationary distribution that minimizes the expected cost via linear programming.
- Parameters:
P (np.ndarray) – Transition matrix.
c (np.ndarray) – Cost vector.
- Returns:
Best stationary distribution. exp_cost (float): Corresponding expected cost.
- Return type:
stat_dist (np.ndarray)
- pyotc.otc_backend.policy_iteration.utils.get_stat_dist(P, method='best', c=None)[source]
Computes the stationary distribution of a Markov chain given its transition matrix P.
- Supports multiple methods:
‘best’: Solves a linear program that minimizes cost under stationarity constraints.
‘eigen’: Solves for the stationary distribution using the eigenvalue method.
‘iterative’: Uses power iteration for large or sparse matrices.
- Parameters:
P (np.ndarray) – Transition matrix of the Markov chain, shape (n, n).
method (str) – Method used to compute the stationary distribution. One of ‘eigen’, ‘iterative’, or ‘best’. Defaults to ‘best’.
c (np.ndarray) – Cost vector of shape (n,).
- Returns:
Stationary distribution vector of shape (n,), summing to 1. exp_cost (float): Expected cost under the stationary distribution.
- Return type:
pi (np.ndarray)