pyotc.otc_backend.policy_iteration package¶

Subpackages¶

Submodules¶

pyotc.otc_backend.policy_iteration.utils module¶

pyotc.otc_backend.policy_iteration.utils.get_best_stat_dist(P, c)[source]¶

Given a transition matrix P and a cost vector c, this function computes the stationary distribution that minimizes the expected cost via linear programming.

Parameters:

P (np.ndarray) – Transition matrix.
c (np.ndarray) – Cost vector.

Returns:

Best stationary distribution. exp_cost (float): Corresponding expected cost.

Return type:

stat_dist (np.ndarray)

pyotc.otc_backend.policy_iteration.utils.get_stat_dist(P, method='best', c=None)[source]¶

Computes the stationary distribution of a Markov chain given its transition matrix P.

Supports multiple methods:

‘best’: Solves a linear program that minimizes cost under stationarity constraints.
‘eigen’: Solves for the stationary distribution using the eigenvalue method.
‘iterative’: Uses power iteration for large or sparse matrices.

Parameters:

P (np.ndarray) – Transition matrix of the Markov chain, shape (n, n).
method (str) – Method used to compute the stationary distribution. One of ‘eigen’, ‘iterative’, or ‘best’. Defaults to ‘best’.
c (np.ndarray, optional) – Cost vector of shape (n,) used only when method=’best’.

Returns:

Stationary distribution vector of shape (n,), summing to 1.

Return type:

pi (np.ndarray)

Raises:

ValueError – If method is ‘best’ but cost vector c is not provided, or if an invalid method name is given.

pyotc.otc_backend.policy_iteration package¶

Subpackages¶

Submodules¶

pyotc.otc_backend.policy_iteration.utils module¶

Module contents¶

pyotc

Navigation

Related Topics