pyotc.otc_backend.policy_iteration package

Subpackages

Submodules

pyotc.otc_backend.policy_iteration.utils module

pyotc.otc_backend.policy_iteration.utils.get_best_stat_dist(P, c)[source]

Given a transition matrix P and a cost vector c, this function computes the stationary distribution that minimizes the expected cost via linear programming.

Parameters:
  • P (np.ndarray) – Transition matrix.

  • c (np.ndarray) – Cost vector.

Returns:

Best stationary distribution. exp_cost (float): Corresponding expected cost.

Return type:

stat_dist (np.ndarray)

pyotc.otc_backend.policy_iteration.utils.get_stat_dist(P, method='best', c=None)[source]

Computes the stationary distribution of a Markov chain given its transition matrix P.

Supports multiple methods:
  • ‘best’: Solves a linear program that minimizes cost under stationarity constraints.

  • ‘eigen’: Solves for the stationary distribution using the eigenvalue method.

  • ‘iterative’: Uses power iteration for large or sparse matrices.

Parameters:
  • P (np.ndarray) – Transition matrix of the Markov chain, shape (n, n).

  • method (str) – Method used to compute the stationary distribution. One of ‘eigen’, ‘iterative’, or ‘best’. Defaults to ‘best’.

  • c (np.ndarray, optional) – Cost vector of shape (n,) used only when method=’best’.

Returns:

Stationary distribution vector of shape (n,), summing to 1.

Return type:

pi (np.ndarray)

Raises:

ValueError – If method is ‘best’ but cost vector c is not provided, or if an invalid method name is given.

Module contents