Stationary and Non-Stationary State Transitions in Markov Decision Process (MDP) - Python Automation and Machine Learning for ICs - - An Online Book - |
||||||||
Python Automation and Machine Learning for ICs http://www.globalsino.com/ICs/ | ||||||||
Chapter/Index: Introduction | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Appendix | ||||||||
In a Markov Decision Process (MDP), stationarity refers to whether the transition probabilities between states remain constant over time. Stationary State Transitions: In a stationary transition scenario, the transition probabilities from one state to another remain constant regardless of the time step or the current state-action pair. For instance, in the study of Board Game Transition Probabilities, imagining a board game where we roll a fair six-sided die to determine how many spaces we move. The transition probabilities from one space to another are stationary because they depend only on the current space and the outcome of the die roll. The probability of moving from space Si to space Sj is the same regardless of how many turns have passed. Non-Stationary State Transitions: In a non-stationary transition scenario, the transition probabilities may change over time or depend on the current time step or the specific state-action pair. For instance, for financial market, consider an MDP modeling a stock market. The transition probabilities between different states (representing different price levels or market conditions) may change over time due to market trends, economic events, or other factors. The dynamics of the market are non-stationary, and the probability of moving from one state to another can vary over time. Another example is weather forecasting. In an MDP modeling weather forecasting, the transition probabilities between weather states (e.g., sunny, rainy, cloudy) may be non-stationary. The transition probabilities could depend on the current season, time of day, or recent weather patterns, making the system non-stationary. |
||||||||
================================================================================= | ||||||||
|
||||||||