Hi all (and Sorry if 'Issues' is not the appropriate place for this question)
For the purpose of some IRL, I need to have the transition matrix of these games. Using the RAM version, I have 12825618 maximum of possible states. However, since the game is always determined by the initial state (and the future actions), can't I just sample all the initial states (and obtain the MDP from this) ? If yes how do I sample all the initial states ? Or do you have an advice on a simple (and efficient because I doubt the first one is) way to achieve my goal ?
Thanks in advance for your response