The reinforcement learning problem can be formally solved via a Bayesian statistical framework, where a probability distribution is used to express our belief about which is the true environment model. Unfortunately, this necessitates performing planning in the space of all possible beliefs, something which is usually intractable. In addition, there is a set of many possible models to choose from, none of which may be true in reality, and which may have different complexity of simulating.
In this project, the student will develop methodologies for using available data to perform planning in uncertain environments, combining information from models at multiple levels of details and accuracy. The main problem is how much computation time to devote to each model as we get more data. It is to be expected that the more complex models will become more useful the more data we have.
Background: Probability, Bayesian inference, Good Programming Skills