Speaker: Larry Moss (Indiana University Bloomington)
Date and Time: Thursday, February 15th 2024, 16:30-18:00
Venue: ILLC seminar room F1.15 in Science Park 107 and online.
Title: Markov Decision Processes and Coinduction.
Abstract: Markov decision processes (MDPs) are automata-like objects in which an agent moves from state to state by executing actions, and at the same time accruing rewards. MDPs are used in many applications, including speech recognition, control, and self-driving cars. Reinforcement learning is connected to MDPs, but my talk will not get to RL.
This talk looks at one foundational result in the theory of MDPs: policy iteration. I am interested in value iteration because the classical argument for it has ‘overtones of circularity’. The overall problem in the talk is to relate these classical results to current work in theoretical computer science on coalgebra: this is the reference of ‘coinduction’ in the title. The point is to (1) extend the current work to settings involving analysis and probability, and (2) to give algebraic treatments of the classical results.
This talk will not presuppose knowledge of MDPs; I’ll present everything that is needed. The talk also has a new fixed-point theorem which extends (slightly) the Banach Fixed Point Theorem. Time permitting, the last section will present a general theory calling on more specialized ideas from coalgebra. For that, and other related matters, one might check out the LLAMA seminar on February 14. But this talk will be self-contained.
This is joint work with Frank Feys and Helle Hansen.