DIP Colloquium

Speaker: Michael Franke (Tübingen)
Title: On causal & social world models in language models
Date:
Time: 16:00 - 17:30
Location: SP107 F1.15 (ILLC Seminar Room)

Abstract: Recent works have suggested that “world models” emerge in large language models as a side effect of the compression necessary for their high performance on the language modeling task. While robust evidence for such world models exists for simple (e.g., finite-state) problems, more needs to be said about at least two issues: (i) should we expect LMs to also evolve (veridical / human-like) world models for human-relevant but highly abstract latent variables such as causal information or social cognition?, and (ii) what are good methods or in-principle arguments to address (i) in the first place? In this talk, I argue that formal results from statistical machine learning and causal discovery, already implicate clear limits on emergent world models’ veridicality and human-likeness. However, they also imply that abstract function-sharing across tasks can be a reason for efficient compression to recover human-relevant concepts. This function-sharing hypothesis could be used, in practice, to guide methodology. To demonstrate this approach, I present a case study pitting pragmatic reasoning and Theory-of-Mind reasoning in language models against each other. Behavioral and interventionist experiments provide evidence for function-sharing between these tasks, suggesting an abstract layer of shared representations for both.