Seminar – NLP-DH

Computational Linguistic Seminar

What
The CLS is the Computational Linguistics Seminar of the University of Amsterdam. Seminars are open to all interested researchers and students of all levels from UvA and elsewhere.

Who
The seminar is organized by Jelke Bloem and Alina Leidinger.

How
To receive notifications about upcoming talks and Zoom details, please join the CLS mailing list.

Subscribe to the announcement mailing list

When
To make sure you do not miss any talks, you can add the CLS agenda to your calendar by clicking on the link below.

Add the CLS agenda to your calendar

Where
The CLS takes place on Tuesdays at 16:00 in room L3.36 at LAB42 in Amsterdam Science Park or via Zoom. Other days and locations are occasionally possible. See the details for each talk. To receive the details please subscribe to the CLS mailing list. The links to participating on Zoom will be distributed via the mailing list on the day of the seminars.

Upcoming seminar

Lukas Galke, Max Planck Institute for Psycholinguistics

What makes a language easy to deep-learn?

Tuesday 16th May, 16:00. Room L3.36 at LAB42, Amsterdam Science Park, plus live streaming on Zoom.

Neural networks drive the success of natural language processing. A fundamental property of natural languages is their compositional structure, allowing us to describe new meanings systematically. However, neural networks notoriously struggle with systematic generalization and do not necessarily benefit from a compositional structure in emergent communication simulations. Here, we test how neural networks compare to humans in learning and generalizing a new language. We do this by closely replicating an artificial language learning study (conducted originally with human participants) and evaluating the memorization and generalization capabilities of deep neural networks with respect to the degree of structure in the input language. Our results show striking similarities between humans and deep neural networks: More structured linguistic input leads to more systematic generalization and better convergence between humans and neural network agents and between different neural agents. We then replicate this structure bias found in humans and our recurrent neural networks with a Transformer-based large language model (GPT-3), showing a similar benefit for structured linguistic input regarding generalization systematicity and memorization errors. These findings show that the underlying structure of languages is crucial for systematic generalization. Due to the correlation between community size and linguistic structure in natural languages, our findings underscore the challenge of automated processing of low-resource languages. Nevertheless, the similarity between humans and machines opens new avenues for language evolution research.

Past seminar

Sarenne Wallbridge, University of Edinburgh

Speech as a multi-channel system: Quantifying perceptual channel value

Tuesday 28th March, 16:00. Room L3.36 at LAB42, Amsterdam Science Park, plus live streaming on Zoom.

Speech is one of the most complex and intrinsic modalities of human communication. When we speak, we convey information through both the lexical channel of which words are said, and the non-lexical channel of how those words are spoken. The problem of representing lexical information has dominated the field of speech processing and spoken language understanding. The problem of representing non-lexical information, however, has received less focus. The non-lexical channel contains a host of information, some of which serves important communicative functions such as indicating communicative intent or marking novel information, while other features pertaining to a speaker’s environment or identity may be less relevant during communication. Understanding which components of the lexical and non-lexical channels are perceptually salient is crucial for modelling human comprehension of spoken language and could lead to more efficient methods of representing speech.

In my work, I aim to quantify the perceptual value of the lexical and non-lexical components of speech for comprehension, specifically how much they constrain expectations of upcoming communication. In this talk, I will present our investigations into quantifying the value of the lexical and non-lexical channels in spoken dialogue. I will discuss when current language models align with this aspect of perception and when they diverge, as well as how we can use them to study perception. Finally, I will conclude by discussing potential approaches for quantifying the value of lexical and non-lexical information in terms of compression and entropy reduction.