about

I am a Postdoctoral Researcher at KU Leuven (🇧🇪), working within the LAGoM-NLP group led by Miryam de Lhoneux. My research concerns Multilingual NLP, where I investigate methods for making language representation fair and equitable. In particular, I am interested in how principled data selection and language sampling can be employed as part of the pre-training process, so that models can learn from better data — rather than simply more of it. In addition, I am also interested in how typological bias is reflected in commonly employed evaluation datasets and metrics, and how this affects our appraisal of multilingual models.

Prior KU Leuven, I earned my PhD in computational linguistics at Uppsala University (🇸🇪), supervised by Joakim Nivre and Anders Søgaard. My dissertation focused on the syntactic knowledge encoded by language models, investigated through the lens of dependency parsing (available here). Before my PhD, I graduated from the EM-LCT program, where I spent my first year at the University of Groningen (🇳🇱) and my second year at the the University of the Basque Country (🇪🇸). I grew up in Western Massachusetts (🇺🇸).


publications


fall in wm