about

I am a Postdoctoral Researcher at KU Leuven (🇧🇪), working within the LAGoM-NLP group led by Miryam de Lhoneux. My research concerns Multilingual NLP, where I investigate methods for making language representation fair and equitable. In particular, I am interested in how principled data selection and language sampling can be employed as part of the pre-training process, so that models can learn from better data — rather than simply more of it. In addition, I am also interested in how typological bias is reflected in commonly employed evaluation datasets and metrics, and how this affects our appraisal of multilingual models.

Prior KU Leuven, I earned my PhD in computational linguistics at Uppsala University (🇸🇪), supervised by Joakim Nivre and Anders Søgaard. My dissertation focused on the syntactic knowledge encoded by language models, investigated through the lens of dependency parsing (available here). Before my PhD, I graduated from the EM-LCT program, where I spent my first year at the University of Groningen (🇳🇱) and my second year at the the University of the Basque Country (🇪🇸). I grew up in Western Massachusetts (🇺🇸).

publications

A Kulmizev, J Nivre: Investigating UD Treebanks via Dataset Difficult Measures. EACL 2023. Dubrovnik, Croatia.
M Abdou, V Ravishankar, A Kulmizev, A Søgaard: Word Order Does Matter and Shuffled Language Models Know It. ACL 2022. Dublin, Ireland.
A Kulmizev, J Nivre: Schrödinger’s Tree – On Syntax and Neural Language Models. Frontiers in Artificial Intelligence.
M Abdou, A Kulmizev, D Hershcovich, S Frank, E Pavlick, A Søgaard: Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color. CoNLL 2021. Punta Cana, DR.
Z Luo, A Kulmizev, X Mao: Positional Artefacts Propagate Through Masked Language Model Embeddings. ACL 2021. Digital.
A Kulmizev, V Ravishankar, M Abdou, A Søgaard, J Nivre: Attention Can Reflect Syntactic Structure (If You Let It). EACL 2021. Digital.
A Kulmizev, V Ravishankar, M Abdou, J Nivre: Do Neural Language Models Show Preferences for Syntactic Formalisms?. ACL 2020. Digital.
A Kulmizev, M de Lhoneux, J Gontrum, E Fano, J Nivre: Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited. EMNLP 2019. Hong Kong.
M Abdou, A Kulmizev, F Hill, D Low, A Søgaard: Higher-order Comparisons of Sentence Encoder Representations. EMNLP 2019. Hong Kong.
M Abdou, A Kulmizev, V Ravishankar, L Abzianidze, J Bos: What can we learn from Semantic Tagging?. EMNLP 2018. Brussels, Belgium.
M Abdou, A Kulmizev, V Ravishankar: [MGAD: Multilingual Generation of Analogy Datasets](https://www.aclweb.org/anthology/L18-1320.pdf]. LREC 2018. Miyazaki, Japan.

fall in wm