Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD 2024)

Colocated with: LREC-COLING 2024 (Torino, Italia)

Date of the Workshop: May 25, 2024

Organised and sponsored by:
The Special Interest Group on the Lexicon (SIGLEX) of the Association for Computational Linguistics (ACL), SIGLEX’s Multiword Expressions Section (SIGLEX-MWE), Universal Dependencies (UD) and UniDive Cost Action CA21167.


Multiword expressions (MWEs) are word combinations that exhibit lexical, syntactic, semantic, pragmatic, and/or statistical idiosyncrasies (Baldwin and Kim, 2010), such as by and large, hot dog, pay a visit and pull someone’s leg. The notion encompasses closely related phenomena: idioms, compounds, light-verb constructions, phrasal verbs, rhetorical figures, collocations, institutionalized phrases, etc. Their behavior is often unpredictable; for example, their meaning often does not result from the direct combination of the meanings of their parts. Given their irregular nature, MWEs often pose complex problems in linguistic modeling (e.g. annotation), NLP tasks (e.g. parsing), and end-user applications (e.g. natural language understanding and MT), hence still representing an open issue for computational linguistics (Constant et al., 2017).

Universal Dependencies (UD; De Marneffe et al., 2021) is a framework for cross-linguistically consistent treebank annotation that has so far been applied to over 100 languages. The framework aims to capture similarities as well as idiosyncrasies among typologically different languages (e.g., morphologically rich languages, pro-drop languages, and languages featuring clitic doubling). The goal in developing UD was not only to support comparative evaluation and cross-lingual learning but also to facilitate multilingual natural language processing and enable comparative linguistic studies.

After independently running a successful series of workshops, the MWE and UD communities are now joining forces to organize a joint workshop. This is a timely collaboration because the two communities clearly have overlapping interests. For instance, while UD has several dependency relations that can be used to annotate MWEs, both annotation guidelines (i.e. is syntactic irregularity and inflexibility or semantic non-compositionality the leading criterion?) and annotation practice (both across treebanks for a single language and across languages) for these relations can be improved (Schneider and Zeldes, 2021). The PARSEME MWE-annotated corpora for 26 languages build on UD annotated corpora (Savary et al., 2023). Both communities share an interest in developing guidelines, data-sets, and tools that can be applied to a wide range of typologically diverse languages, raising fundamental questions about tokenization, lemmatization, and morphological decomposition of tokens. Proposals for harmonizing annotation practice between what has been achieved in PARSEME and UD and expanding PARSEME MWE annotation to non-verbal MWEs are also central to the recently started UniDive COST action (CA21167).

The workshop invites submissions of original research on MWE, UD, and the interplay of both. In particular, the following topics are especially relevant:

Submission Formats

The workshop invites two types of submissions:

Paper Submission and Templates

Papers should be submitted via the workshop’s START submission page. Please choose the appropriate submission format (archival/non-archival). Submissions must follow the LREC-COLING 2024 stylesheet.

When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the research described in the paper or are a new result of the research. Moreover, ELRA encourages all LREC-COLING authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones)

Archival papers with existing reviews from ACL Rolling Review (ARR)) will also be considered. A paper may not be under review through ARR and MWE-UD simultaneously. A paper that has or will receive reviews through ARR may not be submitted for review to MWE-UD.

Best Paper Award and Travel Grants

Important Dates

What When
Paper submission deadline March 3, 2024
ARR commitment deadline March 25, 2024
Notification of acceptance April 1, 2024
Camera-ready papers due April 8, 2024
Underline upload deadline TBD
Workshop May 25, 2024

All deadlines are at 23:59 UTC-12 (Anywhere on Earth).

Keynote Speakers

Natalia Levshina Radboud University
Harish Tayyar Madabushi University of Bath

Organizing Committee

Core Organizers
Archna Bhatia Institute for Human and Machine Cognition, USA
Gosse Bouma Groningen University, NL
Kilian Evang Heinrich Heine University Düsseldorf, DE
Marcos Garcia University of Santiago de Compostela, Galiza, Spain
Voula Giouli Institute for Language & Speech Processing, ATHENA RC, Greece
Lifeng Han Univ. of Manchester, UK
Joakim Nivre Uppsala University and Research Institutes of Sweden, Sweden
A. Seza Dogruöz Ghent University, Belgium
Alexandre Rademaker IBM Research, Brazil

Program Committee

Jean-Yves Antoine University of Tours
Verginica Barbu Mititelu Romanian Academy
Cherifa Ben Kehlil University of Tours
Philippe Blache Aix-Marseille Uni
Francis Bond Palacký University
Claire Bonial U.S. Army Research Laboratory
Julia Bonn University of Colorado Boulder
Tiberiu Boroș Adobe
Miriam Butt Universität Konstanz
Marie Candito Université Paris Cité
Giuseppe G. A. Celano Leipzig Uni
Çağrı Çöltekin Tübingen
Paul Cook University of New Brunswick
Monika Czerepowicka University of Warmia and Mazury
Daniel Dakota Indiana University
Marie-Catherine de Marneffe UC Louvain
Valeria de Paiva Nuance
Gaël Dias University of Caen Basse-Normandie
Kaja Dobrovoljc University of Ljubljana
Rafael Ehren Heinrich Heine University Düsseldorf
Meghdad Farahmand Berlin, Germany
Christiane Fellbaum Princeton University
Jennifer Foster Dublin City University
Aggeliki Fotopoulou Institute for Language and Speech Processing, ATHENA RC
Stefan Th. Gries UC Santa Barbara & JLU Giessen
Bruno Guillaume Université de Lorraine
Tunga Gungor Bogaziçi University
Eleonora Guzzi Universidade da Coruña
Cvetana Krstev University of Belgrade
Timm Lichte University of Tübingen
Irina Lobzhanidze Ilia State University
Teresa Lynn ADAPT Centre
Stella Markantonatou Institute for Language & Speech Processing, ATHENA RC
John P. McCrae National University of Ireland, Galway
Nurit Melnik The Open University of Israel
Laura A. Michaelis University of Colorado Boulder
Johanna Monti “L’Orientale” University of Naples
Jan Odijk University of Utrecht
Petya Osenova Bulgarian Academy of Sciences
Yannick Parmentier University of Lorraine
Agnieszka Patejuk University of Oxford and Institute of Computer Science, Polish Academy of Sciences
Pavel Pecina Charles University
Ted Pedersen University of Minnesota
Scott Piao Lancaster University
Martin Popel Charles University
Prokopis Prokopidis Institute for Language and Speech Processing, ATHENA RC
Carlos Ramisch Aix Marseille University
Manfred Sailer Goethe-Universität Frankfurt am Main
Tanja Samardžić University of Zurich
Agata Savary Université Paris-Saclay
Nathan Schneider Georgetown University
Sabine Schulte im Walde University of Stuttgart
Sebastian Schuster Saarland University
Maria Simi Università di Pisa
Kiril Simov Bulgarian Academy of Sciences
Ivelina Stoyanova Bulgarian Academy of Sciences
Pavel Straňák Uni Karlova
Stan Szpakowicz University of Ottawa
Zeerak Talat Simon Fraser University
Shiva Taslimipoor University of Cambridge
Harish Tayyar Madabushi University of Bath
Beata Trawinski Leibniz Institute for the German Language
Ashwini Vaidya Indian Institute of Technology
Marion Di Marco Ludwig Maximilian University of Munich
Amir Zeldes Georgetown University
Daniel Zeman Charles University

Sponsors and Support

Cost Action

Anti-harassment Policy

The workshop follows the LREC/COLING’s anti-harassment policy.


