21st Workshop on Multiword Expressions (MWE 2025)

Colocated with: NAACL-2025, Albuquerque, New Mexico, U.S.A.

Date of the Workshop: May 4, 2025

Organised and sponsored by:
The Special Interest Group on the Lexicon (SIGLEX) of the Association for Computational Linguistics (ACL), SIGLEX’s Multiword Expressions Section (SIGLEX-MWE).


News


Contents on this page

Proceedings and video recording

TBD


Program

TBD


Keynote speaker

Nathan Schneider (Georgetown University)

Bio: Nathan Schneider is a computational linguist. As Associate Professor of Linguistics and Computer Science at Georgetown University, he leads the NERT lab, looking for synergies between practical language technologies and the scientific study of language, with an emphasis on how words, grammar, and context conspire to convey meaning. He is the recipient of an NSF CAREER award to study NLP vis-à-vis metalinguistic enterprises like language learning, linguistics, and legal interpretation. Recently, he has weighed in on specific interpretive debates in U.S. law; one of these analyses was cited by U.S. Supreme Court justices in a major firearms case. He is active in the NLP community—especially ACL’s SIGANN and SIGLEX—and the Universal Dependencies project; and cofounded the SOLID forum for empirical research on legal interpretation. Prior to Georgetown, he inhabited UC Berkeley, Carnegie Mellon University, and the University of Edinburgh. Apart from annotation scheming and computational modeling, he enjoys classical music and chocolate chip cookies

Title: Meaning Construction at the Syntax-Lexis Nexus

Abstract: When words and grammar come into contact, things sometimes get messy: idiosyncratic expressions and patterns disobey ordinary principles of regularity and compositionality. A useful point of reference is the theoretical perspective of Construction Grammar, which exhorts us to view linguistic knowledge in terms of form-function mappings—at all levels of granularity. How can this perspective inform a broad-coverage, multilingual approach to lexicosyntactic conundrums? First, I will discuss implications for corpus annotation: while some multiword expressions and names (e.g. “at least”, “in order to”, “Chapter 1”) test the limits of categorical annotation standards like Universal Dependencies, UD treebanks nevertheless enable empirical investigation of some functionally-defined constructions across languages. Second, I will discuss efforts to interpret the latent representations of constructional form and meaning in transformer language models, with the NPN construction (noun-preposition-noun, as in “face to face”) as a case study.


Registration

To attend the workshop (either in person or virtually), please register through NAACL 2025’s registration system. Note that to attend MWE 2025, it is sufficient to select this workshop during registration; you do not have to register for the main conference.


Description

Multiword expressions (MWEs), i.e., word combinations that exhibit lexical, syntactic, semantic, pragmatic, and/or statistical idiosyncrasies (Baldwin and Kim, 2010), such as “by and large”, “hot dog”, “make a decision” and “break one’s leg” are still a pain in the neck for Natural Language Processing (NLP). The notion encompasses closely related phenomena: idioms, compounds, light-verb constructions, phrasal verbs, rhetorical figures, collocations, institutionalized phrases, etc. Given their irregular nature, MWEs often pose complex problems in linguistic modeling (e.g. annotation), NLP tasks (e.g. parsing), and end-user applications (e.g. natural language understanding and Machine Translation), hence still representing an open issue for computational linguistics (Constant et al., 2017).

For more than two decades, modelling and processing MWEs for NLP has been the topic of the MWE workshop organised by the MWE section of SIGLEX in conjunction with major NLP conferences since 2003. Impressive progress has been made in the field, but our understanding of MWEs still requires much research considering their need and usefulness in NLP applications. This is also relevant to domain-specific NLP pipelines that need to tackle terminologies most often realised as MWEs. Following previous years, for this 21st edition of the workshop, we identified the following topics on which contributions are particularly encouraged:

Through this workshop, we will bring together and encourage researchers in various NLP subfields to submit their MWE-related research . We also intend to consolidate the converging results of previous joint workshops LAW-MWE-CxG 2018, MWE-WN 2019 and MWE-LEX 2020, the joint MWE-WOAH panel in 2021, the MWE-SIGUL 2022 joint session, and the MWE-UD 2024, extending our scope to MWEs in e-lexicons and WordNets, MWE annotation, as well as grammatical constructions. Correspondingly, we call for papers on research related (but not limited) to MWEs and constructions in:


Submission Formats

The workshop invites two types of submissions:



Paper Submission and Templates

Papers should be submitted via the OpenReview submission page. Please choose the appropriate submission format (archival/non-archival). Archival papers with existing reviews will also be accepted through the ACL Rolling Review. Submissions must follow the ACL stylesheet. For further information on this initiative, please refer to NAACL 2025

The ARR (pre-reviewed)’s paper can be committed here.


Best Paper Award

TBD


Important Dates

What When
Paper submission deadline February 13, 2025
ARR commitment deadline February 27, 2025
Notification of acceptance March 8, 2025
Camera-ready papers due March 17, 2025
Underline upload deadline April 8, 2025
Workshop May 04, 2025

All deadlines are at 23:59 UTC-12 (Anywhere on Earth).


Organizing Committee (Listed alphabetically)

A. Seza Doğruöz Ghent University, Belgium
Alexandre Rademaker FGV/EMA, Brazil
Atul Kr. Ojha Insight Research Ireland Centre for Data Analytics, University of Galway
Gražina Korvel VU Institute of Data Science and Digital Technologies
Mathieu Constant Université de Lorraine
Verginica Barbu Mititelu Romanian Academy Research Institute for Artificial Intelligence
Voula Giouli Institute for Language & Speech Processing, ATHENA RC, Greece

Program Committee

Agata Savary Université Paris-Saclay
Beata Trawinski Leibniz Institute for the German Language
Carlos Ramisch LIS - Laboratoire d’Informatique et Systèmes
Chikara Hashimoto Rakuten Institute of Technology
Cvetana Krstev University of Belgrade, Faculty of Philology
Eric G C Laporte Université Gustave Eiffel
Francis Bond Palacký University Olomouc
Gaël Dias University of Caen Normandy
Gražina Korvel Vilnius University
Irina Lobzhanidze Ilia Chavchavadze State University
Ismail El Maarouf Imprevicible
Ivelina Stoyanova Deaf Studies Institute
Jan Odijk Utrecht University
John Philip McCrae National University of Ireland Galway
Kenneth Church Northeastern University
Manfred Sailer Johann Wolfgang Goethe Universität Frankfurt am Main
Mathieu Constant Université de Lorraine, CNRS, ATILF
Matthew Shardlow The Manchester Metropolitan University
Meghdad Farahmand University of Genoa
Miriam Butt Universität Konstanz
Paul Cook University of New Brunswick
Pavel Pecina Charles University
Petya Osenova Sofia University “St. Kliment Ohridski”
Ranka Stanković University of Belgrade
Sabine Schulte im Walde University of Stuttgart
Shiva Taslimipoor University of Cambridge
Stan Szpakowicz University of Ottawa
Stella Markantonatou ATHENA RIC
Tiberiu Boros Adobe Systems
Tunga Gungor Bogazici University

Sponsors and Support

ACL SIGLEX

Anti-harassment Policy

The workshop follows the ACL anti-harassment policy.


Contact

For any inquiries regarding the workshop, please send an email to the Organizing Committee at mwe2025workshop@gmail.com.

Please register to SIGLEX and check the “MWE Section” box to be registered to our mailing list.