18th Workshop on Multiword Expressions (MWE 2022)

Colocated with LREC 2022 (Marseille, France)

Date of the Workshop: June 25, 2022

Organised and sponsored by:
Special Interest Group on the Lexicon (SIGLEX) of the Association for Computational Linguistics (ACL)


09:00-09:10   Opening
  Session 1: Oral presentations
  Chair: Agata Savary, Online co-chair: Marcos Garcia
09:10-09:25 A General Framework for Detecting Metaphorical Collocations (short, on-site)
  Marija Brkić Bakarić, Lucia Načinović Prskalo and Maja Popović
09:25-09:40 Improving Grammatical Error Correction for Multiword Expressions (short, on-site)
  Shiva Taslimipoor, Christopher Bryant and Zheng Yuan
09:40-09:50 Native and Non-native Speakers’ Idiom Production: What Can Read Speech Tell Us? (non-archival, on-site)
  Jing Liu and Helmer Strik
09:50-10:10 An Analysis of Attention in German Verbal Idiom Disambiguation (long, online)
  Rafael Ehren, Laura Kallmeyer and Timm Lichte
10:10-10:30 Support Verb Constructions across the Ocean Sea (long, online)
  Jorge Baptista, Nuno Mamede and Sónia Reis
10:30 - 11:00 Coffee break
  Keynote: Sabine Schulte im Walde
11:00- 12:00 Figurative Language in Noun Compound Models across Target Properties, Domains and Time
  Chair: Carlos Ramisch, Online co-chair: Archna Bhatia
  Session 2: Oral presentations
  Chair: Harish Tayyar Madabushi, Online co-chair: Archna Bhatia
12:00-12:20 A Matrix-Based Heuristic Algorithm for Extracting Multiword Expressions from a Corpus (long, online)
  Orhan Bilgin
12:20-12:40 Multi-word Lexical Units Recognition in WordNet (long, online)
  Marek Maziarz, Ewa Rudnicka and Łukasz Grabowski
12:40-13:00 Automatic Detection of Difficulty of French Medical Sequences in Context (long, online)
  Anaïs Koptient and Natalia Grabar
13:00- 14:00 Lunch break
14:00- 15:00 Session 3: Joint SIGUL-MWE poster session
  Chair: Shiva Taslimipoor - Phar’Club area
  [MWE] Annotating “Particles” in Multiword Expressions in te reo Māori for a Part-of-Speech Tagger
  Aoife Finn, Suzanne Duncan, Peter-Lucas Jones, Gianna Leoni and Keoni Mahelona
  [MWE] Metaphor Detection for Low Resource Languages: From Zero-Shot to Few-Shot Learning in Middle High German
  Felix Schneider, Sven Sickert, Phillip Brandes, Sophie Marshall and Joachim Denzler
  [MWE] Automatic Bilingual Phrase Dictionary Construction from GIZA++ Output RETRACTED
  Albina Khusainova, Vitaly Romanov and Adil Khan
  [MWE] A BERT’s Eye View: Identification of Irish Multiword Expressions Using Pre-trained Language Models
  Abigail Walsh, Teresa Lynn and Jennifer Foster
  [MWE] Enhancing the PARSEME Turkish Corpus of Verbal Multiword Expressions
  Yagmur Ozturk, Najet Hadj Mohamed, Adam Lion-Bouton and Agata Savary
  [MWE] German Light Verb Constructions in Business Process Models (non-archival)
  Kristin Kutzner and Ralf Laue
  [paper][poster] - published at LREC 2022 main
  [SIGUL] Evaluating Unsupervised Approaches to Morphological Segmentation for Wolastoqey
  Diego Bear and Paul Cook
  [SIGUL] Baseline English and Maltese-English Classification Models for Subjectivity Detection, Sentiment Analysis, Emotion Analysis, Sarcasm Detection, and Irony Detection
  Keith Cortis and Brian Davis
  [SIGUL] Building Open-source Speech Technology for Low-resource Minority Languages with SáMi as an Example - Tools, Methods and Experiments
  Katri Hiovain-Asikainen and Sjur Moshagen
  [SIGUL] Investigating the Quality of Static Anchor Embeddings from Transformers for Under-Resourced Languages
  Pranaydeep Singh, Orphee De Clercq and Els Lefever
  [SIGUL] Introducing YakuToolkit. Yakut Treebank and Morphological Analyzer
  Tatiana Merzhevich and Fabrí­cio Ferraz Gerardi
  [SIGUL] A Language Model for Spell Checking of Educational Texts in Kurdish (Sorani)
  Roshna Abdulrahman and Hossein Hassani
  [SIGUL] SimRelUz: Similarity and Relatedness Scores as a Semantic Evaluation Dataset for Uzbek Language
  Ulugbek Salaev, Elmurod Kuriyozov and Carlos Gómez-Rodríguez
  [SIGUL] ENRICH4ALL: A First Luxembourgish BERT Model for a Multilingual Chatbot
  Dimitra Anastasiou
  Joint SIGUL-MWE keynote: Steven Bird
15:00- 16:00 Multiword Expressions and the Low-Resource Scenario from the Perspective of a Local Oral Culture
  Chair: Shiva Taslimipoor, Online co-chair: Paul Cook
  Grand Large room
16:00 - 16:30 Coffee break
  Session 4: Oral presentations
  Chair: Teresa Lynn, Online co-chair: Paul Cook
16:30-16:40 Compound-internal Anaphora: Evidence from Acceptability Judgements on Italian Argumental Compounds (non-archival, online)
  Irene Lami and Joost van de Weijer
16:40-16:50 Light Verb Constructions in Corpora of Historical English (non-archival, online)
  Eva Zehentner
16:50-17:05 Sample Efficient Approaches for Idiomaticity Detection (short, online)
  Dylan Robert Schumacher Phelps, Xuan-Rui Fan, Edward Gow-Smith, Harish Tayyar Madabushi, Carolina Scarton and Aline Villavicencio
17:05-17:20 mwetoolkit-lib: Adaptation of the mwetoolkit as a Python Library and an Application to MWE-based Document Clustering (short, online)
  Fernando Rezende Zagatti, Paulo Augusto de Lima Medeiros, Esther da Cunha Soares, Lucas Nildaimon dos Santos Silva, Carlos Ramisch and Livy Real
17:20-17:40 Handling Idioms in Symbolic Multilingual Natural Language Generation (long, online)
  Michaelle Dubé and François Lareau
17:40- 18:00 MWE community discussion
  Chair: Carlos Ramisch
  Open to all MWE Section members for online participation

Keynote Speakers

This year, we are going to have two amazing talks by:

Sabine Schulte im Walde, University of Stuttgart

Steven Bird, Charles Darwin University


Multiword expressions (MWEs) are word combinations which exhibit lexical, syntactic, semantic, pragmatic and/or statistical idiosyncrasies (Baldwin & Kim 2010), such as by and large, hot dog, pay a visit and pull one’s leg. The notion encompasses closely related phenomena: idioms, compounds, light-verb constructions, phrasal verbs, rhetorical figures, collocations, institutionalised phrases, etc. Their behaviour is often unpredictable; for example, their meaning often does not result from the direct combination of the meanings of their parts. Given their irregular nature, MWEs often pose complex problems in linguistic modelling (e.g. annotation), NLP tasks (e.g. parsing), and end-user applications (e.g. natural language understanding and MT), hence still representing an open issue for computational linguistics (Constant et al. 2017).

For almost two decades, modelling and processing MWEs for NLP has been the topic of the MWE workshop organised by the MWE section of SIGLEX in conjunction with major NLP conferences since 2003. Impressive progress has been made in the field, but our understanding of MWEs still requires much research considering its need and usefulness in NLP applications. For this 18th edition of the workshop, we identified three topics on which contributions are particularly encouraged:

Through this workshop, we would like to bring together and encourage researchers in various NLP subfields to submit MWE-related research, so that approaches that deal with processing of MWEs including processing for low-resource languages and for various applications can benefit from each other. We also intend to consolidate the converging effects of previous joint workshops LAW-MWE-CxG 2018, MWE-WN 2019 and MWE-LEX 2020, and the joint MWE-WOAH panel in 2021, extending our scope to MWEs in e-lexicons and WordNets, MWE annotation, as well as grammatical constructions. Correspondingly, we will call for papers on research related (but not limited) to MWEs and constructions in:

Joint session with SIGUL 2022 Workshop

Pursuing its efforts in building bridges with other communities, the MWE Section organises a joint session with the workshop of the Special Interest Group on Under-resourced Languages (SIGUL 2022). The goal is to foster future synergies that could address scientific challenges in the creation of resources, models and applications to deal with multiword expressions and related phenomena in low-resource scenarios, in accordance with one of our special topics in MWE 2022. The session will feature a joint poster session and a joint keynote talk by Steven Bird.

Submission modalities

The workshop invites two types of submissions:

All papers should be submitted via the workshop’s START submission space. Please choose the appropriate link for standard Archival submission or for the Non-archival submission. Registering to the workshop will be necessary to present both archival and non-archival submissions. Presentation and participation formats (on-line, on-site, both) will depend on LREC 2022 main conference arrangements and will be announced later.

Instructions for authors

The double-blind submissions (Archival submissions) should adhere to the ACL Author Guidelines. There is no limit on the number of reference pages.

The PMWE book series editors have put forward a list of conventions to cite multilingual MWE examples and a checklist for PMWE authors. Parts of the checklist are specific to PMWE authors, but sections like Terms, abbreviations and spelling can be relevant for MWE 2022 submissions. We encourage authors to adopt these conventions whenever relevant, without enforcing them. We hope that, in the long term, these could become widely adopted standards in the community.

All submissions should be made via the workshop’s START space. Please choose the appropriate submission modality as described in the Sumbission modalities section above.

Important dates

Paper Submission Deadline: April 17, 2022
Notification of Acceptance: May 3, 2022
Camera-ready Papers Deadline: May 23, 2022
Workshop: June 25, 2022


