SIGLEX-MWE Section - Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD 2024)

Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD 2024)

Colocated with: LREC-COLING 2024 (Torino, Italia)

Date of the Workshop: May 25, 2024

Organised and sponsored by:
The Special Interest Group on the Lexicon (SIGLEX) of the Association for Computational Linguistics (ACL), SIGLEX’s Multiword Expressions Section (SIGLEX-MWE), Universal Dependencies (UD) and UniDive Cost Action CA21167.

@multiword

News

May 25, 2024: MWE-UD 2024 workshop
May 23, 2024: Proceedings available
April 30, 2024: Detailed tentative schedule online
April 5, 2024: Acceptance notifications for nonarchival presentations sent
April 2, 2024: Acceptance notifications for archival papers sent
February 19, 2024: MWE-UD 2024 workshop final CfP posted with extended deadline (new submission date: March 3, 2024)
February 19, 2024: MWE-UD 2024 workshop ARR commitment date set (ARR Commitment Date: March 25, 2024)
January 31, 2024: Second CfP posted
January 16, 2024: Keynote speaker Natalia Levshina confirmed
January 18, 2024: Keynote speaker Harish Tayyar Madabushi confirmed
December 8, 2023: First CfP posted
December 8, 2023: MWE-UD 2024 workshop date confirmed (Workshop Date: May 25, 2024)
November 21, 2023: MWE-UD 2024 proposal accepted to LREC-COLING 2024
August 29, 2023: Organising committee formed

Contents on this page

Proceedings and video recording
Program
Keynote Speakers
Registration
Description
Submission Formats
Paper Submission and Templates
Best Paper Award and Travel Grants
Important Dates
Organizing Committee
Program Committee
Sponsors and Support
Anti-harassment Policy
Contact

Proceedings and video recording

The proceedings are available in the ACL Anthology.

Program

Subject to change.

Note: presentations marked “non-archival” are either already published elsewhere or work in progress. They have not undergone MWE-UD 2024’s formal peer review progress and are not included in the proceedings, only listed in the programme. Due to last-minute changes in the lineup of non-archival presentations, the version here differs slightly from the one in the proceedings.

Date: Saturday, 25 May 2024
Location: Room “Madrid”, Lingotto Conference Centre, Turin, Italy
Zoom link: see Catalyst conference app

Time	Session
09:00–09:05	Welcome
	Session chair: Voula Giouli
	[slides]
09:05–09:50	Keynote 1
	Session chair: Voula Giouli
	Every Time We Hire an LLM, the Reasoning Performance of the Linguists Goes Up Harish Tayyar Madabushi [abstract] [slides]
09:50–10:30	Oral session 1
	Session chair: Lifeng Han
	Assessing BERT’s sensitivity to idiomaticity Li Liu and Francois Lareau [paper] [slides]
	Sign of the Times: Evaluating the use of Large Language Models for Idiomaticity Detection Dylan Phelps, Thomas M. R. Pickard, Maggie Mi, Edward Gow-Smith and Aline Villavicencio [paper] [slides]
10:30–11:00	Coffee break
11:00–12:00	Poster session 1
	Identification and Annotation of Body Part Multiword Expressions in Old Egyptian Roberto Díaz Hernández [paper] [poster]
	Diachronic Analysis of Multi-word Expression Functional Categories in Scientific English Diego Alves, Stefania Degaetano-Ortlieb, Elena Schmidt and Elke Teich [paper] [poster]
	Lexicons Gain the Upper Hand in Arabic MWE Identification Najet Hadj Mohamed, Agata Savary, Cherifa Ben Khelil, Jean-Yves Antoine, Iskandar keskes and Lamia Hadrich-Belguith [paper] [poster]
	Revisiting VMWEs in Hindi: Annotating Layers of Predication Kanishka Jain and Ashwini Vaidya [paper] [poster]
	Combining Grammatical and Relational Approaches. A Hybrid Method for the Identification of Candidate Collocations from Corpora Damiano Perri, Irene Fioravanti, Osvaldo Gervasi and Stefania Spina [paper] [poster]
	Multiword Expressions between the Corpus and the Lexicon: Universality, Idiosyncrasy, and the Lexicon-Corpus Interface Verginica Barbu Mititelu, Voula Giouli, Stella Markantonatou, Ivelina Stoyanova, Petya Osenova, Kilian Evang, Daniel Zeman, Simon Krek, Carole Tiberius, Christian Chiarcos and Ranka Stanković [paper] [poster]
	MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora Lifeng Han, Gareth Jones and Alan Smeaton [non-archival] [poster]
	A demonstration of MWE-Finder and MWE-Annotator Jan Odijk, Martin Kroon, Tijmen Baarda, Ben Bonfil and Sheean Spoel [non-archival] [poster]
	Is Less More? Quality, Quantity and Context in Idiom Processing with Natural Language Models Agne Knietaite, Adam Allsebrook, Anton Minkov, Adam Tomaszewski, Norbert Slinko, Richard Johnson, Thomas Pickard and Aline Villavicencio [non-archival] [poster]
	Annotating Compositionality Scores for Irish Noun Compounds is Hard Work Abigail Walsh, Teresa Clifford, Emma Daly, Jane Dunne, Brian Davis and Gearóid Ó Cleircín [non-archival] [poster]
12:00–13:00	Oral session 2
	Session chair: Joakim Nivre
	Universal Feature-based Morphological Trees Federica Gamba, Abishek Stephen and Zdeněk Žabokrtský [paper] [slides]
	Light Verb Constructions in Universal Dependencies for South Asian Languages Abishek Stephen and Daniel Zeman [paper] [slides]
	Strategies for the Annotation of Pronominalised Locatives in Turkic Universal Dependency Treebanks Jonathan Washington, Çağrı Çöltekin, Furkan Akkurt, Bermet Chontaeva, Soudabeh Eslami, Gulnura Jumalieva, Aida Kasieva, Aslı Kuzgun, Büşra Marşan and Chihiro Taguchi [paper] [slides]
13:00–14:00	Lunch
14:00–14:45	Keynote 2
	Session chair: Gosse Bouma
	Using Universal Dependencies for testing hypotheses about communicative efficiency Natalia Levshina [abstract] [slides]
14:45–15:00	Booster session: virtual presentations
	Session chair: Kilian Evang
	Automatic Manipulation of Training Corpora to Make Parsers Accept Real-world Text Hiroshi Kanayama, Ran Iwamoto, Masayasu Muraoka, Takuya Ohko and Kohtaroh Miyamoto [paper] [slides] [booster video] [presentation video]
	Fitting Fixed Expressions into the UD Mould: Swedish as a Use Case Lars Ahrenberg [paper] [slides] [booster video] [presentation video]
	The Vedic Compound Dataset Sven Sellmer and Oliver Hellwig [paper] [slides] [booster video] [presentation video]
	Overcoming Early Saturation on Low-Resource Languages in Multilingual Dependency Parsing Jiannan Mao, Chenchen Ding, Hour Kaing, Hideki Tanaka, Masao Utiyama and Tadahiro Matsumoto. [paper] [slides] [booster video] [presentation video]
	Towards the semantic annotation of SR-ELEXIS corpus: Insights into Multiword Expressions and Named Entities Cvetana Krstev, Ranka Stanković, Aleksandra Marković and Teodora Mihajlov [paper] [slides] [booster video] [presentation video]
	Universal Dependencies for Saraiki Meesum Alam, Francis Tyers, Emily Hanink and Sandra Kübler [paper] [slides] [booster video] [presentation video]
	BERT-based Idiom Identification using Language Translation and Word Cohesion Arnav Yayavaram, Siddharth Yayavaram, Prajna Devi Upadhyay and Apurba Das [paper] [slides] [booster video] [presentation video]
15:00–16:00	Oral session 3
	Session chair: Marcos Garcia
	To Leave No Stone Unturned: Annotating Verbal Idioms in the Parallel Meaning Bank Rafael Ehren, Kilian Evang and Laura Kallmeyer [paper] [slides]
	Annotation of Multiword Expressions in the SUK 1.0 Training Corpus of Slovene: Lessons Learned and Future Steps Jaka Čibej, Polona Gantar and Mija Bon [paper] [slides]
	Ad Hoc Compounds for Stance Detection Qi Yu, Fabian Schlotterbeck, Hening Wang, Naomi Reichmann, Britta Stolterfoht, Regine Eckardt and Miriam Butt [paper] [slides]
16:00–16:30	Coffee break
16:30–17:20	Poster session 2
	Synthetic-Error Augmented Parsing of Swedish as a Second Language: Experiments with Word Order Arianna Masciolini, Emilie Francis and Maria Irena Szawerna [paper] [poster]
	A Universal Dependencies Treebank for Gujarati Mayank Jobanputra, Maitrey Mehta and Çağrı Çöltekin [paper] [poster]
	Part-of-Speech Tagging for Northern Kurdish Peshmerge Morad, Sina Ahmadi and Lorenzo Gatti [paper] [poster]
	Domain-Weighted Batch Sampling for Neural Dependency Parsing Jacob Striebel, Daniel Dakota and Sandra Kübler [paper] [poster]
	AlphaMWE-Arabic: Arabic Edition of Multilingual Parallel Corpora with Multiword Expression Annotations Najet Hadj Mohamed, Malak Rassem, Lifeng Han and Goran Nenadic [non-archival] [poster]
	MaiBaam: A Multi-Dialectal Bavarian Universal Dependency Treebank Verena Blaschke, Barbara Kovačić, Siyao Peng, Hinrich Schütze and Barbara Plank [non-archival] [poster]
	Redefining Syntactic and Morphological Tasks for Typologically Diverse Languages Omer Goldman, Leonie Weissweiler and Reut Tsarfaty [non-archival] [poster]
	UCxn: Typologically Informed Annotation of Constructions Atop Universal Dependencies Leonie Weissweiler, Nina Böbel, Kirian Guiller, Santiago Herrera, Wesley Scivetti, Arthur Lorenzi, Nurit Melnik, Archna Bhatia, Hinrich Schütze, Lori Levin, Amir Zeldes, Joakim Nivre, William Croft and Nathan Schneider [non-archival] [poster]
	Sparse Logistic Regression with High-order Features for Automatic Grammar Rule Extraction from Treebanks Santiago Herrera, Caio Corro and Sylvain Kahane [non-archival] [poster]
	Joint Annotation of Morphology and Syntax in Dependency Treebanks Bruno Guillaume, Kim Gerdes, Kirian Guiller, Sylvain Kahane and Yixuan Li [non-archival] [poster]
17:20–18:00	Best paper award, announcements, community discussion
	Session chair: Voula Giouli
	[slides]

Keynote Speakers

Natalia Levshina: Using Universal Dependencies for testing hypotheses about communicative efficiency

Abstract: There is abundant evidence that language structure and use are influenced by language users’ tendency to be efficient, trying to minimize the cost-to-benefit ratio of communication (e.g., Hawkins, 2004; Gibson et al., 2019; Levshina, 2022). In my talk I will show how data from corpora annotated with Universal Dependencies can be used for testing hypotheses about the role of communicative efficiency in shaping up language structure and use. The hypotheses are as follows:

As discussed by typologists (Sapir, 1921; Sinnemäki, 2008), rigid word order can compensate for lack of formal marking of core arguments. The hypothesis is then that there are positive correlations between the entropy of subject and object in a transitive clause in a corpus and the relative frequency of disambiguating case forms or verb forms. These correlations are expected to minimize the articulation effort involved in the use of argument flags or indices.
There is a positive correlation between semantic tightness (Hawkins, 1986), operationalized as Mutual Information between lexemes and syntactic roles, and the relative frequency of verb-final clauses in a corpus. Strong associations between lexemes and roles help to avoid the costs of reanalysis in verb-final languages.
There is a negative correlation between the relative frequency of verb-final clauses in the clause and the average number of overt core arguments, which helps to save processing costs required for keeping longer dependencies in mind (cf. Ueno & Polinsky, 2009).

These hypotheses will be tested on corpus data annotated with Universal Dependencies, with the help of mixed-effects models with genealogical and geographic information as random effects.

References

Gibson, Edward, Richard Futrell, Steven P. Piantadosi, Isabelle Dautriche, Kyle Mahowald, Leon Bergen & Roger Levy. 2019. How efficiency shapes human language. *Trends in Cognitive Science* 23(5): 389-407. https://doi.org/10.1016/j.tics.2019.02.003

Hawkins, John A. 1986. *A Comparative Typology of English and German: Unifying the Contrasts*. London: Croom-Helm.

Hawkins, John A. 2004. *Efficiency and Complexity in Grammars*. Oxford: Oxford University Press.

Levshina, Natalia. 2022. *Communicative Efficiency: Language Structure and Use*. Cambridge: Cambridge University Press.

Sapir, Edward. 1921. *Language: An Introduction to the Study of Speech*. New York: Harcourt.

Sinnemäki, Kaius. 2008. Complexity trade-offs in core argument marking. In: Matti Miestamo, Kaius Sinnemäki and Fred Karlsson (eds.), *Language Complexity: Typology, Contact, Change*, 67–88. Amsterdam: John Benjamins.

Ueno, Mieko & Maria Polinsky. 2009. Does headedness affect processing? A new look at the VO-OV contrast. *Journal of Linguistics* 45: 675–710.

Bio: Dr. Natalia Levshina is an assistant professor of communication and computational methods at Radboud University in Nijmegen, Netherlands. Her main research interests are linguistic typology, corpora, AI, cognitive and functional linguistics. After obtaining her PhD at the University of Leuven in 2011, she has worked in Jena, Marburg, Louvain-la-Neuve and Leipzig, where she got her habilitation qualification in 2019, followed by a research position at the Max Planck Institute for Psycholinguistics in Nijmegen. She has published a book “Communicative Efficiency: Language structure and use” (Cambridge University Press, 2022), in which she formulates the main principles of communicatively efficient linguistic behaviour and shows how these principles can explain why human languages are the way they are. Natalia is also the author of a best-selling statistical manual “How to Do Linguistics with R” (Benjamins, 2015).

Harish Tayyar Madabushi: Every Time We Hire an LLM, the Reasoning Performance of the Linguists Goes Up

Abstract: Pre-Trained Language Models (PLMs), trained on the cloze-like task of masked language modelling, have demonstrated access to a broad range of linguistic information, including both syntax and semantics. Given their access to both syntax and semantics, coupled with their data-driven foundations, which align with usage-based theories, it is valuable and interesting to examine the constructional information they encode. Early work confirmed that these models have access to a substantial amount of constructional information. However, more recent research focusing on the types of constructions PLMs can accurately interpret, and those they find challenging, suggests that an increase in schematicity correlates with a decline in model proficiency. Crucially, schematicity—the extent to which constructional slots are fixed or allow for a range of elements that satisfy a particular semantic role associated with the slot—correlates to the extent of “reasoning” needed to interpret constructions, a task that poses significant challenges for language models. In this talk, I will begin by reviewing the constructional information encoded in both earlier models and more recent large language models. I will explore how these aspects are intertwined with the models’ reasoning abilities and introduce promising new approaches that could integrate theoretical insights from linguistics with practical, data-driven approaches of PLMs.

Bio: Dr. Tayyar Madabushi’s research focuses on understanding the fundamental mechanisms that underpin the performance and functioning of Large Language Models. His work on LLMs was included in the discussion paper on the Capabilities and Risks of Frontier AI, which was used as one of the foundational research works for discussions at the UK AI Safety Summit held at Bletchley Park. His research on the constructional information encoded in language models has been influential in bringing together the fields of construction grammar and pre-trained language models. In addition, his work on language models includes collaborative industrial research aimed at rectifying biases in speech-to-text systems widely utilised across the UK. Before starting his PhD in automated question answering at the University of Birmingham, Dr. Tayyar Madabushi founded and headed a social media data analytics company based in Singapore.

Registration

To attend the workshop (either in person or virtually), please register through LREC-COLING 2024’s registration system. Note that to attend MWE-UD 2024, it is sufficient to select this workshop during registration; you do not have to register for the main conference.

Description

Multiword expressions (MWEs) are word combinations that exhibit lexical, syntactic, semantic, pragmatic, and/or statistical idiosyncrasies (Baldwin and Kim, 2010), such as by and large, hot dog, pay a visit and pull someone’s leg. The notion encompasses closely related phenomena: idioms, compounds, light-verb constructions, phrasal verbs, rhetorical figures, collocations, institutionalized phrases, etc. Their behavior is often unpredictable; for example, their meaning often does not result from the direct combination of the meanings of their parts. Given their irregular nature, MWEs often pose complex problems in linguistic modeling (e.g. annotation), NLP tasks (e.g. parsing), and end-user applications (e.g. natural language understanding and MT), hence still representing an open issue for computational linguistics (Constant et al., 2017).

Universal Dependencies (UD; De Marneffe et al., 2021) is a framework for cross-linguistically consistent treebank annotation that has so far been applied to over 100 languages. The framework aims to capture similarities as well as idiosyncrasies among typologically different languages (e.g., morphologically rich languages, pro-drop languages, and languages featuring clitic doubling). The goal in developing UD was not only to support comparative evaluation and cross-lingual learning but also to facilitate multilingual natural language processing and enable comparative linguistic studies.

After independently running a successful series of workshops, the MWE and UD communities are now joining forces to organize a joint workshop. This is a timely collaboration because the two communities clearly have overlapping interests. For instance, while UD has several dependency relations that can be used to annotate MWEs, both annotation guidelines (i.e. is syntactic irregularity and inflexibility or semantic non-compositionality the leading criterion?) and annotation practice (both across treebanks for a single language and across languages) for these relations can be improved (Schneider and Zeldes, 2021). The PARSEME MWE-annotated corpora for 26 languages build on UD annotated corpora (Savary et al., 2023). Both communities share an interest in developing guidelines, data-sets, and tools that can be applied to a wide range of typologically diverse languages, raising fundamental questions about tokenization, lemmatization, and morphological decomposition of tokens. Proposals for harmonizing annotation practice between what has been achieved in PARSEME and UD and expanding PARSEME MWE annotation to non-verbal MWEs are also central to the recently started UniDive COST action (CA21167).

The workshop invites submissions of original research on MWE, UD, and the interplay of both. In particular, the following topics are especially relevant:

Sensitivity of LLMs to MWE and syntactic dependencies. Studies along the lines of Manning et al. (2020) (UD), Nedumpozhimana and Kelleher (2021), Garcia et al. (2021), Fakharian and Cook (2021), Moreau et al. (2018) (MWE), and others on the question to what extent LLMs make use of syntactic dependencies or are capable of detecting MWEs and capturing their semantics.
Applicability of UD and MWE annotation and discovery for low-resource and typologically diverse languages and language varieties. Both UD and PARSEME aim at universal applicability across a wide range of languages. Much theoretical, computational, and empirical work concentrates on high-resource languages however. Applying these frameworks to typologically diverse languages may lead one to reconsider the notion of token, word, and morphological segmentation, and to reassess the notion of MWE for languages that feature compounding or incorporation (Baldwin et al., 2021; Haspelmath, 2023).
Case studies. Studies on the consistency, coverage or universal applicability of MWE annotation in the UD or PARSEME frameworks, as well as studies on automatic detection and interpretation of MWEs in corpora.
MWE and UD processing to enhance end-user applications. MWEs have gained particular attention in end-user applications, including MT (Zaninello and Birch, 2020; Han et al., 2021), simplification (Kochmar et al., 2020), language learning and assessment (Paquot et al., 2019; Christiansen and Arnon, 2017), social media mining (Maisto et al., 2017), and abusive language detection (Zampieri et al., 2020; Caselli et al., 2020). We believe that it is crucial to extend and deepen these first attempts to integrate and evaluate MWE technology in these and further end-user applications.
Testing developed systems on the latest dataset versions. Authors are also encouraged to submit papers that test the developed systems using the recent UD 2.13 and/or PARSEME 1.3 releases.

Submission Formats

The workshop invites two types of submissions: 

Archival submissions that present substantially original research in both long paper format (8 pages + references) and short paper format (4 pages + references)
Non-archival submissions of abstracts describing relevant research presented/published elsewhere which will not be included in the MWE-UD proceedings.

Paper Submission and Templates

Papers should be submitted via the workshop’s START submission page. Please choose the appropriate submission format (archival/non-archival). Submissions must follow the LREC-COLING 2024 stylesheet.

When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the research described in the paper or are a new result of the research. Moreover, ELRA encourages all LREC-COLING authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones)

Archival papers with existing reviews from ACL Rolling Review (ARR)) will also be considered. A paper may not be under review through ARR and MWE-UD simultaneously. A paper that has or will receive reviews through ARR may not be submitted for review to MWE-UD.

Best Paper Award and Travel Grants

Best paper award criteria TBD
UniDive members with accepted papers may be eligible for travel reimbursement via UniDive.
Accepted authors from underrepresented groups (e.g., an underrepresented counry) or accepted authors of work on low-resource languages may be eligible to apply for an ACL-SIGLEX travel grant of upto 500 USD.

Important Dates

What	When
Paper submission deadline	March 3, 2024
ARR commitment deadline	March 25, 2024
Notification of acceptance	April 1, 2024
Camera-ready papers due	April 8, 2024
Underline upload deadline	TBD
Workshop	May 25, 2024

All deadlines are at 23:59 UTC-12 (Anywhere on Earth).

Organizing Committee

Archna Bhatia	Institute for Human and Machine Cognition, USA
Gosse Bouma	Groningen University, NL
A. Seza Dogruöz	Ghent University, Belgium
Kilian Evang	Heinrich Heine University Düsseldorf, DE
Marcos Garcia	University of Santiago de Compostela, Galiza, Spain
Voula Giouli	Institute for Language & Speech Processing, ATHENA RC, Greece
Lifeng Han	Univ. of Manchester, UK
Joakim Nivre	Uppsala University and Research Institutes of Sweden, Sweden
Alexandre Rademaker	IBM Research, Brazil

Program Committee

Verginica Barbu Mititelu	Romanian Academy
Cherifa Ben Kehlil	University of Tours
Philippe Blache	Aix-Marseille Uni
Francis Bond	Palacký University
Claire Bonial	U.S. Army Research Laboratory
Julia Bonn	University of Colorado Boulder
Tiberiu Boroș	Adobe
Marie Candito	Université Paris Cité
Giuseppe G. A. Celano	Leipzig Uni
Kenneth Church	Baidu
Çağrı Çöltekin	University of Tübingen
Mathieu Constant	Université de Lorraine
Monika Czerepowicka	University of Warmia and Mazury
Daniel Dakota	Indiana University
Miryam de Lhoneux	KU Leuven
Marie-Catherine de Marneffe	UC Louvain
Valeria de Paiva	Nuance
Gaël Dias	University of Caen Basse-Normandie
Kaja Dobrovoljc	University of Ljubljana
Rafael Ehren	Heinrich Heine University Düsseldorf
Gülşen Eryiğit	Istanbul Technical University
Meghdad Farahmand	Berlin, Germany
Christiane Fellbaum	Princeton University
Jennifer Foster	Dublin City University
Aggeliki Fotopoulou	Institute for Language and Speech Processing, ATHENA RC
Stefan Th. Gries	UC Santa Barbara & JLU Giessen
Bruno Guillaume	Université de Lorraine
Tunga Gungor	Bogaziçi University
Eleonora Guzzi	Universidade da Coruña
Laura Kallmeyer	Heinrich Heine University Düsseldorf
Cvetana Krstev	University of Belgrade
Timm Lichte	University of Tübingen
Irina Lobzhanidze	Ilia State University
Teresa Lynn	ADAPT Centre
Stella Markantonatou	Institute for Language & Speech Processing, ATHENA RC
John P. McCrae	National University of Ireland, Galway
Nurit Melnik	The Open University of Israel
Johanna Monti	“L’Orientale” University of Naples
Dmitry Nikolaev	University of Manchester
Jan Odijk	University of Utrecht
Petya Osenova	Bulgarian Academy of Sciences
Yannick Parmentier	University of Lorraine
Agnieszka Patejuk	University of Oxford and Institute of Computer Science, Polish Academy of Sciences
Pavel Pecina	Charles University
Ted Pedersen	University of Minnesota
Prokopis Prokopidis	Institute for Language and Speech Processing, ATHENA RC
Manfred Sailer	Goethe-Universität Frankfurt am Main
Tanja Samardžić	University of Zurich
Agata Savary	Université Paris-Saclay
Nathan Schneider	Georgetown University
Sabine Schulte im Walde	University of Stuttgart
Sebastian Schuster	Saarland University
Matthew Shardlow	University of Manchester
Joaquim Silva	Universidade NOVA de Lisboa
Maria Simi	Università di Pisa
Ranka Stanković	University of Belgrade
Ivelina Stoyanova	Bulgarian Academy of Sciences
Stan Szpakowicz	University of Ottawa
Shiva Taslimipoor	University of Cambridge
Beata Trawinski	Leibniz Institute for the German Language
Ashwini Vaidya	Indian Institute of Technology
Marion Di Marco	Ludwig Maximilian University of Munich
Amir Zeldes	Georgetown University
Daniel Zeman	Charles University