Machine Learning for Ancient Languages

The Workshop

The ML4AL Workshop aims to inspire collaboration and support research momentum in the emerging field of Machine Learning for the study of ancient texts.

Ancient languages preserve the cultures and histories of the past. However, their study is fraught with difficulties, and experts must tackle a range of challenging text-based tasks, from deciphering lost languages to restoring damaged inscriptions, to determining the authorship of works of literature.

Technological aids have long supported the study of ancient texts, but in recent years advances in Artificial Intelligence and Machine Learning have enabled analyses on ancient languages on an unprecedented scale and in unparalleled detail.

The ML4AL workshop will showcase the scientific opportunities at the intersection of the Humanities and ML, and spotlight promising directions for future endeavours within this rising field.

When

15 August 2024.
Where

Hybrid format: Bangkok (Thailand) and online.
Registration

Refer to the official ACL 2024 website.

Call for Papers

Topics of interest

Digitization: bringing textual sources to a high-quality machine-readable format.

Restoration: recovering missing text and reassembling fragmented written artifacts.

Attribution: contextualising a document within its original geographical, chronological and authorial setting.

Linguistic analysis: involving linguistic tasks such as semantic analysis, part of speech (POS) tagging, text parsing and segmentation.

Textual criticism: the process of reconstructing a text’s philological tradition of textual transmission.

Translation and decipherment: which aim to make a text’s language comprehensible and interpretable to modern-day researchers.

We particularly encourage submissions which tackle low-data, underrepresented, non-Western ancient languages. We also invite dataset publications to further enrich our understanding of these languages and their contexts.

Scope

ML4AL is designed to facilitate and invigorate the ongoing collaborative momentum between ML and the Humanities, to foster a deeper understanding of our past.

We invite contributions tackling texts from the diverse corners of the globe, in any language, script or medium. We establish a chronological scope from the inception of writing systems in ancient Mesopotamia and Egypt (3400 BCE) to the late first millennium CE.

We welcome long (8 page) and short (4 page) paper submissions on OpenReview: see our Submission page for more information.

Accepted regular workshop papers will be included in the workshop proceedings, but non-archival submissions are also welcome.

READ THE FULL CfP

SUBMISSIONS ARE NOW CLOSED

Objectives

An inclusive scope

By showcasing the scientific opportunities at the intersection of the Humanities and ML, the workshop offers a roadmap for this burgeoning interdisciplinary field.

A collaborative mindset

ML4AL emphasises the value and urgency of active collaboration between the specialists from both fields, to produce compelling and consequential research.

An explainable approach

ML4AL aims to generate awareness of the risks of data bias and digital colonialism, emphasise the importance of standardised datasets, metrics and benchmarks, and encourage the development of explainable AI tools.

Information

Important dates

All deadlines are 11:59 pm UTC -12h (“anywhere on Earth”).

Please note the extended deadlines!

~~17 May 2024~~ 24 May 2024

Direct paper submission deadline

~~17 June 2024~~ 21 June 2024

Notification of acceptance

~~1 July 2024~~ 5 July 2024

Camera-ready paper due

15 August 2024

Workshop

Organising Committee

Dr John Pavlopoulos, Athens University of Economics and Business, and Archimedes/Athena RC, Greece.

Dr Thea Sommerschield, University of Nottingham, UK.

Dr Yannis Assael, Google DeepMind, UK.

Assisted by:

Dr Shai Gordin, Ariel University, Israel.
Prof. Kyunghyun Cho, NYU, CIFAR, Genentech, USA.
Prof. Marco Passarotti, Università Cattolica del Sacro Cuore, Italy.
Dr Bin Li, Nanjing Normal University, China.

Dr Rachele Sprugnoli, Università di Parma, Italy.
Dr Yudong Liu, Western Washington University, USA.
Dr Adam Anderson, UC Berkeley, USA.

Program Committee

Masayuki Asahara, National Institute for Japanese and Linguistics, Japan.
John Bodel, Brown University, USA.
Claudia Corbetta Università di Bergamo, Italy.
Mark Depauw, KU Leuven, Belgium.
Hanne Eckhoff, University of Oxford, UK.
Margherita Fantoli, KU Leuven, Belgium.
Ethan Fetaya, Bar-Ilan University, Israel.
Federica Gamba, Charles University, Czech Republic.
Chul Heo, Pusan University, Republic of Korea.
Petra Heřmánková, Aarhus University, Denmark & Johannes Gutenberg-Universität Mainz, Germany.
Marietta Horster, Johannes Gutenberg-Universität Mainz, Germany.
Renfen Hu, Beijing University, China.
Federica Iurescia, Università Cattolica di Milano, Italy.
Kyle Johnson, TikTok, USA.
Alek Keersmaekers, KU Leuven, Belgium.
Ussen Kimanuka, Pan African University Institute, Kenya.
Thomas Koentges, You Say Data Limited, New Zealand.
Els Lefever, Ghent University, Belgium.
Eliese-Sophia Lincke, Freie Universität Berlin, Germany.
Chao-Lin Liu, Chengchi University, Taiwan.
Liu Liu, Nanjing University, China.
Jiaming Luo, Google Canada, Canada.
Massimo Maiocchi, Ca' Foscari University of Venice, Italy.

Isabelle Marthot-Santaniello, University of Basel, Switzerland.
Barbara McGillivray, King's College London, UK.
Alex Mullen, University of Nottingham, UK.
Chiara Palladino, Furman University, USA.
Chanjun Park, Upstage, Republic of Korea.
Matteo Pellegrini, Università Cattolica di Milano, Italy.
Mladen Popovic, University of Groningen, The Netherlands.
Jonathan Prag, University of Oxford, UK.
Avital Romach, Yale University, USA.
Edgar Roman-Rangel, Instituto Tecnológico Autónomo de México, Mexico.
Matteo Romanello, University of Lausanne, Switzerland.
Brent Seales, University of Kentucky, USA.
Andrew Senior, Google DeepMind, UK.
Gustav Ryberg Smidt, University of Ghent.
Richard Sproat, Google DeepMind, Japan.
Gabriel Stanovsky, The Hebrew University of Jerusalem, Israel.
Silvia Stopponi, University of Groningen, The Netherlands.
Qi Su, Peking University, China.
Matthew I. Swindall, Middle Tennessee State University, USA.
Xuri Tang, Huazhong University, China.
Charlotte Tupman, University of Exeter, UK.
Haneul Yoo, KAIST, Republic of Korea.
Chongsheng Zhang, Henan University, China.

Sponsors & Support

Diamond Tier Sponsor

Google DeepMind

Silver Tier Sponsor

The Vesuvius Challenge - Scroll Prize

Supporting Organisation

ARCHIMEDES Unit - Research on artificial intelligence, data science and algorithms

Machine Learning for
Ancient Languages