Workshop on
eLexicography between Digital Humanities and Artificial Intelligence:
Complexities in Data, Technologies, Communities

9^th July, Utrecht, the Netherlands

TivoliVredenburg: Vredenburgkade 11, 3511 WC Utrecht

Co-located with Digital Humanities (DH) Conference 2019

About the Workshop

Lexicography is currently embracing rapid change as the traditional methods of publishing dictionaries are replaced by the ubiquity of lexical information on the Web. Furthermore, the application of computational techniques to the processes of lexicography is revolutionizing how dictionaries can be constructed. The proposed event will be the second iteration of a highly successful workshop first run before the EADH Conference in Galway on 6 December 2018.

Specific emphasis in this edition of the workshop will be on complexity in regard of data, technologies and community aspects of lexicography. Complexity in data concerns issues in access to lexicographic data across stakeholders (national institutes, research groups, individuals), representation formats, linguistic assumptions, underlying theories and scope of analysis and representation, legal restrictions and licenses, multilingual, cross-lingual, comparative and typological issues, as well as advanced aspects in multimodality, concerning audio-visual representation. Complexity in technologies concerns challenges in creating and expanding novel lexicographic resources, which require a combination of many technologies, including natural language processing tools as well as machine learning approaches and AI methods in general, for data linking and data management, in order to identify and represent words, their senses and definitions.

Complexity in communities, in part, lies in differences related to stakeholders’ type and status (national language institutes, standardization bodies, research groups, individuals), status of the language in question (official, minority, regional, etc.), and involvement in networking activities. In addition, the complexity is related to several academic disciplines involved in eLexicographic research such as linguistics, natural language processing (NLP), digital humanities, artificial intelligence (AI), computational linguistics, computer science, etc. This constitutes a challenge to provide professionals with training opportunities and ensure knowledge exchange among all stakeholders.

This workshop is co-organized and supported by the two the European Projects ELEXIS and Prêt-à-LLOD.

Topics of Interest

Lexicography and Digital Humanities
AI for Lexicography
Complexity in Data for Lexicography
Complexity in Technologies for Lexicography
Complexity across Lexicography and Related Communities
Access and usage of dictionaries on the Web
Retro-digitization of lexicographic resources
Lexicography for language learning
Use and applications of NLP for lexicography
Lexicography for terminology and translation
Lexicography for under-resourced languages
Linked Data for lexical resources

Call for Papers

The deadline for abstract submission has been extended until May 6th 2019.

We welcome submissions of abstracts of up to 500 words that will be presented as posters at the workshop. Submissions should present methodologies, experiments, use cases, descriptions of ongoing or planned research projects and position papers on topics related to the topics of interest (given above). Furthermore, we especially welcome papers describing interdisciplinary research combining research in lexicography, linguistics, computer science and digital humanities approaches and giving insights into complexity in regard of data, technologies and community aspects of lexicography.

Please submit abstracts by ~~April 30th~~ May 6th 2019 in any language (including an English translation for the title for reviewing purposes). Submissions will be reviewed by at least 3 reviewers and will be made available online prior to the workshop.

Papers should be submitted via EasyChair at

https://easychair.org/conferences/?conf=lexdhai2019

Notifications will be sent by end of May and final versions of abstracts will be required by end of June.

Registration

Please register for the workshop via DH2019 Conference registration

Workshop Schedule

9:00 - 9:30 Invited talk: Andrea Abel (Eurac Research, Italy)
"Unity in Variety: Observation and Lexicographic Treatment of the German Standard Variety used in South Tyrol"

9:30 - 10:00 Invited talk: Piek Vossen (Vrije Universiteit Amsterdam, Netherlands)
"Framing in the Dutch Language: from structured data to text and back from text to structured data on situations"

10:00 - 10:25 Introduction to Prêt-à-LLOD project

10:25 - 10:55 Coffee break

10:55 - 11:20 Introduction to ELEXIS project: Anna Woldrich (Austrian Academy of Sciences, Austria)
"Linking Communities through ELEXIS: the social aspects of an infrastructure"

11:20 - 12:00 Lightning talks

12:00 - 13:00 Poster session

"ONAMA – Ontology of the Narratives of the Middle Ages" (Peter Hinkelmanns)
"Planning a domain-specific electronic dictionary for the mathematical field of graph theory" (Theresa Kruse)
"Towards a representation of figurative language and semantic shift in computational lexica: a case study in old english emotions" (Javier Enrique Díaz-Vera, Fahad Khan and Monica Monachini)
"Wikidata: lexicographical data for everyone" (Lydia Pintscher)

Tanja Wissik - Austrian Centre for Digital Humanities

Tanja Wissik is a senior researcher at the Austrian Centre for Digital Humanities (ACDH) of the Austrian Academy of Sciences and teaches information technologies for translators at the University of Graz. She graduated from the University of Graz in translation and interpretation studies. She holds a PhD from the University of Vienna in translation studies with a specialization in the field of terminology and corpus linguistics. She has been working in numerous national and international research projects related to language resources and language technologies first as a junior researcher at the Institute for Specialized Communication and Multilingualism of the European Academy Bolzano, and then as a researcher and lecturer at the University of Vienna.

Paul Buitelaar - National University of Ireland Galway

Paul Buitelaar is a Senior Lecturer at the National University of Ireland Galway (NUIG), vice-director of the Insight Centre for Data Analytics at NUIG and head of the Insight Unit for Natural Language Processing. His main research interests are in the development and use of Natural Language Processing methods and solutions for semantic-based information access. He has been involved in a large number of national and international funded projects in this area. In recent years he was involved in the development of the Saffron framework for knowledge extraction and the definition and implementation of lemon, a vocabulary for Linguistic Linked Data.

John P. McCrae - National University of Ireland Galway

John P. McCrae is a lecturer above-the-bar at the National University of Ireland, Galway in the school of information technology. His work has focussed on the application of linked data to language resources. In particular, he is the original developer of the lemon-OntoLex model, which has become a de-facto standard for representing lexicons on the Web. In addition, he is a board member of the Global WordNet Association. He has also organized many events including the Language Data and Knowledge Conferences (2017, 2019), the Linked Data in Linguistics Workshops (2013,14,15,16,18), Summer Datathons/Summer Schools on Linguistic Linked Open Data (2015,17) and 7 other workshops.

Justin Tonra - National University of Ireland Galway

Justin Tonra is Lecturer in English (Digital Humanities) at the National University of Ireland Galway. His areas of research interest include digital approaches to literary studies, book history, textual studies and bibliography, scholarly editing, and literature of the Romantic period. He is currently joint National Coordinator for DARIAH Ireland, and a working-group leader for COST Action CA16204 Distant Reading for European Literary History.

Toma Tasovac - Belgrade Center for Digital Humanities

Toma Tasovac is Director of the Belgrade Center for Digital Humanities (BCDH) and, as of September 2018, Director of the Digital Research Infrastructure for the Arts and Humanities (DARIAH). His areas of interest include lexicography, data modeling, TEI, digital editions and research infrastructures. Toma was previously a Steering Group member of the European Network for eLexicography (ENeL), and is currently also affiliated with the European Lexicographic Infrastructure (ELEXIS).

Ksenia Zaytseva - Austrian Centre for Digital Humanities

Ksenia Zaytseva is a data analyst at the Austrian Centre for Digital Humanities (ACDH) of the Austrian Academy of Sciences. She coordinates work on Reference Resources and Controlled Vocabularies with the main focus to further develop and maintain the ACDH Vocabularies platform. She is involved in the development of data driven applications for digital humanities projects in archaeological and linguistic domains. Her research interests are Semantic Web technologies and Linked (Open) Data, methods and technologies in knowledge engineering and machine learning for data analysis.