Workshops and Lectures

Workshops

You will find below the list of workshops, with the program and the dates. Most of the workshops are planned to last the two weeks of the ESU (this year from 21 July to 2 August), some workshops will last one week (the dates will be specified in the workshop description). One workshop “Edition “géographique” numérique” will be offered in French.

AI and Image Analysis (Lauren Tilton, University of Richmond, USA)

The workshop will last one week, from July 28 to August 1 (week 2). What is an image? How do computers view images? How can we use AI to analyze images? The workshop will address these three questions through a combination of theory, methods, and practical skills. We will focus on

understanding image files such as digitized manuscripts and artworks,
organizing image data in connection with metadata such as author and provenance data, and
analyzing features in images using AI such as objects and people.

We will build from more basic concepts such as color to more complex methods including multimodal large language models (MLLMs). A primary aim of this course is for participants to better understand what is possible through computational image analysis and how the approaches can further their interests. The course is designed for participants who are new or familiar with image analysis. No programming skills are required.

Digital Archives: Reading and Manipulating Large-Scale Catalogues, Curating and Creating Small-Scale Archives (Yael Netzer, Hebrew University, Israel)

The purpose of this two-week workshop is to develop practical and critical skills toward the representation of knowledge in digital archives and to build a small-scale digital archive. This workshop blends theory and hands-on activities, enabling participants to engage with digital catalogues, metadata structures, and archival curation tools.

Additional Open Lab Session: An optional free exploration day will be available for participants to receive one-on-one guidance on their projects and tools.

No prior knowledge is required for this workshop. Participants are encouraged to bring their own datasets but will be provided with starter collections if needed.

First Week – Reading and Working with Data / Collections in OpenRefine

Digital data in various formats is at the heart of humanities research. Often, datasets are large, messy, or structured in unfamiliar ways. This week, students will learn to inspect, clean, and enrich digital catalogues using OpenRefine, as well as how to enhance datasets with Linked Open Data (LOD) from sources such as the Library of Congress, VIAF, and Wikidata. By the end of this week, students will be proficient in:

Understanding different file formats (CSV, TSV, Spreadsheets, JSON, XML TEI)
Using regular expressions for data manipulation (with some skill and aid from chatGPT)
Writing expressions with GREL (OpenRefine’s scripting language)
Fetching and reconciling data via REST API (e.g., GeoNames, Wikidata)
Scraping and structuring data from the web
Mapping textual data to geographic locations

Schedule:

Class 1: Introduction, loading a file, faceting, and exploring data
Class 2: Regular expressions and working with dates
Class 3: Clustering techniques for data cleaning
Class 4: Fetching external data using REST APIs (GeoNames example)
Hands-On Session: Practicing administrative tasks (changing working directory, memory allocation)
Class 5: Reconciliation and enriching data with Wikidata
Class 6: Handling JSON and XML file formats
Class 7: Web scraping techniques and automation
Class 8: From text to map – Geospatial representations in OpenRefine
Class 9: Summary and discussion

Second Week – Building a Digital Archive: Archives of the Present

This week focuses on the creation and structuring of small-scale digital archives, but also introduces the concept of archives of the present—a critical reflection on how contemporary events, data, and digital traces shape our archival practices. Participants will work with their own or provided collections, conceptualizing metadata structures and curatorial strategies. The workshop covers best practices in digital archive development, including metadata schema selection, linked data integration, and user-friendly design.

The discussion of archives of the present will explore:

How digital documentation of real-time events (social media, news articles, live-streamed content) can be archived
The ethical challenges of archiving contemporary materials
Methods for ensuring accessibility and preservation of ephemeral data
The evolving nature of authority files and metadata in fast-changing digital environments By the end of this week, students will be proficient in:
Theoretical foundations of archival studies
Metadata structuring and best practices
Using Omeka-S for archive implementation
Using Tropy for organizing and annotating images
Linking archives to external sources and ontologies
Designing and publishing an accessible, structured digital archive
Engaging with contemporary data collection and preservation strategies

Schedule:

Class 1: Theory of archives – an introduction
Class 2: Digital archives – examples and reviewing participant collections
Class 3: Modeling the domain
Class 4: Metadata – methods of description, challenges, and dilemmas
Class 5: Introduction to Omeka-S – setting up and structuring an archive
Class 6: Using Tropy – basic features and integration with Omeka
Hands-On Session: Working on participant collections
Class 7: Archives of the present – Capturing and preserving digital traces
Class 8: Linking and integrating with external resources and authority files
Class 9: Publishing – designing Omeka pages for public access
Class 10: Summary and reflections

To enrich the learning experience, this workshop will aim to incorporate:

Case studies of successful digital archive projects
Collaborative group work, where teams handle different types of archival materials
Expanded toolset beyond OpenRefine and Omeka, including basic Python for data manipulation and SPARQL for querying LOD sources
Introduction to IIIF (International Image Interoperability Framework) for handling digital images in archives
Machine learning-assisted metadata extraction, including OCR (Transkribus), Google Vision API, and Named Entity Recognition (NER)
Sustainability and long-term digital archive maintenance strategies

Top

Digital Curation and Cultural Heritage (Carol Chiodo, The Claremont Colleges, USA)

The workshop will last one week, from July 22 to July 26 (week 1). The large-scale digitization of cultural heritage, emerging forms of born-digital collections and archives, and the new ways in which researchers across disciplines engage with these materials are challenging traditional theories and practices in cultural heritage. They also present new opportunities for practitioners and researchers working with digitized collections and archival materials by expanding the means of discovery, engagement, and stewardship. These opportunities include the application of computational tools and methods, integrating computational analysis with traditional cultural heritage curation methods. This workshop will introduce digital curation as it pertains to the cultural heritage sector, understood as the selection, acquisition, preservation, maintenance, and delivery of digital data derived from objects held by libraries and museums. We will examine the intersection of emerging computational and analytical methods and technologies with museum collections, rare books and archival holdings while weighing the advantages and disadvantages for historical, social, scientific, and cultural research engagement with these materials. The course will provide an overview of best practices for the management and stewardship of these born-digital and digitized holdings in libraries and museums and we will also explore recent applications of digital curation to cultural heritage, including generative AI, geospatial analysis, audiovisual curation, collections as data, and more. The workshop is suitable for those new to the field, such as graduate students and early career scholars, as well as experienced practitioners interested in gaining new perspectives on computational curation in cultural heritage.

Top

Digital XML-TEI Content Creation and Processing Aided by AI Tools (Alex Bia, University of Elche, Spain)

DESCRIPTION: In this workshop, you’re going to learn how to create documents using XML-based encoding schemes like TEI, HTML, and ePub. Additionally, you will discover how to manage document structure to normalize large collections and how to render and transform documents automatically using technologies such as CSS, XPath, and XSLT. We will also explore how the most recent AI tools can help us perform some of these tasks.

This workshop comprises two parts:

Week 1: Introduction to XML-based technologies for text encoding deals with producing digital text documents using XML vocabularies like TEI and HTML, as well as related technologies (ePub, Markdown, etc.). This is a hands-on, introductory-level course. We will also introduce AI tools like ChatGPT to be used as an aid in document processing.
Week 2: XML-TEI document structuring and rendering deals with designing and validating document structures, as well as producing polished renderings through CSS stylesheets and XSLT transformations. This is a hands-on, medium-high level course.

Top

Distant Reading in R. Analyse the text & visualize the data (Artjoms Šeļa, Czech Academy of Sciences and Giovanni Pietro Vitali, University of Versailles Saint-Quentin-en-Yvelines)

Distant reading is one of the most well-known methodological approaches in digital humanities, formalized by Franco Moretti in the article Conjectures on World Literature (2000). It benefits greatly from computational tools. For this reason, we are proposing a course based on the use of R, one of the most popular programming languages in the scientific community today. The philosophy of the course is to analyze text and visualize data, and its structure follows this dichotomy. The objective is to introduce participants to different methodological perspectives and provide practical tools they can use in their own research. The course offers a compact introduction to natural language processing, computational text analysis, machine learning, graph theory, and geospatial humanities. By the end of the two-week course, participants will be able to use R and RStudio to apply textual and spatial analysis. An important component of the course is data visualization,, an area in which R excels, offering a comprehensive framework for creating graphs, maps, and trees. The final part of the course will focus on open-source programs like Gephi, GIMP, and Inkscape, which allow users to manipulate and rework vector and graphical files. The course is suitable for beginners who want to start their digital humanities training with a complete overview of the most common tools used for distant reading.

Top

Humanities Data and Mapping Environments (David Wrisley, New York University, Abu Dhabi, UAE and Voica Pușcașiu, Babeș-Bolyai University, Romania)

This spatial humanities workshop will introduce participants to different ways of thinking about humanities data, their curation within projects, and their use in digital mapping environments. The workshop will not be a traditional course in Geographic Information Systems (GIS), although we will use open source GIS and web mapping along the way. The workshop is designed for the total beginner who would like

to explore how a spatial dimension can enrich humanities and interdisciplinary research projects and
to learn some fundamental skills for collecting and organizing data in order to be able to integrate such methods into their research workflows.

Drawing inspiration from the location of the ESU in the historical center of Besançon, participants will gather data from within the city and will work with data from local cultural institutions. The workshop will also introduce students to ways in which artificial intelligence and machine learning are opening up new horizons for spatial humanities research. The workshop lasts a total of 36 hours, two weeks of 18 contact hours each.

The central goals of the workshop are fourfold:

to learn where we might obtain spatial data relevant to our research interests, or capture data from analog sources through digitization,
to explore modeling data for a research project having a spatial dimension,
to practice different ways that we can tell a story by visualizing spatial data, and
to learn ways that we can disseminate and share that data.

In the first part of the course we conduct a critical review of a range of spatial humanities projects: their scope and the rhetorical strategies they employ for spatial storytelling and argument. We will begin by reflecting on how location-based research might be incorporated into research projects in different disciplines (cinema, art history, anthropology, history, literature, etc.) as well as the challenges of incorporating a spatial dimension into research. We will learn about the creation of data in formats relevant to spatial humanities projects (using gazetteers, mobile data collection, off-the-shelf software) as well as some basic querying in order to perform repetitive tasks for building a spatial dataset. Students will be introduced to normalization and wrangling techniques and will contrast the manual, slow creation of data with more automated forms of analysis.

In the second part of the course, we will learn some skills in static site development so that we can host our own basic web maps. We will experiment with other automated workflows and will turn to more complex forms of visualization and storytelling. Open-source GIS software will be used to learn about georeferencing / warping and the creation of historical vector / polygon data from digitized historical maps. Depending on the time available and participant interest, we may explore other topics of interest: discipline-specific gazetteers, mapping packages in R, OpenStreetMap, Wikidata, maps & IIIF, machine classification of features in historical or series maps, etc.

A Zotero library of supplementary readings will be provided by the instructors.

Top

Text analysis/prompt engineering on historical data (Isuri Anuradha, Lancaster University/CLARIN Trainer’s network)

To be held during the second week of ESU. This workshop will discuss the fundamental techniques for text analysis and prompt engineering in the context of historical data. Rather than a traditional course in Natural Language Processing (NLP), the session will focus on hands-on sessions for practical analysis of historical texts using LLMs. This workshop aims to provide a comprehensive understanding and practical skills in designing effective prompt engineering techniques, enabling participants to access and analyse historical data more efficiently and creatively. Over five sessions, participants will explore the fundamentals of AI-driven text analysis, focusing on prompt engineering techniques tailored for domain-specific historical texts. The workshop will cover key applications such as entity recognition, relationship extraction and sentiment analysis, equipping attendees with practical skills for processing and interpreting historical narratives. Hands-on sessions will be conducted using Python, with discussions tailored for beginners and those with moderate technical experience.

Edition “géographique” numérique (Digital Text Edition and Mapping, Rudy Chaulet, University Marie and Louis Pasteur)

This workshop, in French, will last for one week (week 2). L’atelier aura lieu du 28 juillet au 1er août (semaine 2). La réflexion et le travail auront une orientation à la fois théorique et pratique. On y posera quelques principes d’édition numérique en français de récits de voyage, puis on réfléchira à la proposition suivante : “Pour une édition géographique” : comment passer de l’édition en format page (livre papier, pdf), au format “carte”, en examinant plusieurs exemples existants et en envisagent d’autres possibilités, en particulier avec le logiciel uMap, pour lequel un apprentissage rapide sera donné. Le but sera de réaliser une édition cartographique intégrale et non pas des extraits d’œuvres comme cela se rencontre la plupart du temps. A partir de ces réflexions et de diverses expérimentations, on réalisera un travail à la fois individuel et collectif de “mise en carte” intégrale d’un récit de voyage dont le texte sera fourni. On envisagera aussi les possibilités d’automatisation des tâches de “l’édition cartographique”. Aucune compétence en programmation n’est requise.

Lectures

The ESU also represents the opportunity to listen to conferences on various topics of Digital Humanities. Representatives from Clarin Trainers’ Network as well as from Humanistica have already confirmed their presence.

Top

ESU DH 2025