Workshops and Lectures
Workshops
Big Data in the Chamber: Corpus-Assisted Studies of Parliamentary Discourse Across Time and Space
Instructors: Anna Kryvenko
Duration: both weeks
Syllabus: View details
Abstract:
Parliaments are pivotal institutions in democracies, shaping policies that impact citizens by deliberating critical societal issues. The debates are commonly recorded as open-access digital proceedings enriched with metadata. These records are valuable for researchers exploring political, societal, historical, cultural or communicational dynamics in fields such as linguistics, discourse analysis, political science, history, sociology, gender studies as well as for various teaching contexts. This workshop takes advantage of the interoperability and comparability of the ParlaMint corpora containing parliamentary proceedings from 26 national and 3 regional parliaments across Europe at least between 2015 and 2022, although several ParlaMint corpora include data spanning a much longer period. Available in the original languages and machine-translated to English, the corpora also feature metadata on speakers, parties and speeches, including names, gender, age, roles, party affiliation, power positions, political leanings, speech dates, topics and sentiment. This hands-on project-oriented tutorial will provide skills and methodological training to explore ParlaMint version 5.0, which can be obtained by downloading the files or by accessing the preloaded data via online platforms – primarily noSketch Engine and TEITOK. All data and tools are open access and can be used free of charge. Designed for researchers in Social Sciences and Humanities with interest in parliamentary discourse but no or little familiarity with corpus linguistic tools, this workshop will train participants to leverage extensive content, annotations and metadata via user-friendly concordancers, facilitating research on individual national parliaments, enabling transnational comparisons, and fostering cross-disciplinary collaboration. Participants with also discover CLARIN – Common Language Resources and Technology Infrastructure.
Critical AI and Large Language Models for Humanists
Instructors: Anouk Lang
Duration: 2nd week
Syllabus: View details
Abstract:
This course is based on the premise that scholarly expertise from the humanities can, and should, be brought to bear on understanding how generative AI and related algorithmic technologies work, their broader effects on the world, and how we might more thoughtfully and responsibly engage with them. The course seeks to give enough technical background—from the probabilistic underpinnings of large language models to considerations of training corpora and interface design—for participants to be able to take an informed stance on the pressing social, ethical and environmental concerns emerging around generative AI, and to better understand the potential impact of these technologies on institutions important to the continued functioning of democracy such as the media and education. It incorporates hands-on activities which offer participants the chance to explore features of the technologies for themselves and follow their own interests. In bringing together developments from one set of disciplinary orientations—the fields of science and technology studies (STS) and critical AI—together with theoretical and methodological approaches from humanities subjects—such as historical perspectives on text generation technologies and insights from literary theory with established explanatory power for grasping the workings of narrative, genre and style—it seeks to model for participants a reflexively critical orientation to ‘AI’ not as the singular, stable entity which is so often the referent of technocapitalist hype, but as a collection of imbricating technologies emerging from specific historical and cultural contexts whose putative dominance we have the ability to critique and to contest.
Digital Archives: Reading and Manipulating Large-Scale Catalogues, Curating and Creating Small-Scale Archives
Instructors: Yael Netzer
Duration: both weeks
Syllabus: View details
Abstract:
This two-week intensive workshop provides participants with the practical and critical skills necessary to navigate the lifecycle of digital knowledge representation. The first week focuses on “reading” and manipulating data, utilizing OpenRefine to clean, enrich, and reconcile messy datasets with Linked Open Data (LOD) sources like Wikidata and GeoNames. In the second week, the focus shifts to the creation of “Archives of the Present,” where participants move from theory to implementation. Using tools such as Omeka-S, students will conceptualize metadata structures and build small-scale digital archives while grappling with the ethical and technical challenges of preserving contemporary, ephemeral digital traces. By blending hands-on technical training—including web scraping, API integration, and machine learning-assisted metadata extraction—with archival theory, this workshop empowers researchers to transform raw data into structured, sustainable, and accessible digital resources. No prior technical knowledge is required.
Digital Curation and Cultural Heritage
Instructors: Carol Chiodo
Duration: one week
Syllabus: View details
Abstract:
The large-scale digitization of cultural heritage, emerging forms of born-digital collections and archives, and the new ways in which researchers across disciplines engage with these materials are challenging traditional theories and practices in cultural heritage. They also present new opportunities for practitioners and researchers working with digitized collections and archival materials by expanding the means of discovery, engagement, and stewardship. These opportunities include the application of computational tools and methods, integrating computational analysis with traditional cultural heritage curation methods. This workshop will introduce digital curation as it pertains to the cultural heritage sector, understood as the selection, acquisition, preservation, maintenance, and delivery of digital data derived from objects held by libraries and museums. We will examine the intersection of emerging computational and analytical methods and technologies with museum collections, rare books and archival holdings while weighing the advantages and disadvantages for historical, social, scientific, and cultural research engagement with these materials. The course will provide an overview of best practices for the management and stewardship of these born-digital and digitized holdings in libraries and museums and we will also explore recent applications of digital curation to cultural heritage, including generative AI, geospatial analysis, audiovisual curation, collections as data, and more. The workshop is suitable for those new to the field, such as graduate students and early career scholars, as well as experienced practitioners interested in gaining new perspectives on computational curation in cultural heritage.
Digital XML-TEI Content Creation and Processing Aided by AI Tools
Instructors: Alejandro Bia
Duration: both weeks
Syllabus: View details
Abstract:
In this workshop, you’re going to learn how to create documents using XML-based encoding schemes like TEI, HTML, and ePub. Additionally, you will discover how to manage document structure to normalize large collections and how to render and transform documents automatically using technologies such as CSS, XPath, and XSLT. We will also explore how the most recent AI tools can help us perform some of these tasks.
Distant Reading in R. Analyse the text & visualize the data
Instructors: Simone Rebora,Giovanni Pietro Vitali
Duration: both weeks
Syllabus: View details
Abstract:
Distant reading is one of the most well-known methodological approaches in digital humanities, formalized by Franco Moretti in the article Conjectures on World Literature (2000). It benefits greatly from computational tools. For this reason, we are proposing a course based on the use of R, one of the most popular programming languages in the scientific community today. The philosophy of the course is to analyze text and visualize data, and its structure follows this dichotomy. The objective is to introduce participants to different methodological perspectives and provide practical tools they can use in their own research. The course offers a compact introduction to natural language processing, computational text analysis, machine learning, graph theory, and geospatial humanities. By the end of the two-week course, participants will be able to use R and RStudio to apply textual and spatial analysis. An important component of the course is data visualization, an area in which R excels, offering a comprehensive framework for creating graphs, maps, and trees. The final part of the course will focus on open-source programs like Gephi, GIMP, and Inkscape, which allow users to manipulate and rework vector and graphical files. The course is suitable for beginners who want to start their digital humanities training with a complete overview of the most common tools used for distant reading.
Humanities Data and Mapping Environments
Instructors: David Wrisley,Voica Pușcașiu
Duration: both weeks
Syllabus: View details
Abstract:
This spatial humanities workshop will introduce participants to different ways of thinking about humanities data, their curation within projects, and their use in digital mapping environments. The workshop will not be a traditional course in Geographic Information Systems (GIS), although we will use open source GIS and web mapping along the way. The workshop is designed for the total beginner who would like to explore how a spatial dimension can enrich humanities and interdisciplinary research projects and to learn some fundamental skills for collecting and organizing data in order to be able to integrate such methods into their research workflows. Drawing inspiration from the location of the ESU in the historical center of Besançon, participants will gather data from within the city and will work with data from local cultural institutions. The workshop will also introduce students to ways in which artificial intelligence and machine learning are opening up new horizons for spatial humanities research.
Introduction to Stylometry
Instructors: Jeremi Ochab
Duration: one week
Syllabus: View details
Abstract:
You may have heard that computational text analysis for digital humanities and cultural analytics is fun. We assure you it’s not: it’s a grim endeavour that can easily go wrong really quickly and requires patience, expertise and making responsible choices. In this one-week course, we offer a comprehensive survival guide to multivariate text analysis with R, where we start with the basics of counting words and spend a lot of time on fundamentals: text representation, calculation of differences and similarities, vector manipulations, unsupervised and supervised methods of text classification. We will guide you through the user-friendly interface of stylo software to introduce important concepts and operations. Then, we will show you how to expand on that: understand the workflow, design your own research, discuss real-world studies and run simple replication experiments.
Measuring Manuscripts: Quantitative Approaches to Ancient Language and Script
Instructors: Christian Casey
Duration: both weeks
Syllabus: View details
Abstract:
This two-week workshop introduces participants to computational philology through hands-on work with manuscripts, text corpora, and digital tools. The course begins with traditional philological methods—textual criticism, paleography, orthography, and phonology—and then shows how these practices can be structured, modeled, and analyzed using accessible computational techniques. Participants will learn how to build small digital corpora, align textual variants, generate concordances and word lists, and encode script and linguistic data in structured formats. The workshop introduces basic quantitative approaches such as frequency analysis, diachronic comparison, simple statistical correlation, and exploratory visualization (including clustering and dimensionality reduction). Students will also explore how sound data, script data, and textual data can be treated as analyzable datasets. Throughout the course, each participant develops a small individual project based on one method covered in class (e.g., a concordance, variant alignment, paleographic clustering visualization, diachronic frequency comparison, or phonological analysis). In the final sessions, students publish their project online using lightweight, sustainable infrastructure. No prior programming experience is required. By the end of the workshop, participants will have designed, implemented, and publicly shared a small born-digital philological project grounded in structured data and computational analysis.
Voices as Data: Computational Phonetics and Responsible AI for Cultural Soundscapes
Instructors: Katarzyna Foremniak,Mariusz Sozański
Duration: both weeks
Syllabus: View details
Abstract:
While Digital Humanities has extensively explored textual and visual sources, the computational analysis of speech and vocal performance remains underrepresented. This two-week workshop introduces voice as structured cultural data and critically examines how AI technologies mediate access to vocal heritage. Participants will learn how to design and curate small speech corpora, extract acoustic and prosodic features, and apply basic statistical models to analyze phonetic variation. Building on these foundations, the workshop integrates AI-based speech technologies, including automatic speech recognition and large language models, with a strong emphasis on evaluation, bias detection, and epistemic risk. Rather than focusing on tool usage alone, the course foregrounds responsible AI practices, transparency, and reproducibility in the analysis of spoken cultural data. Through hands-on sessions using open datasets such as Mozilla Common Voice and selected oral history recordings, participants will collaboratively build a reusable, open-source research toolkit hosted on GitHub. This toolkit will include containerized workflows, analysis notebooks, documentation, and a responsible-AI framework tailored to speech data in humanities research. The workshop combines computational practice, infrastructural awareness, and critical reflection, empowering participants from diverse disciplinary backgrounds to treat voice as analyzable cultural evidence while remaining attentive to ethical, methodological, and social implications. No advanced programming skills are required.
Lectures
The ESU also represents the opportunity to listen to conferences on various topics of Digital Humanities. Representatives from Clarin Trainers’ Network as well as from Humanistica have already confirmed their presence.