Tutorials | DHASA 2021

Several tutorials will be organized. More information on the dates and times of the events will follow.

CATMA 6.0

Malte Meister & Dominik Gerstorfer
https://www.catma.de

This workshop provides an introduction to the annotation software CATMA 6.0. We will show you how to create projects, upload your text data and annotate it in a very free and undogmatic way. Manual annotation features let you add any annotation category you like to text passages of any length. You can even annotate the same passages with any number of tags. Once you thus entered a thick description to your text you can analyze and visualize your interpretation of it. To speed your annotation process up you can use semi-automatic-annotation or work on your annotations collaboratively. CATMA GUI is designed to be used intuitively by Humanities scholars without any programming skills.

Target Group: Humanities Scholars
Requirements: Background in text-based Humanities is helpful to follow the workshop (Literary Science, History, Linguistics or other)

About the presenter
Malte Meister is a technical associate (software developer) at Technische Universität Darmstadt, where he is a member of the DFG-funded “forTEXT” project team. He is primarily responsible for the maintenance and ongoing development of the CATMA web application.

Preparations
– CATMA account, via link: https://app.catma.de/catma/ using an email address and a password (click ‘Sign up’), or simply sign in using a Google account (click ‘Sign in’).

Hackathon: Glosario
(A multilingual glossary package)

Angelique Trusler
https://carpentries.org/

Glosario is an open source glossary of terms used in data science that is available online and also as a library in both R and Python. By adding glossary keys to a lesson’s metadata, authors can indicate what the lesson teaches, what learners ought to know before they start, and where they can go to find that knowledge. Authors can also use the library’s functions to insert consistent hyperlinks for terms and definitions in their lessons in any of several languages. To help expand Glosario, we will add definitions in another languages to an existing term. You do not need to know any particular programming language to contribute to Glosario.

Target Group: Everyone
Requirements: GitHub account (optional)

About the presenter
Angelique, PhD (Educational Psychology), is the Regional Consultant for Southern Africa at The Carpentries. Angelique aims at developing strategies and workflows to adapt The Carpentries’ membership and workshop programs to function optimally in Southern Africa. In addition, the role aims to assist and advise in developing a plan to create and support a strong team of volunteer instructors across the regions.

Preparations
– Please, sign up to GitHub
– Documentation on How to make a contribution to Glosario & Glossary

WikimediaZA edit-a-thon

Bobby Shabangu
http://wikimedia.org.za/wiki/Main_Page

Wikipedia is the biggest online encyclopedia accessed by millions of users daily, yet like the rest of the internet it is dominated by American and European content. This workshop is to show participants how they can add notable content on Wikipedia particularly African content, and if they speak an African language in their own mother tongue.

Target Group: Everyone
Requirements: No background knowledge, no background in computer language, the only requirement is for participants to have a computer, tablet or phone to have access to internet.

About the presenter
Bobby Shabangu is a South African Wikimedian who is passionate about African content and decolonising the internet. Shabangu started editing Wikipedia in 2013 and has since contributed more content in the SiSwati Wikipedia. Some of his notable projects include collaborating with gallaries, libraries, archives institutions and museums to create Wikipedia pages about topics comprising of the history of old buildings in the city of Johannesburg, South African constitution, South African notable artists, South African food, political figures etc.

Preparations
– create Wikipedia user accounts by following instructions from this 2min YouTube video.
– to prepare themselves on what to expect watch this short clip about the history of Wikipedia

Digitalization and Visualization
(A practical approach with TEI data)

Sree Thottempudi
https://www.sadilar.org/index.php/en/9-about/250-postdoc

The tutorial “Digitalization and Visualization (A practical approach with TEI data)” consists of a series of chapters that build on each other and introduce the coding, editing and visualization of texts according to the guidelines of the Text Encoding Initiative (TEI). The tutorial is designed for less-resourced languages and their digitalizal representation, but can also be used for Semantic web applications.

This tutorial consist of three elements: first, a set of slides for an introductory presentation that introduces the most important terms and elements of TEI; secondly, from one or more exercise sheets that are used to practice what has been learned; and thirdly, parsing and visualization of TEI data. You can use your own language data for your practice.

Target Group: Everyone
Requirements: Basic XML Knowledge

About the presenter
Sree did his PhD in DH and Data Science application towards Telugu classical manuscripts. He completed his MS in Information technology and MA in Linguistics and Literature. He is also associated with SADILAR.

Sree did his PhD in DH and Data Science application towards Telugu classical manuscripts. He completed his MS in Information technology and MA in Linguistics and Literature. He is also associated with SADILAR.

Text Mining for Newbies

Stefan Ziehe
https://www.gcdh.de/en/welcome/

The goal of this tutorial is to give a basic introduction to Text Mining (the automated extraction of relevant information from text data). The participants will be shown how to use Python libraries such as scikit-learn and NLTK in order to perform text mining tasks such as classification or clustering. The code from this tutorial can be used by the participants as a starting point for possible future projects.

Target Group: Everyone
Requirements: Basic Programming (Python), Very Basic Mathematics/Statistics knowledge (very basic knowledge in linear algebra (i.e. vectors and matrices))

About the presenter
Stefan Ziehe is a researcher at the University of Göttingen, Germany. He has a B.Sc. and a M.Sc. in Applied Computer Science. His primary research interests are currently Natural Language Processing, Deep Learning and Multimodality.

Preparations
install Jupyter Notebook (https://jupyter.org/install) with the following packages:

– NumPy (https://numpy.org/install/)
– scikit-learn (https://scikit-learn.org/stable/install.html)
– Matplotlib (https://matplotlib.org/stable/users/installing/index.html)
– NLTK (https://www.nltk.org/install.html)

Alternatively, sign up to Google Colab (https://colab.research.google.com/).

Expanding the influence of DH throughout
the humanities – Introducing Gale Digital Scholar Lab

Chris Houghton
https://www.gale.com/intl/primary-sources/digital-scholar-lab

Released in 2018, Gale Digital Scholar Lab is Gale’s innovative cloud-hosted text and data mining platform. For the first time, major archive material has been integrated with powerful analytical tools to provide access to DH across the humanities department, in a way that is accessible to all, regardless of technical experience. Universities around the world are using Gale Digital Scholar Lab as the basis of digital humanities courses, and increasingly, to introduce DH concepts into the undergraduate classroom.

This tutorial will introduce Gale Digital Scholar Lab and illustrate its development journey to becoming the world’s most popular integrated TDM solution. Learn how it is being used in classrooms and how it is providing major benefits to researchers.

Target Group: Linguists, Library and Information Scientists, Humanities Scholars, Social Science Scholars
Requirements: None

About the presenter
Chris Houghton is Head of Digital Scholarship for Gale, and works to strengthen the relationship between Gale and the Global DH community. With responsibility for Gale Digital Scholar Lab outside of North America, Chris collaborates with scholars around the world to support research projects and with instructors to integrate the Lab into teaching, from undergraduate to PhD level.
Chris is passionate about working with developing DH communities and has led partnerships with researchers in China, Brazil, India and throughout Africa.

CATMA 6.0

Hackathon: Glosario(A multilingual glossary package)

WikimediaZA edit-a-thon

Digitalization and Visualization (A practical approach with TEI data)

Text Mining for Newbies

Expanding the influence of DH throughout the humanities – Introducing Gale Digital Scholar Lab

Digital Humanities Association of Southern Africa

Hackathon: Glosario
(A multilingual glossary package)

Digitalization and Visualization
(A practical approach with TEI data)

Expanding the influence of DH throughout
the humanities – Introducing Gale Digital Scholar Lab