Section: 3. Software Environment Module | AI for Social Sciences and Humanities

In this module, you will be introduced to key software tools used to create and explore semantic data in the humanities and social sciences. Through guided demonstrations, you will see how Protégé and Tedi are used to build ontologies and ontoterminologies, how SPARQL enables querying of RDF data, and how LEAF-Writer supports semantic text encoding using TEI standards. You will also get an overview of how NLP notebooks can assist in basic language processing tasks. This module gives you a practical understanding of the tools behind semantic technologies in SSH.

Select activity 3a. ProtégéThis MOOC introduces Protégé, a free an...

3a. Protégé
This MOOC introduces Protégé, a free and open-source ontology editor developed at Stanford University, widely used for building OWL ontologies. It explains the foundational elements of ontologies—individuals, properties, and classes—emphasizing that meaning arises from the relationships between objects. Ontologies are structured as class hierarchies supported by logical constraints and property restrictions. An example is given through the Krater Ontology, which models various types of Ancient Greek vases and their characteristics. The process of ontology building in Protégé involves defining class hierarchies, annotating terms, and populating the ontology with individuals. The presentation raises two key open questions about defining essential characteristics and integrating the linguistic dimension of ontology. Protégé supports reasoning tools to ensure logical consistency and is compliant with W3C standards, making it a powerful tool for knowledge representation, particularly in the humanities and social sciences.
Select activity Protégé (mp4-en)

Protégé (mp4-en) File
Select activity Protégé (mp4-fr)

Protégé (mp4-fr) File
Select activity Protégé (ppt in pdf format)

Protégé (ppt in pdf format) File
Select activity Protégé (ppt with script in pdf format)

Protégé (ppt with script in pdf format) File
Select activity 3b. TEDIThis MOOC session introduces TEDI, a freel...

3b. TEDI
This MOOC session introduces TEDI, a freely distributed software by the University of Crete for academic and research use. TEDI is designed for building multilingual ontoterminologies—terminologies whose conceptual system is a formal ontology. The session begins with a recap of ontoterminology, highlighting its role in representing and standardizing domain knowledge by combining ontology and terminology. TEDI supports both a conceptual dimension, where users define concepts, essential characteristics, relations, and instances based on Aristotelian principles, and a linguistic dimension, where terms and proper names are assigned independently across languages but linked through a shared ontology. The session demonstrates TEDI’s editors and shows how to structure and populate an ontoterminology. Export options include HTML (for web-based dictionaries), RDF (for tools like Protégé), TBX (ISO-standard for terminological data), and CSV (e.g., for CmapTools), making TEDI a versatile and standards-compliant tool for humanities research and education.
Select activity TEDI (mp4-en)

TEDI (mp4-en) File
Select activity TEDI (ppt in pdf format)

TEDI (ppt in pdf format) File
Select activity TEDI (ppt with script in pdf format)

TEDI (ppt with script in pdf format) File
Select activity 3c. SPARQLThis MOOC session introduces SPARQL, the...

3c. SPARQL
This MOOC session introduces SPARQL, the standard query language for RDF data. It begins with a recap of RDF and its structure as subject–predicate–object triples in directed, labeled graphs. The session then defines SPARQL and its importance in querying RDF datasets, much like SQL for relational databases. The session outlines the structure of SPARQL queries—PREFIX declarations, query forms like SELECT and ASK, graph patterns, and modifiers. Examples show how to query linked data using endpoints like DBpedia, including retrieving labels, depictions, and verifying patterns. It concludes by emphasizing SPARQL’s role in semantic search, large-scale data exploration, and linking diverse data sources.
Select activity SPARQL (mp4-en)

SPARQL (mp4-en) File
Select activity SPARQL (ppt in pdf format)

SPARQL (ppt in pdf format) File
Select activity SPARQL (ppt with script in pdf format)

SPARQL (ppt with script in pdf format) File
Select activity 3d. LEAF-WriterLEAF-Writer is a free, web-based te...

3d. LEAF-Writer
LEAF-Writer is a free, web-based text encoding tool that requires no installation or configuration and supports collaborative editing. The session explains text encoding as the process of making human-readable text machine-readable through markup, covering structural, presentational, and semantic types. XML (eXtensible Markup Language) is introduced as a flexible language for structuring and labeling data without predefined tags, balancing user freedom with interoperability challenges. The Text Encoding Initiative (TEI) is presented as a humanistic XML standard designed for consistent text encoding in literary and linguistic contexts, such as manuscripts, historical archives, and critical editions. LEAF-Writer supports TEI schema, offers on-the-fly validation, and entity tagging, allowing users to encode texts easily via a web platform. Export options include XML, HTML, and Markdown formats, facilitating integration and reuse. The session closes with links for further learning and access to the LEAF-Writer platform.
Select activity LEAF-Writer (mp4-en)

LEAF-Writer (mp4-en) File
Select activity LEAF-Writer (ppt in pdf format)

LEAF-Writer (ppt in pdf format) File
Select activity LEAF-Writer (ppt with script in pdf format)

LEAF-Writer (ppt with script in pdf format) File
Select activity 3e. A notebook for NLPDesigned for newcomers, this...

3e. A notebook for NLP
Designed for newcomers, this hands-on course offers a gentle introduction to Natural Language Processing (NLP) using Python in Google Colab. Participants learn to read and analyze text files, calculate basic text metrics like word and character counts, and perform simple preprocessing tasks such as converting text to lowercase. The course emphasizes practical coding skills and foundational concepts, setting the stage for deeper exploration into NLP techniques like lemmatization, part-of-speech tagging, and sentiment analysis using popular libraries such as NLTK. It’s an ideal starting point for anyone looking to unlock the potential of text data.
Select activity COLAB NLP Notebook (mp4-en)

COLAB NLP Notebook (mp4-en) File
Select activity NLP Colab Notebook (ppt in pdf format)

NLP Colab Notebook (ppt in pdf format) File
Select activity 3f. Quiz on Module 3

3f. Quiz on Module 3
Select activity Quiz on Software Environment Module

Quiz on Software Environment Module

3. Software Environment Module

Section outline

3a. Protégé

3b. TEDI

3c. SPARQL

3d. LEAF-Writer

3e. A notebook for NLP

3f. Quiz on Module 3

Contact Us