Section outline

  • TALOS Course Screenshot

    Welcome to the TALOS – AI for Social Sciences and Humanities MOOC!

    This course introduces Artificial Intelligence (AI) to Humanists and Social Scientists, with no prior technical background required. Developed within the TALOS project, it focuses on building and applying FAIR datasets—Findable, Accessible, Interoperable, and Reusable—enriched with semantic technologies such as Knowledge Graphs, Ontologies, and Ontoterminologies.

    The MOOC also explores practical tools and methods for semantic data creation, while addressing ethical and educational implications. Key AI concepts, including Large Language Models (LLMs) and Natural Language Processing (NLP), are introduced in context.

    The course is structured into:

    1. Introduction Module
    2. Knowledge Graph Module
    3. Software Environments Module
  • In this introductory module, you will learn what Artificial Intelligence is and how it compares to human intelligence. You will explore the main AI approaches, including symbolic AI and neural networks, and understand the historical evolution of AI up to modern technologies like large language models. You will discover why AI literacy is essential for Humanists and Social Scientists, examine educational and social issues related to AI, and get an introduction to computational linguistics.

    • 1a. What is AI?

      This MOOC introduces Artificial Intelligence (AI) without requiring any prior expertise in computer science. It begins by exploring the concept of intelligence, highlighting key abilities such as reasoning, learning, adapting, and communicating, and compares human and artificial forms. The presentation outlines the two main traditional approaches of AI: Symbolic AI, based on explicit knowledge representation and logical reasoning, and Connectionist AI, inspired by neural networks and capable of learning from large datasets. The historical evolution of AI is discussed, from its symbolic origins to modern success with deep learning and generative models such as large language models (LLMs). AI is now widely used across various fields, including healthcare, finance, robotics, and smart cities, profoundly impacting our daily lives. A hybrid approach, combining both symbolic and connectionist paradigms, is presented as a promising direction for building more robust and explainable AI systems. Finally, ethical considerations are emphasized, including algorithmic bias, data privacy, and the importance of regulatory frameworks like the EU’s AI Act to ensure safe and transparent AI deployment.

    • 1b. The need for AI literacy for Humanists and Social Scientists

      The second unit of the Introductory section of this MOOC focuses on the need for AI literacy for Humanists and Social Scientists. It explores what Artificial Intelligence can offer to the Social Sciences and Humanities (SSH), highlighting new pedagogical possibilities and the transformative potential of AI in research and education. The unit also addresses why AI literacy is essential for SSH scholars and discusses the evolving roles of humanists and social scientists in an increasingly AI-driven world.

    • 1c. Educational and Social Issues

      This unit discusses the social, educational, and ethical issues raised by AI. It examines the potential risks posed by AI technologies, and presents a vision for their ethical, human-centered, and critically informed use, in alignment with EU regulations and the guidelines of other global organizations. Finally, it highlights the critical role that the Social Sciences and Humanities (SSH) can play in shaping ethical AI tools and guiding their responsible application.

    • 1d. Computational Linguistics


      This course introduces the field of Computational Linguistics (CL), exploring how computers process and understand human language. It covers key applications such as machine translation, sentiment analysis, named entity recognition, and authorship attribution. Learners gain insights into the use of corpora (large text collections) for training AI models and conducting linguistic research. Emphasis is placed on the role of Large Language Models (LLMs) in the digital humanities and social sciences, highlighting both their capabilities and ethical considerations. The course provides a foundational understanding of how CL and corpus analysis empower modern AI and language research.

    • 1e. Quiz on Module 1
  • In this module, you will learn how Knowledge Graphs are used in the Social Sciences and Humanities to structure and link information. You will explore the standards and formats that make data interoperable and reusable, with a focus on W3C recommendations. You will also gain a clear understanding of what an ontology is, how it supports knowledge representation, and how it is applied in real-world contexts such as medicine, smart cities, and digital humanities. Finally, you will be introduced to key ontology representation languages like RDF and OWL.

    • 2a. KG for SSH
      This course begins by introducing the concept of a graph — a structure made of nodes and links used to represent relationships between entities. Building on this foundation, it explores how graphs evolve into Knowledge Graphs (KGs) when the nodes and links carry semantic meaning. Led by Dr. Maria Papadopoulou, Assistant Professor in Digital Humanities & Classics, the course guides learners through how KGs represent real-world knowledge in a machine-readable form. By using subject-predicate-object triples, KGs encode facts and reveal complex relationships between people, places, events, and ideas. The course showcases how these models are built, how Uniform Resource Identifiers (URIs) give unique meaning to concepts, and how KGs support reasoning, search, and data integration. Real examples—from the Peloponnesian War to museum datasets—demonstrate their power in the SSH context. Through this approach, the course equips learners to model, query, and link information in ways traditional databases cannot.

    • 2a. KG for SSH (Intoduction)

    • 2a. KG for SSH (Part I)

    • 2a. KG for SSH (Part II)

    • 2b. What is Ontology?

      This MOOC provides a structured introduction to ontology as understood in Knowledge Engineering. It is organized into six main sections. The first part explores the origins of ontology, tracing its roots from philosophy to its contemporary use in computer science. The second section provides definitions of ontology, particularly in the context of information systems and knowledge sharing, emphasizing the formal specification of conceptualizations that can be interpreted by machines.
      Next, three concrete examples—drawn from medicine, smart city systems, and digital humanities—illustrate how ontologies are used in practice to organize knowledge, enhance interoperability, and manage complex data structures. The fourth section examines theories of concept, which underpin various approaches to conceptual modeling. It distinguishes between essential and descriptive characteristics, and explains the difference between concepts and classes as units of knowledge organization.
      The fifth section surveys different ontology representation languages, including graphical, AI-based, logical, and W3C languages such as RDF and OWL. These languages vary in formality, expressiveness, and suitability depending on the application domain. The final section presents two ontology-building environments—Protégé and Tedi—highlighting their respective modeling paradigms and practical uses in constructing ontologies based on either class-based or concept-based approaches.

    • 2c. Quiz on Module 2
  • In this module, you will be introduced to key software tools used to create and explore semantic data in the humanities and social sciences. Through guided demonstrations, you will see how Protégé and Tedi are used to build ontologies and ontoterminologies, how SPARQL enables querying of RDF data, and how LEAF-Writer supports semantic text encoding using TEI standards. You will also get an overview of how NLP notebooks can assist in basic language processing tasks. This module gives you a practical understanding of the tools behind semantic technologies in SSH.
    • 3a. Protégé

      This MOOC introduces Protégé, a free and open-source ontology editor developed at Stanford University, widely used for building OWL ontologies. It explains the foundational elements of ontologies—individuals, properties, and classes—emphasizing that meaning arises from the relationships between objects. Ontologies are structured as class hierarchies supported by logical constraints and property restrictions. An example is given through the Krater Ontology, which models various types of Ancient Greek vases and their characteristics. The process of ontology building in Protégé involves defining class hierarchies, annotating terms, and populating the ontology with individuals. The presentation raises two key open questions about defining essential characteristics and integrating the linguistic dimension of ontology. Protégé supports reasoning tools to ensure logical consistency and is compliant with W3C standards, making it a powerful tool for knowledge representation, particularly in the humanities and social sciences.

    • 3b. TEDI

      This MOOC session introduces TEDI, a freely distributed software by the University of Crete for academic and research use. TEDI is designed for building multilingual ontoterminologies—terminologies whose conceptual system is a formal ontology. The session begins with a recap of ontoterminology, highlighting its role in representing and standardizing domain knowledge by combining ontology and terminology. TEDI supports both a conceptual dimension, where users define concepts, essential characteristics, relations, and instances based on Aristotelian principles, and a linguistic dimension, where terms and proper names are assigned independently across languages but linked through a shared ontology. The session demonstrates TEDI’s editors and shows how to structure and populate an ontoterminology. Export options include HTML (for web-based dictionaries), RDF (for tools like Protégé), TBX (ISO-standard for terminological data), and CSV (e.g., for CmapTools), making TEDI a versatile and standards-compliant tool for humanities research and education.


    • 3c. SPARQL

      This MOOC session introduces SPARQL, the standard query language for RDF data. It begins with a recap of RDF and its structure as subject–predicate–object triples in directed, labeled graphs. The session then defines SPARQL and its importance in querying RDF datasets, much like SQL for relational databases. The session outlines the structure of SPARQL queries—PREFIX declarations, query forms like SELECT and ASK, graph patterns, and modifiers. Examples show how to query linked data using endpoints like DBpedia, including retrieving labels, depictions, and verifying patterns. It concludes by emphasizing SPARQL’s role in semantic search, large-scale data exploration, and linking diverse data sources.

    • 3d. LEAF-Writer

      LEAF-Writer is a free, web-based text encoding tool that requires no installation or configuration and supports collaborative editing. The session explains text encoding as the process of making human-readable text machine-readable through markup, covering structural, presentational, and semantic types. XML (eXtensible Markup Language) is introduced as a flexible language for structuring and labeling data without predefined tags, balancing user freedom with interoperability challenges. The Text Encoding Initiative (TEI) is presented as a humanistic XML standard designed for consistent text encoding in literary and linguistic contexts, such as manuscripts, historical archives, and critical editions. LEAF-Writer supports TEI schema, offers on-the-fly validation, and entity tagging, allowing users to encode texts easily via a web platform. Export options include XML, HTML, and Markdown formats, facilitating integration and reuse. The session closes with links for further learning and access to the LEAF-Writer platform.


    • 3e. A notebook for NLP

      Designed for newcomers, this hands-on course offers a gentle introduction to Natural Language Processing (NLP) using Python in Google Colab. Participants learn to read and analyze text files, calculate basic text metrics like word and character counts, and perform simple preprocessing tasks such as converting text to lowercase. The course emphasizes practical coding skills and foundational concepts, setting the stage for deeper exploration into NLP techniques like lemmatization, part-of-speech tagging, and sentiment analysis using popular libraries such as NLTK. It’s an ideal starting point for anyone looking to unlock the potential of text data.

    • 3f. Quiz on Module 3