How do you construct your Knowledge Graph?

Tutorial at ECAI 2024 - 19th October 2024

More and more knowledge graphs (KGs) are constructed for private use, e.g., Google, or public use, e.g., DBpedia, and Wikidata. These knowledge bases are commonly used to support many AI tasks such as entity recognition, question answering, semantic-data labeling, etc. With the advent of Large Language Models (LLMs), Knowledge Graphs have been positioned as a trustable input source for supporting LLMs in any kind of prompts provided by the users. However, these KGs need to be constructed and maintained, keeping their knowledge up to date and unavailable for effective LLM support. In this tutorial, you will learn best practices for declaratively constructing KGs from heterogeneous input datasets. The tutorial will motivate the necessity of good practices in knowledge management and demonstrate how experts are currently implementing them in real-world projects. The tutorial will be guided by a real use case from the transport domain at the EU level, for integrating several pieces of data and used for route planning across all EU members. At the end of the tutorial, the attendees will be able to set up a sustainable workflow that transforms data in any format into a well-formed and usable KG.


Program

Knowledge Graph Construction

Part I: Getting started with the Semantic Web stack

We will briefly introduce the main and W3C recommended technologies used for constructing and exploiting knowledge graphs such as RDF, OWL, RDFS, R2RML, or SPARQL.

Part II: The RML specification and its possibilities

Declarative mappings are commonly used for specifying how a dataset can be transformed into an RDF knowledge graph according to the schema provided, usually in the form of an ontology. We present one of the most popular languages used to this end, RML, and the novel features of its most recent release by the W3C.

Part III: Creating declarative mapping rules

Different approaches for mapping creation will be presented. Then, a guided hands-on session will follow with participants to create rules using the user-friendly serialization for RML, YARRRML.

Part IV: KG Construction

Several engines and approaches will be presented with their advantages and disadvantages. We will show how a KG can be materialized or virtualized and what are the best techniques to be used depending on the use case. Finally, attendees will use an engine to materialize its dataset into RDF.


Presenters

Ana Iglesias-Molina

Universidad Politécnica de Madrid

ana.iglesias (at) upm.es

Ana Iglesias-Molina is a PhD student at the Ontology Engineering Group - UPM. Her research is focused on knowledge graph construction and management with declarative mapping languages, ontology engineering, and knowledge representation. She participated in the organization and presented in three tutorials on Knowledge Graph Construction (ISWC2020, ESWC2022, K-CAP2023), and since 2023 she has also been part of the organization of the Knowledge Graph Construction Workshop (ESWC2023, ESWC2024). In the past years, she has been teaching Data Science at a Business School Master in Madrid, and Semantic Web courses at Bachelor's and Master’s levels at Universidad Politécnica de Madrid. She has also been part of the KG Construction Community Group since its foundation.

David Chaves-Fraga

Universidade de Santiago de Compostela

david.chaves (at) .usc.es

Prof. David Chaves-Fraga is an Assistant Professor at the University of Santiago de Compostela. He held a PhD in Artificial Intelligence at the Ontology Engineering Group (Technical University of Madrid) in 2021. His research is mainly focused on data management techniques for knowledge graphs. He is the chair of the W3C Community Group on Knowledge Graph Construction and he has organized three academic tutorials on these topics (ESWC2019, ISWC2020, and ESWC2022). Since 2019, he has also co-organized the International Workshop on Knowledge Graph Construction co-located with the Extended Semantic Web Conference (2019-now). In the past years, he has taught Semantic Web, Knowledge Representation, and Data Management in several university courses at Bachelor's and Master’s levels in UPM and KULeuven. He is the main author of the Spanish national guideline for RDF-KG generation from open data.

Anastasia Dimou

KU Leuven

anastasia.dimou (at) kuleuven.be

Prof. Anastasia Dimou is tenure track assistant professor at the Declarative Languages and Artificial Intelligence (DTAI), KU Leuven. She received her PhD in 2017 pertaining to high-quality Knowledge Graph (KG) construction from heterogeneous data for which she was awarded the SWSA distinguished dissertation award. Her research interests focus on declarative, (semi-)automated, and efficient KG construction from heterogeneous data. Anastasia is affiliated with Flanders Make, the Flemish research institute for the manufacturing industry, hence her research is often applied to real-world manufacturing use cases. Since 2013, Prof. Dimou was involved in more than 20 research projects and published more than 100 publications. She participated in the OC of ISWC (2021,2022), ESWC (2021-2024), SEMANTICS (2022), the Semantic Publishing Challenge (2015,2016) and the KG construction workshop (20219-now). She has given more than 10 tutorials on KG construction at conferences and companies and she teaches a course on KGs. In 2023, her team and her won the SemTab challenge that benchmarks systems for automated tables annotation with KGs. She is the promotor of the Belgian Network on KGs for Data Integration (KG4DI, w3id.org/ kg4di/), the chair of the W3C Community Group on KG Construction (W3C-KGC), with more than 160 members from all over the world and the science communication officer of the Distributed KGs COST Action (DKG, cost-dkg.eu/) with more than 100 members from more than 30 countries in EU.


Materials

All the necessary materials for the tutorial are available at:
Technologies

Sponsors