Mapping & Transforming Data for Semantic Integration: The X3ML Toolkit
This content discusses the importance of handling metadata from diverse institutions like galleries, libraries, and museums in a unified manner for research advancement. It highlights the necessity of data aggregation and integration to create valuable resources for various purposes, stressing the role of tools for modeling, cleaning, and transforming data while preserving and enhancing semantics.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Mapping & Transforming Data for Semantic Integration The X3ML Toolkit Syntactic and Semantic Conversion of Metadata RDA 17th Plenary Meeting Edinburgh (Virtual 20-23 April 2021 Maria Theodoridou Foundation for Research & Technology Hellas (FORTH) Institute of Computer Science (ICS) maria@ics.forth.gr www.ics.forth.gr/isl/cci
Outline Motivation - Goals Requirements Data Transformation Workflow X3ML toolkit X3ML Mapping Definition Language 3M Editor X3ML Engine Exploitation Pros & Cons 2
Motivation - Goals Institutions like galleries, libraries, archives and museums curate different types of collections that, even between similar types of institutions, are documented in different ways using different languages; influenced by different disciplines, objectives and geography, and are encoded using different metadata schemas. Handling such metadata as a unified whole is vital for progressing new fields of research and discovery, providing more knowledgeable information retrieval and (meta) data exchange. A unified source is the result of data aggregation and integration. Data aggregation and integration has the potential to create rich resources useful for a range of different purposes, from research and data modeling to education and engagement. It is being accomplished by incorporating several tools for modelling, cleaning, normalizing and transforming data.
Requirements To facilitate data transformations while preserving or even enhancing the semantics of data. In terms of conceptualization Definition of guidelines and best practices Reliance on standards Specifications for supporting schema mappings In terms of technology Software for assisting users in describing their transformations Software for supporting data transformations, with emphasis on Configurability Extensibility Scaling Automation Ease of Use 4
Data Transformation Workflow Aggregator Semantic Integration & Interoperability CIDOC CRM & family of models Transformation - X3ML engine Terminology Mapping Data Transformation Tools URI generation specification - LOD Schema Matching Data Providers Heterogeneous Data Collections 5
Example Aggregator Semantic Integration & Interoperability Transformation - X3ML engine Data Transformation Tools Terminology Mapping URI generation specification - LOD Schema Matching Data Providers Places, Objects Documents Photos, Persons Heterogeneous Data Collections 6
X3ML Toolkit A set of small, open source, microservices designed with open interfaces, easily customized and adapted to complex environments that assist the data provisioning process for information integration, using X3ML, a mapping definition language: X3ML mapping definition language, an XML based declarative language which describes schema mappings in such a way that they can be collaboratively created and discussed by experts. 3M the Mapping Memory Manager, a tool for managing mapping definitions. 3M Editor, a web application suite to assist users during the mapping definition process, using a human-friendly user interface and a set of sub-components that either suggest or validate the user input. X3ML Engine, is a tool that realizes the transformation of data resources to a target format with respect to an X3ML Mapping definition language. 7
X3ML Mapping Definition Language Specification X3ML is a declarative, XML based language which describes schema mappings in such a way that they can be collaboratively created and discussed by experts. Key Features It provides a declarative way for describing schema mappings Focuses on properly mapping schema resources Decoupled from the URI and values generation process Mappings are described using XML serialization https://github.com/isl/x3ml/blob/master/docs/x3ml-language.md 8
3M Editor Enables the creation of mapping definitions (X3ML) between source and target schemata Supports guided mappings by analyzing source resources and target schemata Provides user space and mapping storage Transforms data (in RDF format) using X3ML Engine http://www.ics.forth.gr/isl/3M 9
3M Editor Implemented using modern and responsive technologies Faster and light-weight (at client side) Allows concurrent edits of mappings from different users (a la Google docs) Beta version to be announced early 2020 10
X3ML Engine Realizes the transformation of data resources to a target format with respect to an X3ML mapping definition. Main principles: Simplicity by design Transparency in terms of expected output Re-use of standards and technologies as much as possible Facilitating the instance matching process Available as: API, executable (console-based & GUI), service History: Designed by FORTH. Initial development by DELVING B.V. under the support and contribution of FORTH (until v.1.3). FORTH took over the full development since 3/2015. 24 Releases (Latest: v.1.9.4 8/2020) X3ML Input X3ML Engine Ontology-based descriptions Generator Policy Terminology https://github.com/isl/x3ml 11
Exploitation / Assets Matrix X3ML 3M Editor X3ML Engine FP7 ARIADNE H2020 BlueBRIDGE H2020 BlueCloud H2020 VRE4EIC H2020 PARTHENOS H2020 SSHOC H2020 ARIADNEplus H2020 SeaLiT BritishMuseum 12
X3ML Toolkit pros & cons Pros Cons Simple model for defining mappings Supports incremental changes of source & target schemata Supports customized URI generation policies Decouples schema mapping from URI specification Currently only xml rdf Shallow learning curve URI specification needs technical skills Scalability (good for small and large datasets, memory issues with huge datasets - big data) Easily deployed in different environments Promotes the collaborative work of experts 13
Useful links The X3ML Toolkit https://www.ics.forth.gr/isl/x3ml-toolkit The source code is open source available on github: 3M - Mapping Memory Manager https://github.com/isl/Mapping-Memory-Manager 3M Editor https://github.com/isl/3MEditor X3ML Engine https://github.com/isl/x3ml Free to use deployment of 3M https://isl.ics.forth.gr/3M/ 14
Thank you for your attention! Maria Theodoridou Foundation for Research and Technology Hellas (FORTH) maria@ics.forth.gr