
ISOcat: An Overview of Typological Database Systems
Explore the world of ISOcat with a detailed look at typological database systems, data categories, and XML resources. Learn how to make semantics explicit, associate data categories with resources, and create schemas for valid XML documents. Discover the importance of PIDs and where to place them for effective resource management. Delve into the concepts of Relax NG schema and refining textual documents with extended Backus-Naur Form (EBNF). Gain insights into building blocks, rules, and combining elements for creating valid resources.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
www.isocat.org www.isocat.org Beyond ISOcat 20 June 2013 CLARIN-NL ISOcat tutorial 1
www.isocat.org www.isocat.org Vision Typological Database System RR Relation registries MPI RR MPI DCR ISO DCR Data category registries TDS database MPI archive Linguistic resources resource 20 June 2013 CLARIN-NL ISOcat tutorial 2
www.isocat.org www.isocat.org How to make semantics explicit? Associate data categories with your resources using the PIDs Where to put the PIDs? Preferably in a schema Or in the resource itself (redundant) Or in the metadata of the resource (less specific) 20 June 2013 CLARIN-NL ISOcat tutorial 3
www.isocat.org www.isocat.org What is a schema? comes from the Greek word " " (skh ma), which means shape, or more generally, plan. (wikipedia) A collection of building blocks and rules on how to combine them into a valid resource XML document: DTD, XML Schema, Relax NG, easy; see http://www.isocat.org/12620/ RDF graph annotation property easy; see http://www.isocat.org/ns/dcr.rdf Text document: A grammar Extended Backus Naur Form (EBNF) ... how to embed Data Category PIDs? 20 June 2013 CLARIN-NL ISOcat tutorial 4
www.isocat.org www.isocat.org XML resource <lmf:lexicon xml:lang= jp alphabet= ipa > <lmf:entry> <lmf:lemma> <lmf:writtenForm>nihongo</ > </ > </ > </ > 20 June 2013 CLARIN-NL ISOcat tutorial 5
www.isocat.org www.isocat.org XML resource <lmf:lexicon xml:lang= jp alphabet= ipa > <lmf:entry> <lmf:lemma> <lmf:writtenForm dcr:datcat= http://www.isocat.org/datcat/ > nihongo </ > </ > </ > </ > 20 June 2013 CLARIN-NL ISOcat tutorial 6
www.isocat.org www.isocat.org XML Relax NG schema <rng:attribute name= alphabet dcr:datcat= http://www.isocat.org/datcat/ > <rng:value dcr:datcat= http://www.isocat.org/datcat/ > ipa </ > </ > 20 June 2013 CLARIN-NL ISOcat tutorial 7
www.isocat.org www.isocat.orgCGN/DCOI grammar with DC references http://lux13.mpi.nl/schemacat/schema/CGN (early alpha version) (* @dcr:datcat 'N' http://www.isocat.org/datcat/DC-4909 *) ... tag = 'N', '(', NTYPE, ',', GETAL, ',', GRAAD, ',', GENUS, ',', NAAMVAL, ') ... (* @dcr:datcat NTYPE http://www.isocat.org/datcat/DC-4908 *) (* @dcr:datcat 'soortnaam' http://www.isocat.org/datcat/DC-4910 *) (* @dcr:datcat 'eigennaam' http://www.isocat.org/datcat/DC-4911 *) NTYPE = 'soortnaam' | 'eigennaam' ; ... 20 June 2013 CLARIN-NL ISOcat tutorial 8
www.isocat.org www.isocat.org Multiple DCRs? Actually we don t need multiple DCRs to have overlapping subsets Overlaps are created due to Data categories are typed, and might not have the type you need POS field (closed DC) of the lexical entry walk gets the value verb (simple DC) PoS = verb Verb (open DC) feature of a feature structure gets the value walk Verb = walk External sets are imported just as they are NKJP, GOLD, STTS, Only some take the effort to also provide mappings There might be very fine differences between your data category and an existing one, and the owner doesn t want to adapt Still we would like to know that these data categories are the same or almost the same! 20 June 2013 CLARIN-NL ISOcat tutorial 9
www.isocat.org www.isocat.org Relation Registry - RELcat http://lux13.mpi.nl/relcat/ (alpha version) Stores user specific sets of relations: language ID isocat:DC-2482 dc:language language name isocat:DC-2484 relcat:subClassOf time coverage isocat:DC-1502 dc:coverage 20 June 2013 CLARIN-NL ISOcat tutorial 10
www.isocat.org www.isocat.org Relation types There already exist large collections of relations with their own vocabularies, e.g., OWL (2), SKOS, ... RELcat has a basic relation type hierarchy rel:related rel:sameAs rel:almostSameAs rel:broaderThan rel:superClassOf rel:hasPart rel:narrowerThan rel:subClassOf rel:partOf which can be extended for other vocabularies rel:sameAs owl:sameAs skos:exactMatch rel:almostSameAs skos:closeMatch 20 June 2013 CLARIN-NL ISOcat tutorial 11
www.isocat.org www.isocat.org RELcat usage RELcat is still in an alpha phase no user interface yet upload of relations via the system administrator isocat@mpi.nl however, there is an read-only API which is in use by (experimental) parts of the CLARIN infrastructure, e.g., the CMDI semantic mapping component 20 June 2013 CLARIN-NL ISOcat tutorial 12
www.isocat.org www.isocat.org Another new kitten: SCHEMAcat Resource schemata of any type should be stored somewhere persistently Get a PID These schemata are preferably annotated with data categories SCHEMAcat ISOcat These data categories will then have (typed) relationships among each other SCHEMAcat RELcat Status: very early alpha, but some schemata are already available CGN: http://lux13.mpi.nl/schemacat/schema/CGN 20 June 2013 CLARIN-NL ISOcat tutorial 13
A whole litter! www.isocat.org www.isocat.org Linguistic resource (schema) Linguistic knowledge base Data categories Containers Concepts Relation Schema Registry - SCHEMAcat Data Category Registry - ISOcat Concept Registry Relation Registry - RELcat 20 June 2013 CLARIN-NL ISOcat tutorial 14