Discussion of Data Fabric Terms & Preparation for RDA P7 Virtual Meeting

discussion of data fabric terms preparation n.w
1 / 7
Embed
Share

Join the discussion on Data Fabric terms and preparation for the RDA P7 Virtual Meeting happening on January 25, 2016. Explore new terms, use cases, and plans surrounding vocabulary issues. Broaden the conversation on data management principles and policies to enhance digital data management practices.

  • Data Fabric
  • RDA
  • Virtual Meeting
  • Data Management
  • Vocabulary

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Discussion of Data Fabric Terms & Preparation for RDA P7 Virtual Meeting Monday, January 25, 2016 Organized by Gary Berg-Cross (DFT-IG) and Peter Wittenburg (DF-IG)

  2. Agenda, Context and Recap 1. Brief update by Gary Berg-Cross on DFT IG activities 1. New terms, 2. Use cases for vocabulary services, 3. Context for DF term discussion and 4. P7 plans Overview of vocabulary issues from Data Fabric standpoint by Peter Wittenburg who will provide some overview of terms and issues as part of the meeting Discussion of how to handle vocabulary issues going forward. 1. This meeting will be an opportunity to discuss some of the troublesome DF terms (and maybe other ones) in context to see if we can develop some working draft definitions that can be firmed up over time. Vocabulary issues and plans from other RDA groups 1. If interested people can respond here with candidate terms or issues and perhaps working definitions as well as bring them up at the meeting as noted in the agenda. 2. 3. 4.

  3. DFTIG Status and Plans 1. Some terms about repository registries, for example, have been entered into the RDA DFT term tool based on recent DF discussions and posts as well as RDA-WDS Data-Pub Workflows. http://smw-rda.esc.rzg.mpg.de/index.php/Special:AllPages Collection Registry Repository Registry; Data repository entry; Data review . , Data journal; 2. In addition we are working with the Vocabulary Services IG to use some of their tool-based services to improve our vocabularies: Providing URLs for each term for referencing Creating taxonomies from the definitions Handling synonyms etc.

  4. Broadening the Discussion (Stepwise or Scope-wise) Data Management (and use) is broad so we are building out from our start Digital Data Management including unregistrered data (is a broader concept) Digital Object Management (registered, digital data) Where are datasets???

  5. Based on practical principles, Policy defines when in a workflow a PID is created as well as other curation activities..These defs are linked Integrate Concepts: Policy-based Digital Data Management Concept Graph (Reagan Moore) Purpose Defines Collection DATA_ID DATA_REPL_NUM DATA_CHECKSUM SubType Replication Policy Checksum Policy Has Isa Isa Isa Has Isa Sharing Publication Preservation Has Digital Object Attribute Isa Quota Policy Has Isa Defines Data Type Policy Isa Updates Integrity Isa Isa Persistent State Information Authenticity Isa Defines Property Policy Procedure Updates Controls Access control Isa Isa SubType Has HasFeature GetUserACL HasFeature Periodic Assessment Criteria Policy Workflow Isa Policy SetDataType Completeness HasFeature Enforcement Point Chains Isa SetQuota Correctness Isa Function HasFeature Invokes Isa DataObjRepl Isa Consensus Isa SysChksumDataObj Operation Consistency Client Action

  6. Based on DF Discussions we developed suggested concepts with candidate terminology: Examples Data practice is the actual application/ use of ideas & methods (as opposed to theories) about how data are collected, created, stored (maintained), curated, used, shared and released (disseminated). Data principles are rules that provide guidance across data management and use for such things as data acquisition, data lifecycle control, data policy & ownership, metadata practices, data quality etc. Common data solutions are agreed upon, easily available, tested & approved approaches to widely occurring problems in data management and use Data discovery is a process of query and/or search to find (research) data of interest. Database cracking features incremental partial indexing and/or sorting of the data. It combines features of automatic index selection and partial indexes. It reorganizes data within the query operators, integrating the re-organization effort (occasionally invoking creation or removal of indexes on tables and views based on use) into query execution. It shifts the cost of index maintenance from updates to query processing. Adaptive indexing is characterized by the partial creation and refinement of preliminary or fixed DB indexes as side effects to support efficient query execution. (after http://www.vldb.org/pvldb/vol4/p586-idreos.pdf) 1. 2. 3. 4. 5. 6.

  7. Now we have a new, long list of terms to discuss For example, searchable what makes (data, publication etc. ) searchable? Rich metadata, use of a standard vocabulary, use of a registry etc... Some terms on our list have relevant RDA groups Metadata (e.g rich metadata etc.), Data publishing workflow (e.g. workflow), Domain repository, Repository Platforms for Research Data IG, Active Data Management Plans IG, BioSharing Registry: connecting data policies, standards & databases in life sciences WG Practical Policy (follow on) ? etc. Some (general) terms we can leverage standards organizations & bodies (NIST, ISO etc.) System, architecture, actor, service, schema, protocols, layer, physical layer, re-usable Some we may have particular advocates for (Research Object, self documentation- etc.)

Related


More Related Content