Linked Data Techniques for Exposing Text Database Content

oclc open source linked data framework n.w
1 / 15
Embed
Share

Learn about utilizing a combination of tools to expose text database records as Linked Data. Explore the process involving the SRW/U server, content negotiation, and URI patterns to enhance accessibility and search capabilities for database content.

  • Linked Data
  • Text Database
  • Content Exposition
  • Data Framework
  • Linked Data Techniques

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. OCLC Open Source Linked Data Framework OCLC Research TAI CHI Webinar 7/1/2010 Ralph LeVan Sr. Research Scientist OCLC Research

  2. Goal: Expose Text Database Content as Linked Data Technique: Using a combination of the urlrewritefilter from tuckey.org, the content negotiation component from the Freie Universit t Berlin s Pubby server and our Open Source SRW/U server you can expose the records in your database as Linked Data

  3. Roadmap 1. SRW/U Server 2. URIs for records 3. Real World Objects for records 4. Multiple record formats 5. RDF needs to be returned 6. Content Negotiation

  4. SRW/U Server SRW/U server sits in front of text databases We have interfaces for DSpace, Lucene and Pears Easy to write your own interface Convert CQL query to native query language Do search and return a resultset object Return records from the resultset (The Lucene interface is a good simple example of how to build your own database interface) I expose my SRW/U service as <context>/search, but you can put it wherever you want.

  5. URIs for records urlrewritefilter implements apache mod_rewrite patterns for java servlets It sees the URI and converts it to an SRU search. <from>^/([0-9][0-9]+)/$</from> <to>/search?query=local.viafID+exact+%22$1%22</to> E.g. viaf/123 becomes viaf/search?query=viafID+exact+%22123%22

  6. Aside: What to Return? An SRU query returns a searchRetrieveResponse. A smart client can pick its record out of that response, but that seems wrong A bad URI will result in no records found , but a 404 (record not found) is more appropriate Solution: add a new parameter (service=APP) to signal that this was a request for a single record E.g. viaf/123 becomes viaf/search?query=viafID+exact+%22123%22&service=APP

  7. Real World Objects for records urlrewritefilter can generate 303 (see other) redirects based on URI patterns <from>^/([0-9][0-9]+)$</from> <to type="seeother-redirect">/viaf/$1/</to> E.g. viaf/123 redirects to viaf/123/ The target, viaf/123/, is called the Generic Record (note: now we use viaf/123/ as the URI that gets turned into the SRU search)

  8. Multiple record formats urlrewritefilter plus the new httpAccept parameter in SRU E.g., viaf/123/marc21.xml becomes viaf/search?query=viafID+exact+123&httpAccept=application /marc21+xml SRU is configured with a list of supported media types and the XSL stylesheets that render them

  9. MimeType Configuration XML.mimeTypes=application/sru+xml;q=0.85, application/xml, text/xml HTML.mimeTypes=text/html;q=0.9, application/xhtml+xml RSS.mimeTypes=application/rss+xml;q=0.8 RSS.styleSheet=viaf2rss.xsl M21.mimeTypes=application/marc21+xml;q=0.7 M21.styleSheet=viaf2marc21.xsl marc21HTML.mimeTypes=application/marc21+html;q=0.7 marc21HTML.styleSheet=viaf2marc21.xsl

  10. Aside: 123/marc21.xml NOT 123.m21 The Generic Record being at viaf/123/ seems to imply that it is a collection of records How do I ask for the HTML version of the MARC21 version of viaf/123 if suffix mangling is all I have?

  11. RDF needs to be returned viaf/123/rdf.xml Making good RDF is tricky and beyond the scope of this presentation (but I think we re getting close to agreements of sensible basics)

  12. Content Negotiation on Generic Record Pubby has a really nice Content Negotiation module It is configured with the list of supported media types with optional quality measures It takes an HTTP Accept header and returns the supported media type that matches best SRU is configured with a list of supported media types and their quality measures and the XSL stylesheets that render them (see previous slide)

  13. Result viaf/123 redirected to viaf/123/ viaf/123/ turned into viaf/search?query=viafID+exact+123 VIAF record returned by SRW/U server Content Negotiation causes viaf/123/viaf.html to be returned to googlebot and browsers, viaf/123/rdf.xml to applications that ask for application/rdf+xml and viaf/123/viaf.xml returned when no preference is provided

  14. Lucene Database Demonstration TBD

  15. Questions?

Related


More Related Content