Saturday 24 November 2007

BarCamp London: Don't Scrape, Glean (GRDDL) - Tom Morris

APIs break the principle of DRY — equivalent to “accessible version” of web sites…

W3C have defined GRDDL: defines well-structured process of converting XHTML (or HTML4 using Tidy) into RDF * works like a stylesheet on top of your HTML

can define with GRDDL how to get at data in APIs (incl. RSS & ATOM)

aim to make data layer separate from HTML but bound to it, just like CSS

XSLT templates provided to convert HTML into RDF using classes

using xsltproc > output.rdf to transform

  • have a template HTML page that includes links to xslt
    • either your own or standard ones
  • add a link to that page in the profile attribute of the head tag of the content page
    • can have space separated profiles
  • then go to triplr.org to convert page into RDF triples or JSON data

most languages have a sparql library to handle RDF pretty much like SQL

examples:

extends microformats — enables you to write your own

No comments: