PBCore for publishing, sharing, and preservation

Jack Brighton, February 28th, 2007

(I’m moving this from the comment section of John Proffitt’s post “RSS a good start, but a federated PBCore-based metadata archive would be better” at his suggestion. Comments are perhaps getting buried, but please do see that thread for more context and great points by all participants. Of course I edited this since I can’t leave anything alone…)

John’s and Dale’s ideas here about using PBCore are excellent, and this is a great place to discuss shaping new practices with media and metadata. I do think XML is the key to unlock access to content, and to expose metadata in whatever flavor and variety is wanted for particular purposes. So an RSS 2.0 feed is good for one purpose, and a PBCore XML record works for something a little more full-blown. In the latter case, we could use PBCore records to connect the dots in a federated collection of public media content at a highly granular level, by developing applications to parse, sift, search, and serve the data. It could look like one collection, but the content could be anywhere. This model is becoming more common in the library world where an XML protocol like OAI-PMH is used. (See http://www.openarchives.org/ )

With this in mind I recently developed templates in my content management system to output various XML formats, including RSS, Atom, PBCore, and Dublin Core. You might have seen me fumble thru a demo of this at the IMA Tech Session Show and Tell. You can see the beta version of this at http://will.atlas.uiuc.edu/index.php/prairiefire/ . Scroll down to find the Syndication menu in the left nav. The PBCore link will generate PBCore XML for the latest 10 episodes of this show we produce called Prairie Fire. When you are on one of the Episode content pages, the PBCore URL reflects just that episode. Same for Segment PBCore URLs. The URL calls the template to display the specific record or set of records, so it becomes the key to everything.

To what end? Right now, it’s just a demonstration or proof of concept. Eventually this could be used by Content Depot or NGIS to suck in metadata and media objects for system-wide syndication. (You know, as in Syndication.) In this case, the primary media item would be a broadcast-quality file, not a streaming archive. Then you’d also have a reference to the streaming archive as part of the PBCore record, along with other versions and related assets like a thumbnail image, etc. But I’m not sure PBCore is the right format to wrap up related media assets, so we could use standards like like MODS or METS which can include PBCore records as nested elements. In fact, when people begin using our media we’ll want to harvest tags and trackbacks, which add valuable metadata to the existing record. So we’ll want a way to encode this metadata and allow the total package to evolve. PBCore can be the item-level metadata format, but all related items might best be encoded in something else. Then everything can live and breathe as an item, a collection of related items, and a collection of collections. (Am I getting too meta here?) I’m suggesting that this method leads to media objects that harness collective intelligence, with metadata records that evolve with use. Our technical systems should allow for preservation of this metadata along with the media object at its core.

So what to do next? I’m going to finish building out my little CMS implementation and see where it leads. There are zero actual PBCore applications that can use this stuff, far as I know. But this is really easy to do, and it might lead to some other easy ideas…which I think are often the best kind!

2 Responses to “PBCore for publishing, sharing, and preservation”

  1. sopitikoj Says:

    Hi all!

    Beautiful site!

    G’night

  2. lokimikoj Says:

    Hello

    I see first time your site guys. I like you :)

Leave a Reply