RSS a good start, but a federated PBCore-based metadata archive would be better
John Proffitt, February 27th, 2007I’d like to echo Dale’s posting, and expand upon it just a bit more.
First off, I agree that the political hurdles to implementing a standardized and centralized media back-end for the public media world are daunting. Further, I think what we see as “public media” is going to shift around rapidly in next couple of years, so determining who is “allowed” into the fold will becoming increasingly difficult (e.g. can a library join, or do you have to be a broadcaster with an active high-power FM or TV license?). There are other challenges as well, but let’s leave that issue alone for the moment. Back to the tech…
I think a centralized storage system is probably a bad idea, or at least one that would be difficult to achieve for all kinds of reasons. It’s also unnecessary. Why does everything have to be stored together, under one roof? The storage can be anywhere. It’s the live, searchable content index that would be most useful to the public, to other stations, to search engines and more. Let’s just remember that storage and indexing do not have to occur at the same place.
Now, about RSS. I think RSS is a great syndication system for short-form and linked media for recently published items. But RSS strikes me as insufficient as a deep-catalog syndication system. For example, how would I syndicate — using RSS — a catalog of 50,000 items or 100,00 items, in which the items are drawn from a variety of subjects and media formats and sources, each with various rights and authors associated with them? Theoretically, RSS could do this, as it’s just a string of XML. However, RSS 2.0 in its baseline configuration doesn’t carry all the data a centralized search system would need. Sure you can extend RSS with your own additional XML tags (just look at iTunes), but it still sounds a little silly to me to do it that way.
What I would propose is the establishment of a standard metadata description and storage pointer language, based on the PBCore schema (which is pretty complete already). Each public media entity would then expose its metadata index and its digital media archive to the public, to other stations, and to a centralized repository that would periodically accept updates from the edge storage and indexing systems. Access to the data could be tiered as desired, exposing only those items you wish to expose to various users or partners.
Using this metadata standard would allow the proposed central index to gather information from repositories both inside and outside the public media world.
In this way, we have the local control required (for whatever reasons) over media assets, yet the central searchability of our content is not impaired. Local entities would be required to meet certain metadata standards (and tests) before being accepted into the central indexing system. And getting into the system would be a high priority for any media companies wanting to be “found” online, especially in areas beyond the reach of any legacy transmitters.
The big plus is that while there would have to be an entity building and maintaining the indexing service, the various players would only have to meet a baseline standard protocol, mostly eliminating the politics. Yes, fights break out at the IEEE from time to time, but in the end, they do reach broadly interoperable standards.
Or… and here’s a subversive bit… do we just implement the metadata standard and then call up Google and tell them how and where to index all our content?
February 27th, 2007 at 8:39 am
Thanks John. I agree that the PBCore standard is more comprehensive than RSS. But we are going to have to operate and distribute content (and import content) within a media universe that is unlikely to adopt our standards, no matter how capable. Feedreading is a pretty flexible art, however, and it should be possible to generate feeds that have the complete PBCore data set included, along with the RSS. At NCPR our feeds output basic RSS as well as the extended iTunes tags that include the media enclosure and other special requirements for a podcast feed. On the reader end, the extraneous or duplicate tags are ignored. It makes for a fatter feed, but flexible. This would allow platforms that are not PBCore-friendly to use our content, and for us to also use content distributed via the more widely used syndication method.
February 27th, 2007 at 9:11 am
So it would be a bit odd to go for centralization after a week of hearing about expanding networks and reaching further into and out of the community, wouldn’t it? In any case, there’s got to be some way to utilize PBCore and RSS together, as a kind of unified standard. Wonder what it would take to make that happen? In order to make it truly proper, we would in fact need a standard, otherwise feeds wouldn’t validate. This may seem a trivial issue, but in fact it’s not, as invalid feeds can cause reader meltdown at times. But it seems it might be worth it to do whatever is necessary to get that standard approved, eh? Any w3c-connected folks in the building, please report to the front desk…
February 27th, 2007 at 12:27 pm
Hey, we’re actually reading each other’s posts and commenting — it’s like a real blog!
Allow me one more counter-proposal… Assuming my content metadata catalog is in a centralized database (that I maintain at my location), I could choose to establish multiple feeds and data interchange formats to fit anyone’s needs.
So I could actually establish multiple feeds off the same data set. One feed for plain-vanilla RSS 2.0, or whatever the curent version would be at any given time. That feed would validate perfectly with the W3C and would be readable without incident by any RSS 2.0 reader, podcatcher, etc.
Then I have another feed setup for the current edition of the iTunes format. Then yet another feed, this one a customized one, for some kind of syndication to selected partners. This last one would contain additional tags taken from the PBCore data set in an agreed-upon format and could be very, very long indeed.
Additionally, in that last feed category, you could break up your content database into many sub-feeds, each specifically focused on one program or one class of programs, or programs about particular subjects, etc. Perhaps that’s another area for collaboration — a standard subject taxonomy between radio and TV and other public media collaborators.
I’m still not convinced that an XML feed (RSS or whatever) is the most efficient way to expose my (someday) large catalog of content, but it certainly would be the easiest one to setup. One last proposal… why don’t we do this in a phased way? Let’s start with RSS 2.0, then once that’s working move to deeper and deeper extensions? That gives us a chance to feel good about collaboration via some quick successes. We may need that to foster trust within the sometimes competitive public media environment.
February 27th, 2007 at 1:23 pm
Brilliant, John. Much more sensible than modifying already existing standards that work perfectly fine. And also kind of a big “duh” moment for me. What was I thinking? Dale and I spoke a bit about modifying Public Media Manager so it accommodates PBCore. I think we’d just put together a PBCore-structured XML feed, based on those we already have for RSS. Any suggestions on what might be a better idea?
February 27th, 2007 at 1:47 pm
My only comment now would be… we should push for development of a standardized XML structure for the interchange, if indeed we want to use XML. And I wouldn’t encode the entire PBCore data element list, as it’s probably too expansive for what’s required. So for now, toy with the setup, but we would need agreement across the industry to make a go of this. This is why ongoing discussion amongst the technical folks is important — we can get past the political wranglings by agreeing upon data exchange protocols amongst ourselves. Although, I must admit, techies can get political too. Which is better, Mac or PC?
February 27th, 2007 at 7:17 pm
One thing I love about you guys is you seem to read my mind, then find solutions I’m too dumb or timid to suggest. I do think XML is the key to unlock access to content, and to expose metadata in whatever flavor and variety is wanted for particular purposes. So an RSS 2.0 feed is good for one purpose, and a PBCore XML record works for something a little more full-blown. In the latter case, we could use PBCore records to connect the dots in a federated collection of public media content at a highly granular level, by developing applications to parse, sift, search, and serve the data. It could look like one collection, but the content could be anywhere. This model is becoming more common in the library world where an XML protocol like OAI-PMH is used. (See http://www.openarchives.org/ )
With this in mind I recently developed templates in my content management system to output various XML formats, including RSS, Atom, PBCore, and Dublin Core. You might have seen me fumble thru a demo of this at the IMA Tech Session Show and Tell. You can see the beta version of this at http://will.atlas.uiuc.edu/index.php/prairiefire/ . Scroll down to find the Syndication menu in the left nav. The PBCore link will generate PBCore XML for the latest 10 episodes of this show we produce called Prairie Fire. When you are on one of the Episode content pages, the PBCore URL reflects just that episode. Same for Segment PBCore URLs. The URL calls the template to display the specific record or set of records, so it becomes the key to everything.
To what end? Right now, it’s just a demonstration or proof of concept. Eventually this could be used by Content Depot or NGIS to suck in metadata and media objects for system-wide syndication. (You know, as in Syndication.) In this case, the primary media item would be a broadcast-quality file, not a streaming archive. Then you’d also have a reference to the streaming archive as part of the PBCore record, along with other versions and related assets like a thumbnail image, etc. But I’m not sure PBCore is the right format to wrap up related media assets, so we could use standards like like MODS or METS which can include PBCore records as nested elements. In fact, when people begin using our media we’ll want to harvest tags and trackbacks, which add valuable metadata to the existing record. So we’ll want a way to encode this metadata and allow the total package to evolve. PBCore can be the item-level metadata format, but all related items might best be encoded in something else. Then everything can live and breathe as an item, a collection of related items, and a collection of collections. (Am I getting too meta here?)
So what to do next? I’m going to finish building out my little CMS implementation and see where it leads. There are zero actual PBCore applications that can use this stuff, far as I know. But this is really easy to do, and it might lead to some other easy ideas…which I think are often the best kind!
February 27th, 2007 at 7:43 pm
Hey guys — great conversation. I’d encourage you all to actually create new blog posts rather than replies, if you’d like. Comments are good, but get less overall visibility. Your ideas deserve more exposure!
March 16th, 2007 at 11:07 pm
xanax side effects
news
April 18th, 2007 at 5:14 pm
Vorovannaja stochka ty zhe znaes. Pran Elina.
May 1st, 2007 at 9:33 am
Ne v dengah ne v muzejnoj pyl. Pran Siarl.
May 31st, 2007 at 5:04 pm
Zhili byli.. Menelaus Karolis.
June 17th, 2007 at 6:18 pm
I liked your site. On it interesting themes
are discussed!!!
June 20th, 2007 at 9:39 pm
Ne nasovsem a navsegd. Genya Georgiy.
June 20th, 2007 at 11:02 pm
Hello! Good Site! Thanks you! xvfifnsrjzmcn
July 4th, 2007 at 8:31 pm
Youre really losing i. Den Leola.
July 7th, 2007 at 2:30 am
957996ff758e6cab1caa8338544bc5c8
957996ff758e6cab1caa8338544bc5c8
July 14th, 2007 at 5:22 am
But you keep on abusin i. Steffan Innokenti.
July 17th, 2007 at 9:46 pm
anna nicole smith skyscraper
Hello!
September 19th, 2007 at 2:55 pm
they’ll have you suicidal suicida. Piet Nirvana.
October 21st, 2007 at 4:06 pm
they’ll have you suicidal,suicida. Mihangel Caiaphas.
October 29th, 2007 at 10:09 am
Hello, very nice site, keep up good job!
Admin good, very good.
December 27th, 2007 at 10:10 am
Lafarge to buy Orascom Cement for $12.8 bln link
February 4th, 2008 at 9:23 am
best paris hilton sex tape video ever f
Recently leaked footage of the new Paris Hilton sex tape
April 18th, 2008 at 3:46 am
fo033.txt;2;5
May 14th, 2008 at 7:52 am
932f1e0f8257
932f1e0f825711c4b87f