RSS a good start, but a federated PBCore-based metadata archive would be better
John Proffitt, February 27th, 2007I’d like to echo Dale’s posting, and expand upon it just a bit more.
First off, I agree that the political hurdles to implementing a standardized and centralized media back-end for the public media world are daunting. Further, I think what we see as “public media” is going to shift around rapidly in next couple of years, so determining who is “allowed” into the fold will becoming increasingly difficult (e.g. can a library join, or do you have to be a broadcaster with an active high-power FM or TV license?). There are other challenges as well, but let’s leave that issue alone for the moment. Back to the tech…
I think a centralized storage system is probably a bad idea, or at least one that would be difficult to achieve for all kinds of reasons. It’s also unnecessary. Why does everything have to be stored together, under one roof? The storage can be anywhere. It’s the live, searchable content index that would be most useful to the public, to other stations, to search engines and more. Let’s just remember that storage and indexing do not have to occur at the same place.
Now, about RSS. I think RSS is a great syndication system for short-form and linked media for recently published items. But RSS strikes me as insufficient as a deep-catalog syndication system. For example, how would I syndicate — using RSS — a catalog of 50,000 items or 100,00 items, in which the items are drawn from a variety of subjects and media formats and sources, each with various rights and authors associated with them? Theoretically, RSS could do this, as it’s just a string of XML. However, RSS 2.0 in its baseline configuration doesn’t carry all the data a centralized search system would need. Sure you can extend RSS with your own additional XML tags (just look at iTunes), but it still sounds a little silly to me to do it that way.
What I would propose is the establishment of a standard metadata description and storage pointer language, based on the PBCore schema (which is pretty complete already). Each public media entity would then expose its metadata index and its digital media archive to the public, to other stations, and to a centralized repository that would periodically accept updates from the edge storage and indexing systems. Access to the data could be tiered as desired, exposing only those items you wish to expose to various users or partners.
Using this metadata standard would allow the proposed central index to gather information from repositories both inside and outside the public media world.
In this way, we have the local control required (for whatever reasons) over media assets, yet the central searchability of our content is not impaired. Local entities would be required to meet certain metadata standards (and tests) before being accepted into the central indexing system. And getting into the system would be a high priority for any media companies wanting to be “found” online, especially in areas beyond the reach of any legacy transmitters.
The big plus is that while there would have to be an entity building and maintaining the indexing service, the various players would only have to meet a baseline standard protocol, mostly eliminating the politics. Yes, fights break out at the IEEE from time to time, but in the end, they do reach broadly interoperable standards.
Or… and here’s a subversive bit… do we just implement the metadata standard and then call up Google and tell them how and where to index all our content?