I'd like your opinions on a few issues. I work at a data center that
primarily archives and distributes remote sensing data about the
environment and as such I am probably one of, if not the only,
science data manager that is interested in PREMIS metadata. My
reasons for being interested are:
1. I am a fan of the OAIS reference model
2. We need to use something to manage information about our archive
3. I hate reinventing the wheel (though am willing to balance wheels
and add new spokes as need be)
4. It should eventually help interoperability allowing archives to
easily back each other up through the exchange of metadata and data
5. It should eventually help harmonize the directions digital
libraries and data centers are headed (I'd love to add the GIS
community to this too), thereby providing a more seamless transition
from data to information and knowledge, if not exactly to wisdom...
I've noticed that I get a lot of push back on using PREMIS, not just
internally; but, also from my fellow science data managers
elsewhere. Some of that may simply be resistance to change or the
infamous "not invented here" syndrome. It also may be partly that
there is a lot of pressure to head in the direction of less metadata
rather than more (i.e., I hear sentiments like "metadata isn't the
solution - we need a better hammer" a lot). One question I'm often
asked is why anything beyond FGDC metadata is needed (almost all of
my and my colleagues' data is documented at the data set level by
FGDC metadata). My answers about storage information, fixity,
preservation events and agents, and rights are routinely met with
statements like "FGDC can do that." That has always seemed strange
to me since the FGDC standard was specifically developed to contain
metadata to support the following:
"The information included in the standard was selected based on four
roles that metadata play:
- availability -- data needed to determine the sets of data that
exist for a geographic location.
- fitness for use -- data needed to determine if a set of data meets
a specific need.
- access -- data needed to acquire an identified set of data.
- transfer -- data needed to process and use a set of data." - from
CSDGM, 1999
Not one of these purposes is to ensure the long-term preservation of
data. As such, my first thought has always been that FGDC and PREMIS
metadata should be orthogonal - in other words, there shouldn't be a
lot of overlap between the standards. It that is the case, then it
seems to me that it would make a lot of sense to use both standards
simultaneously - FGDC to deal with external user access, PREMIS to
deal with preservation needs. Since I've gotten so much push back on
this, I decided to see how much overlap between the two standards
there really is. I've attached a very rough draft of a PREMIS to
FGDC mapping and am contemplating drafting the inverse FGDC to PREMIS
mapping in addition. The map was drafted for our internal wiki - so
lot's of the comments are NSIDC specific. My own impression is that
there is a bit more overlap than I was expecting; but, that I had to
pound those square pegs pretty hard to get them to fit in those round
holes.
My questions for the group are:
1. Is something like this mapping useful to the community or is it
silly to even think about it? [I would be willing to post it on the
PIG wiki; but, reserve the right to publish it in a paper I am
writing]. If it is useful, would the community be interested in
reviewing/helping me solidify the draft?
2. What is your take on the FGDC vs PREMIS metadata issue?
3. What is your opinion on the question of whether data centers and
digital libraries, etc. should/could use the same set of standards?
Thanks a bunch (in advance),
Ruth Duerr
NSIDC Data Stewardship Program Manager and MODIS/PARCA Data Coordinator
|