Robin, the only question that leaps immediately to mind is regards
"use." How much of this is overlap with information intended for the
At 12:35 PM 3/21/2002 -0500, Robin Wendler wrote:
>On Thu, 21 Mar 2002, Jerome McDonough wrote:
> > Technical Metadata for Audio/Video: Michigan State, Library of
> > Congress, Harvard
>We plan to have a mtg report and revised schema drafts available
>sometime the week of April 1st.
> > Technical Metadata for Text: New York University, Harvard
>I was supposed to revise and send around a paper on this, mea maxima
>culpa. If I can clean up what I have for general consumption I'll send it
>out by April, but more likely I will publicly declare defeat.
>I would LOVE to see what NYU has done on this.
>As a summary, our gross areas of local concern for archiving text (i.e.,
>character data, not Word, et al.) were
>-- Character set -- about which there is an eye-opening technical report
>on the Unicode site: "Character Encoding Model (Unicode technical report
>#17)" http://www.unicode.org/unicode/reports/tr17/ What aspects of this
>do we need to record and how can we determine them? (Since our
>contributors sure as heck won't know...)
>-- Markup -- What DTDs, entity files, schemas, style sheets, etc. do we
>want to a) know about and/or b) deposit along with the text file? How do
>we manage versioning of said auxiliary files?
>-- Processing history -- what if anything do we want to know about the
>hardware/software environment in which these text files were produced (OCR
>-- Use -- and this is fuzzy one: what if anything do we need to record
>about the application/processing environment in which the file is intended
>to be used? For example, a DTD/style sheet may not tell you (as an
>archive) everything you need to know about a text object in order to
>preserve the functions that it currently fulfills in its application
>context. (Does that make any sense?) We can preserve the bits, we're
>fairly confident we can preserve the characters and markup, but what
>if the app dies?
>Robin Wendler ........................ work (617) 495-3724
>Office for Information Systems ....... fax (617) 495-0491
>Harvard University Library ........... [log in to unmask]
>Cambridge, MA, USA 02138 .............