Begin forwarded message:

From: Amanda Xu <[log in to unmask]>
Date: February 10, 2012 16:08:21 EST
To: "[log in to unmask]" <[log in to unmask]>
Subject: Re: [BIBFRAME] URIs, was: The German National Library's response
Reply-To: Amanda Xu <[log in to unmask]>

Dear Karen:

Thank you so much for asking those great questions, which many of us has in mind as well!!!  Which fields we need to carry over to the next generation lib systems in global scale really depend on the degree to which 1) we want to satisfy users who are going to interact with the systems as planners, owners, designers, builders, integrators, and users; and 2) we have be in compliance with the requirement of library standards, e.g. ACRL standards for libraries in higher ed. and visual literacy competency standards, IT industry standards, and regulatory ones.  

FRBR does address some of the info-seeking concerns for users who are going to maintain and use the data in the systems, but it was defined more than 10 years ago.  Users' expectations today and recently recommended ACRL library standards might be a lot higher than that.  I would hope that our next systems should support libraries to meter individual libraries' contributions to institutions' effectiveness, including research, teaching and learning outcomes according to libraries' standards.  It's critical for libraries of all types and around the global right now.  

All of lib codes so far only emphasize on description and access points for lib resources of all content, media and format types.  We need to take care of those perspectives (including FRBR objectives) very well.  In the meantime, we have to integrate user data into what works well in FRBR and other lib standards for content models. 

For instance, this is what the mass wants for videos, e.g. http://www.youtube.com/watch?v=6LAPFM3dgag; and this is what we offer in the catalog through our MARC mapping  http://www.loc.gov/marc/bibliographic/bd007v.html, and library metadata http://www.loc.gov/standards/amdvmd/index.html.  

Of course, our emphasis is on transitioning videos from analog to digital, and to cloud-based research, teaching, and learning environments, and for data curation and preservation purpose.  I would hope that at least the needs and wants of the mass to be incorporated into the design choices we have to make here, if not the domain experts, e.g. music major students.   

For info about library system support of users' views from 5000 feet above the ground in the discovery systems for academics, please refer to my position paper in 2009: http://www.dlib.org/dlib/july09/khoo/Xu.pdf  

The following is my attempt to address your specific concerns:

>What do we do with all of the data in the 0xx fields in MARC?

AX: Remember if what we need to improve here is to support the analysis of how well individual libraries' have contributed to the institutional effectiveness, considering this is what MODS-mapping covers in 0xx fields in MARC, available from http://www.loc.gov/standards/mods/mods-mapping.html#genre.  Did we address that concern?  If yes, what are they?  If not, what changes do we need to make? 

>There are a lot of codes and identifiers that don't yet have a URI representation. Some people want to identify things like publisher and place of publication -- which sounds sensible until you remember that these are *transcribed* elements and thus represent the text of the title page, not an entity. What should we do about that?

AX: As far as I am concerned, all the codes should have the options of both URI and literals.  Let the users and systems pick which one is preferred. According to the examples by Thom Hickey, etc. in VIAF.org, he used <owl:sameAs> for URI, <foaf:name> for RDF literals, <skos:inScheme> for authority scheme indicating each library's contribution to the name authority file, etc. which I thought was brilliantly done.

As for publisher and place of publication, we really need to emphasize on authority control of both, and make use of the access control record as suggested by Dr. Barbara Tillet many years ago.  Catalogers should continue to transcribe in the bibs, and update the changes to the elements through authority control.  But the systems will support the look-up of preferred names for the publisher and place of publication.

>I am assuming that a major effort for the bibframe development will be analyzing RDA and MARC and determining which elements can be >represented by URIs.

AX: I would assume the same.  That's why a URI like this is for: http://rdvocab.info/uri/schema/FRBRentitiesRDA/Person.  LC has been working fantastically fast in codifying elements in bibs: http://www.loc.gov/standards/valuelist/index.html and this is the one for 008 field: http://www.loc.gov/standards/valuelist/marcgt.html  Did we catch up with the changes in metadata registry? 

 >Another large effort will be reconciling the differences between MARC and RDA. MARC has many elements that were never part of the cataloging code (e.g. all of the fixed field elements), and RDA has not yet been developed into a machine-readable format 

AX: It really depends 1) what users' viewpoints and their workflow that we want to support, 2) the needs for user-centered lib services that we are creating, e.g. info-seeking and research, and 3) the types of analysis we need to obtain for service measurement and refinement, etc. in the systems, hopefully still supporting MARC, RDA, and other content standards that are in the pipelines to be developed by the lib communities.  The flexibility of linked library data and actionable URIs will make the movement of data to be processed by machine, and hopefully controllable by humans, at least, to the extent of human readable, configurable, and debugged-able via a browser.


Thanks a million again!  


Amanda Xu    
 
                


From: Karen Coyle <[log in to unmask]>
To: [log in to unmask]
Sent: Friday, February 10, 2012 7:21 AM
Subject: Re: [BIBFRAME] URIs, was: The German National Library's response

On 2/9/12 5:07 PM, Amanda Xu wrote:
> I thought everyone in the group had some kind of default agreement on what to give URIs for (e.g. named entities, media types, concepts, etc.)
>

I don't think we do have such an agreement. There are some obvious
things that can be identified, such as everything that today has an
authority record. Beyond that it gets pretty fuzzy. What do we do with
all of the data in the 0xx fields in MARC? There are a lot of codes and
identifiers that don't yet have a URI representation. Some people want
to identify things like publisher and place of publication -- which
sounds sensible until you remember that these are *transcribed* elements
and thus represent the text of the title page, not an entity. What
should we do about that?

I am assuming that a major effort for the bibframe development will be
analyzing RDA and MARC and determining which elements can be represented
by URIs.

Another large effort will be reconciling the differences between MARC
and RDA. MARC has many elements that were never part of the cataloging
code (e.g. all of the fixed field elements), and RDA has not yet been
developed into a machine-readable format. I don't think we even know for
sure what the RDA elements are -- the ones in the elements list were
really a first pass, from what I understand.

kc

> I discussed about bi-directional linking behavior between source and target data a few days ago. In a navigable info space, where each named entity is considered as a little distributed computer, the source data usually refers to data from operational data store, which has system specific record identifier, in addition to OCLC record number.
>
>
>    Imagine we give
>
> each named entity (e.g. person, family, corporate,  concept, object
>
> Amanda Xu Sent from my iPhone
>
> On Feb 9, 2012, at 14:50, "Hickey,Thom"<[log in to unmask]>  wrote:
>
>> I like actionable URIs as identifiers, but agree that the creator of the
>> identifier may be better left out of the URI.
>>
>> PURLs and VIAF IDs are examples of identifiers that use a domain name
>> specific to themselves, not the agency maintaining them (in both cases
>> OCLC).
>>
>> The separate domain names make them much cooler.
>>
>> --Th
>>
>> -----Original Message-----
>> From: Bibliographic Framework Transition Initiative Forum
>> [mailto:[log in to unmask]] On Behalf Of Juha Hakala
>> Sent: Thursday, February 09, 2012 8:53 AM
>> To: [log in to unmask]
>> Subject: Re: [BIBFRAME] The German National Library's response
>>
>> Hello,
>>
>> Karen Coyle wrote:
>>
>>> Juha, thanks for the info regarding IETF activity. The issue I see
>> with
>>> URNs is not the structure but the minting: should libraries begin to
>>> link their data I see a need for thousands or even tens of thousands
>> of
>>> identifiers (hundreds of thousands?) when we figure out a way to make
>>> library holdings available to the linked data space. Surely we'll need
>>
>>> at least an identifier for each library. At least URIs piggy-back on
>> the
>>> domain system, which already exists.
>>
>> Yes, a lot of identifiers will be needed. And if someone prefers to use
>> URNs for this purpose, RFC 3188bis (the revised namespace registration
>> request for National Bibliography Numbers, NBNs) makes it clear that
>> these identifiers can be assigned to data elements as well.
>>
>> Where these URN:NBNs resolve to and what kind of services they will be
>> able to support will depend on the technical infrastructure available.
>>>
>>> Definitely, this gives us something to think about, and I have no
>> doubt
>>> that we could develop some kind of naming/identifying system to carry
>>> this data. Obviously the first step is to figure out what we need to
>>> identify, a kind of requirements study.
>>
>> Yes; and in addition we may need to consider what kind of services the
>> identified things require.
>>
>>> What I dislike about the persistent identifier is that you lose the
>> link
>>> to the originating agency that you have in the URI. That might be just
>> a
>>> "human thing" - that I feel better when looking at the URI that I can
>>> see WHO is responsible.
>>
>> A persistent identifier may show the originating agency as well. Whether
>>
>> they do or don't, depends on the identifier system used. With URN:NBN
>> the namespace specific string (the identifier part of the URN) may be
>> semantic, if that is the preference of the organization assigning those
>> identifiers. But in the long run it may not be a good idea to include
>> the originating agency into the identifier, since organisations (and
>> even more so, their domain names) may be more short-lived than the
>> things they create. Cool URIs, just like semantic identifiers, may tell
>> who originated the resource, but there is a good chance that they do not
>>
>> tell who is currently responsible for keeping the resource available. A
>> different method for finding this out must be available.
>>
>> ARKs, of course, give you both, at least in
>>> theory. Is anyone using the "?" feature of ARKs that lets you query
>> for
>>> that information? Should such info be part of our best practices?
>>
>> I don't know if the "?" and "??" features of ARK are in use, and if so,
>> by whom. John Kunze may be able to tell that. But I do think that
>> providing this functionality in a PID system is a good idea, and will
>> "lend" it into the URN system (in case John doesn't mind ;-)). Although
>> the practical implementation in the URN system will probably be an
>> option of retrieving preservation metadata / rights metadata about the
>> resource.
>>
>> Revised version of the URN syntax (RFC2141bis) allows the use of<query>
>>
>> and<fragment>.<query>  will never be part of the URN, but it could be
>> used to carry service-related information. For example, this base URN:
>>
>> http://urn.fi/URN:ISBN:978-952-10-7612-1
>>
>> provides the user the default service (splash page describing the
>> resource, and providing a link to the book), but this URN:
>>
>> http://urn.fi/URN:ISBN:978-952-10-7612-1?I2C
>>
>> will supply descriptive metadata about the resource in the default
>> format, provided that the resolution service knows how to deal with the
>> service request in<query>  (I2C = URI to resource description).
>>
>> In the context of linked data, we might be interested in enabling for
>> instance retrieval of the definition of a concept in the chosen language
>>
>> (?ENG for English, ?SWE for Swedish, and so on). Whatever linking
>> mechanisms are used (PIDs, cool URIs or something else) they should
>> enable us to do whatever needs to be done.
>>
>> Links are an essential feature in linked data, and we should plan
>> carefully the implementation of this functionality - and not take for
>> instance the functionality cool URIs are currently providing as the
>> predetermined basis for our work.
>>
>> All the best,
>>
>> Juha
>>>
>>> kc
>>>
>>>>
>>>>>> - what should the URI resolve to?
>>>>
>>>> URN-related RFCs are currently being revised (see
>>>> http://datatracker.ietf.org/wg/urnbis/). I am currently writing a new
>>>> version of RFC 2483, which specifies the resolution services URN can
>>>> provide. In the present RFC 2483 the list of services is fixed. RFC
>>>> 2483bis will be based on the idea that IANA should establish a
>> registry
>>>> of informal and formal resolution services. Then URN user communities
>>>> could register new services at will (and parameters to these
>> services,
>>>> for instance for requesting descriptive metadata about the resource
>> in
>>>> different formats).
>>>>
>>>> Existing persistent identifier systems provide a diverse set of
>>>> services. With ARK, for instance, it is possible to check the
>>>> preservation commitment of the organisation holding a resource. I
>> don't
>>>> know if the PID systems will become more homogeneous in this respect
>> in
>>>> the future.
>>>>
>>>> Nobody knows what the URIs utilized within this initiative should
>>>> resolve to, but I am sure that the mechanism to be built should be
>>>> flexible so that it can be adjusted to meet the future needs we don't
>>>> foresee yet.
>>>>
>>>> Best regards,
>>>>
>>>> Juha
>>>>
>>>>>>
>>>>>> That kind of thing.
>>>>>>
>>>>>
>>>>> Does anyone know an answer to any of these questions? Therefore, I
>>>>> think, no URI is better than no URI at all. Use brief and simple and
>>>>> easily memorized codes for vocabularies like the terms in 337-338,
>> and
>>>>> use IDnumbers for names and subjects and titles.
>>>>> Any implementation can easily relate them to all sorts of URIs that
>> may
>>>>> be in current use or follow best practice or resolve to something
>>>>> useful for the purpose at hand. Verbal terms need changes and are
>>>>> language-bound, URLs are perishable, only codes and numbers are
>> robust,
>>>>> easy to handle, and versatile.
>>>>>
>>>>> B.Eversberg
>>>>
>>>
>>
>> --
>>
>>  Juha Hakala
>>  Senior advisor, standardisation and IT
>>
>>  The National Library of Finland
>>  P.O.Box 15 (Unioninkatu 36, room 503), FIN-00014 Helsinki University
>>  Email [log in to unmask], tel +358 50 382 7678

--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet