Print

Print


Ron, I think that the author ordering example is a great one. Journal articles can have (believe me, I saw it) over 100 authors in some fields. It can be the case that the first n authors are in a particular order. I doubt if the order of the remaining 90 or so is precise.

What I learned about those long author lists is that there are "rules" for who get first and second and possibly third billing.[1] They vary, naturally, by field. Some fields (mathematics) stick to strictly alphabetical order. Others appear to put the most important/involved folks toward the top (leading to some disagreements about relative placement.) Oddly, in some fields principal investigators come LAST, not first. So the order only imparts some semantics if you happen to know about the field, its rules, and under what set of rules (which will have changed over time) the order was decided. Because of this, some journals now designate a "contact author" who is the person who should be contacted for any requests for information, and so "contact author" has become a status symbol because it is the only semantic designation given to authors in multi-authored works.

I still think this calls for clearer semantics, although with legacy data it may be necessary to preserve order.

kc

[1] http://en.wikipedia.org/wiki/Academic_authorship

On 9/3/14, 10:44 AM, Murray, Ronald wrote:
[log in to unmask]" type="cite">
So yes, its a data modeling issue:�

There are social and maybe legal arrangements encoded in text layouts from which readers can make inferences based on order. In multiperson scholarly publications, for example, notions like "first author" and "last author" can be very important in evaluating source credibility. If we throw away such ordering due to the inability to represent it, we toss out an opportunity for end-user (or computed) inferencing. (This bumps us against a previous techno/cataloging policy-motivated data capture limitation - that of not enumerating all authors of a publication. That decision would be pretty problematic in an age of multinational collaborative experimentation such as those at CERN.*)

Wherever one sees person names out of alphabetical order - e.g., many actor listings in motion picture titles and trailers along with "with" and "and" � some readers and those of us behind the scenes should give a thought to why it is so.�There's a big difference between no ability to specify well-motivated data orderings (i.e., go easy on designers and programmers) and one that is employed according to rule-specified, i.e., those that are less oriented towards data display.

Ron Murray

*Click on the ATLAS collaboration author link at:�http://cds.cern.ch/record/1753190
Note that they are all in alphabetic order, whereas this MEDLINE citation:�http://www.ncbi.nlm.nih.gov/pubmed/24808340�hints at a different kind of relationship.

---------------------


From: "[log in to unmask]" <[log in to unmask]>
Reply-To: Bibliographic Framework Transition Initiative Forum <[log in to unmask]>
Date: Wednesday, September 3, 2014 11:23 AM
To: "[log in to unmask]" <[log in to unmask]>
Subject: Re: [BIBFRAME] Medium of performance (Music) question

On Sep 3, 2014, at 10:35 AM, Karen Coyle <[log in to unmask]> wrote:

RDF also has ordered collections, called rdf:Seq. Like the SKOS�collection it appears to be oriented toward display (and as such�could be used for the eye-readable publication statement). It also�has rdf:List, which is a kind of linked list with a chain from first�->�rest -> last. Although these exist, they are apparently�awkward to work with compared to simple triples. Therefore, like�blank nodes,�they give you an out when you have no other choice, but�other solutions (like re-thinking the actual meaning of the data�elements�that were previously connected by order) are preferable.

As someone who writes code to manage RDF, I agree heartily with this. It is true that RDF can represent order. But it is _very_ true that other solutions are preferable. <grin> As Karen Coyle points out, that requires that effort be spent not in encoding, but in modeling itself. I don't pretend to know much about�description�for�music, but in this case, if I do understand the intent properly, it seems that we're reaching for a "medium of performance" class that itself could feature various properties, like "instrument used". That would allow a construction:

resource bf:musicMedium example:oboe-guitar-duet .
example:oboe-guitar-duet�mybibframe:instrument�[
mybibframe:mediumOfPerformance�<http://id.loc.gov/authorities/performanceMediums/mp2013015507>�;
mybibframe:numberOfInstruments "1"] ;
mybibframe:instrument [
� � � � � � � � �mybibframe:mediumOfPerformance�<http://id.loc.gov/authorities/performanceMediums/mp2013015306> �;
� � � � � � � � �mybibframe:numberOfInstruments "1" ]�.

or the like. With an actual "medium of performance" class combined with inferencing, some of the properties�might be able to be omitted.�

Given that MARC bibliographic was essentially a display format, we�have to assume that in some cases (publisher statement being�an�obvious one) there was no impetus to develop a more precise coding.�It is instructive to compare the treatment of the basic�bibliographic fields in MARC bibliographic to the MARC holdings�format, which was designed for machine processing, especially�in the�area of serial receipt prediction. I'm in favor of doing this�re-thinking NOW before we move these practices into BIBFRAME.�Note,�however, that this will require collaboration with the cataloging�community, since in some cases the rules may still support the�older�concepts.

Yes. One of the "social" issues here is the need to recognize that what a cataloger (or other professional) sees looking at the raw data is not at all what the patron (or other consumer) necessarily will see. We already know that from MARC. We don't put field numbers or subfield codes in the displays of OPACs. {grin} But we still assume a very simple mapping between raw data and displayed data. Processes of indexing and transformation are freeing for the cataloger, allowing a precise description to be manipulated a great deal before display, and in different ways for different displays.

On 9/3/14, 7:09 AM, Murray, Ronald wrote:
[log in to unmask]" type="cite">If RDF is supposed to be representing information as graphs, then it should also have the ability to represent orderings among�linked elements in RDF graphs. This is how fundamental structures like directed graphs (think about map directions) are�constructed.

RDF, treated as a graph, is directed.�Paths through the graph can be represented and discovered fairly easily using constructions like SPARQL�property paths�or�LDPath programs, although there aren't always obvious encodings in RDF itself for those constructions.�Ordering of collections, however, is certainly different. Ordering for display, for example, can be done via linked lists, or indexed items, and probably in other ways. But it's not clear to me that this example is really about ordering at all. It seems to me to be more about aggregation.

---
A. Soroka
The University of Virginia Library




On 9/3/14, 7:09 AM, Murray, Ronald�wrote:
[log in to unmask]" type="cite">If�RDF is supposed to be representing information as graphs, then�it should also have the ability to represent orderings among�linked elements in RDF graphs. This is how fundamental�structures like directed graphs (think about map directions)�are constructed.

--->�If I had not dug a little deeper, I would have said this�---> This sounds like an oversight at RDF design time,�probably due to the designers (a.) not following through on�their basic graph definitions, or (b.) not being able to�imagine the need for data ordering,��(c.) not looking around�for existing graph-like resource description structures in the�wild (e.g., library catalogs from Cutter on). Given that lapse�at the RDF design level, is there a reason not to implement�graph ordering in RDA? All order-oriented graph�definitions�can be found in graph theory textbooks, and algorithms for�traversing ordered graphs � and especially knowing when to�stop � are equally accessible.

However�- SKOS�already�includes a definition of an�OrderedCollection (thanks, RDF folks!):

<rdf:Description�rdf:about="#OrderedCollection">
<rdfs:label�xml:lang="en">Ordered Collection</rdfs:label>
<rdfs:isDefinedBy�rdf:resource="http://www.w3.org/2004/02/skos/core"/>
<skos:definition�xml:lang="en">An ordered collection of concepts, where�both the grouping and the ordering are�meaningful.</skos:definition>
<skos:scopeNote�xml:lang="en">Ordered collections can be used where you�would like a set of concepts to be displayed in a specific�order, and optionally under a 'node�label'.</skos:scopeNote>
<!--�S28 -->
<rdf:type�rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
<!--�S29 --><rdfs:subClassOf�rdf:resource="#Collection"/>
</rdf:Description>

But�the SKOS�scope note�does not suggest/dictate�that element ordering can play a role beyond data display.

Now�we know better, and need to correct our BIBFRAME�conceptualization & implementation to (a.) accommodate the�previous examples, and (b.) generalize the use of�OrderedCollections for application elsewhere.�

Ron�Murray



From:�Karen Coyle <[log in to unmask]>
Reply-To:�Bibliographic Framework Transition Initiative Forum <[log in to unmask]>
Date:�Tuesday,�September 2, 2014 6:07 PM
To:�"[log in to unmask]"�<[log in to unmask]>
Subject:�Re:�[BIBFRAME] Medium of performance (Music) question

I had to go back and read the blog post because it was 5�years ago and I didn't remember the particulars. (OK, I also�didn't remember the generalities ;-))

I think you are right to be thinking about this. Connecting�elements based on their order is not something we can easily�achieve with RDF, and I would think is not a practice that�we want to carry forward.�

Unlike MARC or XML, RDF emphasizes individual, atomic�elements, and in fact doesn't do order very well. �There are�mechanisms for imposing order in RDF but what I hear from�RDF developers is that one should use ordering very�sparingly. We can,�however, create logical groups of�elements, such as the instrument and the number. I believe�that is what Joerg was demonstrating, although to include�the number we would need yet another level of grouping:

<piece for oboe and guitar> a bf:Work ;
� � � � � � bf:Title �Title of this piece for oboe�and guitar� ;
� � � � � � mybibframe:instrument (
� � � � � � � �mybibframe:mediumOfPerformance
� � � � � � � � <http://id.loc.gov/authorities/performanceMediums/mp2013015507>�.
� � � � � � � �mybibframe:numberOfInstruments "3" ) ;
� � � � � � �mybibframe:instrument (
� � � � � � � � �mybibframe:mediumOfPerformance�� � � � � � �
� � � � � � � � <http://id.loc.gov/authorities/performanceMediums/mp2013015306>��.
� � � � � � � � �mybibframe:numberOfInstruments "1" ) .
� � � � ��
Or something like that.

We need to rethink areas of our records that depend on order�today. Some places where it is used it could be replaced by�separate data elements, each with their own meaning. For�example, a 245 with parallel titles, each in a $a with�related $b's needs to become�two separate title "graphs",�one for each set of title elements for a language. The very�unfortunate $g in the MARC X00 fields (which can contain�information relating to either the author portion or the�title portion of the field, depending on where it is placed)�needs to be rethought entirely. Ditto multiple $x subfields�in the 6XX area. If the order of those is important for its�meaning, then we will need to resolve that by combining them�into a single element, or defining better what each $x�means. (See examples on�page 72 of Chan & O'Neill on�FAST [1], where they show topics with qualifiers, but not�actual subdivisions. I don't know if this is the "normal"�case.)�

Another difficulty is the repeatability of data, especially�in cases where US libraries (or perhaps all anglo-american�libraries) do not separately catalog the resources in�aggregates. Even today I believe that many library systems�do not separately index, say,�subject fields, so that one�gets false hits from words in different subject headings.�This problem is compounded in music data because so much of�one's holdings consists of aggregates, and some terms�(symphonies, concertos) are so common that without a�uniform�title browse you can't get much precision in searching. To�me this says that we haven't adequately identified the�*things* in our data and therefore are mixing together�elements that should be attributes of different resources.�RDA skims past this a bit by�allowing one to identify�individual works, but FRBR, as one can see from the solution�proposed by the Working Group on Aggregates [2], failed�(IMO) to come up with a workable solution (thus at least in�part negating the nice attempt by RDA).�

Now is the time to resolve these issues, before they become�baked into yet another library data model.

kc

[1] O'NEILL, E. T., & CHAN, L. M. (2003).�FAST�(Faceted Application of Subject Terminology) a simplified�LCSH-based vocabulary. [Hague, Netherlands], IFLA.�http://www.ifla.org/IV/ifla69/papers/010e-ONeill_Mai-Chan.pdf.�
[2]�http://www.ifla.org/node/923

On 9/1/14, 1:44 PM, Kirk-Evan�Billet wrote:
[log in to unmask]" type="cite"> It�s a Bibframe vocabulary question and also a syntax question. I want to refer back to something Karen Coyle discussed in 2009, responding to Martha Yee (�http://kcoyle.blogspot.com/2009/07/yee-questions-9-11.html
�). I�m calling it the �two oboes and three guitars� problem, but I�ll reduce it down to just the 1 oboe and 1 guitar problem.

For some music resource with an instrumentation of oboe plus guitar, we might have, in MARC:

382 0 �$a oboe $n 1 $a guitar $n 1 $s 2 $2 lcmpt

But in Bibframe, the following would be inaccurate:

<piece for oboe and guitar> a bf:Work ;
� � � � � � bf:Title �Title of this piece for oboe and guitar� ;
� � � � � � bf:[?property]�
<http://id.loc.gov/authorities/performanceMediums/mp2013015507>
�;
� � � � � � bf:[?property]�
<http://id.loc.gov/authorities/performanceMediums/mp2013015306>
�;

because the medium of performance is not oboe and it�s also not guitar; it�s oboe + guitar. Is there a syntax that can �wrap� these two separate statements together so that we�re making one assertion about the work�s medium? Alternatively, and especially since no such bf property currently exists (have I missed it?), is this a case where it is expected that we will use a non-bf vocabulary? (bf:musicMedium is included in category �Title information� and corresponds to MARC bib 240 $m or auth 1XX $m.)

Thanks for any insights, clarifications, or corrections.

Kirk-Evan Billet


--�
Karen Coyle

[log in to unmask]http://kcoyle.net

m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600


--�
Karen Coyle

[log in to unmask]http://kcoyle.net

m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600



-- 
Karen Coyle
[log in to unmask] http://kcoyle.net
m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600