Alphabetical order, as I show in that piece I cited, may be useful for a small list of items, but when information systems retrieve hundreds of thousands or millions of entries, alphabetical order is as good as no order at all. Why has Google won the search engine wars? Not because it retrieves more than anyone else (we have no idea what is left out) but because it provides what in a large number of cases is a "useful order." If you did a search on "Barack Obama" in Google (or Bing or Yahoo) and the results were returned in alphabetical order by title, you'd be clicking for days before you reached the "B"s. (Results are over 200 million). Instead, http://www. and the Wikipedia article are the first two items. In alphabetical order, those would be hundreds of pages along, and you'd never get there. It's all a question of the "Right Tool for the Job". With big data, the alphabet is not the right tool.

Yet, to a large extent, our cataloging rules design headings for alphabetical order, and do nothing to facilitate "useful order" for large retrieval sets.

So it's not that there is no use for alphabetical order, but it is often not useful in the situations in which we use it in library catalogs today. And you won't find it being used in the major web sites, like Amazon, Google, IMDB, MusicBrainz, or even online phone books (e.g. or WorldCat!. Designing primarily for alphabetical order, and ignoring other useful orders, is a huge gap in our data planning.


On 7/27/14, 11:50 AM, Tim Thompson wrote:
[log in to unmask]" type="cite">
Yes, alphabetical order for its own sake is not particularly helpful, but it can be useful as an information retrieval mechanism. For example, if you try to do a lookup for "London (England)" in the current version of the BIBFRAME Editor, you may be disappointed to find that your lookup retrieves a number of individual London locations, but not the top-level entity "London (England)" itself.

This is because the current lookup is not left-anchored, but rather does a keyword search over individual headings. In this case, simple alphabetical order would seem to make the lookup task easier. Hint: as one astute user discovered, if you do a lookup for "Londinium," you will in fact retrieve "London (England)" as the only result, since "Londinium" has been provided as a variant heading in the authority record for "London (England)" (and apparently appears nowhere else in the Name Authority File).


Tim A. Thompson
Metadata Librarian (Spanish/Portuguese Specialty)
Princeton University Library

On Sun, Jul 27, 2014 at 10:07 AM, Karen Coyle <[log in to unmask]> wrote:
On 7/27/14, 3:41 AM, Thomas Berger wrote:
With XML documents there was a distinction between data centric
and document centric approaches, often characterized by the
permission of "mixed content". Traditional bibliographic
records somehow redundantly follow both approaches, i.e. you
have "data" in MARC 100 and 700 and the same facts recorded again
as parts of the text in 245$c. Especially in cases where the
redundancy is not very high, i.e. when the form recorded in
the statement of responsibility grossly deviates from the form
given in the heading one would wish for additional markup
linking the substring in the SoR with the heading or - like TEI does -
embedding heading information in markup distinguishing the name
in the SoR.

Now RDF (with string data types) enforces a strict data-centric
view on our bibliographic situation which even in circumstances
we usually consider as "pure data" fails to provide appropriate

As I said earlier in this thread:

3. I'm not convinced that it makes sense to convert the entire document that is a "bibliographic description" to RDF, any more than I would want to convert an entire web page to RDF. RDF was designed to surface the data hidden in web pages, not to turn the entire web into triples.

There are aspects of our data that are document-like, and I see no reason to force these into RDF if they don't fit comfortably. We need to turn the question around, from "How do I fit this into RDF?" to "What do I want to do with this data?" If we wish to provide users with notes about the resource, reviews, or handy hints as to where to find it on the library shelves, there's no reason that these have to be in RDF. If they are, it is for the convenience of processing, not because they result in useful RDF (which is, as you say, designed for linkable data).

At the same time, I question the need to carry forward certain practices, like reversing author names to the comma-delimited form, which exists solely to support alphabetical order [1]. We (will) have an identifier for the person, and we can have any number of display forms. We need to re-think our data for the web, not try to turn the web into a card catalog.

[1] Where I question alphabetical order:

Karen Coyle
[log in to unmask]
m: 1-510-435-8234
skype: kcoylenet

Karen Coyle
[log in to unmask]
m: 1-510-435-8234
skype: kcoylenet