Print

Print


Alphabetical order, as I show in that piece I cited, may be useful for a 
small list of items, but when information systems retrieve hundreds of 
thousands or millions of entries, alphabetical order is as good as no 
order at all. Why has Google won the search engine wars? Not because it 
retrieves more than anyone else (we have no idea what is left out) but 
because it provides what in a large number of cases is a "useful order." 
If you did a search on "Barack Obama" in Google (or Bing or Yahoo) and 
the results were returned in alphabetical order by title, you'd be 
clicking for days before you reached the "B"s. (Results are over 200 
million). Instead, http://www. barackobama.com and the Wikipedia article 
are the first two items. In alphabetical order, those would be hundreds 
of pages along, and you'd never get there. It's all a question of the 
"Right Tool for the Job". With big data, the alphabet is not the right 
tool.

Yet, to a large extent, our cataloging rules design headings for 
alphabetical order, and do nothing to facilitate "useful order" for 
large retrieval sets.

So it's not that there is no use for alphabetical order, but it is often 
not useful in the situations in which we use it in library catalogs 
today. And you won't find it being used in the major web sites, like 
Amazon, Google, IMDB, MusicBrainz, or even online phone books (e.g. 
http://www.whitepages.com) or WorldCat!. Designing primarily for 
alphabetical order, and ignoring other useful orders, is a huge gap in 
our data planning.

kc

On 7/27/14, 11:50 AM, Tim Thompson wrote:
> Yes, alphabetical order for its own sake is not particularly helpful, 
> but it can be useful as an information retrieval mechanism. For 
> example, if you try to do a lookup for "London (England)" in the 
> current version of the BIBFRAME Editor, you may be disappointed to 
> find that your lookup retrieves a number of individual London 
> locations, but not the top-level entity "London (England)" itself.
>
> This is because the current lookup is not left-anchored, but rather 
> does a keyword search over individual headings. In this case, simple 
> alphabetical order would seem to make the lookup task easier. Hint: as 
> one astute user discovered, if you do a lookup for "Londinium," you 
> will in fact retrieve "London (England)" as the only result, since 
> "Londinium" has been provided as a variant heading in the authority 
> record for "London (England)" (and apparently appears nowhere else in 
> the Name Authority File).
>
> Tim
>
>
> --
> Tim A. Thompson
> Metadata Librarian (Spanish/Portuguese Specialty)
> Princeton University Library
>
> On Sun, Jul 27, 2014 at 10:07 AM, Karen Coyle <[log in to unmask] 
> <mailto:[log in to unmask]>> wrote:
>
>     On 7/27/14, 3:41 AM, Thomas Berger wrote:
>
>         With XML documents there was a distinction between data centric
>         and document centric approaches, often characterized by the
>         permission of "mixed content". Traditional bibliographic
>         records somehow redundantly follow both approaches, i.e. you
>         have "data" in MARC 100 and 700 and the same facts recorded again
>         as parts of the text in 245$c. Especially in cases where the
>         redundancy is not very high, i.e. when the form recorded in
>         the statement of responsibility grossly deviates from the form
>         given in the heading one would wish for additional markup
>         linking the substring in the SoR with the heading or - like
>         TEI does -
>         embedding heading information in markup distinguishing the name
>         in the SoR.
>
>         Now RDF (with string data types) enforces a strict data-centric
>         view on our bibliographic situation which even in circumstances
>         we usually consider as "pure data" fails to provide appropriate
>         descriptions.
>
>
>     As I said earlier in this thread:
>
>     3. I'm not convinced that it makes sense to convert the entire
>     document that is a "bibliographic description" to RDF, any more
>     than I would want to convert an entire web page to RDF. RDF was
>     designed to surface the data hidden in web pages, not to turn the
>     entire web into triples.
>
>     There are aspects of our data that are document-like, and I see no
>     reason to force these into RDF if they don't fit comfortably. We
>     need to turn the question around, from "How do I fit this into
>     RDF?" to "What do I want to do with this data?" If we wish to
>     provide users with notes about the resource, reviews, or handy
>     hints as to where to find it on the library shelves, there's no
>     reason that these have to be in RDF. If they are, it is for the
>     convenience of processing, not because they result in useful RDF
>     (which is, as you say, designed for linkable data).
>
>     At the same time, I question the need to carry forward certain
>     practices, like reversing author names to the comma-delimited
>     form, which exists solely to support alphabetical order [1]. We
>     (will) have an identifier for the person, and we can have any
>     number of display forms. We need to re-think our data for the web,
>     not try to turn the web into a card catalog.
>
>     kc
>     [1] Where I question alphabetical order:
>     http://kcoyle.net/presentations/thinkDiff.pdf
>
>     -- 
>     Karen Coyle
>     [log in to unmask] <mailto:[log in to unmask]> http://kcoyle.net
>     m: 1-510-435-8234 <tel:1-510-435-8234>
>     skype: kcoylenet
>
>

-- 
Karen Coyle
[log in to unmask] http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet