LISTSERV mailing list manager LISTSERV 16.0

Help for BIBFRAME Archives


BIBFRAME Archives

BIBFRAME Archives


[email protected]


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

BIBFRAME Home

BIBFRAME Home

BIBFRAME  September 2011

BIBFRAME September 2011

Subject:

Re: Content Data vs. Textual Data

From:

Jeffrey Trimble <[log in to unmask]>

Reply-To:

Bibliographic Framework Transition Initiative Forum <[log in to unmask]>

Date:

Wed, 28 Sep 2011 09:45:52 -0400

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (136 lines)

Karen,

Historically  many of the indicators were used for card production generation.  We have the 240/245 indicators that say to "print" or "not print".
Other indicators used for coding of controlled vocabulary (as in the 6XX) fields.

As for their continuation, I think we can continue, but again, we'll need to make sure that the definitions of the indicators may need to be more precise.  And we'll have to 
assume that the card production environment is DEAD.

That all said, well and good, we will have a much larger problem at hand:  the ILS vendors themselves.  As many of us can witness, no two vendors use the MARC
record the same.  That can be said of XML structure too, but I will address that later in the posting.

Some vendors fully support and implement the MARC21 standard as we have it now.  Some do it half-way.  Some just let the record get loaded and use it for
'pretty display and editing' but they transfer it to some internal tables, and strip off the guts of the record.  Exporting it out of that ILS is impossible.  (And I know
of one ILS that does it--none of the three big ones.....)  

This brings me to XML structure.  Again ILS vendors have to support it in ways we don't know about just yet.  Let's look at Dublin Core.  DCMI has one standard,
OCLC uses a completely different standard, DSpace uses "Qualified Dublin Core"--and it hasn't been updated since version 1.2 (Version 1.8 is in beta!)  Fedora
uses something different.

If we were to transfer MARC21 to and XML structure, I think we would have as many XML standards as there are catalogers in the world.  XML is a wrapping language
not a standard.  I could write this email with XML wrapping--it wouldn't mean a thing unless you were to use "my interpreter program" to understand what my
XML wrappers mean.  I would have to establish my Namespace (which by the way, I have a namespace for my own XML coding and it can be found on the internet)
for the interpreter to know what to do with the data.

Let's look a little further into XHTML and HTML.  The latest and greatest standards are there, but to do all the browsers implement it the same?  I wish.  I now have
about 5 CSS style sheets for our web services here--each to address the different browsers.

Funny, we have one MARC21 standard, and yet most of the ILS vendors display the data pretty well.  I said display--I didn't say anything about data extraction/interpolation, etc.

Back to the indicators.  We may need to define indicators and subfield codes in a "paired" environment.  We may need to think about each MARC tag in a deep sense and any
associated indicators.  We will now have to associate data content with subfield codes and indicators.

We are back to the basics:  if the ILS doesn't do what the standard says, it doesn't matter a hill of beans what you use to store the cataloging data in.  We can store the data
in Postgresql or Oracle tables and edit from there, but interpreting that data by our vendors will be (and is) paramount.

--jat



On Sep 28, 2011, at 12:28 AM, Karen Coyle wrote:

> Jeff,
> 
> I've just done a study of the MARC indicators which has made me feel somewhat cautious about continuing their use.
> 
> http://kcoyle.blogspot.com/2011/09/meaning-in-marc-indicators.html
> 
> Aside from the fact that the indicators serve a wide variety of functions, some which with hind sight seem a bit dubious, I find that there is at least one basic flaw in the design: there is no way to make clear which subfields the indicator value should be applied to. In some cases the indicator refers to the entire field, but in most cases it logically applies to only some of the subfields (I give more about this in the blog post, but non-filing indicator and the 245 $a is an obvious one). However, there is nothing explicit in the standard nor in the actual instance records that would make clear which subfields are being addressed by the indicator. It's possible that could be defined on a field-by-field basis, but that means that a system needs to have "outside information" in order to process the data -- I think it's best when fields and subfields self-define so that it isn't necessary to refer elsewhere for processing information.
> 
> This is an issue also for indicators that are not defined. "Not defined" is coded as "blank", but not all blanks mean "not defined" so again it is necessary to build into a system the information about which indicators are defined and which are not. This kind of complexity and special knowledge is a deterrent to data exchange with other communities because there is a steep curve to getting the information that you need in order to process the records, and much of that information isn't in the records themselves. (Not to mention that we don't have a machine-readable version of the MARC format that one could use for validation... sheeeesh!)
> 
> Although it may seem wasteful, my preference is for each data element to be fully described in itself. So rather than having a single field that can carry different forms of the data based on indicators, I would prefer that each "semantic unit" have its own data element (which in MARC means its own field). If that seems too complex for input (although it doesn't actually change the number of meanings in the record, only their encoding), the user-interface could present something like:
> 
> title/textual
> title/content
> etc.
> 
> making sure that the various forms of the same element can easily be seen as a logical unit to the person doing the input.
> 
> kc
> 
> Quoting Jeffrey Trimble <[log in to unmask]>:
> 
>> I've been thinking about this issue because it is an interesting way for Catalogers, Data Analysts and Librarians to look at the issue.  This also plays into
>> the Cataloging thinking of "transcribing a title" and the interesting new feature(s) of RDA.  Let me remind you that I'm doing this on the cuff, so some things
>> are not presented "pretty" or necessarily logical--I'm thinking aloud.
>> 
>> So we have this MARC record structure.  As I have been mentioning before, it is possible to expand the structure.  For the sake of this discussion, let's assume
>> we were to expand the Indicators from 2 to 3.  The new indicators has definitions of:
>> 
>> 0	This is textual data [transcribed]
>> 1	This is content data [transcribed]
>> 2	This is textual data [non-transcribed]
>> 3	This is content data [non-transcribed]
>> 4	This is transcribed data (textual and content)
>> 5	This is non-transcribed data (as it appears on a title page or on the item textual and content)
>> 
>> ...  maybe more.
>> 
>> 1.	Transcription Solution:
>> 
>> So you could then define a 245 in two  ways:
>> 
>> 245 104  The adventures of Huckelberry Finn / $c Samuel Clemens.
>> 245 005   The ADVENTURES of HUCKELBERRY FINN /$c samuel CLEMENS  <== Appears on the t.p.
>> 
>> Noticed that I actually used indicator position 1 to indicate indexing (printing or not printing on the card). Now the ILS vendor has to make it possible when two 245s are present to make sure these indicators work correctly or you will either have duplicate entries.  (And filing can be a problem if the ILS does not normalize the character string when indexing and gives different weighting to upper case letters and lower case letters.
>> 
>> 2.  Content vs. textual.
>> 
>> 300 ##0 $a xii, 543 p. : $b ill., maps ; $c 28 cm.
>> 300 ##1 $a xii $a 543 $ap. $bill $b maps $c 28 cm.
>> 
>> You can now teach the display to see the second 300 hundred as content data and the computer knows roman numerals from non roman.
>> 
>> 3.  Example with Imprint statement
>> 
>> 260 ##0	[New York, N.Y.] : $b Moonshine Press, $c c1990.
>> 260 ##2   NEW YORK :$b MoonSHINE,     <==appears on t.p., but no date until you turn to t.p. verso.
>> 260 ##1   New York, New York : $b The Moonshine Press, $c 1990, $g 2008
>> 
>> Do you see where I'm going with this.  We are able to record data in a variety of ways and let the machine manipulate it as needed.  The subfield codes can more or less stay the same, but we still may need to expand on this area.
>> 
>> --Jeff
>> 
>> 
>> Jeffrey Trimble
>> System LIbrarian
>> William F.  Maag Library
>> Youngstown State University
>> 330.941.2483 (Office)
>> [log in to unmask]
>> http://www.maag.ysu.edu
>> http://digital.maag.ysu.edu
>> ""For he is the Kwisatz Haderach..."
>> 
> 
> 
> 
> -- 
> Karen Coyle
> [log in to unmask] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet

Jeffrey Trimble
System LIbrarian
William F.  Maag Library
Youngstown State University
330.941.2483 (Office)
[log in to unmask]
http://www.maag.ysu.edu
http://digital.maag.ysu.edu
""For he is the Kwisatz Haderach..."

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
July 2011
June 2011

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager