LISTSERV mailing list manager LISTSERV 16.0

Help for BIBFRAME Archives


BIBFRAME Archives

BIBFRAME Archives


[email protected]


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

BIBFRAME Home

BIBFRAME Home

BIBFRAME  November 2011

BIBFRAME November 2011

Subject:

Re: What the data tells us -> Dublin Core application profiles

From:

Thomas Baker <[log in to unmask]>

Reply-To:

Bibliographic Framework Transition Initiative Forum <[log in to unmask]>

Date:

Tue, 8 Nov 2011 12:47:11 -0500

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (224 lines)

I agree with the strategy of "modularity" presented here.  This thread reminds
me of Dublin Core workshops circa 1996-2000, when the focus shifted from "the
core" per se to extensibility, modularization, and application profiles.
Application profiles "sequester" complexity, as Roy puts it, by providing a
context for complexification while retaining interoperability on the basis of
the shared or overlapping parts (e.g., "core" properties).

A strategy of modularity suggests that MARC be replaced not by any _one_
"unworkably complex data carrier" (as Karen puts it), but perhaps by a series
of well-designed application profiles.  Indeed, a DCMI/RDA Task Group -- formed
after a meeting among participants from the RDA, DCMI, and Semantic Web worlds
[1] -- has been working towards the goal of expressing RDA in RDF and
constructing application profiles on its basis since 2007.

In a Linked Data environment, interoperability is achieved not by sharing
specific "formats", but by ensuring that diverse applications produce triples
that are coherent -- i.e., that overlap semantically and use, or are mapped to,
common vocabularies.

Putting the design of application profiles onto a formal basis was the
motivation behind the DCMI Abstract Model, or DCAM (mentioned in the LC
announcement) and related specifications.  The process of designing application
profiles was seen as directed towards the specification of implementation
_formats_ [2] -- formats with straightforward mappings to triples.

If the goal is to enable data to be managed on the back end by a variety of
implementation technologies (e.g., in XML or databases), and cleanly exposed as
triples, then the DCAM specifications can provide one good starting point --
one that would need to be revised in light of specific requirements and
subjected to iterative testing, perhaps incorporating new approaches from the
OWL community.

The notion of "core" need not be seen as absolute either, but rather as
relative to requirements for "coherence" between different types of data.  The
goal should not be to perfect some "ideal core", but to identify which
statements are needed to produce coherent data in which situations.

In the library world, DCMI is still widely associated with simple XML formats
from the early 2000s based on "the fifteen elements", but its potentially far
more useful contribution to the bibliographic framework initiative lies in
building on the DCMI/RDA and DCAM work to help to bridge the gap between
closed-world implementations and the open world of Linked Data.  

Tom (wearing his DCMI hat)

[1] http://dublincore.org/documents/singapore-framework/
[2] http://www.bl.uk/bibliographic/meeting.html


On Sun, Nov 06, 2011 at 12:00:11PM -0800, Karen Coyle wrote:
> Roy, I wish you'd said all of this to begin with! Yes, we need to
> create a simple core structure that can be extended. This is what we
> do not have with MARC, and we definitely do NOT have with RDA.
> Unfortunately, RDA is more like MARC than what you describe below.
> We do have an opportunity to create a something more workable in
> this transition, but if we do not then we will be stuck with an
> unworkably complex data carrier for a very long time. As some said
> when RDA was still in progress, this may be our last chance to get
> it right because we are falling further and further behind as
> information providers.
>
> Coming up with a core is tricky, to say the least. RDA's core
> includes elements that are core for all of the formats that it
> supports -- so there are core music elements, core maps elements,
> etc., all as part of a single core. I'm not sure that helps us.
> FRBR's entities are probably a better core -- although I find there
> to be some idiosyncrasies in FRBR (the four Group 3 entities, to
> start) that need to be ironed out. I do think that it is essential
> that we start from zero and re-think core for the purposes of a new
> framework.
> 
> kc
> 
> Quoting Roy Tennant <[log in to unmask]>:
> 
> >Karen,
> >I think you missed my point. The point wasn't to enrage music catalogers by
> >leaving a field or subfield behind that they simply must have -- it was
> >rather to determine a core of bibliographic description (which I submit the
> >data DOES tell us), then allow communities of interest to specify ways in
> >which that core can be decorated with what they require without ending up
> >where we did with MARC -- with an arguably bloated record (and I'm including
> >subfields here) that tries to be prepared for every eventuality. That's why
> >I suggested modularity as being an excellent strategy for accomplishing one
> >of my pet goals (to respond to Hal Cain's request):
> >
> >· Simple aims should be simple to accomplish.
> >
> >· Complexity should be avoided unless it is absolutely required to achieve
> >the goal.
> >
> >· If complexity is required, it should be sequestered. That is, complexity
> >should not spill over to affect those who donąt need it to achieve their
> >goals.
> >
> >When a MARC subfield is used 17 times out of 240 million records we may want
> >to consider just how important it is to create it, document it, and write
> >software to process it.
> >Roy
> >
> >On 11/5/11 11/5/11 € 1:24 PM, "Karen Coyle" <[log in to unmask]> wrote:
> >
> >>Quoting Roy Tennant <[log in to unmask]>:
> >>
> >>>I believe you are missing the point. The evidence is clear -- the vast
> >>>majority of the some 3,000 data elements in MARC go unused except for a
> >>>small percentage of records in terms of the whole. What isn't there cannot
> >>>be indexed or presented in a catalog, no matter how hard you try. In other
> >>>words, which fields were coded is the only relevant information. It is the
> >>>ONLY relevant information when you are discussing how to move forward.
> >>
> >>I disagree. (As does the OCLC report, BTW) To some extent the stats on
> >>MARC records reflect the many special interests that MARC tries to
> >>address. I have spent more time on the Moen statistics [1] than the
> >>OCLC ones, although since they were done on the same body of data I
> >>don't see how they could be very different.
> >>
> >>In the case of what Moen turned up, the most highly used fields were
> >>ones that systems require (001, 005, 008, 245, 260, 300) -- it's a bit
> >>hard to attribute that to cataloger choice. But for the remainder of
> >>the fields there is no way to know if the field is present in all of
> >>the records that it *should* be, or not.
> >>
> >>At least some of the low use fields are ones that serve a small-ish
> >>specialized community. Only 1.3% of the OCLC records have a
> >>Cartographic Mathematical Data (255), but according to the OCLC report
> >>that represents a large portion of the Maps records (p. 23 of OCLC
> >>report). It's harder to make this kind of analysis for fields that can
> >>be used across resource types. For example, 35-47% of the records
> >>(OCLC v. LC-only, respectively, from Moen's stats) have a Geographic
> >>Area code (043). Undoubtedly some records should not have that field,
> >>so is this field a reliable indicator that the resource has geographic
> >>relevance? We have no way of knowing. In addition, as MARC fields are
> >>constantly being added, some fields suffer from not having been
> >>available in the past. (Moen does a comparison of fields used over
> >>time [2], and the OCLC report also looks at this; see below.)
> >>
> >>Neither the Moen stats nor the OCLC report really tell us what we need
> >>to know. It's not their fault, however, because we have no way to know
> >>what the cataloger intended to represent, nor if the MARC record is
> >>complete in relation to the resource. My experience with some
> >>specialized libraries (mainly music and maps) was that these
> >>communities are diligent in their coding of very complex data. These,
> >>however, represent only small numbers in a general catalog.
> >>
> >>The OCLC report reaches this conclusion:
> >>
> >>"That leaves 86 tags that are little used, or not used at all, as
> >>listed in the ?MARC 21 fields little or not used? table (Table 2.14,
> >>p. 32). Of these infrequently occurring fields, 16 are fields that
> >>were introduced between 2001 and 2008. Three of these fields
> >>(highlighted in orange) have no occurrences in WorldCat since OCLC has
> >>no plans to implement them."
> >>
> >>This means that there are really 67 fields that seem to be underused.
> >>That is out of 185 tags (not 3000, which would be more like the number
> >>of subfields). That's about 1/3. Having sat in on many MARBI meetings,
> >>however, I am sure that there are communities that would be very upset
> >>if some of these fields were removed (e.g. musical incipits, GPO item
> >>number). Admittedly, some fields were introduced that then turned out
> >>not to be useful. If those can be identified, so much the better.
> >>
> >>Basically, there is no way to know a priori what fields *should* be in
> >>a MARC record other than the few that are required. Deciding which
> >>fields can be left behind is going to take more than a statistical
> >>analysis. I agree that we should not carry forward all MARC data just
> >>"because it is there." The analysis, though, is going to be fairly
> >>difficult. Even more difficult will be the analysis of the fixed
> >>fields. I could go on about those at length, but that analysis will be
> >>complicated by the fact that the fixed fields are frequently a
> >>duplicate of data already in the record, and we never should have
> >>expected catalogers to do the same input twice for the same
> >>information -- we should have had a way to accomplish indexing and
> >>display with a single input.
> >>
> >>kc
> >>[1] http://www.mcdu.unt.edu/?p=41
> >>[2] http://www.mcdu.unt.edu/?p=47
> >>
> >>>
> >>>The one thing you said that I agree with wholeheartedly, is that we should
> >>>know what data is useful to users. Yes. That.
> >>>Roy
> >>>
> >>>
> >>>On 11/4/11 11/4/11 € 10:41 PM, "J. McRee Elrod" <[log in to unmask]> wrote:
> >>>
> >>>>Roy Tennant <[log in to unmask]> wrote:
> >>>>
> >>>>
> >>>>>"Implications of MARC Tag Usage on Library Metadata Practices"
> >>>>>http://www.oclc.org/research/publications/library/2010/2010-06.pdf
> >>>>
> >>>>This study told us what fields were in records, not whether those
> >>>>fields were utilized in OPACs.  MARC has a wealth if information never
> >>>>put to practical use.   Which fields were coded is fairly useless
> >>>>information.
> >>>>
> >>>>A study of what fields OPACs actually use might be helpful, but that
> >>>>still does not tell us what fields might be helpful to patrons if they
> >>>>were utilized,'
> >>>>
> >>>>
> >>>>   __       __   J. McRee (Mac) Elrod ([log in to unmask])
> >>>>  {__  |   /     Special Libraries Cataloguing   HTTP://www.slc.bc.ca/
> >>>>  ___} |__ \__________________________________________________________
> >>>>
> >>>
> >>
> >>
> >
> 
> 
> 
> -- 
> Karen Coyle
> [log in to unmask] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet

-- 
Tom Baker <[log in to unmask]>

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
July 2011
June 2011

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager