Print

Print


Kelley and Amanda,

As I recall this very situation was a topic the Jason Tomale discussed
in his code4lib article. Unfortunately, the code4lib journal isn't
reachable at the moment, but hopefully that will get fixed.

kc

[1]http://journal.code4lib.org/articles/3832

On 1/11/18 4:05 PM, Kelley McGrath wrote:
> Hi Amanda,
> 
>  
> 
> Thanks for the addition info. I have not looked at the BF2 converter,
> but this example does demonstrate a couple challenges for conversion.
> 
>  
> 
> 1.       The preceding punctuation determines the meaning of some
> subfields, such as 245$b, and shouldn’t be ignored (and don’t forget to
> account for the records where the punctuation is mistakenly placed after
> the subfield marker)
> 
>  
> 
> 2.       Inconsistent practice
> 
>  
> 
> Depending what was on the piece and what cataloging rules were used, I
> think you should have something like one of these where the non-Roman
> characters are paired with the Romanization. But there’s all sort of
> variations out there if you start looking at cataloging in the wild.
> 
>  
> 
> 245/880 00 西遊記= ǂb Journey to the west / ǂc 中国电视剧制作中心...
> 
> 245 00 Xi you ji = ǂb Journey to the west/ ǂc Zhong guo dian shi ju zhi
> zuo zhong xin...
> 
> 246 31 Journey to the west
> 
>  
> 
> 245/880 00 西遊記/ ǂc 中国电视剧制作中心...
> 
> 245 00 Xi you ji / ǂc Zhong guo dian shi ju zhi zuo zhong xin...
> 
> 246 11 Journey to the west
> 
>  
> 
> Kelley
> 
>  
> 
> *From:* Bibliographic Framework Transition Initiative Forum
> [mailto:[log in to unmask]] *On Behalf Of *Xu, Amanda
> *Sent:* Monday, January 08, 2018 3:03 PM
> *To:* [log in to unmask]
> *Subject:* Re: [BIBFRAME] CC:AAM Statement in Support of the
> Internationalization of BIBFRAME
> 
>  
> 
> Hi Kelley,
> 
>  
> 
> Thank you so much for your comments on the example that I provided to
> Karen.  My example (西遊記 = Journey to the west) is based on what can
> be converted to BIBFRAME work title using MARC2BIBFRAME2 converter
> available from  https://github.com/lcnetdev/marc2bibframe2. 
> 
>  
> 
> To show two parallel titles in BIBFRAME Instance, we need to add two
> MARC 246 fields as the following:
> 
>  
> 
>  
> 
> This will bring us to the following BIBFRAME Instance Title with mock-up
> of BCP47 language codes:
> 
>  
> 
> bf:title [ a bf:ParallelTitle,
> 
>                 bf:Title,
> 
>                 bf:VariantTitle;
> 
>             rdfs:label "西遊記"@zh-cmn-hant;
> 
>             bflc:titleSortKey "西遊記";
> 
>             bf:mainTitle "西遊記"@zh-cmn-hant ],
> 
>         [ a bf:Title;
> 
>             rdfs:label "西遊記 = Journey to the west /"@zh-cmn-hant;
> 
>             bflc:titleSortKey "西遊記 = Journey to the west /";
> 
>             bf:mainTitle "西遊記"@zh-cmn-hant;
> 
>             bf:subtitle "Journey to the west"@en-us ],
> 
>         [ a bf:ParallelTitle,
> 
>                 bf:Title,
> 
>                 bf:VariantTitle;
> 
>             rdfs:label "Journal to the west"@en;
> 
>             bflc:titleSortKey "Journey to the west";
> 
>             bf:mainTitle "Journey to the west"@en-us ],
> 
>         [ a bf:Title;
> 
>             rdfs:label "Xi you ji /"@pinyin;
> 
>             bflc:titleSortKey "Xi you ji /";
> 
>             bf:mainTitle "Xi you ji"@pinyin ];
> 
>  
> 
> The conversion specs only cover the conversion of MARC fields and
> subfields, not symbols like equal sign “=”.  Parallel titles are only
> recognized through the conversion of MARC 246 2^nd indicator “1” and are
> designated as BIBFRAME Instance title.  For more information, please
> check conversion specifications – Fields 200-24X, except 240-Titles –R1,
>  available from http://www.loc.gov/bibframe/mtbf/
> 
>  
> 
> If this is not right, we need to improve the conversion specifications
> and converters for multi-script records.
> 
>  
> 
> Amanda   
> 
>  
> 
>  
> 
>>Amanda’s example (西遊記 = Journey to the west) actually shows two
> parallel titles and not a title and subtitle. I’m not sure there’s any
> use case for putting >them in the same sort key and they might be better
> treated as two separate titles. If you want to retain an ISBD display
> reflecting the title page, they might be >better modeled as equivalent
> titles appearing sequentially rather than as one thing.
> 
>  
> 
> -----Original Message-----
> From: Bibliographic Framework Transition Initiative Forum
> [mailto:[log in to unmask]] On Behalf Of Kelley McGrath
> Sent: Sunday, January 07, 2018 4:59 PM
> To: [log in to unmask] <mailto:[log in to unmask]>
> Subject: Re: [BIBFRAME] CC:AAM Statement in Support of the
> Internationalization of BIBFRAME
> 
>  
> 
> I welcome support for more complex language coding. Language coding for
> audiovisual resources, such as video and musical recordings is often
> more complicated than it is for textual resources. Current MARC coding
> isn’t capable of sufficiently expressing this in machine-actionable
> form. I would also like to see more systems support ISO 639-3, which
> provides better support for spoken language. Chinese is the most
> egregious example, but there are other cases where this would be
> useful.  I know some people want to record whether Chinese subtitles are
> in simplified or traditional characters (or both).
> 
>  
> 
> For audiovisual materials there is not a reliable link between the
> language of the resource and the language of transcribed titles or other
> text taken from the manifestation. There’s no reason to think users need
> to understand the words to a song to enjoy listening to it. Video
> records often have many language options and are inconsistent in whether
> they take the title in the original language as the title proper or the
> title from an external source in the language of whatever country the
> video is being marketed to. The Iranian film “The Circle,” where a
> common North American DVD version is in Persian with English subtitles,
> but gives the title and some of the other credits on-screen in Italian,
> is a more extreme example of disconnection (oclc# 48590961; it was an
> Italian co-production)
> 
>  
> 
> Amanda’s example (西遊記 = Journey to the west) actually shows two
> parallel titles and not a title and subtitle. I’m not sure there’s any
> use case for putting them in the same sort key and they might be better
> treated as two separate titles. If you want to retain an ISBD display
> reflecting the title page, they might be better modeled as equivalent
> titles appearing sequentially rather than as one thing.
> 
>  
> 
> There are occasional examples of titles that mix languages (“Le francais
> pour moi : learning a second language”), but I wonder how often the
> title as a whole doesn’t have a language-specific intended audience.
> 
>  
> 
> Joe said that for non-transcribed data, like notes, that is given in the
> language of the catalog, recording the language is trivial. If we are
> creating statements not records, don’t we need this info at the
> statement level? Many notes are formulaic and would be better modeled as
> structured data that can more easily be transformed into a note in the
> language of the user’s choice for display (e.g., instead of writing
> “Includes bibliographical references (pages 157-160)” we should record
> some controlled data representing bibliographical references combined
> with start and end number and type of pagination and let the computer
> put it together). However, many notes are unique and can only be
> expressed as text. Presumably, we only want to show the user the notes
> they can read. And do we need some way to link various language versions
> of the same note (at least until Google Translate et al. are good enough)?
> 
>  
> 
> Marking language, script etc. of bibliographic information could be
> useful to select data for display in a user’s preferred language for
> multi-lingual manifestations, for work info and for info about
> expressions that aren’t language specific (e.g., music). For monolingual
> resources and language-specific expressions, you might want the language
> of the resource or perhaps the language of the catalog.
> 
>  
> 
> Is it useful to have notes in the language of the catalog for
> consistency in searching and in the language of the resource for display?
> 
>  
> 
> Kelley
> 
>  
> 
> -----Original Message-----
> 
> From: Bibliographic Framework Transition Initiative Forum
> [mailto:[log in to unmask]] On Behalf Of Karen Coyle
> 
> Sent: Saturday, January 06, 2018 10:45 AM
> 
> To: [log in to unmask] <mailto:[log in to unmask]>
> 
> Subject: Re: [BIBFRAME] CC:AAM Statement in Support of the
> Internationalization of BIBFRAME
> 
>  
> 
> Thanks, Amanda. This is a good example (your BCP47 example) of the
> complexity.
> 
>  
> 
>         [ a bf:Title;
> 
>  
> 
>             rdfs:label "西遊記 = Journey to the west"@zh-cmn-hant;
> 
>  
> 
>             bflc:titleSortKey "西遊記 = Journey to the west";
> 
>  
> 
>             bf:mainTitle "西遊記"@zh-cmn-hant;
> 
>  
> 
>             bf:subtitle "Journey to the west"@en-us ] .
> 
>  
> 
> shows that mixed language strings (rdfs:label, bflc:titleSortKey)
> present problems. (One has a language tag, the other does not -
> 
> intentional?) I would assume that BCP47 would mainly be used by
> specialist libraries or for special collections.
> 
>  
> 
> Perhaps I should offer my questions in list form, and we can tick them off?
> 
>  
> 
> - 2-char or 3-char codes, or BCP47? Which are used, and under what
> circumstances?
> 
> - mixed language strings - which language are they given, if any, and why?
> 
> - what is the relationship between language of title and language of the
> text of the resource, if any? (cf. "Quo Vadis?"[1])
> 
> - who is responsible for the rules that govern decisions? RDA group?
> 
> PCC? LoC BIBFRAME?
> 
> - does this affect display in any way?
> 
> - does this affect indexing in any way?
> 
> - does this affect search in any way?
> 
>  
> 
> I think that's it, along with the obvious need for use cases.
> 
>  
> 
> kc
> 
> [1]
> 
> http://www.worldcat.org/search?qt=worldcat_org_bks&q=quo+vadis&fq=dt%3Abks
> 
>  
> 
> (Also, as a note, one can only see the full OCLC record when accessing
> from a member organization. The rest of us see only a fairly reduced
> record and never the MARC fields. In addition, given that RDA is behind
> a paywall, that also isn't available to those outside of a subscribing
> institution. This affects not only us renegade retirees, but also many
> librarians in libraries who cannot afford these services. This
> "have/have not" is not, IMO, good for the library world in general.
> 
> Silos may be necessary, but they always create a barrier.)
> 
>  
> 
> On 1/5/18 3:23 PM, Xu, Amanda wrote:
> 
>> Hi Karen,
> 
>>
> 
>>  
> 
>>
> 
>> Thank you so much for these wonderful questions.  According to W3C
> 
>> Recommendation 25 February 2014, RDF 1.1 Concepts and Abstract Syntax,
> 
>> a literal in an RDF graph consists of two or three elements.  If the
> 
>> third element is present, a literal is a language-tagged string.
> 
>> Lexical representations of language tags may be converted to lower case.
> 
>> The value space of language tags is always in lower case.  The
> 
>> language tag must be well-formed according to section 2.2.9 of BCP47,
> 
>> available from https://tools.ietf.org/html/bcp47.  You can find the
> 
>> language codes from
> 
>> https://www.iana.org/assignments/language-subtag-registry/language-sub
> 
>> tag-registry
> 
>>
> 
>>
> 
>>  
> 
>>
> 
>> For compliance, we may consider the adoption of BCP47 language tags.
> 
>> However, I agree with you that we must build good use cases for the
> 
>> coding change given the complexity of our data.  I also agree with Joe
> 
>> Kiegek that in native BIBFRAME and a good user interface, assigning
> 
>> language tags may not be difficult or time consuming.  In addition, I
> 
>> am hoping that the next version of MARC2BIBFRAME converter can handle
> 
>> multiscript record conversion better with the use of BCP47 language
> 
>> tags if an agreement can be reached by PCC or some such group.
> 
>>
> 
>>  
> 
>>
> 
>> One experiment that I did might be the starting point for us to
> 
>> collect sample data for use case development.  You can check
> OCLC#122820377 .
> 
>> It is not a RDA record and relator codes are missing.  But we may list
> 
>> it as an example for a multiscript record.   The transcribed title and
> 
>> subtitle are in different language scripts.  Author/title groups,
> 
>> personal names, TOC, etc. are in different language scripts.
> 
>>
> 
>>  
> 
>>
> 
>> _Paired field for MARC 245 title field in OCLC_:
> 
>>
> 
>>  
> 
>>
> 
>>  
> 
>> imap:[log in to unmask]:143/fetch%3EUID%3E.INBOX.BIBFRAME%3E2922?
> 
>> header=quotebody&part=1.1.2&filename=image001.png
> 
>>
> 
>>  
> 
>>
> 
>> _Titles with language tags using BCP47 in BIBFRAME description_:
> 
>>
> 
>>  
> 
>>
> 
>> <http://example.org/ocn122820377#Work> a bf:MovingImage,
> 
>>
> 
>>         bf:Work;
> 
>>
> 
>>     rdfs:label "Xi you ji"@pinyin;
> 
>>
> 
>>  
> 
>>
> 
>> bf:contribution
> 
>>
> 
>> [ a bf:Contribution;
> 
>>
> 
>>                                 bf:agent
> 
>> <http://example.org/ocn122820377#Agent700-31>;
> 
>>
> 
>>                                 bf:role
> 
>> <http://id.loc.gov/vocabulary/relators/ctb> ],
> 
>>
> 
>> [ a bf:Contribution;
> 
>>
> 
>>                                bf:agent
> 
>> <http://example.org/ocn122820377#Agent880-44>;
> 
>>
> 
>>                                bf:role
> 
>> <http://id.loc.gov/vocabulary/relators/ctb> ];
> 
>>
> 
>>  
> 
>>
> 
>> bf:title [ a bf:Title;
> 
>>
> 
>>             rdfs:label "Xi you ji"@pinyin;
> 
>>
> 
>>             bflc:titleSortKey "Xi you ji";
> 
>>
> 
>>             bf:mainTitle "Xi you ji"@pinyin ],
> 
>>
> 
>>         [ a bf:Title;
> 
>>
> 
>>             rdfs:label "西遊記 = Journey to the west"@zh-cmn-hant;
> 
>>
> 
>>             bflc:titleSortKey "西遊記 = Journey to the west";
> 
>>
> 
>>             bf:mainTitle "西遊記"@zh-cmn-hant;
> 
>>
> 
>>             bf:subtitle "Journey to the west"@en-us ] .
> 
>>
> 
>>  
> 
>>
> 
>>  
> 
>>
> 
>> <http://example.org/ocn122820377#Agent700-31> a bf:Agent,
> 
>>
> 
>>         bf:Person;
> 
>>
> 
>>     rdfs:label "Yang, Jie"@pinyin;
> 
>>
> 
>>     bflc:name00MarcKey "7001 $6880-04$aYang, Jie";
> 
>>
> 
>>     bflc:name00MatchKey "Yang, Jie" .
> 
>>
> 
>>  
> 
>>
> 
>> <http://example.org/ocn122820377#Agent880-44> a bf:Agent,
> 
>>
> 
>>         bf:Person;
> 
>>
> 
>>     rdfs:label "杨洁"@zh-cmn-hant;
> 
>>
> 
>>     bflc:name00MarcKey "8801 $6700-04/$1$a杨洁";
> 
>>
> 
>>     bflc:name00MatchKey "杨洁" .
> 
>>
> 
>>  
> 
>>
> 
>> Attached is the entire record in .ttl format.  Thanks a lot!
> 
>>
> 
>>  
> 
>>
> 
>> Amanda
> 
>>
> 
>>  
> 
>>
> 
>>  
> 
>>
> 
>> ---
> 
>>
> 
>> Amanda Xu
> 
>>
> 
>> Metadata Analyst Librarian
> 
>>
> 
>> Cataloging and Metadata Department
> 
>>
> 
>> University of Iowa Libraries
> 
>>
> 
>> 100 Main Library (LIB)
> 
>>
> 
>> Iowa City, IA 52242-1420
> 
>>
> 
>>  
> 
>>
> 
>>  
> 
>>
> 
>>  
> 
>>
> 
>> -----Original Message-----
> 
>> From: Bibliographic Framework Transition Initiative Forum
> 
>> [mailto:[log in to unmask]] On Behalf Of Karen Coyle
> 
>> Sent: Friday, January 05, 2018 2:24 PM
> 
>> To: [log in to unmask] <mailto:[log in to unmask]>
> 
>> Subject: Re: [BIBFRAME] CC:AAM Statement in Support of the
> 
>> Internationalization of BIBFRAME
> 
>>
> 
>>  
> 
>>
> 
>> On 1/2/18 9:14 AM, Joseph Kiegel wrote:
> 
>>
> 
>>> There will be edge cases that are difficult, but for the vast
> 
>>> majority
> 
>> of strings, the language will be obvious to the cataloger.
> 
>>
> 
>>> 
> 
>>
> 
>>> In native BIBFRAME and a good user interface, assigning language tags
> 
>> will not be difficult or time consuming.  The language of cataloging
> 
>> is known and those fields can be tagged automatically.  Templates can
> 
>> assign tags for catalogers who routinely catalog in a given language.
> 
>> I have experimented with language tags in a test interface and it was
> 
>> not hard.
> 
>>
> 
>>  
> 
>>
> 
>> This is the kind of statement which makes me hunger for more detail.
> 
>> For example, what were the rules for assignment for: transcribed titles?
> 
>>
> 
>> titles with subtitles in different languages? author/title groups (if
> 
>> they exist in BF - I don't remember the structuring of those)?
> 
>> personal names? Are there strings with more than one language and how
> 
>> is that handled? Can a title ever be in a different language than the
> 
>> language of the text when the text is monolingual?
> 
>>
> 
>>  
> 
>>
> 
>> Also, do we have or is anyone developing rules or guidelines for
> 
>> cataloging decisions regarding language tagging of individual strings?
> 
>>
> 
>> (This would seem to fall to PCC or some such group?)  Is this covered
> 
>> in RDA anywhere? What standard are we using? ISO 639-1, -2, or BCP 47?
> 
>>
> 
>>  
> 
>>
> 
>> But above all I have yet to read anything that addresses the use cases
> 
>> where such encoding facilitates or is essential for user services. We
> 
>> have long had the separate of subject access by language (O Canada!),
> 
>> and the selection of language materials by language. But I haven't
> 
>> seem a non-speculative, practical use for language tagging of strings.
> 
>> I realize that language tagging of strings is coming to us from RDF,
> 
>> and is somewhat new, and may in the future be obligatory, but I still
> 
>> think we need use cases before undertaking coding so that said coding
> 
>> will provide the desired outcomes, given the complexity of our data.
> 
>>
> 
>>  
> 
>>
> 
>> Perhaps what this amounts to is a knowledge gap between the BF
> 
>> practitioners and those of us who are on the sidelines. If so, please
> 
>> point us to the relevant documentation!
> 
>>
> 
>>  
> 
>>
> 
>> thanks,
> 
>>
> 
>> kc
> 
>>
> 
>>  
> 
>>
> 
>>> 
> 
>>
> 
>>> 
> 
>>
> 
>>> 
> 
>>
> 
>>> -----Original Message-----
> 
>>
> 
>>> From: Bibliographic Framework Transition Initiative Forum
> 
>>
> 
>>> [mailto:[log in to unmask]] On Behalf Of Karen Coyle
> 
>>
> 
>>> Sent: Friday, December 22, 2017 6:42 AM
> 
>>
> 
>>> To: [log in to unmask] <mailto:[log in to unmask]>
> <mailto:[log in to unmask]>
> 
>>
> 
>>> Subject: Re: [BIBFRAME] CC:AAM Statement in Support of the
> 
>>
> 
>>> Internationalization of BIBFRAME
> 
>>
> 
>>> 
> 
>>
> 
>>> Osma, I took all of those examples of 1984 from LoC's catalog. While
> 
>> Wikidata may think they have different titles, we don't know how that
> 
>> decision was made (there are no cataloging rules for Wikidata). In no
> 
>> case have I seen "Nineteen Eighty-Four" for the English version
> 
>> (although it was filed that way in card catalogs as per the ALA Filing
> 
>> Rules). Your examples all conveniently prove your point, but I still
> 
>> think that asking catalogers to determine the language of every field
> 
>> is going to create difficulties. It would be a good idea to take a
> 
>> sampling of records and try this out. From the cataloger's point of view.
> 
>>
> 
>>> 
> 
>>
> 
>>> kc
> 
>>
> 
>>> 
> 
>>
> 
>>> On 12/21/17 7:44 AM, Osma Suominen wrote:
> 
>>
> 
>>>>> However, there is a big problem with trying to attribute
> 
>>
> 
>>>>> *language* to fields in bibliographic data. It only takes a few
> 
>>
> 
>>>>> examples to understand why:
> 
>>
> 
>>>>> 
> 
>>
> 
>>>>> Title:
> 
>>
> 
>>>>> 1984 (book in German)
> 
>>
> 
>>>>> 1984 (book in Hebrew)
> 
>>
> 
>>>>> 1984 (book in English)
> 
>>
> 
>>>> 
> 
>>
> 
>>>> I don't think that's a problem at all. In fact this is a great
> 
>>
> 
>>>> example, since the name of Orwell's novel (assuming you meant it)
> 
>>
> 
>>>> actually differs between many languages. According to Wikidata
> 
>>
> 
>>>> (http://www.wikidata.org/entity/Q208460) it is called
> 
>>
> 
>>>> 
> 
>>
> 
>>>> "1984" in German
> 
>>
> 
>>>> "1984" in Hebrew (but rendered with right-to-left alignment!)
> 
>>
> 
>>>> "Nineteen Eighty-Four" in English (not 1984!) "Vuonna 1984" in
> 
>>
> 
>>>> Finnish "নাইন্টিন এইটি-ফোর" in Bengali
> 
>>
> 
>>> 
> 
>>
> 
>>> --
> 
>>
> 
>>> Karen Coyle
> 
>>
> 
>>> [log in to unmask] <mailto:[log in to unmask]>
> <mailto:[log in to unmask]> http://kcoyle.net
> 
>>
> 
>>> m: +1-510-435-8234
> 
>>
> 
>>> skype: kcoylenet/+1-510-984-3600
> 
>>
> 
>>> 
> 
>>
> 
>>  
> 
>>
> 
>> --
> 
>>
> 
>> Karen Coyle
> 
>>
> 
>> [log in to unmask] <mailto:[log in to unmask]>
> <mailto:[log in to unmask]> http://kcoyle.net
> 
>>
> 
>> m: +1-510-435-8234
> 
>>
> 
>> skype: kcoylenet/+1-510-984-3600
> 
>>
> 
>  
> 
> --
> 
> Karen Coyle
> 
> [log in to unmask] <mailto:[log in to unmask]> http://kcoyle.net
> 
> m: +1-510-435-8234
> 
> skype: kcoylenet/+1-510-984-3600
> 

-- 
Karen Coyle
[log in to unmask] http://kcoyle.net
m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600