Print

Print


Kelley,

I had some thoughts about some of your questions--which I hope will be
helpful.  You've done a great job at articulating these issue, by the way!

On Sun, Jan 8, 2012 at 7:51 PM, Kelley McGrath <[log in to unmask]> wrote:

>
> I'd like to organize my thoughts around some issues that came up when OLAC
> was doing our initial investigations into the potential of the FRBR model
> to improve access to moving images (see http://www.olacinc.org/drupal/**
> ?q=node/27 <http://www.olacinc.org/drupal/?q=node/27>, particularly part
> 3a). There we were talking about works, but I'd like to work through an
> example using language track information on DVDs.
>
> It's easy to see how to construct a statement that says this DVD is usable
> in English
>
> DVD1 -- hasLanguage -- English
>
> But during our discussions, we realized that we wanted to record several
> more specific aspects of language information, including whether the
> language is
>
> Spoken, signed or written
> Within written whether it is captions (open, closed, SDH), subtitles, or
> intertitles
> The original language or a translation
> Primary or secondary
>
>
These seem like important considerations, well worth including. But your
question is how to do that.



> Primary vs. secondary might seem like an odd thing to want to know, but in
> practice, you can go wrong if you don't make this distinction. IMDb often
> fails on this count, which leads to a list of the most popular Thai
> language films being topped by The Hangover Part II (2011) and Rambo (2008)
> (see http://www.imdb.com/language/**th <http://www.imdb.com/language/th>)
> and The Godfather (http://www.imdb.com/title/**tt0068646/<http://www.imdb.com/title/tt0068646/>)
> is listed as if it is equally in English, Italian and Latin. You also see
> this lack of distinction in library bibliographic records, especially for
> educational/documentary videos with a few subtitled sequences in another
> language.
>
> So maybe one way to go at this would be to combine all these
> characteristics into one mega predicate
>
> DVD1 -- hasLanguagePrimaryAudio -- English
>
> And then map that to the less restrictive cases so
>
> hasLanguagePrimaryAudio -- isSubTypeOf -- hasLanguagePrimary
> hasLanguagePrimaryAudio -- isSubTypeOf -- hasLanguageAudio
> hasLanguagePrimary -- isSubTypeOf -- hasLanguage
> hasLanguageAudio -- isSubTypeOf -- hasLanguage
>

Are you speaking about property relationships here, or relationships
between concepts used as part of a descriptive vocabulary (which is what
your subType relationships sound like to me).  If you take a look at some
of the RDA relationships in the OMR (for example:
http://metadataregistry.org/schemapropel/list/schema_property_id/422.html,
you can see all the subproperties for adaptationOfWork) there is a more
general property [basedOnWork], and more specific properties
[novelizationOfWork].

In the value vocabularies, the different concepts may have hierarchical
relationships, which are expressed in the form of SKOS broader/narrower
aspects. See: http://metadataregistry.org/concept/list/vocabulary_id/99.htmlfor
an example of one of those vocabularies.



> so if someone is just looking at the unrefined language level they can get
> that. But it does seem like an awful lot of possibilities to account for.
>
> Maybe another way would be to introduce an intermediate entity between the
> DVD and the language information like this. One advantage is that you could
> distinguish mixed soundtracks from multiple soundtracks as in statements 1
> and 2 in the example below for a DVD with the movie's original mixed Arabic
> and French soundtrack, a dubbed Spanish soundtrack and an English subtitle
> track.
>
> DVD1 hasLanguageStatement LanguageStatement1
> LanguageStatement1 -- Language -- Arabic
> LanguageStatement1 -- Language -- French
> LanguageStatement1 -- LanguageLevel -- Primary
> LanguageStatement1 -- LanguageType -- Audio
> LanguageStatement1 -- LanguageOriginal -- Original
> LanguageStatement1 -- InfoSource -- Container
>
> DVD1 hasLanguageStatement LanguageStatement2
> LanguageStatement2 -- Language -- Spanish
> LanguageStatement2 -- LanguageLevel -- Primary
> LanguageStatement2 -- LanguageType -- Audio
> LanguageStatement2 -- LanguageOriginal -- Translation
> LanguageStatement2 -- InfoSource -- Container
>
>
> DVD1 hasLanguageStatement LanguageStatement3
> LanguageStatement3 -- Language -- English
> LanguageStatement3 -- LanguageLevel -- Primary
> LanguageStatement3 -- LanguageType -- Written
> LanguageStatement3 -- LanguageTypeWritten -- Subtitle
> LanguageStatement3 -- LanguageOriginal -- Translation
> LanguageStatement3 -- InfoSource -- Container
>
> And then you would have to give people who want to use this data some way
> to connect the dots, which I'm not sure how to do.
>
> This approach would also be useful for ordering data. For instance, for
> film and video, the order in which cast names are presented is important,
> as well as the type of ordering. In addition, this could allow you to make
> statements about where the data came from. So you could have something that
> linked transcribed names with identifiers.
>
> Work1 hasCastCredits CastStatement1
>
> CastStatement1 hasSource Manifestation1 [or http://www.imdb.com/title/**
> tt0101531/ <http://www.imdb.com/title/tt0101531/> which is where I
> actually took this from or some other reference source or unspecified for
> legacy data or where someone doesn't want to bother]
> CastStatement1 hasOrder CreditsOrder
>
> CastStatement1 hasCredit CreditStatement1
> CreditStatement1 hasPosition 1
> CreditStatement1 hasTranscribedName "Charlie Sheen"
> CreditStatement1 hasNAR http://id.loc.gov/authorities/**names/n88368094<http://id.loc.gov/authorities/names/n88368094>[Sheen, Charlie]
> CreditStatement1 hasFunction http://id.loc.gov/vocabulary/**
> relators/act.html <http://id.loc.gov/vocabulary/relators/act.html> [actor]
> ...
> CastStatement1 hasCredit CreditStatement15
> CreditStatement1 hasPosition 15
> CreditStatement15 hasTranscribedName "Larry Fishburne"
> CreditStatement15 hasNAR http://id.loc.gov/authorities/**names/no93030105<http://id.loc.gov/authorities/names/no93030105>[Fishburne, Laurence, 1961-]
> CreditStatement1 hasFunction http://id.loc.gov/vocabulary/**
> relators/act.html <http://id.loc.gov/vocabulary/relators/act.html> [actor]
>
> Of course this is a lot of nesting and you'd have to make it work for data
> consumers who didn't want all that complexity.
>

I think your use of the word 'nesting' was a clue to me that you're
thinking of this problem more as an XML thing than an RDF thing.


> How would you approach these kinds of problems with a named graph? Or is
> this not something where you'd want a named graph? Is it better not to do
> all this in linked data but rather some format for internal consumption and
> just use the linked data for the simplified data that external users are
> likely to want? Am I hopelessly on the wrong track?
>
>
For the named graph approach, I think you would need to look more carefully
at how your data structures and extensions are built, rather than think of
the process of making relationships as 'nesting',

Does that make sense?

Diane


> Kelley
>
>
> Kelley McGrath
> University of Oregon
> [log in to unmask]
>