I think it is up to the implementors how to create a unique identifier for a work (similar to RDA 6.6)

In a good implementation, a librarian can search for a title / author / first publication date and get the work that matches the query. If there is more than one work, the user interface can allow to select the one that is supposed to be the correct one. The assumption that ranking algorithms can help is right - that is why I use search engine technology for implementing Linked Data Catalogs.

Due to the nature of title / author / date for a work, which is embedded into the local cultural context (language, person name rules, calendar etc.) it is a challenge to assume a generic rule for building globally unique identifiers for works. A great help would be to declare the cultural context with the catalog data, so others can decode the data by applying the rules of the context.

As always, duplicates in a catalog must be avoided at all costs. That means, human entry errors must be prevented by a good implementation. Otherwise, the quality of the catalog decreases.

My interest is to examine how Linked Data (Bibframe) Catalogs can be merged into a union catalog - such merge algorithms are simpler than in the past before RDA, if the rules for creating unique identifiers are transparent to the software developer. Otherwise, if the unique identifiers in the source catalogs are opaque, they must be re-created from the transported data by a special algorithm, which automatically avoids duplicates for the scope of the new catalog.


On Tue, Jun 30, 2015 at 6:50 PM, Karen Coyle <[log in to unmask]> wrote:
Surely there will be many work duplicates in the vast world of bibliographic data, just like there are duplicate records for manifestations today. How much effort should go in to preventing this duplication?