On 3/10/14 5:56 PM, Ford, Kevin wrote:
> Dear All,
>
> New LC/NACO (Names) and LCSH bulk downloads are available.
Wonderful! Thanks for generating a new set! I'm continuing to use
these files for The Online Books Page, for subject and name validation,
and for the Forward to Libraries service between the OLBP, Wikipedia,
and various other libraries (currently supporting a repertoire of just
over 500 in 17 countries).
Regarding data issues, I'm still seeing a few rogue triples in the
SKOS Ntriples file for LCSH. In particular, a few subjects have
multiple preferred headings, mostly apparently due to typo variation
from the correct form or ID. It's a very small set (small enough
to list below), and I can easily override the bad ones, but in case
you find it useful to know, here they are:
"Ese language" is given as a preferred form for both
ID sh2007006292 (correct)
and ID sh2008006292 (actually "Alice Holt Forest (England)")
"Europe--Maps" is given as a preferred form for both
ID sh2008114958 (correct)
and ID sh2008117958 (actually "Geology--Pacific Ocean")
"Commercial law--Germany (West)" is given as a preferred form for both
ID sh2009120790 (correct)
and ID sh2008120790 (actually "Folk poetry, Turkish--History and
criticism")
"Actors--Correspondence" is given as a preferred form for both
ID sh2009113523 (correct)
and ID sh2009113423 (actually "Administrative discretion--Germany
(West")
sh200912788 is given the correct preferred label
"Jews, Yemeni--Israel--Biography"
and also the incorrect (misspelled) preferred label
"Jews, Yemini--Israel--Biography"
"Women's studies" is assigned as a preferred form for both
ID sh85147771 (correct)
and ID sh2010000020 (actually "Women's studies--Awards")
sh2010000090 is given the correct preferred label
"Iyo language (Papua New Guinea)"
and also the incorrect (misspelled) preferred label
"Iyo language (Papau New Guinea)"
"Woodyard, Lee McKinney (Fictitious character)" is assigned as
preferred form for both
ID sh2013000335
and ID sh2012000335 (actually "Game reserves--North Dakota")
sh85019484 is given the correct preferred label
"Canberra (Military aircraft)"
and also the incorrect (obsolete) preferred label
"Canberra (Bomber)"
sh85032000 is given the correct preferred label
"Cooking (Duck)"
and also the incorrect (obsolete) preferred label
"Cookery (Ducks)"
sh98005660 is given the correct preferred label
"HIV-positive men--United States"
and also the incorrect (misspelled) preferred label
"HIV-postive men--United States"
sh99001551 is given the correct preferred label
"Films for French, [Spanish, etc.] speakers"
and also the incorrect (missing comma) preferred label
"Films for French [Spanish, etc.] speakers"
But the good data and updates far outweigh these anomalies.
Thanks again for making the new sets available!
John
|