LISTSERV mailing list manager LISTSERV 16.0

Help for ID Archives


ID Archives

ID Archives


[email protected]


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Monospaced Font

LISTSERV Archives

LISTSERV Archives

ID Home

ID Home

ID  October 2019

ID October 2019

Subject:

Re: New bulk LCSH export pilot

From:

Steven Michael Folsom <[log in to unmask]>

Reply-To:

LC Linked Data Service Discussion List <[log in to unmask]>

Date:

Thu, 24 Oct 2019 18:33:57 +0000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (1 lines)

Not to nag, but do you think this download option for the "smaller" vocabs might happen in the foreseeable future? We're starting to get more and more requests for these in the QA service.

If not, can we forward some of the use cases we're hearing to improve the LC APIs? E.g. folks are asking for display values for http://id.loc.gov/vocabulary/descriptionConventions to include both the label and code. This is something we could do in QA through the label (plus other context information) we provide.


On 10/2/19, 12:17 PM, "Steven Michael Folsom" <[log in to unmask]> wrote:

Yes, exactly. One archive would be great!

On 10/2/19, 11:26 AM, "LC Linked Data Service Discussion List on behalf of Miller, Matthew" <[log in to unmask] on behalf of [log in to unmask]> wrote:

Thanks for the feedback. When you say smaller vocabularies you are referring to, for example, vocabularies under the "Cataloging" section on the homepage? http://id.loc.gov/
I can look into seeing if we can gather up these smaller ones not available on the download page into one archive available to download.

Thanks,
Matt


-----Original Message-----
From: LC Linked Data Service Discussion List <[log in to unmask]> On Behalf Of Steven Michael Folsom
Sent: Tuesday, October 01, 2019 3:46 PM
To: [log in to unmask]
Subject: Re: [ID.LOC.GOV] New bulk LCSH export pilot

Hi Matt,

Yay to more frequent downloads... now we just need to be ready to act on them locally. :)

The efforts to make the files more legible are helpful too.

Along with more frequent downloads, are you considering adding some/all of the smaller vocabularies as downloads? I don't know if others have a need to have these as downloads, or if because we're trying to provide normalized lookup services for these datasets (and others) via QA (https://github.com/samvera/questioning_authority) our needs are unique.

Thanks for progressing id.loc.gov,
Steven



On 9/18/19, 12:19 PM, "LC Linked Data Service Discussion List on behalf of Matt Miller" <[log in to unmask] on behalf of [log in to unmask]> wrote:

Hello,
We are testing a new bulk export process for LCSH and would like to hear any feedback from anyone who uses the bulk downloads. The new bulk files can be found http://id.loc.gov/download/ with the titles LC Subject Headings (LCSH) *NEW Pilot*

New:
- New compacted JSON-LD serialization
- The JSON-LD and XML files are now newline delimited meaning each line in the file is a completely self-contained record
- There are void files for each download with the date the export was created, title, description and MD5 hash of the unzipped download.
- The N-Triple file has records separators now as comments, each group of triples start with “# Start of sh12345678”
- Increased updated frequency, should be updated when new LCSH updates are released monthly.

The same:
- The new LCSH export will contain the same data as before, including broader and narrower relationship but is slightly more verbose.
- It is available in MADSRDF, SKOS and both combined MADSRDF and SKOS in all serializations.

Thinking of removing:
- The current XML dump is one large XML file, the new XML is each record as RDF XML on its own individual line. The current XML file could be used for bulk loading into a triple store, but the current and future NT file could be used in the same way. Is anyone using the current XML dump file for bulk loading?
- The Turtle serialization

Samples:
The first 10 records for MADSRDF & SKOS in all serializations:
https://gist.github.com/thisismattmiller/0691f815478a5dc337e2e140becfc549

Thanks for any feedback,
Matt Miller






Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

February 2024
January 2024
March 2023
January 2023
July 2022
June 2022
May 2022
April 2022
February 2022
November 2021
September 2021
August 2021
July 2021
June 2021
May 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
March 2020
February 2020
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
April 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2017
July 2016
February 2016
January 2016
December 2015
November 2015
September 2015
August 2015
June 2015
March 2015
February 2015
October 2014
August 2014
July 2014
June 2014
March 2014
January 2014
November 2013
September 2013
August 2013
June 2013
May 2013
April 2013
March 2013
December 2012
November 2012
October 2012
September 2012
August 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
April 2011
March 2011
February 2011
January 2011
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
November 2009
June 2009
May 2009

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager