Scope: The Authority File is maintained locally by CUL-IT. CUL-IT loads new, updated and deleted headings weekly from Peter Ward (authority record vendor) into the authority file, indexing in a authority database, cross referencing with the Blacklight Solr, and generates a JSON report of new, changed, and deleted authorities related to headings in our bib records. Peter Ward files are delivered as two separate files, names and subjects. When a spreadsheet for Names or Subjects is created, they can be filtered to show just new, updated, or deleted headings. This procedure outlines how to create a spreadsheet from the JSON to facilitate authority maintenance in Folio.
Contacts: Mary Campany
Unit: Metadata Design and Operations
Date last updated: June 2023
Date of next review: June 2024
- Download .json file provided by CUL-IT in the “Work with CUL-IT” folder in Box.
- Open the file in OpenRefine.
- To tell OpenRefine what a row show be (specify a record path), click on the top section near the first curly bracket. The section should highlight yellow when a mouse hovers over it. The yellow highlighted section should look like the section below, otherwise the json will not parse correctly.
- Click “Create project”.
- In the “Undo/Redo” section, click “Apply”.
- Open the Openrefine processing script with Notepad. The script can be found in Box under “Authority Working Documents>Authority Change Queue>JSON>Openrefine processing scripts”. Copy and paste the entire script into the “Apply Operation History” box in Openrefine. Run the script.
- This script formats the .json into a more workable spreadsheet.
- Export as an Excel 2007+ (.xlsx) spreadsheet.
- Open the new spreadsheet that was exported from OpenRefine in step 7.
- Add a column on the far left. Write “Completed” in the top row of the column.
- For consistent formatting between weekly spreadsheets, download and open the template spreadsheet found in the [[tbd]] folder. Select and copy the entire spreadsheet.
- Select the entire spreadsheet and paste in only the formatting. Some fields will be hidden because of the existing formatting and can be unhidden if needed. See the screenshot below for the location of the “paste formatting” icon.
- Save the spreadsheet as “un(type)(year).(week)_queue” (ex. “Unname21.28_queue”, “unsub21.27-52_queue”)
- Upload to Box in the “names” or “subjects” folders in Authority Working Documents>Authority Change Queue.
Explanation of fields
- Completed (column where entries are marked as complete after updating)
- Input file (Peter Ward weekly label)
- Change category (i.e. Updated, New, Deleted)
- Id (LC identifier)
- Id.loc.gov uri
- Actionable Headings (e.g. 1XX & 4XX fields from authority record)
- Heading (New or Updated heading, for Deletes replacement heading)
- Old Headings (Version of the heading to be updated or changed). This field can be hidden.
- Relevant Headings (Headings to be changed that appear in our bib records).
- Type (author or subject)
- Type 2 (e.g. person, corporate, geographic location)
- Vocabulary (e.g. LC or Unknown)
- Instance count (number of CUL bib records affected)
- Blacklight Link (Links to affected bib records in Blacklight)
- SOLR Link (Link to FOLIO UUIDs from SOLR index for affected records)
- Variants (Results that match heading in $a but are missing $d, $c, or $q)
- Diacritics (Results where Heading and Old Heading are the same, but diacritics are different)
- New Main Heading? (TRUE or FALSE, a flag that there is a new main heading for an existing heading?)***
- Undifferentiated (TRUE/FALSE indicating whether record is an undifferentiated name, used only for deletes)