You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 16 Next »

Scope: Procedure for handling batch creation of MARC Bib, MFHD, and Item Records for Digitization-produced assets to support their management in the Catalog.

Contact: Jasmine Burns

Unit: Batch Processing, Cataloging, Metadata Services

Date created: 02/17/2017

Date of next review: February 2018


This is a procedure triggered by DCAPS or Digitization work. Relevant template or standard DCAPS procedures that involves this step are documented here.

1. Digitization Project Kick-off Meeting Review

ParticipantJasmine Burns

Input: n/a

Output: Expectations for upcoming, project-based work

Steps Involved:

During the kick-off meeting, a Metadata Services representative (with invited member from Batch or Cataloging, as needed) discusses the:

  • Identifier/Record/Metadata Needs & Fit for the resources being digitized 
    • i.e. not all resources being digitized will be a good fit for record/metadata management in the Catalog.
    • Where should the metadata of record exist for the resources being digitized? Are those identifiers stable enough for use in digitization and preservation workflows?
  • Assess Metadata/Cataloging/Batch work needed:
    • What metadata already exists, for which manifestation (the analog version, a digitized version, other?), and where is this metadata?
    • Do we need to create digital asset records:
      • from scratch? (Note: Metadata, Cataloging & Batch does not do this, but we can help coordinate the effort with the requesting parties)
      • derived from a physical or analog metadata record?
      • derived from a non-MARC source?
      • converted from a non-MARC source?
    • What turn-around / timeframe is required (Note: Metadata, Cataloging & Batch need at least 2 weeks notice generally).
  • Confirm Understanding:
    • The records will be managed in the Catalog by Cataloging, Batch & Metadata staff.
    • We can handle suppressed records, but this is not a preferred situation (keeping suppressed records in the Catalog for digitization and preservation management).
    • We can derive metadata for other delivery or preservation systems (eCommons, SharedShelf, Hydra, CULAR, ...) based off of the records.

2. Digitization Lab Inventory & Digital Assets Identifiers Request

ParticipantsJasmine BurnsGary Branch (as needed), Pamela Stansbury (as needed),

Input: Inventory of Items to be digitized - CSV with Call Number, Title, BIBID for physical/analog resource (if analog resource is already cataloged)

Output: Inventory of Items Needing a Catalog Record & Existing Records/Metadata to leverage

Steps Involved:

  • The Digitization Lab work hands final inventory of items to be digitized (generally, Call Number, Title, BIBID of the physical/analog) to Metadata for review & routing. Metadata review includes:
    • Checking list for completeness.
    • Logging workflow steps into project tracking systems (probably just Zoho).
    • Capturing the hand-off in some version-controlled documentation source like https://github.com/cmh2166/AV2eCommons.
  • According to decisions made at Digital Project Kick-off Meeting (Step 1), the Metadata Contact routes this inventory list and additional context to the appropriate Batch contact for digital (and possibly analog) record creation.
    • If the metadata of record is not destined for the Catalog, then Step 3 is skipped and Batch / Cataloging is not involved in the workflow past a Collection-level record.

3. Digital & Analog (as needed) Asset MARC Record Creation & Loading

ParticipantsGary Branch (hands off to Batch staff member), Pamela Stansbury (as needed), Jasmine Burns (as needed)

Depending on required next step, one of 3a, 3b, 3c (or a to be created 3d) is followed:

3a. MARC exists for the Analog/Physical Item, Need to Derive MARC for the Digital Asset

Input: List of Analog Bibliographic Record IDs

Output: Derived, suppressed Digital Asset Bibliographic Record IDs, and a List of those IDs matched to the Analog Bib IDs

  • List of Analog/Physical Bibliographic Records are handed to Batch Processing (Gary as point person).
  • Batch Processing:
    • Derives Bibliographic Records for the Digital Surrogate from the Analog/Physical Bibliographic Record.
    • The Digital Asset, Derived Bibliographic Records have no holdings and are kept suppressed until reloaded with links (Step 6)
    • The Digital Asset, Derived Bibliographic Records have need a unique a flag, like a 995 ignore, so Batch Processing's validation jobs don't flag these as errors
  • Bib record derivation specification and data profile:
    • based on format, to be documented.
    • upon generation of the Digital Asset, Derived Bibliographic Records, Cataloging (Point Person, Pam) is requested to review a sample to check for adequate data.
    • The resulting decisions/updates/edits should be added back to the format-specific derivation profile, to inform future derivations.

3b. MARC does not exist for the Analog/Physical Item, Need to Derive MARC for the Analog and the Digital Asset from a provided CSV

Input: CSV including Title (of Work), Date, Barcode (on Item), Part Number, Notes

Output: Generated, minimum-level Physical/Analog Bibliographic, MFHD, & Item Records; and Derived suppressed Digital Asset Bibliographic Record IDs, and a List of those IDs matched to the Analog Bib IDs

  • The Requesting Party for the Digital Project creates and provides a CSV with the following information:
    • Title (of Work), Date, Barcode (on Item), Part Number, Notes, Other Metadata Fields as encountered / able to be pulled
  • The CSV is handed to Batch Processing (Gary as point person).
    • That CSV is used by Batch (can Metadata help with this at all?) to generate minimum-level MARC Bibliographic, Holdings & Items records for the Physical/Analog Resource
    • the Physical/Analog Bibliographic Records are unsuppressed unless explicitly requested by the originating party (and approved in the kick-off meeting)
  • Batch Processing:
    • Derives Bibliographic Records for the Digital Surrogate from the Analog/Physical Bibliographic Records generated above.
    • The Digital Asset, Derived Bibliographic Records have no holdings and are kept suppressed until reloaded with links (Step 6)
    • The Digital Asset, Derived Bibliographic Records have need a unique a flag, like a 995 ignore, so Batch Processing's validation jobs don't flag these as errors
  • Bib record derivation specification and data profile:
    • based on format, to be documented.
    • upon generation of the Digital Asset, Derived Bibliographic Records, Cataloging (Point Person, Pam) is requested to review a sample (of the physical and the digital records) to check for adequate data.
    • The resulting decisions/updates/edits should be added back to the format-specific derivation profile, to inform future derivations.

3c. MARC exists (but not in Voyager) for the Analog/Physical Asset, Need to Load MARC for Analog/Physical Asset and Derive MARC for the Digital Asset

Input: Minimum level MARC records (Bib with holdings and items information as relevant)

Output: Voyager-loaded Physical/Analog Bibliographic, MFHD, & Item Records; and Derived suppressed Digital Asset Bibliographic Record IDs, and a List of those IDs matched to the Analog Bib IDs

  • The Requesting Party for the Digital Project creates and provides a set of MARC Bibliographic records stored somewhere open to them, Metadata and Batch Processing
    • The Physical/Analog Bibliographic Records are loaded into Voyager.
    • These records are unsuppressed unless explicitly requested by the originating party (and approved in the kick-off meeting)
  • Batch Processing:
    • Derives Bibliographic Records for the Digital Surrogate from the Analog/Physical Bibliographic Records generated above.
    • The Digital Asset, Derived Bibliographic Records have no holdings and are kept suppressed until reloaded with links (Step 6)
    • The Digital Asset, Derived Bibliographic Records have need a unique a flag, like a 995 ignore, so Batch Processing's validation jobs don't flag these as errors
  • Bib record derivation specification and data profile:
    • based on format, to be documented.
    • upon generation of the Digital Asset, Derived Bibliographic Records, Cataloging (Point Person, Pam) is requested to review a sample (of the physical and the digital records) to check for adequate data.
    • The resulting decisions/updates/edits should be added back to the format-specific derivation profile, to inform future derivations.

3d. MARC does not exist for the Analog/Physical Item, Do NOT need to create record for Digital Asset

Input: CSV including Title (of Work), Date, Barcode (on Item), Part Number, Notes

Output: Generated, minimum-level Physical/Analog Bibliographic, MFHD, & Item Records

  • The Requesting Party for the Digital Project creates and provides a CSV with the following information:
    • Title (of Work), Date, Barcode (on Item), Part Number, Notes, Other Metadata Fields as encountered / able to be pulled
  • The CSV is handed to Batch Processing (Gary as point person).
    • That CSV is used by Batch (can Metadata help with this at all?) to generate minimum-level MARC Bibliographic, Holdings & Items records for the Physical/Analog Resource
    • the Physical/Analog Bibliographic Records are unsuppressed unless explicitly requested by the originating party (and approved in the kick-off meeting)
  • Bib record derivation specification and data profile:
    • based on format, to be documented.
    • upon generation of the Bibliographic Records, Cataloging (Point Person, Pam) is requested to review a sample (of the physical records) to check for adequate data.
    • The resulting decisions/updates/edits should be added back to the format-specific derivation profile, to inform future derivations.

4. Generate Updated Inventory with "eBibs", Hand back to Digitization

Participants: Jasmine Burns, Batch staff

  • Batch provides Metadata with a list of Bibs generated for the digital and physical items.
  • Metadata adds these identifiers to the original inventory CSV handed over by digitization (see step 2).
  • Metadata hands this updated inventory to Digitization to be used for Digitization, Filenaming, & loading into Preservation.

5. Descriptive Metadata Integration

Participants: Jasmine Burns

After digitization magic happens, we need to integrate existing descriptive metadata with the generated files and simple inventory metadata for loading into delivery systems (eCommons, Hydra, SharedShelf, etc.)

  • Metadata staff is contacted with updated information for the post-digitization inventory (updated with filenames, digitization notes, and Kaltura IDs if appropriate.
  • Metadata staff will integrate KalturaIDs, preservation IDs, and catalog (or other) metadata into one CSV and prepare spreadsheets for delivery assets to be loaded into the specified delivery system.
  • Metadata staff will also perform any necessary normalization for preparing metadata for ingest.
  • Handoff: Metadata Staff will give metadata ingest spreadsheet for loading delivery assets into repository to Delivery repository contact (determined at Kick-off meeting).

6. Hydrate eBIB Stubs

Participants: Jasmine BurnsGary Branch (or designated Batch Staff member), Pamela Stansbury (as needed)

After the assets are loaded into a delivery system (eCommons, Hydra, SharedShelf, etc.), the digital asset records created need to be unsuppressed and updated with persistent URLs to the resource in the delivery system.

  • The Delivery System contact (identified at the kick off meeting) will hand off a spreadsheet with BIBID, eBIBID, and persistent URL for delivery system to Metadata Services contact. The Metadata Services contact will:
    • Check list for completeness.
    • Log workflow steps into project tracking systems (probably just Zoho).
    • Capture updated metadata inventory in some version-controlled documentation source like https://github.com/cmh2166/AV2eCommons.
  • If the system of record is the Catalog:
    • Metadata will hand to Batch Processing a CSV of digital asset bib ids and persistent URLs 
    • Batch Processing will un-suppress the digital asset bibliographic record and add the persistent URL.
    • Batch Processing will attach an appropriate Holdings for the Digital Asset, Derived Bibliographic Record (no Item record is attached to Digital Asset Bibliographic/Holdings records)

PostScript: Linking the Digital Asset Bib Record to the Physical/Analog Asset Bib Record

When deriving digital asset Catalog records from physical or analog asset Catalog records, we want to indicate the derivative relationship between the two resources.

  • a special 035 field is generated by Batch that links the derived, digital asset bibliographic record to the physical/analog bibliographic record (documentation for this?)
  • 776  0 8  $i Print version | Online version $w (Voyager) bibidnum
  • 899 ind1 = 0 $a 899code

Metadata Examples & Profiles for the Above Workflow

Example: Pre-Digitization Inventory

to be added

Example: Post-Digitization Inventory / Metadata

https://github.com/cmh2166/AV2eCommons/blob/master/data/CUlecturetapes_20160211/post-Kaltura_files/CUlecturetapes_metadata_kalturaIDs.csv

Example: Digital Asset MARC Bibliographic Record

to be added

Example: Digital Collection-level MARC Bibliographic Record

Not yet updated for RDA or reviewed for current Batch Processing codes / flags / etc.

LEADER 01852cam a22003255a 4500
0016790930
00520160128091848.0
006m        d        
007cr cn ---|||||
008091211s2010    nyua    s     000 0 eng d
035 __ ‡a (OCoLC)667617581
035 __ ‡a 6790930
245 04 ‡a The Huntington Free Library Native American Collection ‡h [electronic resource].
260 __ ‡a [Ithaca, N.Y.] : ‡b Cornell University Library, ‡c c2010.
520 __ ‡a One of the largest collections of books and manuscripts of its kind, the Huntington collection contains extensive materials documenting the history, culture, languages, and arts of the native tribes of both North and South America. Contemporary politics and human rights issues are also important components of the collection.
520 8_ ‡a Full text of a selection of 91 books from the Huntington Free Library Native American Collection representing the various genres in the collection.
500 __ ‡a Title from home page, viewed Sept. 30, 2010.
538 __ ‡a Mode of access: World Wide Web.
650 _0 ‡a Indians of North America.
650 _0 ‡a Indians of South America.
650 _0 ‡a Indians of North America ‡x History ‡v Sources.
650 _0 ‡a Indians of South America ‡x History ‡v Sources.
710 2_ ‡a Huntington Free Library.
710 2_ ‡a Cornell University. ‡b Library. ‡b Division of Rare and Manuscript Collections. ‡0http://vivo.cornell.edu/individual/individual32117 ‡0 http://id.loc.gov/authorities/names/nr97000951 ‡2http://id.loc.gov/authorities/names ‡4 http://id.loc.gov/vocabulary/relators/cur ‡e Curator
856 40 ‡u http://resolver.library.cornell.edu/misc/6790930 ‡x http://dlxs.library.cornell.edu/h/hunt/
899 __ ‡a CULDigReg
906 __ ‡a gs
948 2_ ‡a 20140618 ‡b m ‡d batch ‡e lts ‡x add899CULDigReg
948 1_ ‡a 20100930 ‡b o ‡d mnr1 ‡e lts
948 2_ ‡a 20160128 ‡b m ‡d str1 ‡e lts ‡x RDC Experiment
  • No labels