non-MARC policies, practices and workflows

Compile an inventory of all policies, practices and workflows involving non-MARC metadata activity at both institutions -- including systems and schemas.

Columbia

There are two primary locations for metadata documentation:

a) Metadata Wiki

b) Digital Collections Wiki (not open)

Some metadata documentation may be maintained on project specific wikis, such as the Durst Collection Wiki, though those are generally linked to from the Metadata Wiki.

The Metadata Wiki contains project specific documentation on the CU Digital Projects Metadata Documentation page. This includes data dictionaries, metadata mapping tables, meeting minutes, etc.

A Table of Core CUL Metadata elements was created in Dec. 2012: https://wiki.cul.columbia.edu/display/metadata/Table+of+Core+CUL+Metadata+Elements

This table is MODS-based since this is the primary non-MARC schema used by Columbia University Libraries.

More information on Columbia’s MODS usage can be found in the MODS Implementation Registry.

Metadata is being created in many places throughout the libraries and information services. Technical services is primarily involved in the following areas:

Omeka:

Documentation and training materials for Omeka are available via the Metadata Wiki: https://wiki.cul.columbia.edu/display/metadata/Omeka

Omeka is being used to create online exhibitions. This tool is Dublin Core based, but Columbia staff created a MODS plug-in, thus the Omeka workform currently contains DC, local fields supplied by the Omeka developers, and MODS elements. Most users enter metadata in a spreadsheet which is bulk uploaded to Omeka. However, for those working directly in Omeka, the mixed data template model -- DC, local fields, and MODS -- is cumbersome. There is a plan to create a MODS only template which will be implemented as development time allows.

Once an Omeka exhibition has been approved, the Digital Projects Librarian, the Metadata Coordinator, and the Metadata Assistant meet with the curator and staff to determine the item-level metadata needs. This process results in the selection of a sub-set of elements from the general Omeka data dictionary and an exhibition specific data dictionary. The exhibition curator (or more often a graduate student) create the item-level metadata in a spreadsheet. The metadata gets reviewed and remediated by the either the Digital Projects Librarian, the Metadata Coordinator, or, more recently, the Metadata Assistant located in OSMC. This spreadsheet gets sent to the scan lab. After imaging, the metadata and the images get uploaded into Omeka. Finally, the Metadata Assistant reviews the metadata once more against the actual images.

The current process of metadata consultation and review was not in place when Omeka was implemented in 2009. The Metadata Coordinator and Metadata Assistant are now remediating the older exhibitions.

Customer orders:

“Customer orders metadata” is generated by digitization requests by internal or external patrons to Columbia University Libraries. These requests include among other types of resources: Single pages from books and periodicals, photographs, art objects, archival materials, and complete publications. While this is not metadata that is being displayed to the public, the images are loaded into the Fedora Staff Collection Viewer. The metadata has to conform to a minimal level of consistency to allow for the management of these digital images. The metadata is being created by staff in the Preservation and Digital Conversion Division. The Metadata Coordinator and the Digital Projects Librarian created a list of required elements and a data dictionary.

Documentation is located on the Metadata Wiki: https://wiki.cul.columbia.edu/display/metadata/Customer+Orders

Digital Projects:

Digital Projects are web initiatives that require advanced functionality (e.g., specialized searching, browsable indexes). The objects in these projects come from Columbia Libraries’ special collections. The G.E.E. Lindquist Native American Photographs http://lindquist.cul.columbia.edu/ or the recently launched Community Service Society Photographs http://css.cul.columbia.edu/ are two such examples.

Until recently digital projects used legacy metadata. However, a newly instituted process involves a metadata consultation between project stakeholders, the Metadata Coordinator, and the Digital Projects Librarian. The needed fields are identified and documented in a data dictionary. All fields are mapped to MODS and when work on the project is complete the MODS records are stored in Fedora. Documentation is located on the Metadata Wiki and on the Digital Collections Wiki (links generally connecting one to the other): https://wiki.cul.columbia.edu/display/metadata/CU+Digital+Projects+Metadata+Documentation

Academic Commons:

The institutional repository, Academic Commons, uses MODS 3.4. Metadata received through the self-deposit form is being remediated either by the Academic Commons team in the Center for Digital Research and Scholarship (CDRS) or the Metadata Assistant in OSMC.

Metadata for digital resources not contributed through the self-deposit form is being manually created either in CDRS or OSMC. Columbia staff developed Hypatia, a MODS cataloging and Fedora ingest tool. For further details see the MODS Implementation Registry.

Web archiving:

Metadata for the Web Resources Collection Program is also created in technical services. MARC records are being created for the Human Rights portal in addition to the Dublin Core records residing in Archive-It. The Human Rights Web Archive is built through these MARC records. The metadata is being duplicated in Archive-It. While it is possible to export DC records from Archive-It, it is currently not possible to import MARC records.

Other Web resource collections are only described in Archive-it using its native Dublin Core.

Documentation can be found in the Metadata Wiki: https://wiki.cul.columbia.edu/display/metadata/Web+Archiving, the Web Resources Collection Program (2013-): https://wiki.cul.columbia.edu/display/webresourcescollection/Policies+and+procedures, and the Mellon Project on Web Resources Collection Program Development (2013-) page:

https://wiki.cul.columbia.edu/display/webresourcescollection/Policies+and+procedures.

Archives:

EAD finding aids are created outside technical services. Archival collections can be found via the Archival Collections portal.

Hyacinth

Hyacinth is a proposed tool to support creation and editing of non-MARC metadata and will serve as a front end to Columbia’s Fedora repository. There is an agreed upon data dictionary from which data elements can be selected for specific application profiles. Records will be able to be output in a number of schemes, including, but not limited to, MODS. This tool will support work on Customer Orders, Omeka exhibitions, and digital projects. A plan of work was drafted in Oct. 2012. Implementation will proceed when development time allows.

Object Metadata

A task force has been put in place to investigate tools, standards, etc. for handling the metadata for art and cultural objects. A report is expected by the end of September 2012.

Cornell

RDMSG

When science metadata consultations are requested via the Research Data Management Service Group, work is directed via the RDMSG ticketing system to the Science Data and Metadata Librarian. Best practices and additional documentation (soon to include recommendations for generation of read-me style documentation for science metadata) can be found on the RDMSG website (http://data.research.cornell.edu). Documentation for projects that require annual work (e.g. Loon project), are maintained by the SDML, and kept on a local SFS share, accessible only to the metadata librarians, the director of Cataloging and Metadata and CUL personnel directly involved in the projects.

DCAPS workflows and documentation

Workflows are dependant on the project and no documentation exists for generalized cross-project workflows or procedures related to metadata.

Metadata Consultation page

General information about metadata services offered at Cornell can be found on a publicly accessible wiki: http://lts.library.cornell.edu/metadata. In addition, records of non-MARC metadata consultations are kept in an internal Metadata services wiki page; this page is recently created and does not include historic consulations (https://confluence.cornell.edu/display/metaserv/non-MARC+Metadata+Consultations).

Visual Resources Cataloging

Older / fuller documentation stored on Metadata servers for both PiCtor usage and VRA Cataloging from KVRF and CUL; current training documentation is available via confluence: https://confluence.cornell.edu/display/metaserv/Cataloging+in+PiCtor.

eCommons

eCommons@Cornell is Cornell University Library’s DSpace-based institutional repository that provides long-term access to the intellectual output of of Cornell’s faculty, staff and students. Content is primarily user-submitted and Dublin Core metadata are collected and during deposit. Metadata guidelines are available for user submissions, http://ecommons2.library.cornell.edu/eCommons_Best_Practices.pdf

LUNA

Out-of-date documentation recently deleted from web servers; no current documentation exists. The workflow for LUNA entails loading metadata from Excel spreadsheets; the Metadata Librarian either guides metadata capture or crosswalks metadata from other systems to LUNA.

SharedShelf

No publicly available documentations exist; metadata guidelines are on a per-project basis while a tip-sheet to facilitate SharedShelf metadata creation is shared and stored locally. The workflows for capturing metadata are determined per-project, based on subject knowledge of stakeholders, capacity issues and/or existence of legacy metadata.

Kaltura

No current documentation exists. The workflows for Kaltura metadata include either capturing metadata directly in the XML template or in an Excel spreadsheet, which is then transformed to XML. We have not yet automated the Excel-to-XML transformation.

Locale (Greenstone)

For some collections, metadata are produced for collections held in Locale, a Greenstone based repository run by Mann Library (http://locale.mannlib.cornell.edu). Two guidance documents have been written for working with Greenstone (training, best practices) and are stored on a local SFS share, accessible only to the metadata librarians, the director of Cataloging and Metadata and CUL personnel directly involved in the projects.

CULAR

Documentation for CULAR is not yet written; workflows and level/types of required metadata are still being decided. A CUL-only wiki is available that references metadata and there is a template to capture collection-level descriptive metadata.

CuLLR (https://confluence.cornell.edu/display/culwebdev/CuLLR)

The CuLLR (Currated List of Library Resources) team is charged to design and implement a process whereby print and electronic resources in the library catalog (in the form of metadata), and from other sources to be determined, are extracted in accordance with the subject areas for a specific library. These resources may then be annotated to identify the subject areas for which the resource is useful and other attributes as determined by the project team.

Web Archiving

The web archiving service is nascent. As the initiative develops, workflows and policies will be created.

Project Euclid:
Project Euclid's (http://projecteuclid.org) offers access to a growing platform of high-quality, peer-reviewed journals, monographs, and conference proceedings in the field of theoretical and applied mathematics and statistics. The project was developed by Cornell and jointly managed by Cornell and the Duke University press. Procedures for the creation and maintenance of Project Euclid Journal Issue metadata can be found at: http://lts.library.cornell.edu/lts/pp/mes/euc/index.

EAD

Encoded Archival Description practices and workflows vary dependant on the repository creating the EAD guide; documentation and workflows for this reside within the repositories, themselves.

RMC Patron Requests

Metadata creation for patron digitization requests from the collections of the Division of Rare and Manuscript Collections (RMC) is performed in RMC with guidance from the Metadata Librarian for Humanities & Special Collections; guidelines and workflows are stored locally.

Child pages

non-MARC policies, practices and workflows