Baseline productivity numbers

Establish baseline productivity numbers for activities and projects at each institution to allow for future assessment of potential changes and development associated with 2CUL TSI

Columbia

The statistics are highly dependent on what type of metadata work originates from other departments and how much support is needed for any given project; see report section on dependencies. The work of the Metadata Assistant is better quantifiable than the work of the Metadata Coordinator and the Digital Projects Librarian. The Metadata Assistant tracks her statistics, including number of items and number of days spent completing the project.
General Examples of some tasks completed since 2010 are listed below.

Digital Projects LDPD

Lindquist Collection of Native American Photographs (with Burke Archive)
Project duration: 2010-2012
Metadata work included consultations, creations of data dictionary, MODS mapping, metadata creation for 1868 images and QC

Community Service Society (with Rare Book & Manuscript Library)
Project duration: 2011-2013
Metadata work included consultations, creations of data dictionary, MODS mapping, metadata creation for 1354 images, QC, authority work (44 new, 8 updates), and OCR correction

Timing:
"I would say that for an image without any OCR it does not take more than a minute to fill in the Borough and any topic subjects/subject names (which for the majority of the images is all that I am dealing with). If I have to look up authorities or research anything in the image, it could take a little longer. For the images with OCR, it takes at least 5 minutes and has taken up to 10 depending on how long the passage that needs to be corrected is." (student worker's estimate)

Durst Project (with Avery)
Project duration: 2012-
Metadata work includes: Consultations, legacy data analysis, data dictionaries, creation of test records, etc.

MODS Conversion of Legacy Data

Staff Collection Viewer & Remediation

Columbia recently built a Staff Collection Viewer in order to aggregate digital assets that were created for a variety of purposes such as customer orders of library material, digital projects, and online exhibitions. The existing descriptive metadata for these assets was largely done in a non-standard manner making cross-collection searching nearly impossible. Therefore the first step in this process is the remediation of the legacy metadata using MODS, the Libraries preferred schema for the description of digital objects.

This work is ongoing. To date three large collections of been remediated:
1. Customer Orders – approximately 500 digital assets
2. Greene & Greene Architectural Drawings and Photographs - approximately 5,000 digital assets
3. Hugh Ferriss Architectural Drawings – 362 digital assets

Digital Projects CDRS

Women Film Pioneers
Project duration: 2012-
Metadata work includes taxonomy development, consultations, ca. 2 hours per week

Digital Projects Avery

Built Works Registry
Project duration: 2011-
Metadata work includes consultation and metadata remediation (14,000 records) in process

Omeka exhibits
Metadata work included consultations, creations of data dictionary, metadata remediation and QC for each exhibit

2010: 12 exhibits
2011: 6 exhibits
2012: 5 exhibits
2013: 3 exhibits published so far (several more in process)

Web Archiving
We have no consistent supply of new seed websites to use for estimating. Any collection may go months without new seeds, or as with Burke, gain hundreds in a short period. Much of our metadata work, both in CLIO and A-I, consists of updating/revising existing records rather than creating new records. Retroactively we could say that from 2010-2013 we have created approximately 1100 Archive-It seed-level metadata records, along with 679 marc records (identified with clio kw= wayback.archive-it.org).

Academic Commons
Academic Commons cataloging done in technical services: Between 30-50 per week, depending on format type.

Consultations
The number of general consultations are increasing as more people around the libraries/information services become more aware of metadata in general. Topics can range from staff of other divisions wanting advice for a specific project to general questions on metadata or linked data. These consultations take about one to two hours hour each.

Cornell

Statistics at Cornell are infrequently captured for non-MARC metadata; additional digital project and metadata-related projects are conducted by Cornell metadata librarians and not referenced in the below selection.

Image Cataloging (VRA Core)
2011 July - 2012 June:

Image Records - 1110
Work Records - 717

2012 July - 2013 June:

Image Records - 3390
Work Records - 2513

Platform Development & Implementation
Metadata Librarians at Cornell are involved in the development and implementation of both vendor products and homegrown platforms, this includes schema and field selection / development, taxonomy creation or selection, mapping metadata crosswalks, etc.

Vendor-Cornell development collaboration

Kaltura: Long-term and considerable development of cross-campus and CUL-specific implementations
ARTstor SharedShelf: The hours spent working and consulting with ARTstor on both the SharedShelf 'simple cataloging' tool development and guiding the development of the VRA Core complex cataloging interface is considerable and inestimable.

Home-grown platform development

Blacklight discovery interface
Discovery & Access Integration Layer
Cornell University Library Archival Repository (CULAR)

Consultations
The metadata librarians at Cornell have a wide-range of consultations from single emails or phone calls, to long-term consultations with faculty, library staff and non-Cornell entities. The range of consultation-types makes it challenging to quantify and consistent statistics have not been captured in this realm.

A&S Internal Grants: http://dcaps.library.cornell.edu/initiatives/asgrants/awards
This work occurs with colleagues in DCAPS (http://dcaps.library.cornell.edu/) to consult on the feasibility of projects prior to writing the application, composing budgets and attending the A&S faculty board advisory committee meeting to provide background and field questions about the applications. During the grant term, the Metadata Librarian for Humanities and Special Collections consults on metadata issues, set-up projects in the content management system, provide metadata guidelines, train metadata capturers and perform post-capture metadata review and clean-up. Generally, this work includes significant contact with faculty members. Some projects are very large while others are relatively small; further, projects usually extend beyond the grant-term.
2010-2011: 5 projects
2011-2012: 6 projects
2012-2013: 11 projects
2013-2014: 13 applications; winners to be publicly announced in late Summer 2013.

National Grant Applications
IMLS 2013-2014: 2 application pending pending notification
NEH Preservation & Access 2012-2013 applications: 3 applications submitted; 1 awarded (http://staffweb.library.cornell.edu/node/3557)
NEH Preservation & Access 2013-2014: 2 re-submissions pending
NEH Office of Digital Humanities 2013-2014: 1 application in-preparation
NYSERDA 2013-2014 (NYS Energy Research and Development Authority): 1 awarded

Data Modeling
The main data modeling projects to occur in the past year is the Freedom on the Move digital humanities project; this has occupied approximately 40 hours thus far. While primary data modeling work has completed, development of the project continues with the Metadata Librarian for Humanities and Special Collections

Web Archiving
Crawling is performed by the Metadata Librarian for Humanities & Special Collections, workflow for cataloging in Dublin Core is being investigated by the Administrative Supervisor for Original Cataloging. The web archiving pilot captures sites in four areas, the majority of sites have not yet been identified:
1. University Archives, including sites not on the cornell.edu domain (approximately 300 sites)
2. Special Collections, including sites of organizations Cornell collects (approximately 100 sites)
3. Digital Art for teaching (approximately 150-200 sites)
4. Topic general collections: currently beginning to capture sites related to Hydrofracking in the Marcellus Shale (approximately 200 sites)

Upcoming Need
It is likely that we will be involved with a conversion of our digital collection metadata to RDF triples, potentially including but not limited to records from:
DLXS: http://ebooks.library.cornell.edu/
LUNA*: http://library24.library.cornell.edu (except for collections mimicked in SharedShelf)
Kaltura: https://media.library.cornell.edu/
SharedShelf*: http://www.sscommons.org/ (Collections> anything beginning with "Cornell:")
Mann Locale: http://locale.mannlib.cornell.edu/
Archive-it: http://www.archive-it.org/organizations/529
eCommons: http://ecommons.library.cornell.edu/
DigitalCommons@ILR: http://digitalcommons.ilr.cornell.edu/
Scholarship@Cornell Law: http://scholarship.law.cornell.edu/

*Note: each collection has different metadata standards and fields

Child pages

Baseline productivity numbers