You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

Charge (Adam C)

Problem statement

  • Digital collection identifiers (Rick)
    A long standing issue in the presentation of our digital collections is the lack of a reliable identifier
    for the collections themselves. Collections move to new machines, collections change their delivery platform,
    and collections change their default behaviors. We need to be able to locate collections reliably over a long period of time and discover aspects of collections for interoperating with other collections.
  • Persistent IDs for digital preservation (Bill)
    • We are working on preserving the digital objects in two large collections, the Euclid journals and the arXiv.org preprints.  In general, we can view the objects as having at least one content file and an object descriptor file containing metadata about the object. Most digital objects contain multiple content and metadata files.  We need to be able to identify and locate the files for a long time, regardless of where they are located. Rather than changing the metadata in the descriptor file every time a file is moved-we would like to create persistent identifiers that can be mapped to the files' current locations.
    • The number of component files to be preserved will be several times larger than  the number of digital objects. With processing efficiency in mind, we would prefer a solution that will allow us to resolve the identifiers locally, without going out over the internet for each request for resolution.
    • The digital objects' component files in our preservation system will not be directly accessible to the public; access will occur through a gated interface. The persistent identifier mechanism we use should be able to be produce identifiers that are private and not discoverable.

Requirements for an implementation (John)

  • We don't want to break any system currently used at CUL that uses persistent identifieres, such as the PURL server.  Backward compatible.
  • Provide a mechanism for persistent ID's for OAI-PMH
  • Optionally resolvable only within a constrained environment. Secured nameservice. For archiving.  The individual
  • Can ensure confidentiallity.System should define a mechanism for client authentication/authorization to ensure data integrity and  authority control.
  • It doesn't have dependencies on external systems in order to resolve local PIDs.
  • Every PID should be globally unique.
  • PID should be free of location semantics.
  • PID must be able to refer to multiple aspects, attributes, or behaviors of the digital object, but with a default aspect that conforms with convential use.
  • Globally resolvable.
  • Fine-grained control of PIDs, so that groups can maintain their own sets, without having to maintain multiple PID resolvers/servers.

Methodology  (Adam S)

  • Overview of PID
  • PURL
  • ARK
  • Handle
  • OpenURL
  • OAI-PMH requirements

   
Recommendations
          (embed the rationale for each recommendation)

  • CNRI Handle System  (George)
  • We recommend that CUL undertake a proof-of-concept implementation of the CNRI Handle System. We would hope to test the system's ability to meet all of our requirements and gain insight and skill in the technical and organizational demands of maintaining a persistent identifier system.  CNRI provides free handle server software and documentation. We envision a local system that can be administered in a distributed manner, so caretakers of different collections will be able to make autonomous decisions about the identifiers their collections use. We also recognize that a proof-of-concept system might prove to be of limited or no use in the long run.  During the period of experimentation we would ensure that the digital objects we give identifiers to would be able to be remapped to their original URLs or some other useful identifer system.(Bill)
  • Resource Requirements -- we don't know what the requirements are (development? a standalone machine? server space? maintenance?)  (John)
  • Evaluate the system and its use some time in the future. (Rick)
  • A deliverable: A Usage Document that explains how the system can be integrated into CUL collection building  (Adam C)
  • We recommend that, embedded in mapping metadata, there be an explicit statement of the estimated lifespan of the PID and of the object it represents. This practice will add value to the identifier and help enforce a best practice in lifecycle management of our digital objects. (Bill)
  • No labels