Goals
Note: Questionable goals are in red
For Identifiers (just the complete URI and local part, not about mechanisms)
- Identifiers must be permanent
- Identifiers must permanently resolve to a digital object
- Simplicity (of description and implementation)
- Support for vitality checking
- Needs to support multiple repositories
- The identifier itself needs to support two parts: a local namespace prefix and a (fairly) arbitrary identifier part
- Makes sense in context of VIVO, arXiv, OAIS (CUL), Voyager Catalog, and WorldCat
- Local part should not be Cornell branded so that it is compatible with multi-institution collections where Cornell branding is inappropriate. Generally, these identifiers should be branded to Cornell through the use of a Cornell-based DNS name for the full URI/resolver. It should also be possible to use the local part with non-Cornell resolver for consortia and other non-Cornell-branded needs (cf. DOI, handle; not our purl)
- Identifiers should be short (so use 26 letters plus numbers)
- Identifiers should be easy to copy by hand - separate every 4 characters with a dash
- It must be possible to support opaque identifiers
- Must be resolvable through a web browser
- Must be unique within our PID system without the DNS name portion of the URL
- Recommended that newly created local identifier namespaces use identifiers that include a check "digit" (so that typing errors are not likely to result in a valid identifier).
For Resolver and System
- Supports billions of identifiers with very fast resolution
- Robust architecture and implementation - a highly available system
- Support for "private" identifiers (e.g. for dark archive or internal digital objects) (this is about metadata stored with id and facilities provided depending on it)
- Support for OAI-ORE structuring
- Need to avoid unbounded generation of surrogate persistent identifiers
- Should support multiple delivery formats for an identifier _ (does this mean doing content negotiation at the resolver?)_
- Must support splitting collections (what does this mean?)
- Need a lightweight understanding of identifier equivalence
- Need a way to integrate outside PIDs with Cornell (what does this mean? examples?)
- The identifiers and the associated content should be easily discoverable by Google
- The overall system should integrate well with the "web architecture"
- Should have a PID corresponding to every Cornell NetID and potentially other non-digital resources, not necessarily at Cornell?
- Should give useful error when check digit is wrong.
Out of scope / Not goals
- Any extra characters for error correction (as an extension of any possible error detection)
- Attempt to create a fixed-length of fixed-syntax form (that would aid recognition but cost in flexibility/length)