Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
h1. For Identifiers (just the complete URI and local part, not about mechanisms)

...

#permanent

The PID system should provide long term support for resolving identifiers. The hope is that groups at Cornell and outside of Cornell will be able to use these identifiers to signify digital items, concepts, and other things in a stable maner. The reason to pursue this goal is to reduce the cost of creating systems at Cornell that can refer to things created by other systems.

#permanently resolve to a digital object

The Identifiers should be permanently resolvable via HTTP to a digital resource.

A large category of items that a PID resolver at CUL would index would be digital objects that should remain fixed but it is not a goal of the PID resolver system to ensure fixity of digital resources. There are also items that are outside of this category that the PID resolver should also work with.

We may want to have identifiers for digital objects that constantly change. We may want to have identifiers for digital objects that represent the state of non-digital resources.

#URL

The PIDs should conform to the URL RFC1738

#resolvable through a web browser

...



h4.{anchor:permanent} 
The PID system should provide long term support for resolving identifiers.  The hope is that groups at Cornell and outside of Cornell will be able to use these identifiers to signify digital items, concepts, and other things in a stable maner.  The reason to pursue this goal is to reduce the cost of creating systems at Cornell that can refer to things created by other systems.

h4.{anchor:permanently resolve to a digital object}
The Identifiers should be permanently resolvable via HTTP to a digital resource.  

A large category of items that a PID resolver at CUL would index would be digital objects that should remain fixed but it is not a goal of the PID resolver system to ensure fixity of digital resources.  There are also items that are outside of this category that the PID resolver should also work with.  

We may want to have identifiers for digital objects that constantly change.  We may want to have identifiers for digital objects that represent the state of non-digital resources.

h4.{anchor:URL} 
The PIDs should conform to the URL [RFC1738| http://www.ietf.org/rfc/rfc1738.txt ]

h4.{anchor:resolvable through a web browser} 
Since the PIDs are URLs they could be used as the href attribute in an <a> element in HTML.  When requested via HTTP GET the response should be a digital object or digital surrogate.  Wrinkle in this goal is that many web browsers do not respect the MIME type of responses.  A digital object such as a PDF might need to have the part after the final slash end if .pdf to be interpreted as a PDF file by a browser.  Some browsers will attempt to open a result as a PDF file on a GET of a URL like http://resolver.org/234/b233.pdf

...

 that has a reply status of 400.

...

#unique within our PID system without the DNS name portion of the

If the ID schema has a DNS name then all PIDs in the system should be unique even with the DNS name removed. This goal is trivially satisfied if there is only one DNS name used by the PID system. If there are more than one DNS names used in the system care must be taken to ensure that goal is met.

#works with existing systems

The PIDS should work with existing systems such as VIVO, arXiv, OAIS (CUL), Voyager Catalog, and WorldCat etc.

#Simplicity

Please add comments about this goal.

#support for a local namespace prefix and an identifier part

...



h4.{anchor:unique within our PID system without the DNS name portion of the} 
If the ID schema has a DNS name then all PIDs in the system should be unique even with the DNS name removed.  This goal is trivially satisfied if there is only one DNS name used by the PID system.  If there are more than one DNS names used in the system care must be taken to ensure that goal is met.

h4.{unique:works with existing systems} 
The PIDS should work with existing systems such as VIVO, arXiv, OAIS (CUL), Voyager Catalog, and WorldCat etc.

h4.{anchor:Simplicity}
Please add comments about this goal.

h4.{anchor:support for a local namespace prefix and an identifier part} 
The hope is that the URL can be of the form http://hostname.com/localNamespacePrefix/identifierPart

...

 or something similar.   The localNamespacePrefix could also be called a collectionPrefix.  The resolver system should place no additional restrictions on the identifierPart beyond conforming URL syntax.

...

#support for opaque identifiers

...

  

h4.{anchor:support for opaque identifiers} 
In an attempt to avoid problems in situations where the labels associated with resources change, the PIDs should support partially opaque identifiers such as http://hostname.com/170/2a33-ffff

...

 instead of http://hostname.com/SuperMegaCollection/WalterCarlos1

...

.  Since the resolver system places no additional restrictions on the identifier part, we cannot stop systems from requesting new identifiers with a syntax that they attach meaning to.

...



Ex. A collection administrator might register the following Identifiers:
http://resolver.cornell.edu/170/article23332

...

 -> http://collectionX.cornell.edu/article/23332

...


http://resolver.cornell.edu/170/article23332.pdf

...

 -> http://collectionX.cornell.edu/article/23332?format=pdf

...


http://resolver.cornell.edu/170/article23332.tex

...

 -> http://collectionX.cornell.edu/article/23332?format=tex

...


The resolver system will not attempt to parse these identifiers and will not record or track relationships between identifiers

...

#Local part should not be Cornell branded

To encourage the possibility of using the PIDs across intuitions, the PIDs should not be branded.

#may be surrogate for physical object

The digital resource returned by the PID resolver might not be the thing identified by the PID, but a surrogate for a resource that cannot be transported easily via HTTP.

#should be short

Please add comments about this goal. How short? Should there be a length limit (which implies a limit to the number of things that can be resolved)?

#should be easy to copy by hand

Please add comments about this goal. Adding dashes between every 4 digits of the identifier may be a way to improve the easy of copying the PIDs.

For Resolver and System

#Supports billions of identifiers

Please add comments about this goal.

#Robust architecture

Please add comments about this goal.

#Robust implementation

Please add comments about this goal.

#ability to request metadata about the identifier

Please add comments about this goal.

#lightweight understanding of identifier equivalence

Please add comments about this goal.

#should be easily discoverable by Google

Please add comments about this goal.

#integrate well with the web architecture

This should be satisfied by earlier goals.

#vitality checking

Please add comments about this goal.

#Need to avoid unbounded generation of surrogate persistent identifiers

Please add comments about this goal.

#PID corresponding to every Cornell NetID

Please add comments about this goal.

#useful error when check digit is wrong

Please add comments about this goal.

Governance Issues

#Requires payment to external organization

Please add comments about this goal.

#Can continue resolve IDs in absence of external organization

...

.  

h4.{anchor:Local part should not be Cornell branded} 
To encourage the possibility of using the PIDs across intuitions, the PIDs should not be branded.

h4.{anchor:may be surrogate for physical object}
The digital resource returned by the PID resolver might not be the thing identified by the PID, but a surrogate for a resource that cannot be transported easily via HTTP.

h4.{anchor:should be short} 
Please add comments about this goal.  How short? Should there be a length limit (which implies a limit to the number of things that can be resolved)? 

h4.{anchor:should be easy to copy by hand} 
Please add comments about this goal.  Adding dashes between every 4 digits of the identifier may be a way to improve the easy of copying the PIDs.


h1. For Resolver and System
h4.{anchor:Supports billions of identifiers} 
Please add comments about this goal.

h4.{anchor:Robust architecture} 
Please add comments about this goal.

h4.{anchor:Robust implementation}
Please add comments about this goal.

h4.{anchor:ability to request metadata about the identifier} 
Please add comments about this goal.

h4.{anchor:lightweight understanding of identifier equivalence} 
Please add comments about this goal.

h4.{anchor:should be easily discoverable by Google} 
Please add comments about this goal.

h4.{anchor:integrate well with the web architecture} 
This should be satisfied by earlier goals.

h4.{anchor:vitality checking}
Please add comments about this goal.

h4.{anchor:Need to avoid unbounded generation of surrogate persistent identifiers} 
Please add comments about this goal.

h4.{anchor:PID corresponding to every Cornell NetID} 
Please add comments about this goal.

h4.{anchor:useful error when check digit is wrong}
Please add comments about this goal.


h1. Governance Issues

h4.{anchor:Requires payment to external organization} 
Please add comments about this goal.

h4.{anchor:Can continue resolve IDs in absence of external organization} 
Please add comments about this goal.