Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

If we were to seek funding for expanding arXiv's features to support OA mandates & IR-arXiv interoperability, we may be able to group the metadata overhaul, assigning DOIs for data, and CULAR work.

Interoperability of arXiv with other institutional and subject repositories. One of the important factors in our sustainability efforts is enabling interoperability and creating efficiencies among repositories with related and complementary content to reduce duplicate efforts and bring efficiencies. We will investigate interoperability requirements to enable communication/exchange between arXiv and institutional repositories (for instance, pushing copies of papers published by a scientist to his/her home institution's repository). We formed a MAB subcommittee to identify needs and assess if and how arXiv can provide such functionality.  Also, we’ll continue to exchange information with publishers/societies represented in arXiv, especially in exploring issues such as version of record, linking pre-print to formal published version, etc.

...

& Public Access Mandate Support  

  • Add metadata fields for funding information, article status and migration of old content - There have been several requests for support for additional metadata. These include work to add funding information (requests from supporting members), for the ability to store version information (author manuscript, publisher version, etc.), and for publication information of migrated content (mainly for conference proceedings in computer science). These changes will require extensions of our internal metadata format and handling in appropriate submission interfaces, admin interfaces, moderator screens, search systems, and data export facilities. It may well be appropriate to generalize our models/code in some places.
  • Add linkages to datasets in data repositories - Based on our experience with the Data Conservancy pilot (http://arxiv.org/help/data_conservancy), a loose coupling to external data repositories seems more likely to be sustainable than close collaboration. This also has the benefit of allowing arXiv to work with many repositories, so that users can use the data repository that best matches their need, their community expectations, etc..
  • Support arXiv-IR interoperability
  • Create tools and facilities to better integrate with CS conferences - Joe Halpern, based on discussion of upload experiences with Martijn de Jongh (of AUAI), would like to create a client-side application Scope out a project  to ease the upload of proceeding (or other collections) by reducing the amount of custom programming required for the submission of proceedings via the SWORD interface. Work is required to scope such a project and we might consider the possibility of a Cornell MEng team?

Modernize the User Interface and Alerting System:

  • Search ....
  • Replace and improve alerting system - Replace the email alert system to allow easy subscribe/unsubscribe via web interface tied to user accounts, ensure scalability and allow customization. The current code is very old and hard to maintain, the bulk of it should be rewritten.

Item to "Create a client-side application to ease the bulk-submissions" was #14 (lowest) on 2015 MAB+SAB prioritization questionnaire. Needs more work to scope possible work. Good candidate for external funding and likely Joe would be a great help in finding such funding.

Transfer arxiv.org domain registration from Paul Ginsparg to library - Working with MAB members to understand needs.

Was in 2006 MOU. 2011-04-06: Paul agreed to transfer domain name. Need to work through Paul's Network Solutions account to effect transfer. This is an important issue but not something we should draw public attention to. Simply requires Paul's engagement.

...

  • interface

...

  • .

...

...

  • Assign DOIs to data - We accept data as ancillary files http://arxiv.org/help/ancillary_files but offer relatively little support. It would be more helpful to assign DataCite DOIs from EZID to ancillary files thus making them citeable

...

  • .

...

...

  • Ingest arXiv content into CUL Archival Repository - While arXiv adopts good practices for data backup and management, it is far from being an archival collection.

...

  • As we increase our collaboration with other repositories and consider supporting public access mandates, we need to strengthen our preservation strategies. Work is require to script creation of submission packages (SIPs) for initial ingest (and regular incremental updates

...

  • ) of arXiv content to CULAR (Cornell University Library Archival Repository). Also, we'd like to explore the need for additional archival strategies (e.g., working with Portico or Lockss). 

Modernize the User Interface and Alerting System:

Was #12 on 2015 MAB+SAB prioritization questionnaire.

Add store of "first processed" versions of arXiv articles in order to store an archival version of the processed copy the submitter saw and be able to understand any possible changes due to later reprocessing - The majority of arXiv submissions are made as TeX/LaTeX source files which arXiv then processes to produce PDF and other readable formats. While we take every care to maintain our TeX system in a way that will reliably reprocess old submissions, we currently have no facility to keep a copy of the first processed version. Such a store, accessible only to admins in the first instance, should be added in parallel to the usual cache of processed versions (/cache/ps_cache). This store should be populated by a a script which ensure that an existing processed version for a particular article-version is never overwritten by a new one.

...

  • Modernize the search interface, add facets, include author identifiers - The arXiv search interface could be improved to follow current best practices using facets and better result ordering.

A good example is the recent work done on project Euclid (https://projecteuclid.org/search). Should consider whether scope of this work would also include revision of the arXiv API (http://arxiv.org/help/api) which relies upon the search system.

...

  • Replace and improve alerting system - Replace the email alert system to allow easy subscribe/unsubscribe via web interface tied to user accounts, ensure scalability and allow customization. The current code is very old and hard to maintain, the bulk of it should be rewritten.