Introduction to arXiv

Started in August 1991 by Paul Ginsparg, arXiv.org is internationally acknowledged as a pioneering digital archive and open access distribution service for research articles. The e-print repository, which moved to the Cornell University in 2001, has transformed the scholarly communication infrastructure of multiple fields of physics and plays an increasingly prominent role in mathematics, computer science, quantitative biology, quantitative finance, and statistics. arXiv is an essential component of scientific communication for many researchers worldwide in order to rapidly and widely disseminate their findings, establish priority of their discoveries, and seek feedback to help improve their work. arXiv has an international scope, with submissions and readership from around the world, and collaborations with U.S. and foreign professional societies and other international organizations. Please see the recent arXiv update for additional information.

The sustainability model lays out a business model for arXiv including anticipated expenses, potential revenue streams, value propositions, and communication strategies. The plan entails only the regular operation of arXiv—in other words, what we call “keeping the lights on.” It should be seen as a baseline operational budget, as it does not factor in additional expenses required for R&D or new development projects such as the next-gen arXiv (arXiv-NG) initiative.  

What is arXiv's Financial Model for 2018-2022?

Since 2010, Cornell’s sustainability planning initiative has aimed to reduce arXiv’s financial burden and dependence on a single institution, instead creating a broad-based, community-supported resource. We remain grateful for the support from the Simons Foundation that encouraged long-term community support by lowering arXiv membership fees and making participation affordable to a broad range of institutions. This model aims to ensure that the ultimate responsibility for sustaining arXiv remains with the research communities and institutions that benefit from the service most directly.arXiv’s funding and governance for the current operation (Classic arXiv) is based on a membership program engaging libraries and research laboratories worldwide that represent the repository’s heaviest institutional users. The financial model for 2018–2022 entails four sources of revenues:

What is arXiv-NG, and how is it funded?

From the users’ perspective, arXiv continues to be a successful, prominent subject repository system serving the needs of many scientists around the world. However, under the hood, the service is facing significant pressures. With significant input from the Scientific Advisory Board (SAB) and the Member Advisory Board (MAB), during the 2015 annual meetings, the arXiv team reached the conclusion that in addition to the current business model with a focus on maintenance, the arXiv team needs to embark on a significant fundraising effort, pursuing grants and collaborations. Based on the conclusions of 10 months of planning activities in 2016, the arXiv team has successfully secured a $450,000 grant from the Sloan Foundation to initiate the next-generation arXiv (arXiv-NG) initiative. The 18-month project will enable us to plan and begin to improve the architecture of this critical scientific infrastructure. We anticipate that a multi-phase design and development of a next-generation arXiv (arXiv-NG) will require approximately three years with an additional $2+ million budget, including requirements specification, evaluation of alternate strategies and partnerships, design of a new system architecture, assessment, and deployment. As we develop arXiv-NG during the next three years, we'll continue to rely on the existing system (Classic arXiv) and are committed to continue its robust services.

Another source of funding in support of arXiv-NG came from the Allen Institute for Artificial Intelligence (AI2). An initial gift of $200,000 for 2017 will support a collaboration between the Cornell University and Cornell Computing and Information Science (CIS). This donation will allow us to hire a Research and Innovation Fellow to collaborate with the arXiv team in designing and integrating a series of updated, research-oriented features for arXiv. The ultimate goal is to integrate tools emerging from research into the production system to improve user and moderator experience.

In addition to supporting Classic arXiv through the membership model, Cornell University will continue to raise funds through grant proposals to agencies and foundations in order to fund the arXiv-NG initiative. Also, there will be some reliance on development reserve funds as needed.  Please see the recent arXiv update  for additional information.

 

What is the tier model for 2018–2022?

 

Tier

Previous Model Fee

Current Fee

Tier 1: 1–25

$3,000

$4,400

Tier 2: 26–50

$3,000

$3,800

Tier 3: 51–100

$2,500

$3,200

Tier 4: 101– 150

$2,000

$2,500

Tier 5: 151–200

$1,500

$1,800

Tier 6: 201+

$1,500

$1,000

 

Note: The goal of Tier 6 is to provide an equitable and affordable fee to allow participation. Although we focus our membership drive on the top 200 institutions, we would like to encourage other libraries to contribute.

What is the purpose of the membership tier and fee revision?

arXiv has not adjusted its member fees, which currently represent about 45% of its operating income, since they were initially set in 2012. The rationale for increasing the arXiv membership fees is as follows:

What is the value proposition of the arXiv Institutional Membership Program for libraries?

User feedback tells us that arXiv continues to be the most successful and prominent subject repository system serving the needs of scientists worldwide. A user survey was conducted in April 2016 to seek input from the global user community about the current services and future directions. We were heartened to receive 36,000 responses, representing arXiv’s diverse community. The prevailing message is that users are happy with the service as it currently stands. 95% of survey respondents said that they are very satisfied or satisfied with arXiv. Furthermore, 72% of respondents indicated that arXiv should focus on its main purpose, which is to quickly make available scientific papers, and this will be enough to sustain the value of arXiv in the future (see the survey findings As you assess the value proposition of supporting arXiv, please bear in mind that a Tier 2 contribution of $3,800 is roughly equivalent to the subscription cost of one highly ranked physics journal or one APC charge in a journal like Nature Communications (with an APC of $5,200), New Journal of Physics (with an APC of $2,080), Physical Review X (with an APC of $2,900), or Journal of Physics D: Applied Physics(with an annual subscription cost of $5,740). arXiv ranks #1 in the Ranking Web of Repositories  and arXiv sections rank as five of the top ten publications in Google Scholar’s Top Publications in Physics and Mathematics . Aside from the knowledge that you are part of a shared investment in a culturally embedded resource providing unambiguous value to a global network of scientific researchers, here are some other reasons we would like you to become members:

Have you considered basing the membership model on submissions instead of downloads?

One of the reasons for using download statistics as the basis of the arXiv tier model is the challenges associated with tracking submissions from a particular institution to arXiv. The author metadata arXiv collects is not sufficiently consistent to support systematic submission analysis. Manual analysis of a single month of submissions in 2013 indicated a similar ranking in submission- and download-based data. During the arXiv-NG initiative, arXiv expects to undertake metadata remediation to improve the authorship data for existing submissions (including full ORCID implementation). In the current version of arXiv, it would be difficult to support a reliable submission-based tier model. While we do require submitters to register their institutional affiliation when they register an account with us, and most submitters are authors in the papers they submit (though not necessarily first authors), the relationship between the submitter and the submission’s author metadata is still rather weak. We allow submitters to supply affiliation information in the author metadata, but it’s neither enforced nor widely used. This makes it especially difficult to report on the institutional affiliation of co-authors. In order to implement a submission-based tier model, we will need to expand arXiv’s data model to better integrate with ORCIDs to start. In some cases (like large collaborations), it would still be difficult to expect submitters to provide complete authorship and affiliation data; here it would be helpful have tools to extract this information from the full-text automatically.  

 

What are the expenses and revenue sources for 2018–2022?

The draft budget factors in support only for the routine, daily arXiv operation. It does not include resources needed for significant improvements, emergency interventions (server failure), special projects, or research and innovation. This budget does not consider the nascent arXiv-NG initiative as the new development initiative needs to be supported by an additional fund raising effort. As we transition from Classic arXiv (current system) to arXiv-NG in approximately 2020, the annual expenses will need to be adjusted, taking into consideration the staffing needs of the new architecture.

 

 

arXiv Summary Budget

2017

2018

2019

2020

2021

2022

Revenue

      

Member contributions

$400,000

$500,400

$500,400

$500,400

$500,400

$500,400

Simons Foundation Annual Commitment

$100,000

$100,000

$100,000

$100,000

$100,000

$100,000

Simons Foundation Matching Fund

$300,000

$300,000

$300,000

$300,000

$300,000

$300,000

Cornell Direct Contribution & Leadership 1

$75,000

$170,000

$170,000

$175,000

$175,000

$180,000

Online donation program & gifts

$50,000

$50,000

$50,000

$50,000

$50,000

$50,000

Total Revenue

$925,000

$1,120,400

$1,120,400

$1,125,400

$1,125,400

$1,130,400

Expenses

      

Direct Expenses

$925,000

$1,044,600

$1,066,103

$1,097,551

$1,124,792

$1,157,851

Indirect Expenses

      

College and dept administration and staff support 2

$240,500

$271,596

$277,187

$285,363

$292,446

$301,041

Facilities 3

$101,750

$114,906

$117,271

$120,731

$123,727

$127,364

Total Indirect Expenses

$342,250

$386,502

$394,458

$406,094

$416,173

$428,405

Contribution to Reserve

$0

$75,800

$54,298

$27,849

$608

-$27,451

Notes:

  1. In 2018, Cornell’s direct financial contribution will increase from $75,000 to $170,000, including the salary of Program Director (.40 FTE) as a bridging strategy until we identify additional revenue sources. The MAB recommended that the Program Director's salary be considered a direct cost due to its prominent and necessary role. Prior to 2018 this position has been included with indirect costs.  
  2. University-negotiated indirect cost rate for administration & staff support (HR, finance, facility & IT); based on direct costs (less Cornell in-kind staff contribution) at 26%. 
  3. University-negotiated indirect cost rate for maintenance, custodial & other facility-related costs; based on direct costs (less Cornell in-kind staff contribution) at 11%.

 

What is the staffing configuration for Classic arXiv?  

 

Position Description

Staff

2018

2019

2020

2021

2022

User support (management + staff + student)

1FTE Entwood, 1FTE Weiskoff, 1FTE Salguero, 0.5FTE Goldweber, 0.6FTE students

4.1

4.1

4.1

4.1

4.1

IT (management + staff)

1FTE Lessmeister, 1FTE Caruso, 0.5FTE Fielding, 0.5FTE Perez, 0.50FTE Barker (CAC)  

3.5

3.5

3.5

3.5

3.5

Leadership: Program Director , Scientific Director, CTO

0.4FTE Rieger , 0.4FTE Scientific Director, 0.3FTE Payette

1.1

1.1

1.1

1.1

1.1

Membership, Finance, Policy– included in indirect

0.2FTE McLaren, 0.1FTE Steinhart, 0.15FTE Bolduc

0.45

0.45

0.45

0.45

0.45

Total FTE:

 

9.15

9.15

9.15

9.15

9.15

 

Green indicates staff lines that are not included in the budget as they are considered Cornell’s contributions.

Purple indicates a line that is assumed by Cornell (in addition to indirects); however, it would be reconsidered as we secure additional income (e.g., add a Society membership tier).

Please see the recent arXiv update  for up-to-date information.

What funds are generated through the online giving campaigns?

arXiv has run three week-long online giving campaigns to-date: the first campaign in September 2015 generated about $16,200 in donations, the second in June–July 2016 about $31,475 in donations, and a campaign in February-March 2017 generated around $31,700. We aim to raise at least $50,000 per year through online giving campaigns. A secondary goal of the online giving campaigns is to remind arXiv users that although arXiv.org is free to use, it is not free to produce.