You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Oya Y. Rieger, arXiv Program Director, Cornell University Library, June 2016

Acknowledgements: Many individuals were involved in designing and testing the survey and helped out with the data analysis. Special thanks go to Deborah Cooper, Andrea Cruz, Jim Entwood, Martin Lessmeister, Leah McEwen, Chloe McLaren, Chris Myers, Vandana Shah, Gail Steinhart, Simeon Warner, and Jake Weiskoff. Also, we are grateful for the guidance from the arXiv’s Member Advisory Board and Scientific Advisory Board.

 

EXECUTIVE SUMMARY

 

As part of its 25th anniversary vision-setting process [insert link], the arXiv team at Cornell University Library conducted a user survey in April 2016 to seek input from the global user community about the current services and future directions.  We were heartened to receive 36,000 responses, representing arXiv’s diverse community (See Appendix A). The prevailing message is that the users are happy with the service as it currently stands. 95% of survey respondents said that they are very satisfied or satisfied with arXiv. Furthermore, 72% of respondents indicated that arXiv should focus on its main purpose, which is to quickly make available scientific papers, and this will be enough to sustain the value of arXiv in the future. This theme was pervasively reflected in the open text comments. A significant number of respondents suggest keeping to the core mission and enabling arXiv’s partners and related service providers to continue to build new services and innovations on top of arXiv.

 

Many of the comments reflected deep satisfaction with and gratitude for arXiv. Several users referred to the significance of the service for their personal career development and expressed thanks for its continued existence; for example, a typical comment was: “Thanks for the hard work of many people over the years. My work life would be very different without your efforts.” arXiv also received many plaudits for advancing the dissemination of research through the open access system. One user referred to the service as, “a beacon for scientific communication.” Several commenters expressed how crucial arXiv has been for them personally in being able to quickly access the latest research in their field. There was an overall perception that arXiv was an important leader in the development of alternatives to traditional publishing. Independent researchers who are unaffiliated with large institutions and who might otherwise have delayed access to papers particularly emphasized the importance of arXiv for their work.

 

The combination of multiple choice responses and the extensive and thoughtful open text comments pinpointed areas that need to be upgraded and enhanced. Improving the search function emerged as a top priority as the users expressed a great deal of frustration with the limited search capabilities currently available, especially in author searches. Providing better support for submitting and linking research data, code, slides and other materials associated with papers emerged as another important service to expand. Regardless of their subject area, users were in agreement about the importance of continuing to implement quality control measures including checking for text overlap, correct classification of submissions, and rejection of papers that don’t have much scientific value, and asking authors to fix format-related problems. Several users commented on the need to randomize the order of new papers in announcements and mailings. There were several useful remarks about the need to improve the endorsement system and provide more information about the moderation process.

 

In regard to arXiv’s role in scientific publishing, some users encouraged the arXiv team to think boldly and further advance open access (non-traditional publishing) by adding features such as peer review and encouraging overlay journals.  On the other hand, many users strongly emphasized the importance of sticking to the main mission and not getting side-tracked into formal publishing. There was a similar divergence of opinion about encouraging an open review process by adding rating and annotation features. When it comes to adding new features to arXiv to facilitate open science, the prevailing opinion was that any such features need to be implemented very carefully and systematically without jeopardizing arXiv’s core values. 

 

While many respondents took the time to suggest future improvements or the finessing of current services, a distinct group of users were strident in their opposition to any changes. Throughout all of the suggestions and regardless of the topic, commenters unanimously urged vigilance when approaching any changes and cautioned against turning arXiv into a “social media” style platform. The feeling is that arXiv as it exists is working well and while there are some areas for improvement, too much change could potentially weaken the effectiveness and overall mission of arXiv.

 

KEY FINDINGS

 

Improving the Current arXiv Services

 

  • When asked about the importance of improving a specific range of services, over 70% of respondents said that improving search functions to allow more refined results was very important/important across all groups by years of use, age groups, number of articles published, country groups, and subject areas. Many commenters requested enhanced functions such as author search, date-limited searching, and better support for searching non-English languages. Search was equally problematic regardless of whether the user searched for a known paper, was browsing a subject category or looking for specific authors.
  • A series of questions asked users about improving the submission process specifically with (1) support for submitting research data, code, slides and other materials, (2) improving support for linking research data, code, slides, etc., with a paper, (3) updating the TeX engine and various other enhancements. Support was strongly in favor of such enhancements with about 40% rating each one as very important/important. The open text responses also displayed considerable interest in better support for supplemental materials, although respondents disagreed as to whether they should be hosted by arXiv or another party. Many respondents are supportive of integrating or linking to other services (especially GitHub), while a significant number of respondents also expressed concerns about long-term availability and link rot for content not hosted within arXiv. Some express concerns regarding the resources required for arXiv to do a better job with this. There is some minority specific interest in including the data that underlie figures in arXiv papers.
  • Among other services and improvements recommended by respondents are:
    • Consistent inclusion of information and links about the published versions of the papers
    • More refined options for alerting, both email and RSS. Several respondents specifically requested email alerts for works by a particular author, and there was some interest in HTML-formatted email with live links.
    • The need to update and keep updated arXiv’s TeX engine and provide TeX templates or style files to make submission easier.
    • Linking papers to each other via citations and actionable links in bibliographies came up frequently.
    • Ability to submit a PDF, an increase in the file size limit (often with specific request to link to figures), and the ability to upload multiple files at once.
    • There were some requests for allowing submission directly from authoring platforms (such as Overleaf, Authorea).
    • A much larger percent of recent arXiv users (5 years or less) selected the “no opinion” option about current service improvements. For all the questions in this category, the same trend is visible: a higher percentage of recent users expressed that they had no opinion and this percentage of respondents decreased with each level of increase in years of use. Interestingly, this same trend is not visible by age group, i.e., our data do not show that a higher percent of younger users have no opinion.

 

Importance of Quality Control Measures

 

  • arXiv’s users were asked a series of questions regarding quality control measures. Based on the 26,430 responses to specific controls, the most important of these (ranked very important/important) are:
    • Check papers for text overlap, i.e., plagiarism             77%
    • Make sure submissions are correctly classified          64%
    • Reject papers with no scientific value                          60%
    • Reject papers with self-plagiarism                                58%

 

  • A large percentage of all demographic groups found checking for plagiarism to be important and a slightly smaller group found checking for self-plagiarism as important. There was no discernible difference across demographic groups for the other measures. Similarly, self-plagiarism was also mentioned as another area for improvement. Some noted that context is the key; for example, conference papers are a common and typical area where self-plagiarism could occur in an otherwise scientifically sound submission.
  • Several respondents said that they were unaware of exactly what quality control measures were already in place and felt that the process is too opaque. Others acknowledged the difficult balance between rejecting papers that are clearly unworthy—“crackpot”—and rejecting papers for other, perhaps less obvious, and anonymized reasons. However, even in the face of such criticisms there was a strong thread of satisfaction with arXiv’s current quality control process and users cautioned against going too far in the other direction.
  • Some users would prefer that arXiv embrace a more open peer review and/or moderation process, while others were adamant that current controls allow arXiv the freedom and speed of access that is otherwise unobtainable through traditional publishing.
  • Overall, the feeling was that quality control matters but user comments vary greatly in relation to how arXiv could practically achieve these goals. As one respondent wrote, “Judgment about quality control is a very relative issue.”

 

 

Adding New Subject Categories:

 

  • 73% of the respondents are not interested in seeing additional subject categories to be added to arXiv. 26% of respondents would like to see new subject categories added and suggested chemistry (881), engineering (483), biology (429), economics (248), philosophy (220), and social sciences (106).  There were also several smaller categories such as Machine Learning (82 responses) and Artificial Intelligence (27 responses).
  • A frequently repeated theme was that arXiv does not need to focus particularly on additional subjects but instead should focus on the refinement and addition of subfields and subcategories, especially in hep-th as well as mathematics.

 

Developing New Services

 

  • Users were asked to rate a range of proposed new services for arXiv.  In the ranked responses more than 63% of users rated adding direct links to papers in the references (reference extraction) as very important/important. Citation export in formats such as BibTex, RIS was rated as very important/important by over 57% of users and extraction for the BibTeX entry for the arXiv citation was similarly rated by more than 55% of respondents. Citation analysis tools in general were ranked as very important/important by almost 53% of respondents.
  • In the open text comments, opinions were divided on the need for enhanced citation analysis capabilities. While users were generally in favor of citation tools many of the same users noted that other systems are already doing this, and that this was sufficient for their needs.
  • In the multiple choice survey responses the option to “offer a rating system so readers can recommend arXiv papers that they find valuable” was closely split between very important/important (36%) and not important/should not be doing this (36%). This matches the way the comments were closely split between those in favor and those less certain. Also, it was found valuable by 50% of recent users as compared to 28% of seasoned users. In addition, a larger percentage of younger users find it important (42% of those under 30 years), as compared to 28% of those 60 and above.  Opinions were divided in the open text comments but overall the responses were hesitant about the idea. Some users liked the rating feature “in an ideal world” setting, but did not think it was appropriate for arXiv; others expressed that it would dilute the mission of arXiv or simply appears unfeasible in arXiv’s current incarnation. However, even users directly in favor of a rating system raised issues about whether it would be open to the public, rated by peers, anonymous, etc.  Several respondents stressed that such a feature will need to be implemented very carefully.
  • Like the question about offering a rating system, the idea of adding an annotation feature to allow readers to comment on papers was almost evenly split, with 34.89% of users ranking it as very important/important and 34.08% as not important/should not be doing this. In the open text responses, the trend opposed the idea and some of the responses reflected strongly negative feelings. Those in favor or open to the idea of a commenting system often added a caveat and in general there was a sense of caution even for those responding positively. A common theme of concern was that a moderated system and verifiable accounts would be necessary to prevent a free-for-all. Unlike the question about offering a rating system, there were no discernable differences in opinion based on different demographic characteristics.

 

Finding arXiv Papers:

 

  • The vast majority of arXiv’s users access the papers directly from the homepage (79%), followed by using Google to search (50%) and Google Scholar (35%).  
  • Once on the homepage, reactions were mixed regarding the ease of use and navigation. 32% rated this as easy, but only 25% find it somewhat easy and 21.6% rated it somewhat difficult to use.
  • To discover content 63% of users go to the link for new or recent under a particular category and equally 63% of users use arXiv’s search engine and enter a specific arXiv ID, author name or search term. A small number of users, 14%, rely on the daily mailing list and then look for a particular article in the search field.
  • In the open text comments, opinion was divided about the user interface. The majority of respondents disliked the outdated style, but a definite subgroup appreciated the interface’s simplicity, which these users feel helps arXiv efficiently carry out its mission. The main issues mentioned besides the homepage’s look were the number of links, layout and finding submission information. The lack of hierarchy in organization was found challenging to understanding arXiv’s navigation.
  • Requests for enhancements related to UX included greater personalization of arXiv for readers; for example, the ability to “favorite” papers, curate a personal library, and see recommendations when users visit the site. Other users mentioned the development of APIs to further facilitate the development of overlay journals. Some users also suggested the development of a mobile-friendly version.
  • Many commenters either described how they rely on other services to interact with arXiv content (site-specific searches, ADS, INSPIRE) or recommended features based on their experience with other information systems. Frequently mentioned ones with praise were ADS, INSPIRE, Google Scholar, gitxiv.com and arxiv-sanity.com.

 

 

 

APPENDIX A: DEMOGRAPHICS OF RESPONDENTS

 

Q1 - I use arXiv in the following ways: (Please choose all that apply)

 

Answer

%

Count

I am an arXiv reader

93%

31862

I am an arXiv author

53%

18270

I am an arXiv submitter

50%

17189

I am an arXiv (other type of user): Please describe

2%

845

 

 

 

Q2 - The number of articles I have published/submitted on arXiv is:

Answer

%

Count

1 article

11.99%

2570

2 articles

8.96%

1920

3 - 4 articles

15.19%

3254

5-10 articles

23.06%

4941

More than 10 articles

40.80%

8743

Total

100%

21428



Q3 - My current occupation is:  (Please choose ALL that apply)

Answer

%

Count

I am an academic faculty member (professor) at a college or university

27%

8868

I am an academic staff member (researcher or postdoc) at a college or university

22%

7207

I am a researcher at a non-profit or governmental agency

8%

2707

I am a Masters/Ph.D. student

30%

9890

I am an undergraduate student

5%

1514

I am (please describe)

13%

4353

 

13% of respondents (4353) indicated a different occupation category. The top ones included researchers at a company or industry (900), engineer (515), and retired individuals (478), There were also respondents who described themselves as science writers, editors, freelance editors.  Other response types included data scientist, self-described amateur researchers, self-described laypeople, unemployed, teachers, and the generally curious (e.g., “a man doing research as hobby”).

 

 

 

 

 

 

 

 

 


 

Q4 - As a user, my main subject area of interest in arXiv is: (please choose all that apply)

 

Almost 2000 respondents checked the Other option to specify their main area of interest.  The top categories were astrophysics (726) and astronomy (653).

 

 

Q5 - I have been using arXiv for:

Answer

%

Count

0 - 2 years

19.54%

6470

3 - 5 years

28.96%

9592

6- 10 years

25.44%

8425

11 or more years

26.06%

8632

Total

100%

33119

 

 

 

Q50 - My age is:

Answer

%

Count

younger than 30 years

37.42%

12364

30 - 39 years

31.27%

10332

40 - 49 years

13.76%

4545

50 - 59 years

9.30%

3073

60 - 69 years

5.77%

1908

70 years and over

2.47%

817

Total

100%

33039

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Q6 - My main place of work is located in:

Total 31,255 responses

 

 

Other Countries: 1% or less representation each from 113 countries

 

 

 

 

 

  • No labels