Search version 0.1 (Released)

Start date:  ; release date:  

Source code:

Replaces the classic Lucene-backed search interface with a new Flask-based interface backed by Elasticsearch.

Features & Improvements

  • Elasticsearch. The most significant change in this release is that we moved away from a search index in the classic system based on a 2008 version of Lucene to an Elasticsearch cluster running in the cloud. 

  • We gave the user interface a facelift. We wanted to preserve austere look-and-feel of arXiv, with a bit of modern styling to support readability and scanning. 
  • Better support for author names. Searching by author name was one of the most challenging parts of this milestone. We feel good about the current implementation, but are cognizant that there is ample room for improvement. Please let us know if you see any oddities while searching by author name.

  • TeXisms. We’re trying TeXism-based search in title and abstract fields! To search for a TeXism, enclose the expression in dollar signs ($).

  • Better support for DOI. We’ve added DOI as a search field (exact match only). In response to beta tester feedback, we also made DOIs more prominent in the search results.

  • Other little things:
    • ACM and MSC classification codes are now handled as separate fields.
    • Better wildcard support for searches in title and abstract.
    • The tiny search box in the header now supports all of the fields that the new search interface supports.
    • Hit highlighting. The downside of supporting more fields for search is that it can be more difficult to ascertain why a result was included. We’ve added hit highlighting to indicate which fields and terms matched your query.