Note that hands-on use of the HTRC portal and its tools requires a logon.  Please see the information linked from the section titled "The portal", below.  Those wishing to experience the tools using a collection of scholarly interest may want to construct such a collection following the tutorial referred to under the section called "Workset builder".

 

What is HathiTrust (HT)?

Why aren't all books viewable online?

Computational analysis must address the very real challenges of what can and cannot be legally shared digitally, so it helps to understand the realities that affect full-text viewability.  Not all books in HathiTrust are viewable in full, although all are indexed in full.  Viewability is determined by many factors, including copyright law (both US and International) and stipulations of the rights-holders (authors and/or publishers) and digitizing agents (like Google).  There are two attributes assigned that affect viewability.  The first is an attribute that describes a complex set of factors relating to copyright, digitizing agents and rights-holders, referred to as "rights" metadata.  The second attribute is a binary value ("allow/deny") often referred to as "access" metadata.  In cases where a volume has no factors attached to it that would limit sharing, both attributes would express this.  Colloquially, the set of these volumes are referred to as the "open-open"  set.  What a researcher can do with text is governed by these factors, and the most unrestricted uses can be made from the open-open set. 

What is the HathiTrust Research Center (HTRC)?

What specific services does the HTRC offer scholars?

Documentation of offerings on the HTRC User Community Wiki - links to services, user support documentation, meeting notes, elist addresses and sign-up information, and FAQs.

The "portal"

Workset management

Algorithms

Data Capsule

Datasets

Bookworm