Preparation

  • This page is a companion to the guest lecture for WRIT2100.
  • Please do bring a laptop! If you have one, bring your own. If you do not have one, please feel free to check one out at the Olin circulation desk. Having a laptop will allow you to participate in the exercises and get the most out of this exploration.
  • Please bring samples of text you want to explore.  These can be in several formats: plain text, Microsoft Word Document, PDF and URLs to online sources.  We will be loading them into various tools; hopefully you will see interesting results.  I will encourage sharing. 
  • No special software will be needed. All exercises will be done through a Web browser, without any special plugins.

Agenda

Presentation

There will be a presentation during which your questions and comments are welcome. My aim is to discuss as much as is useful to you. Please feel free to chime in at any time.

Exercises

All exercises will be demonstrated, so no prior knowledge of the tools are required. The room is equipped with a jack to allow easy sharing of your desktop on the screen, so if you discover something that you would like to share and discuss, we can easily do so, and I will encourage that.

Voyant

Voyant is a low barrier text analysis tool that delivers a rich, interactive interface and a variety of visualizations (all of which are explained in the help file).  Input format can be plain text, a PDF (with OCR), a MS Word Document or a URL for HTML analysis.  Please feel free to bring your own material for upload to the workshop, understanding that upload of any material will be subject to the Voyant privacy policy.  Sample texts and URLs for analysis are listed below for experimentation, in case you run low on ideas.

Google nGram Viewer

We will also explore Google's nGram Viewer. Google nGrams depict the frequency of a word or word phrase by publication year. Note that many modifications can be made to refine the analysis, so please consider the links below as starting points. Syntax for refinement is found on the About page.

Immersion

Immersion is a tool for discovering the connections in a corpus of email.  It analyzes the flow data (information found in email headers) and represents these as a network of entities.  The analysis is done in real time on the flow data for which you provide credentials.  The display is rich and  interactive. 
By design, Immersion collects only header information (From, To, Cc and Timestamp).  However, using the actual flow data from your account may cause concerns regarding privacy - Be sure to read over the FAQs to understand what information you are granting access to, and how it will be used.  If you do not like the terms of the tool, you can experience it with their demo data. 
  • No labels