Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Google Search Appliance at Cornell

According to Uncle Ezra it was in March 2006 that Cornell replaced the Inktomi search engine it had been using for University website searching with a Google Search Appliance. The Google Search Appliance (GSA) is administered by the Office of Web Communications. The GSA now supports the search you see on the Cornell Identity Banner on most Cornell University web pages.

...

The effect of these collections is to filter the results that would have come from the overall Cornell search, allowing only the subset of results that correspond to your list of domains to be reported. The advantage is that with the normal seach dialog you can only specifiy a single domain to filter on - with a collection it can be many domains. Adding a domain to your collection that is outside of the Cornell master list will not cause it to be indexed! You may be wondering what domains are in Cornell's master list.

Cornell University Library Websites Google Search Appliance Collection 

In September 2006 I asked Lisa Cameron-Norfleet at The Office of Web Communications to set up a Google Search Appliance Collection that I could use for searching Cornell University Library websites. She kindly (and quickly) created the collection and told me how to get in to the administrator interface. I added all the Cornell Universiy Library digital collections , the list of individual libraries, and a few other library websites. Here are the domains and paths currently in the collection.

...

What you can find with the Cornell Library GSA Collection

  • English words in web pages, like canoe
  • Words or phrases in UTF-8 characters, like ?????? (Unfortunately, Confluence does not play well with Japanese!)
  • Phrases  in pdf documents linked to web pages, like Engr Math PSL Vet* ACCEL
  • Anything in dspace, dlxs, or vivo - like dog for example

Statistics from Library Collection

The Google Seach Appliance Collection administration interface has a report and statistics section that can tell you things like 'How many pages are being crawled on each site?' or 'What were the top 100 search phrases in the month of March?'

http://www.google.com/enterprise/gsa/index.html

...

Cornell Google Search Appliance web page

Example Searches 

...

 
Maybe http://www.digitalhimalaya.com/ should be included in the collection:

...

Library Gateway Search for library hours 

 Extra

I'm not sure what the 'Search Library Pages' link on the Library Gateway page is doing - it finds things in 'library.cornell.edu' and 'mannlib.cornell.edu' and 'www.ilr.cornell.edu/library/catherwood/'.