With students back for first semester classes in our part of the world, Internet cafes are abuzz with students doing their literature search for assignments, term papers, or thesis review of literature. At the moment, the most popular search engine appears to be Google. As of June 14, 2008, the estimated size of Google’s index is about 20 billion web pages, making it the largest crawler-based search engine, based on reported numbers.

So you think that with an Internet search engine like Google or Google Scholar, you’ve done a comprehensive review of all available information, besides those articles which are pay-per-view or for paid subscribers only. Think again. Studies have shown that the hidden web has as much as 500 billion web pages.

Search engines crawl only a small portion or the shallow part of the web. “Invisible web” or deep web refers to information available on the world wide web but is not accessible to general all-purpose search engines. Some materials hidden from the usual search engines include dynamic content, unlinked content, private web, and limited access content.

How to find the invisible web

To search the invisible web, here’s a list of some notable databases that we should check out (see Robert Lackie’s “Those Dark Hiding Places: Invisible Web Revealed, Wendy Boswell):

  • Librarians’ Internet Index – websites you can trust
  • FindLaw – “The highest-trafficked legal Web site”
  • About.com
  • Direct Search site put together by Gary Price
  • Invisible Web Directory -put together by Gary Price and search guru Chris Sherman. This site is a directory of searchable databases, organized by subject
  • Resource Discovery Network – has resources mostly from the United Kingdom, and is extremely well-organized and very searchable
  • InfoMine – an incredible resource that at last count included over 100,000 links and access to hundreds, if not thousands, of databases
  • Virtual Library
  • Intute – a free online service providing access to the very best Web resources for education and research.
  • Internet archive – a digital library of internet sites and other cultural artifacts in digital form.
  • Beaucoup! – a search spot to help search the invisible web.
  • Digital Librarian – a librarian’s choice of the best of the web.
  • ScienceResearch.com – A portal allowing searchable access to numerous scientific journals and databases.
  • Agricola Database – provides citations to agriculture literature.
  • Energy Citations Database – provides free access to science research to over 2.3 million science research citations.
  • Envirofacts – EPA’s one-stop source for environmental information.
  • Plants Database provides standardized information about the vascular plants, mosses, liverworts, hornworts, and lichens of the U.S. and its territories.
  • PlantFacts – an international knowledge bank and multimedia learning center on plants.
  • Window to My Environment database – provides a wide range of federal, state, and local information about environmental conditions and features in an area of your choice.