The Deep Web

I made a post mentioning the deep web the other day. I just checked, and there is actually a lot of information on this subject. The deep web is that part of the web not (yet) readily available to most serach engines. Or at least that used to be the case. Google is now working on indexing dynamic web page content and therefore poking into the deep web. This article here talks about that and mentions some of the players in deep web searching such as brightplanet.com and techdeepweb.com. Then, there is copernic.com which has a desktop application (actually after installation it ended up as yet another search bar in my IE). Copernic engine uses a large number or more or less specialized search engines, compares and ranks their search results, and presents them to you. So far, I haven’t found that this opens up more of the deep web than, say, google.

This page on the techdeepweb site gives some interesting examples of how to use certain query terms to refine your google searches and possibly access the deep web. I was wondering the other day whether the use of such terms is common or not.

To search databases, enter:

“searchterm” (database OR repository OR archive)

For Reverse link searching (searching for pages linking to a certain page you already know), enter:

link:”your URL” searchterm

To search only for certain file types, enter:

filetype:”e.g.PDF” searchterm

Haven’t tried any of these, but I will.

Advertisements

One Response to “The Deep Web”

  1. Might check out ISEN.org which is a method for cataloging and federating deep web databases.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: