Locating deep web entry points


The content on web has increased tremendously over the decades. But large pool of data hidden behind the search forms in databases still remains unexplored by the existing search engines. This valuable data, which is not a part of the surface web, is called deep web. This paper presents a strategy to locate the entry points to deep web. The aim is to make a repository of database driven web pages having deep web content. We have developed a tool which first discovers and then classifies the web pages as “deep web” or “surface web”. After classification, it adds “deep web” pages to the database. To evaluate our technique, we have presented the results from our experiments and provided the performance measure of our tool.

