COMPASS: A Concept-based Web Search Engine for HTML, XML, and Deep Web Data

Today’s web search engines are still following the paradigm of keyword-based search. Although this is the best choice for large scale search engines in terms of throughput and scalability, it inherently limits the ability to accomplish more meaningful query tasks. XML query engines (e.g., based on XQuery or XPath), on the other hand, have powerful query capabilities; but at the same time their dedication to XML data with a global schema is their weakness, because most web information is still stored in diverse formats and does not conform to common schemas. Typical web formats include static HTML pages or pages that are generated dynamically from underlying database systems, accessible only through portal interfaces.

