Deeper: a data enrichment system powered by deep web


Data scientists often spend more than 80% of their time on data preparation. Data enrichment, the act of extending a local database with new attributes from external data sources, is among the most time-consuming tasks. Existing data enrichment works are resource intensive: data-intensive by relying on web tables or knowledge bases, monetarily-intensive by purchasing entire datasets, or timeintensive by fully crawling a web-based data source. In this work, we explore a more targeted alternative that uses resources (in terms of web API calls) proportional to the size of the local database of interest. We build Deeper...

