- Eric Beaudoin
- Research Papers
- Deep Web & Dark Web
Deep Web Data Source Size by Capture-recapture Method
Estimating deep web data source size by capture-recapture method
This paper addresses the problem of estimating the size of a deep web data source that is accessible by queries only. Since most deep web data sources are noncooperative, a data source size can only be estimated by sending queries and analyzing the returning results. We propose an efficient estimator based on the capture-recapture method. First we derive an equation between the overlapping rate and the percentage of the data examined when random samples are retrieved from a uniform distribution. This equation is conceptually simple and leads to the derivation of an estimator for samples obtained by random queries.
- Hits: 639