Crawler-Friendly Web Servers

In this paper we study how to make web servers (e.g., Apache) more craw ler friend ly. Current web servers o er the same interface to craw lers and regular web surfers, even though craw lers and surfers have very different performance requirements. We evaluate simple and easy-to-incorporate modi cations to web servers so that there are signi cant bandwidth savings. Speci cal ly, we propose that web servers export meta-data archives decribing their content.

