Scraping

Robert Brook

Our traffic’s gone up, a bit. Nothing serious, but certainly noticeable. One source of traffic is someone – or something – using HTTrack, presumably to download content from the site for use offline.

Which is, of course, fine. But, if you are spidering or crawling our site, try to be polite. We get Google and Yahoo’s spiders in now and then – and they don’t make trouble. And when you have the content you were looking for, do remember that it’s still covered by Parliamentary copyright.

That said, if you’re doing interesting stuff with our data, we’d love to see it.

Posted in: Historic Hansard

---

« Tagging and taxonomies Why we use Open Source »