Create a robots.txt if you haven’t yet
Kevin Day, February 5th, 2010I have anecdotal evidence that Google is more likely to crawl and index your site if you have a robots.txt (set to allow googlebot) than if you don’t have any robots.txt file.
For Crunch Course, I’ve taken my time setting up a robots.txt file. Partly because I didn’t think it would help much and partly because it takes a couple steps to do it in Django.
As a result, Google’s cache of Crunch Course is about a month old.
I finally got around to adding a robots.txt yesterday, and I’m now getting a bunch of 404 errors from the googlebot for old links that I’ve since changed the structure of. That’s indicating to me that it checks for a robots.txt frequently, but it is much more likely to actually crawl the site if it has the green light to do so. Of course it could also just be a coincidence, but that doesn’t make for a good blog post.
So there you have it, indisputable evidence from one data point that Google is more likely to crawl your site if you have a robots.txt than if you don’t.

Leave a Reply
Enclose code in <pre></pre> tags