Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Multiple people have run controlled experiments like I described in http://www.mattcutts.com/blog/debunking-toolbar-doesnt-lead-...

The most common way such "secret" pages get crawled is that someone visited that secret page with their referrers on and then goes to another page. For example, are you 100% positive that every person who ever visited that page had referrers turned off on every single browser (including mobile phones) they used to access that page?



Are you sure that it is the referrer headers? PP clearly stated there were no outgoing links on the secret page. I think there's a much more mundane explanation: javascript stuff downloaded from Googles CDN. People nowadays are so used to just plopping jQuery etc. into their web pages that they forget that this stuff has to come from somewhere. If it's from Google, I'm quite certain that their CDN loader phones home right before it gives up any of the good stuff.

EDIT: Confirmed, though I was wrong in that there's no loader, requesting jQuery from ajax.googleapis.com gives them a nice fresh Referer header pointing at your secret site for their spiders to crawl. Be mindful!


I'm 100% sure. That page was for me and me alone. It was never accessed by anyone but me. I never shared the URL with anyone.

Referrers only get shared through links. There were no links to or from that page. Going to a page and typing in new URL does not provide a referrer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: