» Alexa Namespace Pollution

Alexa Namespace Pollution

Alexa, Amazon’s search engine, is apparently encouraging people to create info.txt files. I say apparently because I can’t find the page myself, but I trust the source.

URI namespace pollution is a bad thing!

We can’t do much about it in the case of robots.txt, that standard is too deep-rooted now, but we should be severely discouraging anyone from following that same route.

Why? Spurious requests cause bandwidth to go up. You will already be getting hundreds of requests a day for favicon.ico and robots.txt if you run a site, do you really want more?

This pollution is bad. It’s not extensible or open to pre-discovery. It’s rooted in a view of the web that no longer exists, where the root of a domain represents a website and the only website at that domain. The web doesn’t exist like that any more.

This kind of pollution is damaging and will eventually reduce the scalability of the web, as more protocols suffocate the reasonable URI namespace. There is no need for this. We have enough mechanisms to allow more fluid interactions to discover information that is needed, if present.

So, please, no info.txt files.

Actually, I agree and will go update my entry to reflect this. It wasn’t the fixed path and autodiscovery that I liked (and you’ve persuaded me that it’s actually bad) it was more the “store the data on your site and we’ll retrieve it” that I liked.

Or put another way, DataLibre means owning your own URI space too.

I should have made it more clear that I agree with the idea itself, just not the implementation. DataLibre itself is a solid idea in principle, particularly notions like FOAF. If only it had a decent application to show off the potential.