I, For One, Welco -- er, hold on a second...


Via Newmark the Elder, I found this list of "disturbing" facts about Google. It focuses primarily on the privacy issues that arise from all the of the data that Google collects on, well, everyone.

A good portion of the list is unconvincing (to me) whinging about Google not being, essentially, open-source (they hire NSA people! their search algorithm is semi-secret! they use long-lasting cookies!), there is one that I do find particually bothersome:

7. Google's cache copy is illegal: Judging from Ninth Circuit precedent on the application of U.S. copyright laws to the Internet, Google's cache copy appears to be illegal. The only way a webmaster can avoid having his site cached on Google is to put a "noarchive" meta in the header of every page on his site. Surfers like the cache, but webmasters don't. Many webmasters have deleted questionable material from their sites, only to discover later that the problem pages live merrily on in Google's cache. The cache copy should be "opt-in" for webmasters, not "opt-out."

I have no idea about the charge of "illegal" (nor about the ability of the author to claim that this is a "fact", disturbing or otherwise), but it strikes me as a bad practice in general. As the article notes, people who maintain websites often have to edit the content to remove objectionable content, content that violates company practice, or any of a host of other reasons. The persistance of the material in the Google cache defeats this perfectly legal, and usually largely justifiable activity. That material might still be accessable even when a webmaster/company has explicitly attempted to remove it circumvents the efforts while still leaving the site open to whatever sort of result may occur from having the offending material up to begin with. (i.e. The company cannot protect itself by editing its own website when it needs to if the information is not in their hands.)

In general this seems to go along with a basic belief at Google that, after having worked for several years in the web sector, I find particularly problematic: assume everyone wants to be included, and then give the chance to "opt-out". Google took the same tack with Google Print. This is functionally the same business model as spam. Send mail to everyone, but make it possible to ask to not be included in the future. Google just seems to be much, much better at saying why a particular service is a good thing, or not telling them it's going on. ("Oh, hey, by the way we saved a version of your website that everyone can see but that you can't change. That's cool, right?")

Right now, the benefit to being included in Google's services -- by playing along with the company leading the way in the web's second largest online activity -- appears to be greater than missing out, but unchecked, and if it spreads to those consumers in a way that they don't like (imagine a combination of Google Video and Google's newest property, Riya -- suddenly you can be indentified and found online if you happen to have appeared in any photo or video that makes its way to the web, including the appearance of your home, your children, your relatives, etc., all without having chosen to be part of the process), there's always a chance for considerable backlash.


There is an upside to Google's cache. Less than a month ago a forum I use, Financial Sense, lost a good chunk of their files. I suggested that a clever search might recover some/all of their data from Google's cache. That's what they did.

They don't hire NSA spooks, they have one employee who spent a summer internship working with the DoD. The info on the site you are reading comes from Google-Watch which is in no-way objective and has had a bias from day one (this is not to say you should ignore privacy claims regarding Google, but Google-Watch has and continues to have ulterior motives).


Powered by Movable Type 5.02

About this Entry

This page contains a single entry by published on November 23, 2005 9:49 AM.

The Reason They Make The Environment Interactive was the previous entry in this blog.

More Refining Capacity Coming is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.