Using Google searches to detect hacking

October 12, 2010

A link spam example

A link spam example. Note how the hacking uses random words for text links and for the text between the links. Note also that the words themselves are fairly innocuous.

Google searches should be part of every web administrator’s toolkit for spotting website intrusions. The basic recipe uses a combination of the “site:” and “OR” (must be capitalised) operators. Here’s a good example.

That query, entered into the Google search engine, will list any page on mysite.com that contains any of those suspicious terms. Give it a try: mysite.com exists, and there are, typically, hacked pages on it. As I write, no more than 31 terms may be included in a statement such as this, so you’ll need to be parsimonious. Note also that although I’ve not had any sites hacked with non-English spam text, I have thrown in a few crude Spanish words just in case.

I recommend doing the following:

  1. Copy the example into a text editor and change the “mysite.com” bit in the search string into the URL of your site (leave out the “http://” and the “www”).
  2. Remove from the search string any words that might reasonably be expected to appear on your site. For example, a search string for a medical site obviously shouldn't contain search terms related to human anatomy.
  3. Do a Google search on the resulting string. Hopefully no results appear!
  4. Save the resulting page as a bookmark in your browser. You may then open that bookmark whenever you wish to rerun the search.
  5. Create a Google Alert that emails you whenever that search string brings up new results. I’ve found that Google Alerts isn’t always reliable, so be sure to regularly run the manual Google search (by opening the bookmark created at step 4).

Note that this method is not guaranteed to spot link spam, as a lot of link spam uses random text (see the above screenshot). But even if the link spam doesn’t favour suspicious words, link spam hackings typically introduce so much text that at least one of these search terms will score a hit. This method will, moreover, have a chance of spotting link spam that has been made visible only to the Googlebot (since you’re using the Googlebot to do the search). And it certainly shouldn’t be your only means of spotting hacking: security on the web, as in the offline world, is about erecting multiple barriers and multiple alarms.