Google offers top tip to help beat bots
- 20 April, 2009 05:11
Google has put a new spin on the CAPTCHA, a way of helping Web sites distinguish between human visitors and bots: It wants people to tell it which way is up in a series of randomly rotated images, a task that humans find easy and computers difficult.
When spammers started using software to automatically create thousands of Web-mail accounts on services such as Hotmail and Gmail from which to send their spam, the Web-mail operators turned to the CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) to weed out automated applicants.
A typical CAPTCHA asks the visitor to look at an image containing a series of distorted letters and numbers set against a busy background, and to type in the sequence of characters they see. The idea is that a human can easily recognize the shapes of the letters, while a computer program will find it difficult to do so.
However, OCR (optical character recognition) software has become more sophisticated, forcing CAPTCHA developers to make their challenges so unreadable that many humans have trouble too.
Google's answer is to show visitors randomly rotated images and ask them which way is up.
However, while the images will be randomly rotated, they will be carefully selected. First, Google will exclude images for which its own computers can readily identify the top, such as photographs showing landscapes with blue sky (easily detected), text (easily recognized) or portraits of people (there are many facial recognition applications on the market).
Then, it will screen out images that humans find too difficult to orient (for example, abstract art, or overhead views, that don't have a readily identifiable top) by conducting a sort of opinion poll.
That poll is the key to the process, as it allows Google to create a pool of appropriate images for which people agree on the correct orientation.
By challenging visitors with a series of images for which it knows the orientation, and one for which it doesn't, Google can screen out the bots and steadily accumulate statistics about the unknown image: If visitors tend to agree which way is up, it's appropriate for inclusion, and if they disagree, the image may be too difficult.
Google employees Rich Gossweiler, Maryam Kamvar and Shumeet Baluja describe the image selection process in a paper, "What's Up CAPTCHA?" that they will present next Thursday at the WWW 2009 conference in Madrid.
They are not the first to consider using photographs in CAPTCHAs. For instance, in August 2007, Microsoft asked visitors to distinguish between cats and dogs in an attempt to stop the spam bots.
Recent advances in face recognition software, though, may have rendered that approach obsolete: Apple's iPhoto application can already identify cats' faces, but has trouble with dogs, suggesting that software capable of deliberately distinguishing between the two may not be far off.
However clever Google gets, though, the spammers may be cleverer. There is evidence that they are employing humans to crack the current crop of CAPTCHAs, presenting the challenges as a game, paying people in low-income countries to solve them - or reframing the CAPTCHAs as a way to gain access to porn sites. Against that kind of attack, there is little that Google can do.