What is reCAPTCHA

What is reCAPTCHA and how does reCAPTCHA work?

CAPTCHA was designed as a way to tell humans and bots apart. Coined in 2003, the term stands for Completely Automated Public Turing test to tell Computers and Humans Apart. ReCAPTCHA is Google’s variation of a CAPTCHA. The idea behind Google’s reCAPTCHA and indeed behind all CAPTCHA technology is that it should be easy to solve for humans, but impossible for bots and other malicious software.

What is reCAPTCHA used for?

ReCAPTCHA has always been used to separate people from bots, but when it was first released in 2007 it had the secondary purpose of digitizing the archives of the New York Times and once the technology was acquired by Google in 2009—digitizing books for Google Books. It effectively used people to digitize text that optical recognition software couldn’t process. Nowadays, reCAPTCHA is focused purely on preventing bots from automatically visiting website pages, filling out forms, and spamming forums or social media sites with comments. By identifying and blocking such bots, reCAPTCHA helps protect websites from spam, abuse, and worse behaviour.

How does reCAPTCHA work to detect malicious activity?

ReCAPTCHA v3 was released in 2018 and is currently the latest version of the technology. It uses a JavaScript API to return a score between 0 and 1 for every request to a particular page, without interrupting the user. A score of 0 is very likely a bot, a score of 1 almost certainly a human. So reCAPTCHA v3 doesn’t inherently stop malicious activity. It doesn’t even really detect malicious activity—it just returns a score. The idea is that you use this information to separate humans from bots with another solution. For example, you may decide to:

Require MFA for low scores on your login pages.
Limit the ability to send messages for low scores.
Closely monitor orders made from requests with low scores

Alternatively, you can trigger a reCAPTCHA v2 challenge for any request that has a low score in reCAPTCHA v3. ReCAPTCHA v2 is the classic “I’m not a robot” checkbox that you’re most likely familiar with. This checkbox directly involves the user and serves to double-check if a request comes from someone real or not. How the reCAPTCHA technology works behind the scenes to determine a score or to figure out which request gets a challenge is a mystery for all but a very small number of Google engineers, because the technology is not open-source.

What triggers reCAPTCHA?

ReCAPTCHA v3 uses “actions” to identify real traffic from bot traffic. It’s a tag that lets you define key steps in your user journey, so the reCAPTCHA technology can learn what regular users do compared to bot traffic. That’s why Google recommends activating reCAPTCHA v3 across multiple, if not all of the pages on your website. While reCAPTCHA v3 uses risk and behavioural analysis, reCAPTCHA v2 is simpler. Its image challenges require a contextual understanding of the world that most bots don’t currently possess, while its audio and text challenges warp sounds and words so that humans can still understand them but bots cannot. At least that’s the idea; reCAPTCHA v2 challenges aren’t always that easy to solve for people, and some have become easy to solve for bots.

Types of reCAPTCHA v2

Checkbox reCAPTCHA

In reCAPTCHA v2, you can either choose between the checkbox reCAPTCHA or the invisible reCAPTCHA badge. The checkbox reCAPTCHA is the easiest option to integrate; it only requires two lines of HTML to render. It provides the user with the well-known “I’m not a robot” checkbox that users need to toggle. Once the checkbox is ticked, the user will either pass immediately or, if reCAPTCHA is still uncertain about a request’s humanity, be presented with an image, text, or audio challenge.

Invisible reCAPTCHA Badge

The other reCAPTCHA v2 option is the invisible reCAPTCHA badge, which doesn’t require the user to click on a checkbox because the user will have clicked on an existing button on your website that serves as the reCAPTCHA’s checkbox. This is the least intrusive for users, but it requires a JavaScript callback so may be just a little slower. The only indication that reCAPTCHA is operating on a particular page with the invisible reCAPTCHA badge is the “powered by reCAPTCHA” logo near the button that has to be clicked.

reCAPTCHA v1

Google shut down their reCAPTCHA V1 API service on March 31, 2018. This broke reCAPTCHA on all websites everywhere. If you need to fix reCAPTCHA on your website because it still uses reCAPTCHA v1, it is important to upgrade reCAPTCHA immediately to version 2. It is relatively simple to update reCAPTCHA. Any site with a broken recaptcha using the service via the old API will need to update to continue to be protected from spam and abuse. reCAPTCHA V1 has been deprecated since May 2016. Since reCAPTCHA V1 has been shut down, and your website reCAPTCHA is broken, the reCAPTCHA service has been updated to version 2 and now has three options to choose from to fix reCAPTCHA on your website.

Advantages and Weaknesses of reCAPTCHA

ReCAPTCHA is easily the most-used CAPTCHA technology. Companies that use it include Facebook, Twitter, BBC, and many others. While reCAPTCHA offers some advantages to its corporate users, it doesn’t come without its weaknesses.

Advantages of reCAPTCHA

ReCAPTCHA is essentially free. All you need to do is sign up for an API key pair for your website. Costs only come into play once you pass a million assessments per month across all your accounts and websites, making reCAPTCHA free for all but the biggest websites.
ReCAPTCHA v3 does not visibly worsen the user experience. Unlike v2, reCAPTCHA v3 is essentially invisible for users because it won’t serve up any challenges (unless you set them up yourself). The invisible reCAPTCHA badge from reCAPTCHA v2 is also fairly unobtrusive.
ReCAPTCHA blocks some bot traffic. Many companies use reCAPTCHA because it stops the most basic bots and prevents websites from being flooded with spam and abuse perpetrated by simple bots.
ReCAPTCHA is easy to implement. It only requires connecting to a JavaScript API or, in the case of reCAPTCHA v2, adding a few lines of HTML to the relevant pages. ReCAPTCHA also offers several web app plugins to ease the integration of the technology.

Weaknesses of reCAPTCHA

ReCAPTCHA v3 is hard for website admins. The v3 technology may be invisible to users, but it puts the onus on website admins to decide when and how to block bots. What constitutes a low score? At what point do you serve a challenge? These are extremely hard questions that reCAPTCHA v3 does not answer.
ReCAPTCHA doesn’t work that well against bots. Three researchers from the University of Columbia created a low-cost reCAPTCHA attack that solved 70.78% of all reCAPTCHA challenges. The most damaging bots no longer have any difficulty circumventing reCAPTCHAs—whether they do so with AI, their internal logic, or through CAPTCHA farms.
ReCAPTCHA v2 is frustrating and makes the web inaccessible. A reCAPTCHA v2 challenge will stop a user dead in their tracks, often at a crucial point in their customer journey, like when they want to log in, make a purchase, or sign up for your newsletter. The CAPTCHA impact on user experience is not good. On top of that, CAPTCHAs make the web more inaccessible for people with disabilities.
ReCAPTCHA makes it hard to stay GDPR-compliant. The more data reCAPTCHA v3 collects, the better it works, but GDPR and other data privacy frameworks require a legal basis to process data, such as user consent or legitimate interest. Because there are reCAPTCHA alternatives that collect far less data and because reCAPTCHA uses tracking cookies, reCAPTCHA technology is problematic for any company that needs to adhere to a data privacy framework.
Privacy-conscious users receive lower reCAPTCHA scores. Two researchers from the University of Toronto discovered in 2019 that reCAPTCHA v3 gives lower scores to those who don’t have a Google account on their browser. If a user browses with a private browser or a VPN, their reCAPTCHA v3 scores are much lower than when they don’t use those tools. This means that privacy-conscious users are more likely to receive some kind of bot challenge while browsing.

ReCAPTCHA FAQs

How does reCAPTCHA know I’m not a robot?

ReCAPTCHA v3 uses behavioural and risk analysis to analyze every request and determine which request is likely to come from a robot and which one isn’t. ReCAPTCHA v2 issues a checkbox to all the requests that it’s uncertain about. If still unsure once the checkbox has been clicked, it can issue an image, audio, or text challenge to finalize its assessment of a request.

Can reCAPTCHA be fooled?

Bots fool reCAPTCHA all the time. It’s no longer difficult to create a bot that can either bypass or solve anything a reCAPTCHA throws at it, whether that’s just the passive monitoring of reCAPTCHA v3 or the image challenges of reCAPTCHA v2. Bots can do so with their internal logic, with the help of AI and machine learning, or through CAPTCHA farms where a human solves the CAPTCHA for them.

How do you use reCAPTCHA?

ReCAPTCHA v3 requires you to integrate a JavaScript API on all the pages where you want the technology to work. The more pages, the more efficient the technology. ReCAPTCHA v2 either requires you to add a few lines of HTML if you want the checkbox or a JavaScript API if you want the invisible reCAPTCHA badge.