Breaking CAPTCHA or why you should be using reCAPTCHA V3

Jeffery Tay
4 min readJul 21, 2021

CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) was created in the early 2000s to determine that the actions being taken on a browser is by a real user and not by spammers aka bots.

Implementing Captcha

Initial versions of captcha transforms texts and adds in various transformations (e.g lines, character rotations, space reduction between words). The purpose is to confound robots but at the same time providing enough data for humans to decipher the text.

Squiggly characters and lines
Adding a line to confound bots
Reducing space between characters

For Java systems, there is an existing library which does this nicely. You can refer to SimpleCaptcha and its document to crate a myriad of Captcha images to confound the bots (and possibly the users as well!)

Sample Captcha images generated by SimpleCaptcha

With text recognition such as these, there is usually a trade off between how complex you want to the image to become. It is certainly possible to make the Captcha complex by adding Fisheye and drop shadow effects plus plenty of noise, however it will sometimes render the image almost indecipherable by humans too.

Therefore you will come to realize that most implementation tends to gravitate towards simple distortions such as reducing character spacing, adding a noisy background or overlaying some elements on top of the characters.

Breaking SimpleCaptcha

For a start, we will attempt to solve a simple captcha implementation using the following code segment

To solve this, we will be using Tesseract OCR. Tesseract 4.0 introduces a new neural net (LSTM) based OCR engine which we will be using in this article.

The source code for this project can be found at my Tesseract-OCR Github Repo

  1. First we setup a new console project using .NET Core

2. At the Target Framework screen, make sure you choose .NET 5.0 and then click Create

3. Now install Tesseract to your project using the following Nuget command

Install-Package Tesseract

4. There is a need to pull down some trained data (this are found inside tessdata folder in my github), or you can grab the latest off

For simplicity, a tessdata folder is created in the project, with each file set to “Copy if newer” so that it is copied to the compiled folder

tessdata subfolder configuration and loadout

5. A Tesseract config.txt file is also created with the following parameters, the most important configuration is tessedit_char_whitelist which tells Tesseract to only find for these characters

load_system_dawg false
load_freq_dawg false
tessedit_char_whitelist abcdefghijklmnopqrstuvwxyz0123456789

6. Now we put them all together in program.cs.

Program.cs for TesseractCaptcha

Tesseract OCR in action

Running it gives the following output

OCR results using eng_best trained data, only 39%

Not very confident, trying again with a larger trained dataset (eng_default)

Higher confidence at 49%

This is still not good enough, but given that most captcha allow for refreshing of the image, one can always setup a bot to get a new captcha and try this process repetitively to get to a higher confidence response.

Concluding thoughts

The code sample provided is intentionally bare as the intent is not to demonstrate working samples for breaking captchas, but rather show that machine learning and trained models are sufficiently good enough to crack most traditional captcha systems without breaking a sweat.

Captchas is one of the recommended methods by OWASP to prevent automated and/or brute-force attacks (

With AI models increasingly built into systems, it is time for developers to explore and look at implementing more user-friendly and secure captcha systems such as ReCaptcha V3. Such implementations are transparent to the users but will work strongly to deter most web automation/bot systems.



Jeffery Tay

Education is in my blood, partnership and coaching is my passion. ¬ L’essentiel est invisible pour les yeux