Security and Privacy

mCaptcha: Replacing Captchas with Rate Limiters to Improve Security and Accessibility

To counter the threat of AI-powered bots, captcha systems are growing increasingly complex, often sacrificing usability and accessibility. Can mCaptcha, a proof-of-work captcha system, effectively address these issues?

By Aravinth Manivannan, Sibi Chakkaravarthy Sethuraman, and Devi Priya Vimala Sudhakaran

Posted Sep 26 2024

Related Work
System Model
mCaptcha Performance Parameters
mCaptcha Analysis and Discussion
Conclusion
Source Code
Acknowledgments
References

For many years, publicly accessible Web applications have been protecting their services from bots and scripts by asking users to solve captchas (Completely Automated Public Turing tests to tell Computers and Humans Apart), puzzles designed to be challenging for machines to solve yet simple for humans, such as clicking on certain locations in an image or recognizing elongated characters or digits. Designed to stop robotic assaults like spamming, data scraping, and brute-force login attempts,¹ captchas act as a security precaution to determine whether a user is a human or a software program. Captcha techniques are employed in many different areas, including e-transactions, entering a website’s secure areas, gathering email signups, and ensuring that only humans vote when conducting polls and surveys. They are also used to hinder attackers and spammers from injecting malicious software into online registration forms. As such, captchas are also employed as a line of defense against threats such as DDoS attacks, dictionary attacks, malvertising, and botnet and spam attacks.

Key Insights

We developed a variable-difficulty-based proof-of-work captcha system, mCaptcha, that does not harm the Web user’s experience, cause accessibility issues, or jeopardize security.
We evaluated the system’s security and usability, and the results clearly demonstrate that it is effective at delivering reliable security protocols without sacrificing usability.
Further, we demonstrated that mCaptcha is capable of monitoring botnet attacks, as the botnet’s processing resources quickly ran out during the attack scenario.

Although captchas can add an extra layer of security, they often contain flaws that make them vulnerable to attacks. For example, South African threat actors Automated Libra used captcha to open up several GitHub accounts, eliminating the need for human intervention. A group of scientists from Arizona, Florida, and Georgia came up with an ML-based captcha decoder that they claim can solve 94.4% of the real-world captcha challenges in dark networks.² And a malware campaign has been uncovered that uses a creative captcha challenge to deceive visitors into installing the Gozi banking trojan.³ The appearance of attacks like these illustrate how challenging it is to create a safe captcha. But even outside of these high-tech attacks, captchas can be bypassed with the help of captcha farms,⁴ where a human’s skill in solving the puzzles can be obtained via an easy API call.

With advances in deep learning technology, bots, in particular, have shown to be capable of accurately resolving captchas.⁵ The capabilities of these AI-powered bots highlight the limitations of traditional bots,⁶ which generally fail for a number of reasons:

Adaptability: As traditional bots are typically created with predetermined responses, it is challenging to increase their size as the complexity of transactions and inquiries from users expands. Bots powered by AI are capable of evolving from new information, enabling them to cope with a broader range of inquiries and circumstances.
Reliability: Traditional bots demand human assistance to modify their responses and are unable to tackle emerging and shifting circumstances. Bots that use AI are capable of responding to variations in language acquisition, patterns, and user habits over the long term. They can benefit from additional insights and amend their reactions appropriately.
Human-like interactions: Traditional bots typically follow scripted interactions, providing predefined responses. AI-powered bots strive to mimic human-like communications, such as interpreting sentiment, voice, and feelings.
Natural language understanding(NLU): Traditional bots focus on simple content matching and lack the intelligence to interpret linguistic nuances. AI-powered bots leverage methods from natural language processing to comprehend and synthesize human-like content.
Complicated inquiries: Traditional bots struggle to produce precise responses when user queries become more complex and diversified. Bots powered by AI can comprehend complicated sentence patterns and semantics, allowing them to properly address a wider range of inquiries.

Due to these advances, captcha designers have been compelled to make their content more complex and difficult to solve. But this added complexity is having a detrimental impact on usability, often frustrating users. It also makes navigating the Web more difficult for those who have cognitive, visual, or auditory disabilities.

A further usability problem is rooted in the increasing dependence on Google accounts and the use of centralized captcha systems such as Google’s reCAPTCHA, which renders parts of the Internet unavailable for users within restrictive nations where Google cannot function. For instance, someone from China is unable to access any reCAPTCHA-protected services. AI-powered services like reCAPTCHA rely on massive datasets that are exclusive to large organizations such as Google and Cloudflare.²¹ Considering these datasets are private, even if a self-hosted version of the same system were to be developed, it wouldn’t be as efficient as the centralized versions maintained by these leading companies. We believe that centralization of the Web is detrimental to its proper functioning. Being dependent on large, remote data centers leads to unnecessary utilization of resources. And, as mentioned above, many websites are only useful to visitors who live in a specific region. Hosting websites geographically nearby will thereby improve resilience and service independence while consuming fewer resources. But as AI-enabled captchas cannot be self-hosted, we are unlikely to gain the benefits of decentralization with these existing large captcha systems. As we will discuss later, we believe that this can be achieved more effectively with a decentralized (that is, self-hostable) captcha solution.

Since the introduction of captcha, several variations have emerged, including text-based, image-based, math-based, time-based, and puzzle-based captchas.¹^,⁷^,⁸^,⁹ Solving these captchas depends on a person’s capacity for generalization and the ability to detect new patterns that stem from past observations. Traditional bots, however, typically enter random characters or follow predetermined patterns and, in light of this limitation, are unlikely to choose the correct combinations. Another discrete variety of captcha is social media single sign-on (SSO),¹⁰ in which users must first log in to an account on social media, after which their data is subsequently filled out using the one-time sign-on feature. The registration form is easy to complete, and the user thereafter establishes their identity by conveying an active social network profile. Yet another captcha technique is Google’s reCAPTCHA, which was designed to combat sophisticated bots that might bypass traditional captcha testing. The end-user experience is also made simpler, as it takes only one click to verify that someone is not a bot.¹¹^,¹² When a user ticks the I’m not a robot text box, Google’s system begins an intricate process that records user behavior. To determine whether a user is a bot, the technology examines the cookies maintained in the browser as well as the device’s event history. If it is unable to accurately determine that the user is human, a standard captcha is displayed. Finally, checkbox captcha is a simple, easy-to-use technique that can offer only low-level security against automated bots.²² From the visitor’s viewpoint, it resembles our system, mCaptcha. However, it is less safe than mCaptcha and other sophisticated captcha techniques because bots can get around it by mimicking human behavior. Therefore, websites and applications with higher security requirements are advised to opt for alternative captcha techniques.

A major issue with reCAPTCHA and other well-known captcha systems is their collection of user pattern data, which compromises users’ privacy. Google’s reCAPTCHA requires users to provide their data to Google’s database for examination. This poses privacy issues, as Google is able to collect data on visitors, potentially connecting it to their Google profiles or other Internet activities. If user confidentiality¹³ is a priority, a similar service, hCAPTCHA, can be used as an alternative to reCAPTCHA.

Among the other types of captcha systems found in the literature are proof-of-work (PoW) schemes,¹¹^,¹³^,¹⁴^,¹⁵^,¹⁶^,¹⁷^,¹⁸ which rely on one party proving to the other that a certain amount of computational work has been completed. Few have been deployed, though, as they require special client software that must be widely used in order to prevent access from clients who haven’t installed it.

This article outlines the design, execution, and assessment of a unique, variable-difficulty-based PoW captcha system, mCaptcha, that addresses the security, usability, and privacy issues with current captcha systems described here. Our broader goal is to stop attempting to distinguish between humans and robots and return to captcha’s original intent: providing denial of service (DoS) protection.

Related Work

PoW-based captchas impart an additional degree of security by making it more difficult for hostile bots to automate behaviors on a webpage. These systems demand a lot of computational resources, which makes it economically difficult for attackers to operate various bots. Google’s reCAPTCHA,¹⁹ a PoW-based captcha system, separates humans from bots using cutting-edge risk-evaluation methods. Based on how users behave on the website, it assigns a score for each user engagement. The hCAPTCHA ¹³ system confirms human identity by combining sophisticated algorithms with user engagement. In contrast to other captchas, hCAPTCHA occasionally delivers challenges that are more difficult. Although this is meant to make it more challenging for automated programs to bypass the captcha, it also causes discomfort and inconvenience to users, particularly ones with disabilities or cognitive limitations. Coinhive, a controversial PoW-based captcha, used bitcoin mining as proof of work.²⁰ Visitors had to confirm their human identity by mining cryptocurrency (Monero) for a specified period. Due to misuse by hostile entities, however, it was forced to shut down in March 2019. With Solve Media’s PoW-based captcha, visitors must complete short riddles or specific activities in order to gain access to content. As it is dependent upon a set of predetermined riddles or tasks, this approach could prove less successful if automated systems or scripts can recognize and complete the puzzles.

Wang et.al.²⁴ experimentally investigated the impact of current breaches of text-based captcha methods. They used a unified assault architecture that incorporated several attack modules and transfer learning algorithms to rigorously assess the performance of known assaults against 20 captcha schemes in terms of precision and efficacy. To get around the drawbacks of these conventional, character-based captchas, various designs, such as 3D-based and animated captchas, ²³ have recently been suggested. Rendering 3D models to images is a key component of 3D captcha, but it has been observed that this strategy is vulnerable to attacks.²⁵^,²⁶ With animated captchas, a time dimension is added, but advanced bots can use machine-learning algorithms to recognize and predict patterns in time-based challenges. Deep convolutional neural networks (CNNs) have been used in several captcha studies.²⁷^,²⁸^,²⁹^,³⁰ But due to the severe distortions typically utilized in captchas, including distorted letters, background noise, overlapping segments, and camouflage techniques, deep CNN models struggle to correctly distinguish and categorize objects or characters in the rendering. It is challenging to produce a large and varied dataset of labeled captcha images, owing to the need for human labeling or the use of specialized technologies. This limitation consequently hinders the deep CNN models’ ability to effectively learn from the training data.³¹

Table 1 shows a comparison of our system, mCaptcha, with some of the other popular captchas discussed here.

Table 1. Comparison of mCaptcha with other popular captchas.

Parameters	reCAPTCHA	hCaptcha	Friendly Captcha	MTCaptcha	Forest/ pow-captcha	mCaptcha
IP-based	Yes	No	No	Yes	No	No
Works in Tor net	No	No	Yes	No	Yes	Yes
Idempotent Validation	No	No	Not open source and idempotency undocumented	Unknown	Yes	Yes
Analyzes User Agent State	Yes	Yes	Not open source and undocumented	Unknown	No	No
Transparent Validation Mechanism	No	No	Not open source and transparency is undocumented	Unknown	Yes. Software is open source; idempotency is verified	Yes
Bypass Methods	OCR, human-powered captcha farms	OCR, human-powered captcha farms	ASICS, very large botnets	OCR bots, ASICS	Large botnets	ASICS, very large botnets
Accessibility	Inaccessible to visual, auditory, or cognitively challenged	Inaccessible to visual, auditory, or cognitively challenged	Accessible from screen readers	Yes	Accessible from screen readers	Accessible to all

System Model

Figure 1a illustrates the overall design of the mCaptcha system, while Figure 1b shows the sequence of interactions between the components. The server hosting the mCaptcha instance (mCaptcha server) generates the challenge and verifies the proof of work. The website server represents an instance of server-side software that uses Hypertext Transfer Protocol (HTTP) to run the request and response model. It handles inbound HTTP requests from visitors/clients and serves resources over the Internet. The visitor uses the software component that enables communication with website servers. It allows applications to submit HTTP requests to the server and receive HTTP responses.³² The website administrator embeds the mCaptcha into the webpage or form through which resources can be accessed.

The visitor prepares HTTP requests and communicates them to the website server. These requests typically include the URL of the site, the methods (GET/ POST/ PUT/ DELETE), the request header (such as message type, authorization credentials, and so on), and, in certain cases, the request bodies (for methods like POST/PUT). The HTTP request is parsed by the website server to extract essential information, including the method used for the request, the webpage being requested, request headers, and so on. The server then sends the mCaptcha widget for the requested webpage to the visitor. When the client browser initiates the mCaptcha, the mCaptcha server will receive the challenge request. The challenge and a special string identifying the client are then enclosed in the challenge response. The client solves the challenge and forwards the PoW to the mCaptcha server, which then verifies its correctness and generates a single-use, time-bound, client-specific access token for authorization. Once the token has been received, the client sends it to the server. The website server further communicates with the mCaptcha server to validate the token, and if the validation is successful, the client obtains the resource they requested.

mCaptcha Performance Parameters

Accessibility. To control access to a service, the mCaptcha server instance analyzes a website’s difficulty factor. When website traffic is average, a low-complexity challenge is offered to visitors, enabling them to access the resource quickly. When the traffic volume increases, the difficulty level is raised to prevent the client from receiving too many requests. The mCaptcha instance can be updated by the website administrator in accordance with the traffic. If necessary, the difficulty factor can be adjusted to impose delays as required.

Protection. Users of mCaptcha must complete a specified amount of computing labor, usually requiring sophisticated computations or procedures, before their request is approved.³³ The time required to finish the computational task is predetermined. Given that it takes a long time to complete each challenge, bots find this time restriction to be a further hurdle, as it slows down their capacity to quickly complete automated tasks. It is difficult for bots to rapidly compute a large number of challenges, as solving each challenge consumes considerable system resources, such as processor cycles, storage, and so on. Due to the limited resources with which bots normally operate, the expense of completing several captcha challenges increases. A maximum 10-second delay on an authentication endpoint for generating PoW is appropriate for genuine users. Bots, on the other hand, deliver multiple requests from a single device, necessitating a multitude of computational rounds to calculate the PoW for each request, which proves to be expensive. This asymmetric compute resource requirement effectively protects the majority of risk levels.

Rate-limiting. To track the traffic generated by the visitors/clients, mCaptcha uses a customized leaky-bucket-powered cache. The algorithm operates via the metaphor of a bucket with an unlimited capacity and a small opening at the bottom.³⁴ As each request arrives, one unit is added to the bucket. Through the aperture at the bottom (specified as the cool-down period), the bucket leaks at a specified rate. This guarantees that the network traffic does not exceed the required rate, thus ensuring that there are no sudden surges in traffic that overwhelm the network, maintaining a steady and controlled flow.

Site key. This refers to an identification function in mCaptcha. A site key is generated by the website owner or admin using the mCaptcha administrative dashboard when mCaptcha has been installed on the website. The site key is used to uniquely identify the mCaptcha configuration associated with the site.

Difficulty factor. The level of complexity or effort needed to complete particular activities on a website application is determined by its difficulty factor, which can be customized depending on the specific requirements and objectives of the website. This usually entails establishing an equilibrium between keeping activities accessible and achievable for human users while rendering them difficult for bots, to prevent automated misuse. Specific criteria could be used to assess the difficulty factor in accordance with the website’s objectives, intended viewers, and potential threats. A website’s visitor traffic statistics must be organized into tiers for mCaptcha to deploy appropriate difficulty factors. For instance, looking at the configuration in Table 2, a Level 1 difficulty factor (5,000) will be applied if the website receives 2,000 requests. The difficulty factor can be raised to level 2 (50,000) if 5,000 requests are received. The difficulty factor can significantly increase or decrease depending on the traffic stream received. Care must be taken when selecting the difficulty factor, as extremely high levels could make it impossible for users with slower devices to complete the captcha.

Table 2. Sample configuration for difficulty factor and threshold levels.

Level	Difficulty Factor	Visitor Threshold
1	5,000	2,000
2	50,000	5,000
3	500,000	10,000
4	5,000,000	15,000

Proof of work (PoW). The PoW algorithm employed (Figure 2) in mCaptcha leverages SALT (32-bit random string) to bring more unpredictability or distinctiveness to the computational work. In order to ensure a unique hash even when the inputs are the same, salting hashes add random data to the hash function’s input.³⁵ This potentially makes it harder for bots to apply optimization techniques or precompute solutions. Consequently, this distinctive hash created by adding the salt defends our approach from several vectors of attack, such as rainbow table attacks, dictionary attacks, as well as brute-force offline attacks.

Input: SALT, Si-SerializedInput, Df-DifficultyFactor
Output: unsigned16-bit (Digest), nonce, Si
  1: function mHash(SALT, Si, Df)
  2:  nonce ← 0
  3:  Start ← Concat (nonce, SALT, Si)
  4:  Digest ← SHA256(Start)
  5:  while (Df ≤ Digest) do
  6:    nonce ← nonce + 1
  7:    Start ← Concat (nonce, SALT, Si)
  8:    Digest ← SHA256(Start)
  9:  end while
 10: return (unsigned₁₆_bit (Digest), nonce, Si)
 11: end function

Figure 2. mCaptcha: Proof-of-work algorithm.

To identify each website that a client or visitor wants to connect to, SerializedInput, a distinct 32-bit string, is employed. Each website will have a distinct DifficultyFactor. The client responds to the challenge with the nonce, or the number of iterations it took the client to generate PoW. The first 16 bytes of the unsigned integer format of the digest are then designated as PoW and transmitted to the server. Although PoW has been criticized for being energy inefficient, its asymmetric computational power requirement offers an effective mechanism to thwart harmful behaviors. It also makes it possible for parties with fewer resources to build an effective defense against more powerful opponents.

Access token. Figure 3 illustrates access token generation in the mCaptcha server. The client or visitor presents their PoW for validation to the mCaptcha server. For the specific Si, the mCaptcha server retrieves the DifficultyFactor (Df) and the SALT value. It then examines whether the Digest from the client is greater than or equal to Df. If this is true, then the server further calculates the hash value of the nonce, SALT, and Si to compare with the PoW received from the client. If the PoW is valid, mCaptcha will provide a time-constrained, one-time-use access token. The website server verifies this access token with the mCaptcha server instance before allowing access to protected resources, and only if the access token is valid is access granted.

Input: unsigned16-bit (Digest), nonce, Si
Output: AccessToken
  1: function mCAPTCHAServer(Digest, nonce, Si)
  2:  SALT ← GET_SALT
  3:  Df ← Get_Df (Si)
  4:  if Digest ≥ Df then
  5:    Calc_Digest ← (SHA256(nonce, SALT, Si))₁₆_bit
  6:    if CalcDigest ≠ Digest then
  7:      Valid ← false
  8:    else
  9:      Valid ← true
 10:    end if
 11:
 12:  else
 13:    Valid ← false
 14:  end if
 15:
 16:  if Valid ← true then
 17:    AccessToken ← Generate_token()
 18:  end if
 19: return AccessToken
 20: end function
 21: function Generate_token()
 22:  Token ← random()
 23: return Token
 24: end function
 25: function Get_Df(Si)
 26:  Df ← get Df from database for Si
 27: return Df
 28: end function

Figure 3. mCaptcha: Server instance algorithm.

Why SHA-256? In our approach, we use the SHA-256 hashing algorithm. As a standardized method, SHA-256 is supported by a wide range of applications, libraries, programming languages, and hardware elements. Because of its compatibility, SHA-256-based PoW is viable and frequently used. According to FIPS-180, SHA-256 can be broadly comprehended as featuring two basic phases: preprocessing, which involves padding, dividing the data into segments, and setting initialization values, and hash calculation, which is the procedure that generates hash values through a sequence of operations. These different hash values are taken into account to get the final 256-bit hash digest. Computing a hash needs a substantial amount of time and processing resources. This processing complexity represents a crucial component of PoW, since it makes sure that visitors have to spend an ample amount of time to solve the cryptographic challenge and determine a valid hash. This requirement offers security to the website by making it challenging and expensive for adversaries to tamper with or dominate the PoW computation. It is feasible to break the mCaptcha by computing the SHA-256 algorithm at a faster rate if you have almost unlimited resources and time. However, the cost of the resources required to launch such an initiative makes it practically unaffordable.

mCaptcha Analysis and Discussion

Experimental setup. The experimental layout for various analyses is as follows:

For security analysis: To secure an application, we deploy and set up mCaptcha, hosted on a server with a little-endian byte order, X86_64 bit architecture, 12-core CPU configuration. We replicate Web user behavior using the Python-based DDoS open source framework Locust³⁶ on the server. Locust handles load testing, which examines the system’s tolerance and behavior under a predetermined predicted load.³⁷ It also tests websites for load capacity and the number of simultaneous visitors a system can support. The master instance controls the Web interface for Locust and instructs the workers when to start up and terminate visitors. The workers manage visitors and provide the master with analytics. However, no visitors run on the master instance natively.
For usability analysis: We use benchmarks to determine how well mCaptcha performs in terms of usability. To conduct the evaluation, we developed a real-world survey website that is automatically tailored for both PC and mobile clients. (The mCaptcha assessment’s visual representation and usability survey, shown in Figures 4 and 5, are available at https://ux-survey.mcaptcha.org/.) The user experience was created with touch displays and smaller screens in mind, and was examined to ensure that the computational requirements were appropriate for portable devices.

**Figure 5.** Sample instance of mCaptcha survey GUI.

Usability analysis. Prior to the usability evaluation, we carried out a pilot test with five volunteers, whose principal aim was to assess any technical or usability problems with the proposed user experience testing. Technical checks allowed us to determine whether the test was operating as planned, including being sufficiently easy to understand, and included all the required responses. Both checks turned up no defects or vulnerabilities. Each participant received information on the anonymity of the usability testing, that it was mCaptcha’s usability that was being evaluated, and that the data gathered would only be used for research.

We received 252 responses to our public-Internet-wide survey. We eliminated duplicate entries from the 252 responses by leveraging the IP address and browser fingerprint of the participants, resulting in a final count of 169. Considering the user satisfaction level from the analysis of responses in Figure 6, the majority of participants—76 in all—responded that they were very likely to recommend mCaptcha to others. This shows that a sizable fraction of the people surveyed have a strong positive predisposition toward mCaptcha. A notable increase in the number of participants giving higher ratings suggests that most participants are either satisfied or extremely satisfied. We further analyzed the mCaptcha user experience using a graph that depicts the difficulty factor versus time. The graph in Figure 7 depicts the link between the difficulty factors of captcha challenges and the amount of time taken by visitors to complete them. The x-axis represents threads; the y-axis represents the difficulty factor, which grows as the number of requests climbs; and the z-axis represents time. The mCaptcha process begins with a lower difficulty factor, gradually increasing its value. We assess usability by examining the completion rate and time to solve. Given that the visitor only needs to click the captcha, even people with disabilities, such as those who suffer from cognitive impairments, can comfortably finish the task. We designed a dependable difficulty-adjustment mechanism that analyzes user performance and adjusts the difficulty level to achieve an ideal balance between user satisfaction and security. Table 3 gives an overview of the various difficulty factors experimented with and their user completion times. The 50th percentile represents the time at which 50% of users finish the mCaptcha tests successfully. We can infer that as the difficulty factor advances, so do the completion times at all percentiles. This implies that the increasing difficulty factor of mCaptcha tasks requires users to devote more time to completing them successfully.

Insights from usability analysis: breakdown of responses received. — **Figure 6.** Insights from usability analysis: Breakdown of responses received.

**Figure 7.** Difficulty factor versus time.

Table 3. Difficulty factor along with various percentile times.

DFactor	Percentile Time	Time in Sec
1,069,993	25th percentile	0.3395
	50th percentile	0.5147
	75th percentile	0.6905
	90th percentile	1.0061
	99th percentile	2.9128
4,150,002	25th percentile	3.1445
	50th percentile	4.5232
	75th percentile	6.3923
	90th percentile	12.1736
	99th percentile	29.0995
6,550,004	25th percentile	3.9836
	50th percentile	5.7105
	75th percentile	8.2381
	90th percentile	16.3418
	99th percentile	34.8196
14,760,000	25th percentile	9.0467
	50th percentile	13.7065
	75th percentile	20.3819
	90th percentile	36.4154
	99th percentile	86.5972

Security analysis. Locust was initially configured with 12 CPU cores and a spawn rate (the rate at which virtual users are generated and begin to interact with the program under consideration) of 5 users per second until it reached 250 concurrent virtual users and an RPS (requests per second: the rate at which the target application is receiving requests) of 20 and a load factor of 6. The bot visitors are made to send requests to the application’s expensive endpoint by solving the mCaptcha, and thus gauge the amount of traffic the application receives. By changing the number of bots, we investigated the effect of the rate of requests.

The behavior of mCaptcha on a website is fully customizable by the site administrator. To raise the load factor, we increased the configuration to 25 RPS with 300 concurrent users, receiving a load factor value of 10. For configuration 1 on the website, shown in Table 4, it has been observed that an attack is taking longer to be discovered. When configuration 1 was modified to configuration 2 and the RPS was increased to 25, to maintain the highest sustainable output and a 30-second cool down period (the appropriate detection threshold is 1/2 the cool down period), the observed load factor was 12. The visitor count in configuration 2 was halved to further shorten the detection time; we also switched to an embedded cache to reduce network latency and load on the server. We set the number of worker nodes to 11, with the presumption that this would result in stable concurrent users, resulting in a load factor of 7. At that moment, we also learned that Locust crashed. As a result, we reduced the number of worker nodes to 10 and waited for the cool down period. A stable curve was obtained after the service was restarted, showing that an attack had been discovered. To make the curve climb, we limited the number of users during the attack to 20 at a spawn rate of 10 per second. As only 2.5 requests per second were produced by 20 bots/visitors, which is too little, we increased the number to 50 to verify whether it fell under the detection threshold. With the following configuration changes—a worker node of 8 and a cool down period of 30 seconds—the attack was detected and contained, owing to a reasonable curve. To test whether the difficulty factor would diminish, we reduced the number of concurrent users to 30, and the result was a success. We boosted the number of concurrent users even further to 1,000. This also resulted in the detection and containment of the attack. Table 5 lists the final test environment configuration. The DDoS attack simulation result graph is shown in Figure 8. The number of concurrent users at a period is manually increased and decreased to observe the proof-of-work algorithm’s behavior (Figure 8c). Figure 8a represents the total requests received per second and Figure 8b depicts the response time of visitors/users.

Table 4. Details of website configuration.

Configuration 1
Visitors	Difficulty Factor
100	5,000
650	50,000
670	500,000
Configuration 2
Visitors	Difficulty Factor
100	5,000
200	50,000
220	500,000

Table 5. The final test environment configuration.

mCaptcha Configuration
Visitors	DFactor
1,000	5,000
1,100	50,000
1,200	500,000
Locust Configuration
Item	Value
Number of Worker Nodes	8
Max. Load factor	10.9
Peak Concurrent Users (Attack Condition)	1,000
Min. Concurrent Users (Normal Condition)	30

Since mCaptcha demands a substantial amount of computational work to accomplish its task, it is difficult and financially unviable for attackers to automate the process, making mCaptcha resistant to automated solving algorithms and techniques. The challenges are generated with ample randomness as well, with variability achieved by revising the difficulty factor depending on the traffic rate. This restricts attackers from employing a deterministic or static-task-creation approach to enhance their algorithmic solutions. When employing SHA-256, there are 2,256 potential hash values, making it unlikely for two distinct documents to coincidentally have the same hash value. Salt is added to mCaptcha to further limit the potential for collision attacks. The rate at which challenge requests are processed is capped by the rate-limiting strategy enforced by the leaky bucket, which limits the number of requests that can be made within the given period. This can defend against brute-force attacks, which rely on rapidly sending many attempts.

The computation of a challenge necessitates the use of computing assets such as processing cycles and memory access. Depending on the implementation and the specific hardware, resources and their usage patterns will differ. Due to this diversity, it is more challenging for adversaries to spot certain resource-utilization patterns or correlations that could be exploited in side-channel assaults. To verify whether the frequency of requests fluctuates, we altered the bot user count. The rate of requests initially escalates as the number of active bots rises, then suddenly declines. This demonstrates that the attack was detected via mCaptcha’s variable proof-of-work challenges, which subsequently stiffened defenses by increasing the difficulty level. When the number of bot requests drops, the mCaptcha instance notices and lowers the difficulty level.

**Figure 8.** Findings from the simulation of a DDoS attack targeting mCaptcha: a) Total requests per second; b) Response time of visitors/users; c) Visitors per second.

Threat vector analysis. We also uncovered the precise threat vectors or streams attackers could use to undermine the mCaptcha.

Vector 1. The attacker can make requests for PoW configurations until the maximum difficulty factor is reached. The attacker then starts sending invalid PoW (that is, random PoW data rather than performing the computation). The server cannot get overwhelmed with computation in this scenario because the PoW computational parameters must be present in the cache for the hash to be generated.

Vector 2. The attacker gathers valid PoW solutions, enabling them to pass the initial validation check and move on to the PoW validation check. As the mCaptcha server calculates the hash, an overload is produced. To reduce the impact of this vector, we adopted IP-address-based scheduling, which is an approach where the scheduling or prioritization of PoW-validation jobs is based on the IP addresses of the entities that request them. In order to do that, we discover the IP addresses of the entities making the PoW-validation requests by capturing the IP addresses of the incoming queries. Then, a nested or multi-level queue is maintained for the IP addresses, in which each element is a separate queue. Using the corresponding enqueue and dequeue methods, the nested queues within the elements can be individually enqueued and dequeued. In this way, an IP address that frequently transmits requests is considered only if no other requests from other IP addresses are present.

Conclusion

People with particular cognitive, auditory, and visual impairments often lack access to various online resources because current captcha systems do not accommodate them. Existing captcha systems anticipate attacks but are not idempotent and lack precise documentation on how their prediction mechanisms operate. To provide equal access to information, truly idempotent captchas are necessary. Addressing these challenges, we were able to successfully design and develop a rate-limiting, privacy-preserving, PoW-based system that makes captchas more secure, thereby thwarting denial-of-service attacks and maintaining a system’s or service’s overall performance and availability. And, as computational challenges may cause additional delays and resource needs, it is crucial to take into account the possible influence of these delays on the user experience, which we addressed through our testing process. We hope that our research points in the direction of future research on rate-limiting DDoS protection. Current captchas are unreliable, so other captcha alternatives deserve further study as replacements. We envision that our research offers an orientation for subsequent research in this field, including looking into sophisticated cryptanalysis techniques on PoW and developing countermeasures to keep up with our adversaries.

Source Code

The real time implementation and source code of mCaptcha is available at https://mcaptcha.org/.

Acknowledgments

The authors would like to thank the editors and reviewers. The authors would also like to thank S. V. Kota Reddy, vice chancellor of VIT-AP University, and Jagadish Chandra Mudiganti, registrar of VIT-AP University and NLnet Foundation, Netherlands, for their support. Special thanks to Hari Seetha, director of the Centre of Excellence, Artificial Intelligence and Robotics (AIR) at VIT-AP University; Arnold Schrijver from the Social Coding project; Loïc Dachary from the Forgefriends project, as well as Gusted, Avinash Kumar, Datta Adithya, and Chandra Kiran Reddy.

Submit an Article to CACM

CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.

You Just Read

mCaptcha: Replacing Captchas with Rate Limiters to Improve Security and Accessibility

View in the ACM Digital Library

DOI

10.1145/3660628

October 2024 Issue

Vol. 67 No. 10

Pages: 70-80

Table of Contents

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Explore More

BLOG@CACM Sep 24 2024

Using Generative AI to View Questions Through Different Academic Disciplines

Ted Selker and Berry Billingsley

Architecture and Hardware

BLOG@CACM Sep 20 2024

No Generalization without Understanding and Explanation

Walid Saba

Architecture and Hardware

BLOG@CACM Sep 17 2024

Giving Go a Go: Simplifying Cloud Infrastructure Development

Alex Williams

Software Engineering and Programming Languages

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More