Wednesday, September 18, 2024

How to use Recaptcha V3 with a nodejs AWS lambda

Overview

I have a business site (https://x3audio.com) that features a contact form. 

I'm using AWS Cloudfront to deliver the site from an S3 bucket and wanted to include a contact form. To help me do this, I decided to use a lambda function, exposed as a function URL. 

This function URL can be called by a Javascript integration in the web page when a user completes and submits a very simple contact form. Once the recaptcha token has been 'scored', the lambda uses SES to send me an email.

In the era of bots, spam engines and the like, I can't just naively expose the URL and 'hope' everything will be alright. Two security measures have been employed:

OAC was relatively easy to set up, but Recaptcha v3 proved a little problematic. But I finally got it operational, and this post shares some of the issues I encountered. If you follow these, you will at least get the basic V3 flow working (if you want to use the 'advanced' create assessment, this post does not cover that).

V3 has distinct client and server aspects. The client side is straightforward enough, but the server side was less so for me.

Get the token to the lambda

I needed to capture and pass the token that Recaptcha V3 creates when a form is submitted, to the backend lambda that I implemented. Two stages really, first the button that is embedded in the contact form:
 <button class="w-100 btn btn-lg btn-primary g-recaptcha"  
     data-sitekey="YOUR SITE KEY"  
     data-callback='onSubmit'  
     data-action='submit'  
     type="submit"  
     id="contactformbutton">  
     Send  
 </button>  
I'm using Bootstrap v5 so some of the markup is there. The "YOUR SITE KEY" can be found in the Recaptcha section in the google cloud console (you have to sign up to the V3 program) -- see below. 


The data-callback attribute in the button invokes a very simple piece of Javascript to GET the form to my lambda -- did not use POST as this got quite complicated quite quickly:

 async function onSubmit(token) {  
  const cfr = new Request("https://x3audio.com/contact?mx=" + mx   
                      + "&ma=" + ma + "&rem=" + rem +   
                     "&e=" + email);  
  cfr.method = "GET";  
  cfr.headers.append('x-v3token', token);  
  cfr.headers.append('x-v3token-length', token.length);  
  try {  
     const response = await fetch(cfr);  
Recaptcha V3 calls the onSubmit function and supplies the 'token' it has derived for the current users interaction with the page. This is the token we now pass to the AWS lambda for scoring (asking Google to score it via an https POST).

I'm using the web standard Fetch API, so I pass some data I want as query string parameters, but embed the V3 token in the request as a header (called x-v3token) and also set a header with the length of the token (x-v3token-length). 

This second header is not strictly necessary, but I wanted to check the size of the token at source and when received, as Cloudfront has a fairly obscure set of limits in play.

Use the Fetch API in the lambda to get a token scored

My AWS lambda is written in NodeJS, running in a Node 20.x runtime. So, for the recaptcha side of things, I need to extract the token from the headers of an inbound request, and ask Google to score them, using the Fetch API. 

Easy, right??

No. This caught me out. The Fetch API is available in nodejs 20.x, but the standard code editor in AWS cannot see it. To have it visible, you have to include this line at the top of your lambda:
 /*global fetch*/  
Once you do that, you can use the Fetch API easily. What follows is an abbreviated lambda, having just the useful bits documented:

 export const handler = async (event) => {  
  const obj = await assess(event); 
  const response = {
    statusCode: 200
  };
  return response;
 };  
This is the lambda entry point. Obviously you return a response with a status and possibly a body, but here I'm just omitting most of the implementation and showing the call to assess which will do the Recaptcha v3 scoring.

As below:
 
 async function assess(event) {   
  let obj = {   
   recaptcha_score: -1,  
   recaptcha_error_codes: [],  
   is_bot: true,  
   party: event["rawQueryString"],  
   source_ip: event.headers["x-forwarded-for"],   
   rc_v3_token: event.headers["x-v3token"],  
  };  
  try {  
   const rc_result = await checkToken(obj.rc_v3_token, obj.source_ip);  
   obj.recaptcha_score = rc_result.score;  
   obj.recaptcha_error_codes = rc_result.error_codes;  
   obj.is_bot = obj.recaptcha_score < 0.7;  
  }  
  catch (ex) {   
   console.log('Late exception: ' + ex, ex.stack);  
  }  
  return obj;  
 }  
So I set up an object that I will use to record the v3 score, whether it seems to be a bot and some other detail (the raw query string). I extract the v3 token from the headers, where it was set by the Javascript integration on my site (see above).

The event argument to the function is the http integration event received by the lambda.

There is a call to checkToken which is the function (below) that sends the token to Google for scoring and returns it to the assess function. 
 async function checkToken(token, ip) {  
     let score = -1;  
     let error_codes = [];  
     try {  
      const url = 'https://www.google.com/recaptcha/api/siteverify?secret=YOUR-SECRET-KEY&response=' + token;  
      let response = await fetch(url, { method: 'POST' });  
      const json = await response.json();  
      score= json.success ? json.score : -1;  
      error_codes = json.success ? [] : json["error-codes"];  
     }  
     catch (ex) {  
         console.log('Failed to check token: ' + ex, ex.stack);  
         error_codes = [ ex.toString() ];  
     }  
     return { score: score, error_codes: error_codes };  
 }  
The token argument is sent to the recaptch google endpoint (recaptcha/api/siteverify) along with the secret key of your Google cloud account. The response can then be inspected to see if it succeeded and what google thought of the user (based on their interaction with the site).

You must replace YOUR-SECRET-KEY with your own unqiue one. 

Can't find your secrete key? Nor could I, until I pressed Use Legacy key, see image:

 

Example result

Here is an example response from Google, showing a sucessful scoring request, what the score was (0.9, scale is 0.1 to 1.0) and so on.

 {  
  success: true,  
  challenge_ts: '2024-09-17T20:22:45Z',  
  hostname: 'x3audio.com',  
  score: 0.9,  
  action: 'submit'  
 }  

No comments: