Scraping Google Search Results with Python and AWS Part II

In my previous blog post, I talked about scraping Google Search results using Python within the AWS ecosystem using AWS Lambda and storing the results in DynamoDB. This was accomplished using SerpApi to scrape the results and get the data in JSON format.

In this blog post, let's dive one level deeper and explore how you can implement logging for the lambda scraper function we created, and add alerting based on the results you obtain from SerpApi within AWS.

Logging The Response

To log the response from the SerpApi scraper lambda function we wrote in the previous blog post, we will use AWS CloudWatch.

AWS CloudWatch provides an integrated solution to monitor and log Lambda function executions, offering valuable insights into performance metrics, errors, and overall function health.

To gain visibility into the function's execution, we can use CloudWatch to log all the events related to it in a central place. We can capture logs, set up custom metrics, track errors, and monitor execution duration. These logs will be invaluable in troubleshooting, understanding performance bottlenecks, and optimizing the function's behavior.

After creating the lambda function, and testing it like we did in the previous blog post, a related log group should be created in AWS Cloudwatch. This appears like this:

When you click on the log group, you can access the log stream for all your previous lambda runs. It looks like this:

Without any additional configuration and changes, this will tell you if the function ran successfully. If it produced an error, you'll be able to see the error message here as well.

However, to make this more useful, you can add response metadata logging to your lambda function. This will enable you to debug easily when needed. The search_metadata for a search conducted using our APIs looks something like this:

"search_metadata":
{
  "id": "67bf89ae717ded2b923bf512",
  "status": "Success",
  "json_endpoint": "https://serpapi.com/searches/f8e83f39923b5492/67bf89ae717ded2b923bf512.json",
  "created_at": "2025-02-26 21:37:50 UTC",
  "processed_at": "2025-02-26 21:37:50 UTC",
  "google_url": "https://www.google.com/search?q=Coffee&hl=en&gl=us&sourceid=chrome&ie=UTF-8",
  "raw_html_file": "https://serpapi.com/searches/f8e83f39923b5492/67bf89ae717ded2b923bf512.html",
  "total_time_taken": 1.32
}

To include this in the logs for your searches, you'd need to add some basic logging steps to the existing lambda function we wrote in the previous blog post.

💡

For simplicity, I'll use the Lambda function code which uses our Google Search Library in python, for which you can find the code here.

Let's use the logging library to accomplish this. You can set the log level as you wish. It supports 6 log levels:

logging — Logging facility for Python

Source code: Lib/logging/__init__.py Important: This page contains the API reference information. For tutorial information and discussion of more advanced topics, see Basic Tutorial, Advanced Tutor…

Python documentation

Using this library to log the search metadata looks like this:

import json, os
from serpapi import GoogleSearch
import boto3
from datetime import datetime
import logging

# Get the value of the LAMBDA_LOG_LEVEL environment variable
log_level = os.environ.get('LAMBDA_LOG_LEVEL', 'INFO')

# Configure the logger
logger = logging.getLogger()
logger.setLevel(log_level)


def lambda_handler(event, context):
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('SearchResults')

    params = {
        "q": "coffee",
        "api_key": os.environ.get('SERPAPI_API_KEY')
    }

    search = GoogleSearch(params)
    results = search.get_dict()
    logger.info(f'"Search Metadata:{results["search_metadata"]}"')
    organic_results = results["organic_results"]
    links = []
    for result in organic_results:
        links.append(result["link"])

    table.put_item(
        Item={
            'search_id': results["search_metadata"]["id"],
            'timestamp': datetime.now().isoformat(timespec='seconds'),
            'links': json.dumps(links)
        }
    )

    return {
        'statusCode': 200,
        'body': links
    }

Add a LAMBDA_LOG_LEVEL environment variable to the Lambda function to configure the log level. In this case, I've set the default to "INFO". This will ensure I see all the informational logs which I send from this function.

If we test the lambda function now, we should see the search metadata logged in Cloudwatch as well.

Here are the relevant logs in Cloudwatch after I run the function above:

💡

Having the Search ID in your logs can be especially useful in tracking a particular search in case you notice an issue with the results or if you want to reach out to our support team for any search related issues.

This will track search status codes as well. I've explained more about what to expect for those below.

You can add any other relevant fields to your logs as well if you want to keep a record of each search run. These logs are easily searchable, and you can even use features like CloudWatch Logs Anomaly Detection and Live Trail to catch particular errors when they happen.

Search Status and Error Codes

All of our Search APIs use the same error response structure. This includes all of our APIs except for the Extra APIs (Location API, Account API, Search Archive API etc.).

A search status is accessible through the search_metadata.status key. A search status begins as Processing, then resolves to either Success or Error. If a search has failed or contains empty results, the top level error key will contain an error message.

You can find more details about this on our Search API Status and Errors page.

If you're using our APIs via an HTTP GET request, you may encounter some numbered error codes. SerpApi uses conventional HTTP response codes to indicate the success or failure of an API request. In general, a 200 code indicates success. Codes in the 4xx range usually indicate an error that failed given the information provided (e.g., a required parameter was omitted, ran out of searches, etc.). Codes in the 5xx range usually indicate an error with SerpApi's servers.

You can find more details about this on our Status and Error Codes page.

Alerting on Results

If you're looking to alert based on results from our API, you can use AWS tools like Simple Notification Service (SNS). I'll be using SNS to send a notification to my email.

Here I'm going to demonstrate a simple example about setting up an alert to check if a particular website appears in the top 10 organic search results from our Google Search API.

For this, I will modify the lambda function to accept a domain from the user and look for that domain in the top 10 organic results we obtain from our Google Search API. Here is what this looks like:

import json, os
from serpapi import GoogleSearch
import boto3
from datetime import datetime
import logging

# Get the value of the LAMBDA_LOG_LEVEL environment variable
log_level = os.environ.get('LAMBDA_LOG_LEVEL', 'INFO')

# Configure the logger
logger = logging.getLogger()
logger.setLevel(log_level)
sns_client = boto3.client('sns')

# A helper function to send the alert to SNS
def send_alert(sns_topic_arn, message): 
    response = sns_client.publish(
        TopicArn=sns_topic_arn,
        Message=message,
        Subject="SERP Alert: Domain Not in Top 10"
    )
    return response


def lambda_handler(event, context):
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('SearchResults')
    domain_to_search_for = event['domain_to_search_for'] # Accepted from the user

    params = {
        "q": event['q'], # Accepted from the user
        "api_key": os.environ.get('SERPAPI_API_KEY')
    }

    search = GoogleSearch(params)
    results = search.get_dict()
    logger.info(f'"Search Metadata:{results["search_metadata"]}"')
    organic_results = results["organic_results"]
    links = []
    for result in organic_results:
        links.append(result["link"])
    
    domain_found_in_top_10 = False 
    for link in links:
        if domain_to_search_for in link:
            domain_found_in_top_10 = True
    
    table.put_item(
        Item={
            'search_id': results["search_metadata"]["id"],
            'timestamp': datetime.now().isoformat(timespec='seconds'),
            'links': json.dumps(links)
        }
    )

    if not domain_found_in_top_10:
        sns_topic_arn = 'arn:aws:sns:us-east-2:<Account ID>:<SNS Topic>'
        message = f"Alert: The domain {domain_to_search_for} is not in the top 10 results for the searched keyword."
        send_alert(sns_topic_arn, message)

    return {
        'statusCode': 200,
        'body': links,
        'domain_found': domain_found_in_top_10
    }

Now before this can work, we need to create an SNS topic in the AWS account and give the lambda function permissions to access the SNS topic.

Let's create the SNS topic:

Following that, we can add a "Subscription" for this SNS topic and select a preferred method of notification, such as email:

Let's now give lambda permission to access this topic now. You can do this by adding the AmazonSNSFullAccess policy to the role for your Lambda on the AWS IAM page. This allows your Lambda to talk to your SNS topic and actually send the notification.

Following this, replace the SNS topic placeholder with the name of the topic you created here and account ID with your AWS account ID - and we're ready to test.

For the test, we can add the two fields that we are accepting from the user - the query q and the domain domain_to_search_for, and click Test.

Here I deliberately chose a domain which wouldn't be in the top 10 results for the query so I could test the email notification:

This adds the list of links to a DynamoDB table and also send you a notification like this on your email in case your domain is not in the top 10:

DynamoDB table entry created:

Notification sent via email:

💡

You can make many useful modifications to this based on your use case. You configure it to notify you of any changes from a previous top 10 list, or use it to notify you of competitors, and much more.

Conclusion

You've successfully set up your Lambda function to scrape data from Google Search results, add the results to a DynamoDB table. You've also added relevant search logging, and sent an email notification at the click of a button!

I hope this blog post was helpful in understanding how to use AWS's powerful features in combination with our exceptional search APIs. If you have any questions, don't hesitate to reach out to me at sonika@serpapi.com.

Relevant Posts

You may be interested in reading more about our Google Search API, and other integrations we have.

How to Scrape Google Search Results (SERPs) - 2025 Guide

Google, Bing, and Yahoo search results, also known as SERPs (search engine results pages), can be a valuable source of information for businesses, researchers, and individuals. However, manually collecting this data can be time-consuming and tedious. Many web scrapers allow you to scrape Google Search Results, and web scraping tools

SerpApiAndy L

How to Scrape Google Search Results in Bubble.io using SerpApi

Learn about how to scrape Google search results in Bubble.io using SerpApi’s Google Search API without writing any code

SerpApiSonika Arora

Announcing SerpApi’s Make App

We are pleased to announce the release of our Custom App on Make.com. This makes it easier than ever to integrate SerpApi with other systems without having to write any code.

SerpApiAlex Barron

How to Scrape Google Results into Airtable

Discover how to use SerpApi and Make to pipe search result data into Airtable

SerpApiAlex Barron