Google search results are precious for many different folks. Using this information has many applications like SEO, data analysis, content creation, staying up-to-date with the latest news and trends, checking out the competition, managing your online reputation, or even making voice-activated devices smarter.

In this post, we will talk about scraping Google search results asynchronously using Node.js. Whether you're trying to improve your website's search ranking, train AI models, or analyze data trends, this guide is here to help.

Using SerpApi's Google Search API

Say goodbye to proxy management, captchas, and user agents, and say hello to a simplified and robust solution for your web scraping needs.

Benefits

Building and maintaining your own web scraper can be a complex and time-consuming endeavor. You must consider various factors such as handling proxies, solving captchas, managing user agents, and implementing delays to prevent getting blocked by Google. With SerpApi, all of these complexities are abstracted away. You can avoid the headache of developing and maintaining a scraper and instead concentrate on your core application logic.

SerpApi provides a Ready-to-Use Node.js library specifically tailored for effortless integration into your applications. This library is designed to work seamlessly with Node.js, allowing developers to harness the power of SerpApi with minimal setup and configuration. You can get up and running quickly, saving precious development time.

When you use SerpApi, you don't just receive raw HTML pages that you need to parse and structure. Instead, it directly delivers structured JSON data, making it easy to access and utilize the search results in your application. This streamlined approach enables you to focus on using the data rather than wasting time retrieving and formatting it.

Setup

If you haven’t already signed up for a free SerpApi account, go ahead and do that here. Once you complete the process, you can use the following link to retrieve your API key from your account's Dashboard.

You're now ready to install our Node.js package and start using it:

npm install serpapi
const { getJson } = require("serpapi");
getJson({
  engine: "google",
  api_key: API_KEY, // Get your API_KEY from https://serpapi.com/manage-api-key
  q: "coffee",
  location: "Austin, Texas",
}, (json) => {
  console.log(json["organic_results"]);
});

What we'll be scraping

We'll be scraping the organic results returned from Google for 200 topics. We'll collect the position, title, source, and actual link for each result. Please check our official Google Search API documentation for a detailed list of the available data returned in its response.

async parameter

We provide an async parameter that can be used with all of our APIs, including Google Search API. This parameter allows you to send a request to our API and not wait for a response to be returned. Your requests will be processed in parallel on our backend, and you can retrieve the data later.

This approach allows you to first send all of your requests, and then retrieve the data using the getJsonBySearchId() method. This method is a wrapper around our Search Archive API.

Code

Alright, let's write some code!

Below is the complete code snippet for scraping Google Search results asynchronously. If you don't need any additional explanations about how it works, feel free to grab it and modify it based on your needs.

// Import necessary methods from the serpapi library
const { getJson, getJsonBySearchId } = require("serpapi");

// Your SerpApi API key obtained from https://serpapi.com/manage-api-key
const API_KEY = "YOUR_ACTUAL_API_KEY"

// Summarized list of search topics for readability
topics = [
    "Artificial intelligence", "Climate change", "Space exploration", "Healthy recipes", "Virtual reality", "Cryptocurrency", "Photography tips", "Indoor plants care", "Mindfulness meditation", "Travel destinations", "DIY home decor", "Financial planning", "Time management techniques", "Historical events", "Fitness workouts", "Mobile photography"
]

// Initialize an array to store the search requests
const searchQueue = [];

// Function to execute requests to the Google Search API asynchronously
async function getResults(topic) {
    try {
        // Make a request to the Google Search API and push the response to the searchQueue
        const response = await getJson({
            engine: "google",
            q: topic,
            location: "Denver, Colorado, United States",
            google_domain: "google.com",
            hl: "en",
            gl: "us",
            async: true,
            api_key: API_KEY
        });
        searchQueue.push(response)
    } catch (error) {
        console.error('Error fetching data:', error);
        throw error;
    }
}

// Initialize an object to store the final results
const data = {}

// An array to store promises returned by getResults() for each topic
const searchPromises = [];

// Iterate through each topic and initiate the getResults() function for each
topics.forEach(topic => {
    // Create an empty array in the data object for each topic
    data[topic] = []
    
    // Get the promise returned by getResults() and push it to the searchPromises array
    const promise = getResults(topic);
    searchPromises.push(promise);
});

// Function to process organic results from a search query
async function processOrganicResults(searchQuery, searchData) {
    searchData.organic_results.forEach(result => {
        data[searchQuery].push({
            position: result.position,
            title: result.title,
            source: result.source,
            link: result.link
        });
    });
}

// Function to process the searchQueue and retrieve detailed data for each search
async function processSearchQueue() {
    // Wait for all promises in the searchPromises array to be resolved
    await Promise.all(searchPromises);

    // Process each search item in the searchQueue
    while (searchQueue.length > 0) {
        const searchItem = searchQueue.shift();

        try {
            // Retrieve detailed data for a search using its ID
            const searchItemData = await getJsonBySearchId(searchItem.search_metadata.id, { api_key: API_KEY });
            const searchId = searchItemData.search_metadata.id;
            const searchStatus = searchItemData.search_metadata.status;
            const searchQuery = searchItemData.search_parameters.q;

            // Handle different search statuses
            if (searchStatus === "Error") {
                console.log("#ERROR", searchItemData);
            } else if (searchStatus === "Processing") {
                // Requeue the search if it's still processing
                searchQueue.push(searchItemData);
                console.log(`Requeued Search with ID: ${searchId}`);
            } else {
                // Process the organic results for a successful search
                processOrganicResults(searchQuery, searchItemData);
            }
        } catch (error) {
            console.error('Error fetching data:', error);
            throw error;
        }
    }

    // Log the final data in a readable JSON format
    console.log(JSON.stringify(data, null, 2));
}

// Execute the processSearchQueue() function to initiate the processing of the queue
processSearchQueue()

Code breakdown

First we need to import the getJson and getJsonBySearchId methods from the serpapi library.

// Import necessary methods from the serpapi library
const { getJson, getJsonBySearchId } = require("serpapi");

// Your SerpApi API key obtained from https://serpapi.com/manage-api-key
const API_KEY = "YOUR_ACTUAL_API_KEY"

// Summarized list of search topics for readability
topics = [
    "Artificial intelligence", "Climate change", "Space exploration", "Healthy recipes", "Virtual reality", "Cryptocurrency", "Photography tips", "Indoor plants care", "Mindfulness meditation", "Travel destinations", "DIY home decor", "Financial planning", "Time management techniques", "Historical events", "Fitness workouts", "Mobile photography"
]

We then create a getResults() function that will execute the requests to the Google Search API using the async parameter. It accepts a topic as a parameter, which will be used as the search query.

We also initialize a searchQueue array, in which we'll push all of our queued requests. Later on, we'll use the information from this array to retrieve the actual data returned from our requests.

// Initialize an array to store the search requests
const searchQueue = [];

// Function to execute requests to the Google Search API asynchronously
async function getResults(topic) {
    try {
        // Make a request to the Google Search API and push the response to the searchQueue
        const response = await getJson({
            engine: "google",
            q: topic,
            location: "Denver, Colorado, United States",
            google_domain: "google.com",
            hl: "en",
            gl: "us",
            async: true,
            api_key: API_KEY
        });
        searchQueue.push(response)
    } catch (error) {
        console.error('Error fetching data:', error);
        throw error;
    }
}

We continue with defining a data object, which will hold the actual data we get from the results.

It's important to note that the getResults() function returns a promise. We create a searchPromises array that will hold all of the promises returned from getResults(). We need to resolve all of those promises before we start processing the queued requests in the searchQueue.

We also define a processOrganicResults() function, which will be used for processing the organic results from the response once the requests in the queue are processed.

// Initialize an object to store the final results
const data = {}

// An array to store promises returned by getResults() for each topic
const searchPromises = [];

// Iterate through each topic and initiate the getResults() function for each
topics.forEach(topic => {
    // Create an empty array in the data object for each topic
    data[topic] = []
    
    // Get the promise returned by getResults() and push it to the searchPromises array
    const promise = getResults(topic);
    searchPromises.push(promise);
});

// Function to process organic results from a search query
async function processOrganicResults(searchQuery, searchData) {
    searchData.organic_results.forEach(result => {
        data[searchQuery].push({
            position: result.position,
            title: result.title,
            source: result.source,
            link: result.link
        });
    });
}

Now, we're ready to start processing the queued requests. The processSearchQueue() function will do this job for us.

We start by awaiting for all of the Promises from the searchPromises array to be resolved. Then, we start processing each item in the searchQueue by shifting it from the array and retrieving its data with getJsonBySearchId.

If the request has been processed successfully (it has a status === "Success" instead of Processing or Error), we execute the processOrganicResults() with it to retrieve the actual data.

We then log the serialized JSON data.

// Function to process the searchQueue and retrieve detailed data for each search
async function processSearchQueue() {
    // Wait for all promises in the searchPromises array to be resolved
    await Promise.all(searchPromises);

    // Process each search item in the searchQueue
    while (searchQueue.length > 0) {
        const searchItem = searchQueue.shift();

        try {
            // Retrieve detailed data for a search using its ID
            const searchItemData = await getJsonBySearchId(searchItem.search_metadata.id, { api_key: API_KEY });
            const searchId = searchItemData.search_metadata.id;
            const searchStatus = searchItemData.search_metadata.status;
            const searchQuery = searchItemData.search_parameters.q;

            // Handle different search statuses
            if (searchStatus === "Error") {
                console.log("#ERROR", searchItemData);
            } else if (searchStatus === "Processing") {
                // Requeue the search if it's still processing
                searchQueue.push(searchItemData);
                console.log(`Requeued Search with ID: ${searchId}`);
            } else {
                // Process the organic results for a successful search
                processOrganicResults(searchQuery, searchItemData);
            }
        } catch (error) {
            console.error('Error fetching data:', error);
            throw error;
        }
    }

    // Log the final data in a readable JSON format
    console.log(JSON.stringify(data, null, 2));
}

Finally, we execute processSearchQueue() to get everything going.

// Execute the processSearchQueue() function to initiate the processing of the queue
processSearchQueue()

Conclusion

We have seen how to scrape organic results from the Google Search Engine using the Google Search API. You can modify this template in any way you need to based on your use case.

I hope this tutorial was helpful and easy to follow. If you have any questions, feel free to contact me at martin@serpapi.com.