Introduction to Web Scraping Google Maps

In the vast expanse of the internet, information is king, and harnessing the power of data can be a game-changer for businesses and individuals alike. Web scraping emerges as a formidable tool in this pursuit, offering the ability to extract valuable data from various online platforms. In this section, we delve into the concept of web scraping and its application in extracting business information from the well known Google Maps.

The Power of Data Extraction

Web scraping, in essence, is the automated process of extracting data from websites. It involves utilizing tools and scripts to navigate web pages, locate specific information, and then extract and organize that data for analysis or other purposes. Think of it as a digital collector, systematically gathering insights from the vast landscape of the internet.

The Role of Google Maps in Business Information

Google Maps has evolved beyond being a simple navigation tool; it has become a comprehensive repository of business information. From local bakeries to retail stores and service providers, Google Maps hosts a wealth of data about businesses worldwide. Web scraping enables us to tap into this goldmine, extracting details such as business names, addresses, contact information, reviews, and more.

Why is this data important and how it can be used

In the dynamic landscape of local businesses, gaining valuable insights is essential for informed decision-making and strategic planning. One pivotal use case involves gaining insights into the local business landscape. This application allows stakeholders to analyze the distribution of businesses, discern popular industries, and identify trends in specific areas. For instance, by aggregating data on the types and concentrations of businesses in a given area, stakeholders can uncover trends such as emerging industries and strategically position themselves in the local market.

Competitor analysis

Another significant use case revolves around competitor analysis, providing businesses with a tool to understand the competitive landscape. This application involves identifying competing businesses, pinpointing their locations, and assessing their strengths and weaknesses. Through comprehensive competitor analysis, businesses can strategically position themselves by leveraging insights into competitor dynamics.

Market research

Market research, as a use case, plays a vital role in conducting in-depth analyses for specific industries. The application involves scrutinizing market saturation, identifying gaps, and exploring opportunities for new businesses. By conducting thorough market research, entrepreneurs can identify underserved industries, assess the demand for particular services, and strategically enter markets with unexplored opportunities.

Real estate analysis

Real estate analysis is crucial for assessing the value of properties and neighborhoods. This use case involves evaluating the proximity of businesses, amenities, and services to determine property values. Property developers and investors can make informed decisions by considering these factors when evaluating the value and potential growth of real estate in a specific neighborhood.

Using Node.js with Google Maps API

We'll be iterating a grid of coordinates. For each point of the grid, we'll be scraping the business listings in the area.

const { getJson } = require("serpapi");

const API_KEY = "YOUR_API_KEY"

function generateGrid(centerLat, centerLon, gridSize, stepSize) {
    const grid = [];

    for (let lat = centerLat - gridSize; lat <= centerLat + gridSize; lat += stepSize) {
        for (let lon = centerLon - gridSize; lon <= centerLon + gridSize; lon += stepSize) {
            fixedLat = lat.toFixed(4)
            fixedLon = lon.toFixed(4)
            grid.push({ lat, lon });
        }
    }

    return grid;
}

// Area coordinates (Austin, Texas in this example)
const areaLat = 30.27242504697381;
const areaLon = -97.7431956403583;

const earthRadius = 6371; // Earth radius in kilometers

// Generate grid for the Area with a stepping point of 1 km
const stepSize = 1 / earthRadius * (180 / Math.PI);

const areaGrid = generateGrid(areaLat, areaLon, 0.08, stepSize);

businessListings = []

let remainingPoints = areaGrid.length;

function delay(ms) {
    return new Promise(resolve => {
        setTimeout(resolve, ms);
    });
}

async function searchArea(query, coordinates, start, pointNum, timeout = 0,) {
    try {
        await delay(timeout);

        const response = await getJson({
            engine: "google_maps",
            q: query,
            ll: coordinates,
            start: start,
            type: "search",
            api_key: API_KEY
        });

        if (response.local_results) {
            response.local_results.forEach(result => {
                businessListings.push({
                    "placeId": result.place_id,
                    "name": result.title,
                    "type": result.type,
                    "types": result.types,
                    "address": result.address,
                    "phone": result.phone,
                    "website": result.website,
                });
            });
            searchPromise = searchArea(query, coordinates, start += 20, pointNum);
            searchPromises.push(searchPromise);
        }
        else {
            remainingPoints--;
        }
    } catch (error) {
        console.log(error.message);
        remainingPoints--;
    }

}

const searchPromises = []

areaGrid.forEach((point, index) => {
    searchPromise = searchArea("restaurant", `@${point.lat},${point.lon},17z`, 0, index, index * 12000);
    searchPromises.push(searchPromise)
});

async function processData() {
    while (remainingPoints > 0) {
        await delay(20000);
    }

    await Promise.all(searchPromises);

    filteredListings = businessListings.filter((value, index, self) =>
        index === self.findIndex((t) => (
            t.placeId === value.placeId
        ));
    );
    
    return filteredListings;
}

processData();

Getting Started

First, we need to install the serpapi library, which we'll use to send requests to the Google Maps API. To install it, use the following command in your terminal:

npm install serpapi

The next step would be to import the serpapi package and replace the API_KEY placeholder in with your actual API key:

const { getJson } = require("serpapi");
const API_KEY = "YOUR_API_KEY"

To obtain your API key, you can sign up for a free SerpApi account here. Once you complete the process, you can retrieve it from your account's Dashboard.

Code breakdown

We create a function that will generate our grid points. The function takes a central latitude and longitude point and uses gridSize and stepSize parameters to create the grid.

function generateGrid(centerLat, centerLon, gridSize, stepSize) {
    const grid = [];

    for (let lat = centerLat - gridSize; lat <= centerLat + gridSize; lat += stepSize) {
        for (let lon = centerLon - gridSize; lon <= centerLon + gridSize; lon += stepSize) {
            fixedLat = lat.toFixed(4)
            fixedLon = lon.toFixed(4)
            grid.push({ lat, lon });
        }
    }

    return grid;
}

We use the coordinates of Austin, Texas, in this example. The earthRadius constant is used to calculate the stepSize. We then generate the grid with those parameters and a gridSize of 0.08 (approximately 8.6 kilometers). By changing the stepSize and gridSize parameters, you can adjust the grid's density and size.

We define a businessListings array, that will hold the information we extract. The remainingPoints variable is later used in the code to track when all the points from the grid have been processed.

// Area coordinates (Austin, Texas in this example)
const areaLat = 30.27242504697381;
const areaLon = -97.7431956403583;

// Earth radius in kilometers
const earthRadius = 6371; 

// Generate grid for the Area with a stepping point of 1 km
const stepSize = 1 / earthRadius * (180 / Math.PI);

const areaGrid = generateGrid(areaLat, areaLon, 0.08, stepSize);

businessListings = []

let remainingPoints = areaGrid.length;

The delay function will be used to set delays between requests execution.

function delay(ms) {
    return new Promise(resolve => {
        setTimeout(resolve, ms);
    });
}

The searchArea function retrieves the results from the Google Maps API. By default, Google Maps returns 20 results per page. We need to paginate through the results to retrieve all of them. This is done by recursively calling searchArea with an incremented start parameter. The recursion stops when no more local_results are returned in the response.

async function searchArea(query, coordinates, start, pointNum, timeout = 0,) {
    try {
        await delay(timeout);

        const response = await getJson({
            engine: "google_maps",
            q: query,
            ll: coordinates,
            start: start,
            type: "search",
            api_key: API_KEY
        });

        if (response.local_results) {
            response.local_results.forEach(result => {
                businessListings.push({
                    "placeId": result.place_id,
                    "name": result.title,
                    "type": result.type,
                    "types": result.types,
                    "address": result.address,
                    "phone": result.phone,
                    "website": result.website,
                });
            });
            searchPromise = searchArea(query, coordinates, start += 20, pointNum);
            searchPromises.push(searchPromise);
        }
        else {
            remainingPoints--;
        }
    } catch (error) {
        console.log(error.message);
        remainingPoints--;
    }

}

We use searchPromises to store the promises returned from searchArea. Then, we iterate over the grid, executing searchArea for each point. Since the points in the grid are spaced out with a 1 kilometer distance between each one, we pass a Zoom value of 17z in the coordinates parameter. It covers a radius of ~700 meters around each point. Please note that if you change the stepSize, you also need to adjust the Zoom value.

const searchPromises = []

areaGrid.forEach((point, index) => {
    searchPromise = searchArea("restaurant", `@${point.lat},${point.lon},17z`, 0, index, index * 12000);
    searchPromises.push(searchPromise)
});

Finally, we need to process the retrieved results. The processData function first checks if any remaining grid points are left to be processed. If not, it waits for all the promises returned from searchArea to be resolved. Then, we filter all of the duplicate listings from the array.

async function processData() {
    while (remainingPoints > 0) {
        await delay(20000);
    }

    await Promise.all(searchPromises);

    filteredListings = businessListings.filter((value, index, self) =>
        index === self.findIndex((t) => (
            t.placeId === value.placeId
        ));
    );
    
    return filteredListings;
}

processData();

Conclusion

Our journey through scraping business listings on Google Maps using Node.js and SerpApi has demonstrated the power of data for strategic insights. From understanding local business landscapes to competitor analyses, the extracted data proves invaluable for informed decision-making.

If you have any questions or would like to discuss any issues or matters, feel free to contact our team at contact@serpapi.com. We'll be more than happy to assist you and answer all of your questions!