Scrape Bing Search Results with Node.js

Bing search results are rich in data that can be invaluable for development, businesses, and academic research.

SerpApi provides APIs for scraping results pages from all of the largest search engines, including Bing. There are libraries supporting several of the most popular programming languages, including JavaScript, Python, Ruby, and PHP.

In this beginner-friendly tutorial, we will cover how to scrape Bing search results pages with Node.js, using SerpApi's JavaScript library.

To learn how to scrape Bing search results yourself, without using SerpApi, you can click here.

Click here for a tutorial using Ruby.

We will also cover how to paginate and get all of the available results for a Bing search.

Prerequisites

You will need to be somewhat familiar with Javascript.

You will need to have Node and Npm installed on your system. This tutorial should work with Node.js 7 and newer. If you don’t already have Node and Npm installed you can visit the following link for help with this process:

https://docs.npmjs.com/downloading-and-installing-node-js-and-npm

You will also need to sign up for a free SerpApi account at https://serpapi.com/users/sign_up.

Preparation

First we need to create a package.json file inside our project directory:

mkdir bing-scraper-nodejs
cd bing-scraper-nodejs
npm init

Npm will walk you through the process of creating a package.json file. We don't need to make any changes to the package.json file in this case, as we will be using the CommonJS syntax and import system.

Next, we need to install a few dependencies. For this tutorial, we will use the SerpApi Javascript library to get search results, and Dotenv for handling our API key.

npm install dotenv serpapi

If you haven’t already signed up for a free SerpApi account go ahead and do that now by visiting https://serpapi.com/users/sign_up and completing the signup process.

Getting Started

SerpApi provides starter code for all APIs, with examples in popular programming languages. You can find Node.js starter code not only for the Bing Search API but for any of SerpApi's APIs by visiting the documentation pages and selecting Javascript in the "Code to Integrate" window.

You will notice if you are signed into your account that your API key is already added to the parameters.

Note: I regenerated my API key after taking this screen shot to avoid API key theft. We don't recommend sharing your API key publicly, as this will allow someone else to use your SerpApi account without authorization

We can copy the example code using the clipboard icon to get started.

Then create a file called index.js, open it in your favorite IDE or editor, and paste the code.

You can run it by entering the command node index.js. You should see the JSON results printed in your terminal.

Protecting Your API Key

If you push your code to a public GitHub repository, you need to be careful not to expose your API key. A common way of handling this in Node.js is to create and add the API key as an environment variable and use the dotenv node module to retrieve it for use in your application. You can then edit your .gitignore to ensure the .env file won't be pushed to Github.

Create a .env file in your project directory:

touch .env

Open the .env file and add the API key as an environment variable:

SERPAPI_API_KEY="your-api-key-here"

Then we can use the dotenv module to retrieve it in our index.js. Underneath const { getJson } = require("serpapi"); add the following lines:

require('dotenv').config();
console.log(process.env);

Then run your application with node index.js to make sure your API_key is retrieved successfully. Comment out the call to console.log() after you do this, as we don't want to print out our environment variables every time we run our code.

The SerpApi library also includes a config object where you can store your API key so that you don't have to add it as a parameter in each request. To use config we need to add it to the import statement:

const { getJson, config } = require("serpapi");

We can then retrieve the API key using dotenv and store it in config:

const { getJson, config } = require("serpapi");
require('dotenv').config();
// console.log(process.env);
config.api_key = process.env.SERPAPI_API_KEY;

We can then remove the API_key from the parameter object in the getJSON() call. Your full code should now look like this:

const { getJson, config } = require("serpapi");
require('dotenv').config();
// console.log(process.env);
config.api_key = process.env.SERPAPI_API_KEY;

getJson({
  engine: "bing",
  q: "Coffee",
  cc: "US",
}, (json) => {
  console.log(json);
});

If you plan to push your code to a public GitHub repo, make sure you include .env in your .gitignore file.

Other Parameters

The starter code includes the basic parameters needed to perform a search. However, the Bing Search API supports some optional parameters that can help you fine-tune your results. Visit the documentation page to learn about them all.

Geographic Location

Location can be an important factor if you are using the Bing Search API to track SEO rankings, or want to emulate the search results you see when using Bing in your browser.

The Bing Search API supports the lat and lon parameters if you want to search for results from a point of origin as defined by GPS coordinates.

However, in this example, we will use the location parameter for simplicity. To add the location parameter, just add it to the parameter object. It accepts a string, and can include the name of a City, State, and/or Country:

getJson({
  engine: "bing",
  q: "Coffee",
  cc: "US",
  location: "Austin, Texas",
}, (json) => {
  console.log(json);
});

You can read more about location parameters, and how they affect your searches here:

Pagination

Bing searches return a limited number of results per page. Sending a request to SerpApi only returns the first page of results. To get more results we need to use the pagination parameters supported by SerpApi.

We can use the count parameter to increase the number of results per page. It has a maximum value of 50. However, Bing will not always provide a number of results per page equal to the value of count. Later pages seem to contain the number of results specified, but the first page may not show more than 35 results or so even if count=50.

To get as many results as possible, we set count=50:

getJson({
  engine: "bing",
  q: "Coffee",
  cc: "US",
  location: "Austin, Texas",
  count: 50,
}, (json) => {
  console.log(json);
});

We can use the first parameter to paginate. This is an offset, similar to Google's start except that it starts at 1 whereas start begins at 0.

So, for example, if 35 results were returned in response to the initial request, we can send another request with first=36:

getJson({
  engine: "bing",
  q: "Coffee",
  cc: "US",
  location: "Austin, Texas",
  count: 50,
  first: 36
}, (json) => {
  console.log(json);
});

To send multiple requests in a loop, we will need to check the response for the presence of results each time so that we know when to exit the loop. Since we can't check the response until we receive it, we will need to use the await keyword. To use the await keyword in CommonJs, we need to put our code inside an async function.

Typically we want to increment first by the number of results available per page with each pass of the loop, in order to get all available results. However, the number of results Bing returns per page may vary, so to do this with count=50 we would need to check how many results were returned, add one, and use that as the value for first.

Let's simplify for now by setting count=30, and incrementing first by 30 with each pass.

Instead of hardcoding the values for the search, let's also accept an object called "params" as a parameter in the function. You can comment out the call to getJson() we added earlier, and add the following code instead:

async function search(params){
  
  // get the first page
  let page = await getJson(params);
  let resultsPresent = true;
  
  // keep searching while there are results
  while(resultsPresent){
    
    // increment first by 30 to paginate
    params.first += 30;
    page = await getJson(params);
    if(!page?.organic_results){
      resultsPresent = false;
    }else{
      
       /* Print the current pagination offset and the title of the top 
       result on each page*/
      console.log("Result #" + page.search_parameters.first);
      console.log(page.organic_results[0].title);
    }
  }
}

Then declare a search_parameters object and call the search() method, passing in search_parameters:

const search_parameters = {
    engine: "bing",
    q: "Coffee",
    cc: "US",
    location: "Austin, Texas",
    mkt: "en-US",
    count: 50,
    first:1
  };

search(search_parameters);

Open your terminal window again and run your code with node index.js to to make sure everything is working correctly.

To reduce the number of calls it requires to paginate through all of the results, we could set count=50, and check the number of available results in each page, so that we know how much to increment first.

However, there is a cleaner way to paginate.

SerpApi Pagination

SerpApi provides a serpapi_pagination key for engines supporting pagination. With bing, and most other engines, we can use the serpapi_pagination.next key to get the URL for the next page.

We can use the url and querystring node modules to help us parse the parameters from the serpapi_pagination.next URL.

We need to import these modules, add the import statements at the top of index.js:

const url = require("url");
const qs = require("querystring");

The syntax to parse the pagination URL using these modules looks like:

const nextUrl = url.parse(page.serpapi_pagination.next);
const params = qs.parse(nextUrl.query);

We can just replace params.first += 30; with the lines above. Now we can set count to 50, and get the maximum number of results for each page.

Here is the full code:

const { getJson, config } = require("serpapi");
const url = require("url");
const qs = require("querystring");

require('dotenv').config();
// console.log(process.env);
config.api_key = process.env.SERPAPI_API_KEY;
config.timeout = 60000;


const search_parameters = {
    engine: "bing",
    q: "Coffee",
    cc: "US",
    location: "Austin, Texas",
    mkt: "en-US",
    count: 50,
    first:1
  };

async function search(params){
// get the first page
  let page = await getJson(params);
  let resultsPresent = true;
// keep searching while there are results
  while(resultsPresent){
    
    const nextUrl = url.parse(page.serpapi_pagination.next);
// Extract the request parameters
   const params = qs.parse(nextUrl.query);
// Get the next page
    page = await getJson(params);
    if(!page?.organic_results){
      resultsPresent = false;
    }else{
      // Print the current pagination offset and the title of the top result on each page
      console.log("Result #" + page.search_parameters.first);
      console.log(page.organic_results[0].title);
    }
  }
}

search(search_parameters);

Conclusion

We've seen how to scrape Bing Search Results using Node.js with SerpApi, and paginate through all of the available results.

I hope you found the tutorial informative and easy to follow. If you have any questions, feel free to contact me at ryan@serpapi.com.

APIs

Easy Integrations

Features

Use Cases

Prerequisites

Preparation

Getting Started

Protecting Your API Key

Other Parameters

Geographic Location

Pagination

SerpApi Pagination

Conclusion

Links

SerpApi:

Node modules:

Free Plan · 100 searches / month