If you are looking to scrape Google Images, we already have a blog post that covers it.

Google Lens is known for it’s mobile app, widely available on iOS and Android. It’s popular because of the image recognition functionality which provides information and context about objects, text, and landmarks captured by a smartphone camera. It can identify and translate text, recognize plants and animals, provide information about products, and more, making it a valuable tool for visual search and augmented reality experiences.

On the other hand, Google Lens web version is a great tool for reverse searching an image. The search results could contain the knowledge graph of the subject, similar images that can be seen on the internet, and a great feature to find image sources.

Example search results of a giraffe image:

Google Lens search for giraffe. It results include other species of giraffe, a knowledge graph and a collection of similar images

Unlike similar images, which generally display giraffe images, the image sources feature is a great tool for showing the exact images that have been posted on the internet. The same angle, the same color, the same pose and etc, or at least something pretty close.

Here is a walkthrough on accessing the image sources page:

0:00
/0:21

Google Lens search for giraffe photo (Using Chrome on a Mac)

At SerpApi, we have introduced a new Google Lens Image Sources API that enables you to scrape results from the Google Lens image sources page. We have a playground for the API that you can test out. In the rest of this article, we will demonstrate how to use the API to scrape Google Lens image sources page.

Why use an API?

  • No need to create a parser from scratch and maintain it.
  • Bypass blocks from Google: solve CAPTCHA or solve IP blocks.
  • No need to pay for proxies, and CAPTCHA solvers.
  • Don't need to use browser automation.

SerpApi takes care of everything mentioned above with fast response times under ~2.47 seconds (~1.33 seconds with Ludicrous speed) per request. The result is a well structured data in JSON with only a single API call.

Response times and success rates are shown under the SerpApi Status page.

Full code in Javascript

Panda | Panda in China | George Lu | Flickr
https://live.staticflickr.com/8288/7708872342_b3b3b95813_b.jpg

This is the image we will be using for the demonstration. A cute little panda 🐼.

If you don't need an explanation, feel free to copy and paste the code into your favorite IDE and start running. Otherwise, you can follow the step-by-step-guide code explanation to know what each line is doing. You can also check out our playground, which is very handy to play around with the API and get a sense of how things work.

import { getJson } from "serpapi"

const SERPAPI_KEY = "..." // Get your API_KEY from https://serpapi.com/manage-api-key

// 1. Get the image sources page token
const googleLensParams = {
  api_key: SERPAPI_KEY,
  engine: "google_lens",
  url: "https://live.staticflickr.com/8288/7708872342_b3b3b95813_b.jpg"
}
const googleLensResponse = await getJson(googleLensParams)

// 2. Use the page token to retrieve image sources
const imageSourcesParams = {
  api_key: SERPAPI_KEY,
  engine: "google_lens_image_sources",
  page_token: googleLensResponse["image_sources_search"]["page_token"]
}
const imageSourcesResponse = await getJson(imageSourcesParams)

Prerequisite

Install library:

npm install serpapi

serpapi is an official SerpApi API package. Follow this guide to get your Node.js and npm install if you haven’t done so.

Code Explanation

First, we need to import the library

import { getJson } from "serpapi"

To retrieve the image sources, we need page_token which can be obtained from Google Lens API.

We set the required parameters for Google Lens API. A list of parameters and their detailed explanation can be found on our documentation.

const googleLensParams = {
  api_key: SERPAPI_KEY,
  engine: "google_lens",
  url: "https://live.staticflickr.com/8288/7708872342_b3b3b95813_b.jpg"
}

Once parameters are defined, everything else will be taken care of by the library, for example, making the actual API request. Calling getJson to make the API request and JSON data will be returned and assigned it to googleLensResponse.

const googleLensResponse = await getJson(googleLensParams)

Output from Google Lens:

...
"image_sources_search": {
  "page_token": "YzdkMmM0MDEtZDg3YS00NWE0LThhYTYtY2E3Y2U4ODQzZDJm",
  "serpapi_link": "https://serpapi.com/search.json?engine=google_lens_image_sources&page_token=YzdkMmM0MDEtZDg3YS00NWE0LThhYTYtY2E3Y2U4ODQzZDJm"
}

After we have obtained the page_token, we set the required parameters for the Google Lens Image Sources API and we are good to go. A list of parameters and their detailed explanation can be found on our documentation.

const imageSourcesParams = {
  api_key: SERPAPI_KEY,
  engine: "google_lens_image_sources",
  page_token: googleLensResponse["image_sources_search"]["page_token"]
}
const imageSourcesResponse = await getJson(imageSourcesParams)

Output:

The total number of results is about 400.

console.log(imageSourcesResponse)
"image_sources": [
  {
    "position": 1,
    "title": "Good News: Giant Pandas No Longer Endangered | Popular Science",
    "source": "Popular Science",
    "source_logo": "https://encrypted-tbn1.gstatic.com/favicon-tbn?q=tbn:ANd9GcQRm44rTc29eZEGLrhaKXvwyGf0F_BMMmL90m0Lx1bqDrlxWIGlEOEq5oQ7ixJoxJI3oukQXe_oY4R2otKzF8_ArHGvGwB2t4nGGzIdH2MnUAzLCQ",
    "link": "https://www.popsci.com/good-news-giant-pandas-no-longer-endangered/",
    "thumbnail": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcT2uqoIUY7lILHbon8I_OZs3vKeIX5gFA1wzzbd7zSVVuedvQxm",
    "actual_image_width": 785,
    "actual_image_height": 523,
    "date": "Sep 6, 2016"
  },
  {
    "position": 2,
    "title": "Panda | Panda in China | George Lu | Flickr",
    "source": "Flickr",
    "source_logo": "https://encrypted-tbn2.gstatic.com/favicon-tbn?q=tbn:ANd9GcTfrOu0sWKs3Pcw2QmgK3YJ56j2-ayOW-qRIJSQvaGvhVZy3tuULAELn-9WhCkJkgiSJmW8S2sgnoc74IbwPBg6aBaBJfMsZdN3QKOSPEQO0xq3oQ",
    "link": "https://www.flickr.com/photos/gzlu/7708872342",
    "thumbnail": "https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcQ7S-FJ5EFaoGqGi66z7FHC6ANKHqMTFCCjoO9Yzi7NWLHQrpXJ",
    "actual_image_width": 1024,
    "actual_image_height": 683
  },
  {
    "position": 3,
    "title": "Prepare Data for Machine Learning in Python with Pandas - MachineLearningMastery.com",
    "source": "Machine Learning Mastery",
    "source_logo": "https://encrypted-tbn0.gstatic.com/favicon-tbn?q=tbn:ANd9GcQcJkq_0r9l38eWZvFKOeogF5bbkCJawUVhf113plZi1TgDK3f9mjSLTYSL_gNus-7yrzwx-gxEMyzJLzYC7RZ6tg3_jop4F63bvDWzYRUfoLJ8kR3ZtbdnpHiQVDAp8g",
    "link": "https://machinelearningmastery.com/prepare-data-for-machine-learning-in-python-with-pandas/",
    "thumbnail": "https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcT3ERjSbsEayhYLl2WRh4BAu4uBh6JHCJuRjjByTuO9_NvYb18s",
    "actual_image_width": 640,
    "actual_image_height": 427,
    "date": "Aug 15, 2020"
  },
  ...
]

Documentation

Conclusions

Google Lens is a great reverse image search tool and the image sources webpage is a great feature to find sources on the internet that posted the similar images. The data can be useful for analysis, artificial intelligence training, content creation, social media research, and many more.

If you have any questions, please feel free to reach out to me.


Join us on Twitter | YouTube

Add a Feature Request💫 or a Bug🐞