How to Scrape DuckDuckGo Knowledge Graph Results
In my series of blog posts guiding you to scrape Knowledge Graph results from large search engine platforms, I already guided you through how to scrape Google, Bing, and Yahoo Knowledge Graph results effortlessly with SerpApi.
DuckDuckGo is one of the popular search engines which is focusing on privacy. Surely DuckDuckGo returns a Knowledge Graph result for specific keywords.
The source of the Knowledge Graph is not public. Some search engines get results from Wikipedia and we can trust it.
Setting up a SerpApi account
SerpApi offers a free plan for newly created accounts. Head to the sign-up page to register an account and complete your first search with our interactive playground. When you want to do more searches with us, please visit the pricing page.
Once you are familiar with all results, you can utilize SERP APIs using your API Key.
Scrape your first DuckDuckGo Knowledge graph result with SerpApi
Head to the DuckDuckGo Knowledge Graph Results from the documentation on SerpApi for details.
In this tutorial, let's scrape some basic information from my favorite author: "Malcolm Gladwell". If you haven't picked up any of his books, try it. He's an excellent storyteller. The data contains: "name", "description", "thumbnail", "website", "education", and "notable_work". You can also scrape even more information with SerpApi!
First, you need to install the SerpApi client library.
pip install google-search-results
Set up the SerpApi credentials and search.
import serpapi, os, json
params = {
'api_key': 'YOUR_API_KEY', # your serpapi api
'engine': 'duckduckgo', # SerpApi search engine
'q': 'malcolm gladwell'
}
To retrieve DuckDuckGo Knowledge Graph Results for a given search term, you can use the following code:
results = serpapi.Client().search(params).get_dict()['knowledge_graph']
You can store DuckDuckGo Knowledge Graph Results JSON data in databases or export them to a CSV file.
import csv
header = ['name', 'description', 'thumbnail', 'website', 'education', 'notable_work']
with open('malcolm_gladwell.csv', 'w', encoding='UTF8', newline='') as f:
writer = csv.writer(f)
writer.writerow(header)
writer.writerow([results.get('title'), results.get('description'), results.get('thumbnail'), results.get('website'), results.get('facts', {}).get('education'), results.get('facts', {}).get('notable_work')])
This example is using Python, but you can also use your all your favorite programming languages like Ruby, NodeJS, Java, PHP, and more!
If you have any questions, please feel free to contact me.