Intro
In the previous SerpApi Async Requests with Pagination using Python blog post we covered how to make Async requests with SerpApi's pagination, how to use Search Archive API and Queue.
In this blog post we'll cover how to make direct requests to serpapi.com/search.json without using SerpApi's google-search-results Python client.
This way, when making a direct request to SerpApi, we can get a slightly faster response time in comparison to Python's client batch async search feature which uses Queue.
In the following blog post, we'll cover how to add pagination to the shown code below.
Subject of test: YouTube Search Engine Results API.
Test includes: 50 async search queries.
Code
You can check the code example in the online IDE:
import aiohttp
import asyncio
import json
import time
async def fetch_results(session, query):
params = {
'api_key': '...', # your serpapi api key: https://serpapi.com/manage-api-key
'engine': 'youtube', # search engine to parse data from
'device': 'desktop', # from which device to parse data
'search_query': query, # search query
'no_cache': 'true' # https://serpapi.com/search-api#api-parameters-serpapi-parameters-no-cache
}
async with session.get('https://serpapi.com/search.json', params=params) as response:
results = await response.json()
data = []
if 'error' in results:
print(results['error'])
else:
for result in results.get('video_results', []):
data.append({
'title': result.get('title'),
'link': result.get('link'),
'channel': result.get('channel').get('name'),
})
return data
async def main():
# 50 queries
queries = [
'burly',
'creator',
'doubtful',
'chance',
'capable',
'window',
'dynamic',
'train',
'worry',
'useless',
'steady',
'thoughtful',
'matter',
'rotten',
'overflow',
'object',
'far-flung',
'gabby',
'tiresome',
'scatter',
'exclusive',
'wealth',
'yummy',
'play',
'saw',
'spiteful',
'perform',
'busy',
'hypnotic',
'sniff',
'early',
'mindless',
'airplane',
'distribution',
'ahead',
'good',
'squeeze',
'ship',
'excuse',
'chubby',
'smiling',
'wide',
'structure',
'wrap',
'point',
'file',
'sack',
'slope',
'therapeutic',
'disturbed'
]
data = []
async with aiohttp.ClientSession() as session:
tasks = []
for query in queries:
task = asyncio.ensure_future(fetch_results(session, query))
tasks.append(task)
start_time = time.time()
results = await asyncio.gather(*tasks)
end_time = time.time()
data = [item for sublist in results for item in sublist]
print(json.dumps(data, indent=2, ensure_ascii=False))
print(f'Script execution time: {end_time - start_time} seconds') # ~7.192448616027832 seconds
asyncio.run(main())
Code Explanation
Import libraries:
import aiohttp # to make a request
import asyncio
import json # for printing data
import time # to measure execution time
In the fetch_results() function we:
- create search
paramsthat will be passed to SerpApi while making request. - make an
asyncsession request, passing params and waiting for eachresponseand storing it toresultsvariable. - check for
'error'in theresultsand iterate over'video_results', and store extracted data to thedatalist. - return
listwith videos data.
async def fetch_results(session, query):
params = {
'api_key': '...', # your serpapi api key: https://serpapi.com/manage-api-key
'engine': 'youtube', # search engine to parse data from
'device': 'desktop', # from which device to parse data
'search_query': query, # search query
'no_cache': 'true' # https://serpapi.com/search-api#api-parameters-serpapi-parameters-no-cache
}
async with session.get('https://serpapi.com/search.json', params=params) as response:
results = await response.json()
data = []
if 'error' in results:
print(results['error'])
else:
for result in results.get('video_results', []):
data.append({
'title': result.get('title'),
'link': result.get('link'),
'channel': result.get('channel').get('name'),
})
return data
In the second main() function we:
- create a
listofqueries. Could be also txt/csv/json. - open a
aiohttp.ClientSession(). - iterate over queries and create
asynciotasks. - proceed all of the tasks with
asyncio.gather(*tasks). - flatten
listwith data and store it to thedatavariable. - print the data.
async def main():
queries = [
'burly',
'creator',
'doubtful',
# ...
]
data = []
async with aiohttp.ClientSession() as session:
tasks = []
for query in queries:
task = asyncio.ensure_future(fetch_results(session, query))
tasks.append(task)
start_time = time.time()
results = await asyncio.gather(*tasks)
end_time = time.time()
data = [item for sublist in results for item in sublist]
print(json.dumps(data, indent=2, ensure_ascii=False))
print(f'Script execution time: {end_time - start_time} seconds') # ~7.192448616027832 seconds
asyncio.run(main())
Conclusion
As you saw (and possibly tried) these results in quite a fast response times. Additionally, we can add pagination to it, which will be covered in the next blog post.