Intro

In the previous SerpApi Async Requests with Pagination using Python blog post we covered how to make Async requests with SerpApi's pagination, how to use Search Archive API and Queue.

In this blog post we'll cover how to make direct requests to serpapi.com/search.json without using SerpApi's google-search-results Python client.

This way, when making a direct request to SerpApi, we can get a slightly faster response time in comparison to Python's client batch async search feature which uses Queue.

In the following blog post, we'll cover how to add pagination to the shown code below.


Subject of test: YouTube Search Engine Results API.

Test includes: 50 async search queries.

Code

You can check the code example in the online IDE:

import aiohttp
import asyncio
import json
import time

async def fetch_results(session, query):
    params = {
        'api_key': '...',      # your serpapi api key: https://serpapi.com/manage-api-key
        'engine': 'youtube',   # search engine to parse data from
        'device': 'desktop',   # from which device to parse data
        'search_query': query, # search query
        'no_cache': 'true'     # https://serpapi.com/search-api#api-parameters-serpapi-parameters-no-cache
    }
    
    async with session.get('https://serpapi.com/search.json', params=params) as response:
        results = await response.json()

    data = []

    if 'error' in results:
        print(results['error'])
    else:
        for result in results.get('video_results', []):
            data.append({
                'title': result.get('title'),
                'link': result.get('link'),
                'channel': result.get('channel').get('name'),
            })

    return data

async def main():
    # 50 queries
    queries = [
        'burly',
        'creator',
        'doubtful',
        'chance',
        'capable',
        'window',
        'dynamic',
        'train',
        'worry',
        'useless',
        'steady',
        'thoughtful',
        'matter',
        'rotten',
        'overflow',
        'object',
        'far-flung',
        'gabby',
        'tiresome',
        'scatter',
        'exclusive',
        'wealth',
        'yummy',
        'play',
        'saw',
        'spiteful',
        'perform',
        'busy',
        'hypnotic',
        'sniff',
        'early',
        'mindless',
        'airplane',
        'distribution',
        'ahead',
        'good',
        'squeeze',
        'ship',
        'excuse',
        'chubby',
        'smiling',
        'wide',
        'structure',
        'wrap',
        'point',
        'file',
        'sack',
        'slope',
        'therapeutic',
        'disturbed'
    ]

    data = []

    async with aiohttp.ClientSession() as session:
        tasks = []
        for query in queries:
            task = asyncio.ensure_future(fetch_results(session, query))
            tasks.append(task)

        start_time = time.time()
        results = await asyncio.gather(*tasks)
        end_time = time.time()

        data = [item for sublist in results for item in sublist]

    print(json.dumps(data, indent=2, ensure_ascii=False))
    print(f'Script execution time: {end_time - start_time} seconds') # ~7.192448616027832 seconds

asyncio.run(main())

Code Explanation

Import libraries:

import aiohttp # to make a request
import asyncio
import json    # for printing data
import time    # to measure execution time

In the fetch_results() function we:

  1. create search params that will be passed to SerpApi while making request.
  2. make an async session request, passing params and waiting for each response and storing it to results variable.
  3. check for 'error' in the results and iterate over 'video_results', and store extracted data to the data list.
  4. return list with videos data.
async def fetch_results(session, query):
    params = {
        'api_key': '...',      # your serpapi api key: https://serpapi.com/manage-api-key
        'engine': 'youtube',   # search engine to parse data from
        'device': 'desktop',   # from which device to parse data
        'search_query': query, # search query
        'no_cache': 'true'     # https://serpapi.com/search-api#api-parameters-serpapi-parameters-no-cache
    }
    
    async with session.get('https://serpapi.com/search.json', params=params) as response:
        results = await response.json()

    data = []

    if 'error' in results:
        print(results['error'])
    else:
        for result in results.get('video_results', []):
            data.append({
                'title': result.get('title'),
                'link': result.get('link'),
                'channel': result.get('channel').get('name'),
            })

    return data

In the second main() function we:

  1. create a list of queries. Could be also txt/csv/json.
  2. open a aiohttp.ClientSession().
  3. iterate over queries and create asyncio tasks.
  4. proceed all of the tasks with asyncio.gather(*tasks).
  5. flatten list with data and store it to the data variable.
  6. print the data.
async def main():
    queries = [
        'burly',
        'creator',
        'doubtful',
        # ...
    ]

    data = []

    async with aiohttp.ClientSession() as session:
        tasks = []
        for query in queries:
            task = asyncio.ensure_future(fetch_results(session, query))
            tasks.append(task)

        start_time = time.time()
        results = await asyncio.gather(*tasks)
        end_time = time.time()

        data = [item for sublist in results for item in sublist]

    print(json.dumps(data, indent=2, ensure_ascii=False))
    print(f'Script execution time: {end_time - start_time} seconds') # ~7.192448616027832 seconds

asyncio.run(main())

Conclusion

As you saw (and possibly tried) these results in quite a fast response times. Additionally, we can add pagination to it, which will be covered in the next blog post.

Join us on Twitter | YouTube