This tutorial is on scraping search engine results using simple API by SerpApi using PHP and cURL.
Are you new to cURL? Learn the basics of making requests with cURL in PHP.
In this tutorial, we're not going to use SerpApi's PHP library, instead we'll stick to native implementation with cURL.
Here is the complete documentation of the Google Search API we're going to use: https://serpapi.com/search-api
Basic search tutorial in PHP
Step 1: Prepare cURL
Let's prepare our cURL code. Add the SerpApi endpoint as the URL value.
<?php
// Initialize cURL session
$ch = curl_init();
// Set cURL options
curl_setopt($ch, CURLOPT_URL, "https://serpapi.com/search");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
Step 2: Prepare the parameter
Feel free to adjust this parameter based on search engine and other relevant parameters you want to implement.
// Set the data fields for GET request
$fields = array(
'api_key' => 'YOUR_API_KEY',
'engine' => 'google',
'q' => 'Coffee',
'location' => 'Austin, Texas, United States',
'google_domain' => 'google.com',
'gl' => 'us',
'hl' => 'en'
);
Step 3: Execute cURL
Let's execute the code above. Don't forget to close the connection at the end.
// Build the URL query string
$queryString = http_build_query($fields);
curl_setopt($ch, CURLOPT_URL, "https://serpapi.com/search?" . $queryString);
// Execute cURL session
$response = curl_exec($ch);
// Check for errors
if(curl_errno($ch)) {
echo 'Request Error:' . curl_error($ch);
}
// Close cURL session
curl_close($ch);
// Output the response
echo $response;
That's it. You should be able to see the result when running a server. Try php -S localhost:8000
(depend on your OS).
Async search implementation in PHP
Sometimes, you need to perform a "bulk search" or search for multiple keywords at once. We can implement this using the async
option.
Make sure to see the basic endpoint of SerpApi Search Archive API here. It looks like this
https://serpapi.com/searches/$searchID.json?api_key=SECRET_API_KEY
Here is a basic example:
- Add an array to store key value of $searchID and temporary $searchData
- Set async
to true
$searchQueue = [];
$items = ['green tea', 'black tea', 'oolong tea'];
// Batch search in async mode
foreach ($items as $item) {
// Set the data fields for GET request
$fields = array(
'api_key' => $API_KEY,
'engine' => 'google',
'q' => $item,
'location' => 'Austin, Texas, United States',
'google_domain' => 'google.com',
'gl' => 'us',
'hl' => 'en',
'async' => true
);
// Build the URL query string
$queryString = http_build_query($fields);
curl_setopt($ch, CURLOPT_URL, "https://serpapi.com/search?" . $queryString);
// Execute cURL session
$response = curl_exec($ch);
// Check for errors
if(curl_errno($ch)) {
echo 'Request Error:' . curl_error($ch);
}
// save the search id for later retrieval
$searchResult = json_decode($response, true);
$searchQueue[$searchResult['search_metadata']['id']] = $searchResult;
}
In above example, I only make the keyword dynamic. Feel free to adjust other parameters based on your use case.
So far, we only store the temporary search information in an array $searchQueue
now how do we retrieve it?
Retreive the async result
The idea is:
- Doing an infinite loop of the search (since the search can take some time to finish)
- We keep checking if the id status already updated to success
.
- If so, we remove the ID from the search (to make sure we're not running in infinite loop).
- Otherwise, keep checking the search status
while (!empty($searchQueue)) {
foreach($searchQueue as $search) {
$searchID = $search['search_metadata']['id'];
$fields = array(
'api_key' => $API_KEY
);
// Access archive API endpoint
$queryString = http_build_query($fields);
curl_setopt($ch, CURLOPT_URL, "https://serpapi.com/searches/$searchID.json?" . $queryString);
$response = curl_exec($ch);
$search = json_decode($response, true);
echo $search['search_metadata']['status'];
if ($search['search_metadata']['status'] == 'Success'
|| $search['search_metadata']['status'] == 'Cached') {
// Remove the search from the queue based on the search id
unset($searchQueue[$searchID]);
echo $search['search_metadata']['id'] . " is done";
echo "<br>";
echo "First result title : " . $search['organic_results'][0]['title'];
echo "<br>";
} else {
echo $search['search_metadata']['id'] . " is not done yet";
echo "<br>";
}
}
}
echo "All searches are done";
// Close cURL session
curl_close($ch);
?>
When the status of the search is succeeded, I print the first organic result's title. Feel free to do anything on this condition block. Since the data is already available here.
That's it! Feel free to try!