Piping public company financial statements into a system for analysis is a common problem financial analysts all encounter. The professionals are fortunate enough to have Bloomberg terminals with all their bells and whistles, but us amateurs must make do with spreadsheet solutions. Fortunately, SerpApi offers an API for scraping Google Finance and with our Google Sheets Extension, you don't have to be a programmer to scrape this data.
In this tutorial, I'll walk you through the process of fetching the three financial statements Google Finance provides for each public company:
- Income Statement
- Balance Sheet
- Cash Flow
By the way if you are a programmer, you might find this Google Finance API with Python guide of interest:
Google Sheets Setup
First, you'll need a Google account and a SerpApi account and API key to follow along with this tutorial. If you have a SerpApi account already you can find your API key here. Otherwise, please create an account.
- Create a new Google Sheet via https://docs.google.com/spreadsheets
- From the Extensions menu, click Add-ons, and then Get add-ons.
- Search for the SerpApi add-on.
- Install and approve the prompts Google will show you.
- Copy your API key.
- Set your API key by navigating through
Extensions->SerpApi->Setting->Update your API Key
and pasting your key in the pop up.
- If all went well, you'll see the message below.
- Finally, we'll run a test search. The structure of a search is
=SERPAPI_RESULT(serpapi_query_here, field_selector_here)
whereserpapi_query_here
represents the parameters of your search query andfield_selector_here
indicates the path to the field you want to display. We'll get more practice with this later.
query: "engine=google_finance&q=" & ENCODEURL("JPM:NYSE")
selector: "summary.stock"
Note: We wrapped JPM:NYSE
in the ENCODEURL
function to ensure special characters like :
are properly encoded in the request URL.
Copy the following formula into a cell:
=SERPAPI_RESULT("engine=google_finance&q=" & ENCODEURL("JPM:NYSE"), "summary.stock")
If all went to plan, you should see JPM
appear in the cell you pasted the formula.
Finding Our Google Finance Data
Our next preparatory step is to locate our data. For this, we'll take a quick detour to our API Playground to understand what the API response from the Google Finance API looks like.
Loading the link above takes us to a Playground page with our Google Finance API fetching data for JPMorgan Chase. The query is set to JPM:NYSE
because JPM
is the ticker for JPMorgan, and NYSE
is the code for stocks on the New York Stock Exchange.
If you're unsure about a company's stock ticker, use Google Finance to look up the company and click on the best match in the results.
From the company's Google Finance page, you can find the stock ticker and which stock exchange they're listed on. Note that you must use the format used in the URL bar such as stock_ticker:stock_exchange
, not the one in the Google Finance search bar with reverse ordering of stock_exchange:stock_ticker
. Simply put, JPM:NYSE
is valid, but NYSE:JPM
is not.
Now we can pop JPM:NYSE
into a Playground search to scrape their Google Finance data.
In the bottom right of the Playground page, we see the full JSON response body. I recommend collapsing the JSON tree to show only the top level fields so we can better focus on where we need to drill down for our financial statements.
Scrolling down the Google Finance results, we see a "Financials" section that corresponds to a financials
key in our JSON response body. Click the [ ... ]
next to financials
to expand that part of the response.
Within financials
, we find an array containing data for Income Statement, Balance Sheet, and Cash Flow. The sample below omits all but one financial metric per statement to make it more readable. We'll figure out how to iterate through each metric later.
"financials": [
{
"title": "Income Statement",
"results": [
{
"date": "Mar 2024",
"table": [
{
"title": "Revenue",
"description": "The total amount of income generated by the sale of goods or services related to the company's primary operations",
"value": "40.07B",
"change": "11.07%"
},
...
]
}
]
},
{
"title": "Balance Sheet",
"results":
[
{
"date": "Mar 2024",
"table": [
{
"title": "Cash and short-term investments",
"description": "Investments that are relatively liquid and have maturities between 3 months and one year",
"value": "1.63T",
"change": "14.52%"
},
...
]
}
]
},
{
"title": "Cash Flow",
"results": [
{
"date": "Mar 2024",
"table": [
{
"title": "Net income",
"description": "Company’s earnings for a period net of operating costs, taxes, and interest",
"value": "13.42B",
"change": "6.31%"
},
...
]
}
]
}
]
Let's now turn our attention to fetching this data in Google Sheets.
Getting the Reports into Google Sheets
Finally, we can get to the main event.
First, we'll dedicate a cell to input any stock ticker plus its exchange to be used as a lookup reference. We'll also update our initial test API call from the Google Sheets Setup section to reference this instead of hard-coding JPM:NYSE
.
Our input cell is B2
and we never want this to change so we'll replace ENCODEURL("JPM:NYSE")
with ENCODEURL($B$2)
where the $
tells Google Sheets not to change the cell reference if a formula is dragged or copied elsewhere in the sheet.
=SERPAPI_RESULT("engine=google_finance&q=" & ENCODEURL($B$2), "summary.stock")
As previously mentioned, we're interested in scraping three financial statements contained in the financials
array. In order to scrape the correct financial statement, we'll need the right index value.
"financials": [
{
"title": "Income Statement", # index 0
...
},
{
"title": "Balance Sheet", # index 1
...
},
{
"title": "Cash Flow", # index 2
...
}
]
It would be nice to have a dropdown menu with a list of the financial statements we support. Next to that, we'll add conditional logic to populate a cell with the appropriate index value depending on which financial statement is being requested.
You can create a Google Sheets dropdown by typing @dropdown
in a cell and clicking Dropdowns
. We'll put this in cell A4
.
Then clicking New dropdown
.
Then in the resulting menu, we'll add our three financial statements. You can also add a color for each option, but I'll skip that.
Finally, we want our dropdown to dynamically to populate a cell with the index of that financial statement. If you recall from the JSON data above, "financials"
is an array where the first element (index 0) is Income Statement, the second element (index 1) is Balance Sheet, and the third element (index 2) is Cash Flow. We'll need those numerical indexes to select the appropriate values.
We can dynamically set the index values using the following conditional Google Sheet statement. We'll paste this in B4
; next to our dropdown menu.
=IFS(A4="Income Statement", 0, A4="Balance Sheet", 1, A4="Cash Flow", 2)
If we did that correctly, B4
will update with the appropriate index of whichever report was selected in the dropdown menu.
Let's shift gears back to our JSON data.
As seen above, the Income Statement has a results
array containing date
and another array, table
, containing the financial metrics in the statement. We'll start by pulling out the date of the report stored in the date
field with the below formula and storing this in cell B6
.
=SERPAPI_RESULT("engine=google_finance&q=" & ENCODEURL($B$2) & "", "financials."& $B$4 &".results.0.date")
Note: We're using our dynamic $B$4
statement index from our dropdown logic. If we didn't have this, the formula would simply be "financials.0.results.0.date"
.
We'll also add some headers to describe each column of data. We'll also be using these column headers for something else a little later to make our spreadsheet more dynamic.
Next, we'll populate our first row of data. We can modify the previous formula replacing date
with table
followed by 0
to grab the first element in the table
array. Then we'll select for title
.
=SERPAPI_RESULT("engine=google_finance&q=" & ENCODEURL($B$2) & "", "financials." & $B$4 & ".results.0.table.0.title")
We also want to select the value
and change
fields. We do that by simply changing the selector value of the above formula as follows:
=SERPAPI_RESULT("engine=google_finance&q=" & ENCODEURL($B$2) & "", "financials." & $B$4 & ".results.0.table.0.value")
=SERPAPI_RESULT("engine=google_finance&q=" & ENCODEURL($B$2) & "", "financials." & $B$4 & ".results.0.table.0.change")
With these three formulas, we have our first row of data.
Let's take a moment to quickly test our dropdown menu by selecting another financial statement type. If we did everything correctly, the first row should update automatically by selecting Cash Flow
in the dropdown.
Looks good! Now we can work on scraping more rows of data.
Looking at our selector "financials." & $B$4 & ".results.0.table.0.title"
and the remaining data, we see we need to increment the index value after table
to 1
for the second value, 2
for the third value, and so on. Doing this manually will be very time-consuming and error prone so we'll solve for this by creating an index next to our table.
Now we can use each index value in the the corresponding formulas for each row of data by referring to $D8
for the first row with index 0
, $D9
for the second row with index 1
, etc.
We can also utilize our column headers to dynamically set the field selectors where A$7
refers to title
, B$7
refers to value
, and C$7
refers to change
.
Full formulas for the first row:
=SERPAPI_RESULT("engine=google_finance&q=" & ENCODEURL($B$2) & "", "financials." & $B$4 & ".results.0.table." & $D8 & "." & A$7)
=SERPAPI_RESULT("engine=google_finance&q=" & ENCODEURL($B$2) & "", "financials." & $B$4 & ".results.0.table." & $D8 & "." & B$7)
=SERPAPI_RESULT("engine=google_finance&q=" & ENCODEURL($B$2) & "", "financials." & $B$4 & ".results.0.table." & $D8 & "." & C$7)
All this is admittedly a bit complicated, but it's a worthwhile investment because now we can drag these formulas down to populate all the other rows of data.
Brilliant! It works. Let's try two other financial reports.
All good here.
Oops, we get an error in the last row of the Cash Flow statement explaining the field is not found. This happened because the Cash Flow statement has six rows of data instead of seven like the other reports.
To clean this up, we'll add some conditional logic to show empty cells if the selected statement is Cash Flow. We'll do this by checking if $B$4
is equal to 2
, and returning an empty string of ""
if true. If false, we'll execute our scrape Google Finance query. Here are the formulas for all three cells:
=IF($B$4=2,"",SERPAPI_RESULT("engine=google_finance&q=" & ENCODEURL($B$2) & "", "financials." & $B$4 & ".results.0.table." & $D14 & "." & A$7))
=IF($B$4=2,"",SERPAPI_RESULT("engine=google_finance&q=" & ENCODEURL($B$2) & "", "financials." & $B$4 & ".results.0.table." & $D14 & "." & B$7))
=IF($B$4=2,"",SERPAPI_RESULT("engine=google_finance&q=" & ENCODEURL($B$2) & "", "financials." & $B$4 & ".results.0.table." & $D14 & "." & C$7))
And the result:
This looks a lot better, but we still have that unsightly index in column D. As an optional finishing touch, we'll simply hide that column so we don't have to look at it and make it harder to accidentally edit.
Wrapping Up
You now have the basic setup to scrape Google Finance data with Google Sheets. There is plenty you could do next to make this spreadsheet even better. Here are some ideas to get you started:
* Add recent news results about the company
* Show the current stock price
* Add additional metrics like P/E ratio, market cap, and dividend ratio
* Add the ability to compare one company to another
You can also review our Google Finance API documentation and play around in our API Playground to identify other data points you can scrape from Google Finance.
If you have any questions about how our Google Finance API or any of our other scraping APIs work, please don't hesitate to contact us at contact@serpapi.com. We're available 24 hours a day, 5 days a week.
Happy scraping!