In this post I’ll show you how to hack into tons of top secret government data using Excel.
Ok, by “top secret“, I actually mean a database API made publicly available.
And by “hack into“, I actually mean use Excel’s From Web Power Query functionality to access the API.
Governments have a lot of data covering agriculture to renewable energy, and the cool thing is anyone can access it and they provide it all for free with an API. Data is always a valuable resource to have at hand when you’re trying to learn Excel too, so learning how to fish for your own data is definitely worthwhile.
Video Tutorial
Get Your API Key
This API requires an API Key. This is just like a password that allows you to use the API.
The key is free, but you do have to sign up for it. Go here https://api.data.gov/signup/ and fill in your details and press the Signup button.
After signing up, you should receive an email fairly shortly. Take note of the API key it contains as we will be using it later to access the API.
There are tons of different API’s you can explore on your own once you have your key and know how to use it. You can find them all from this page here https://api.data.gov/.
Alternative Fuel Station API
In this post we’ll be exploring this API https://developer.nrel.gov/docs/transportation/alt-fuel-stations-v1/nearest/ which return a list of the nearest alternative fuel station from a given location.
Let’s try out our example web request from the email with our API key. Copy this URL from the email and paste it into the address bar of any browser (I’m using Chrome) then press enter.
Notice the &location=Denver+CO
at the end of the URL. This means the API will return a list of stations near Denver Colorado. In fact, you can pass many other parameters to the API by appending them to the URL in a similar fashion. All the possible parameters are nicely documented on the government website. We’ll try some of these out later.
After submitting this request URL it will return a whole bunch of data. The output might not be that readable to you if you’ve never seen JSON before.
JSON Data Output
The output of this API is in a JSON format. JSON stands for JavaScript Object Notation and it’s a syntax for storing and exchanging hierarchical data.
[
{
"Name": "Joe Smith",
"Address": {
"Street Number": 123,
"Street Name": "Fake Street",
"Post Code": "H0H 0H0"
},
"Email": {
"Work": "Joe@myworkemail.com,
"Personal": "Joe@mypersonalemail.com"
}
}
]
Above is a simple example of a JSON data object for a person’s contact information. This object is actually an array of 3 objects (name, address and email). The address and email objects themselves are also arrays of objects. The address contains a street name, number and postal code and the email contains a work and personal email.
The cool thing is Power Query has JSON parsing functionality built in. We can easily turn something like this into a more familiar and readable table in just a few clicks, so we won’t need to try and decipher our API’s output on our own.
Creating an API URL
Just for fun, I’m going to get a bit fancy with some of the parameters available in the API to find 5 electric fuel stations that are currently open to the public and nearest to Toronto, Ontario in Canada. For this, I’ll use the following parameters and values.
- location=Toronto+ON This means station is located near Toronto Ontario.
- status=E This means the station is open and carries alternative fuel.
- access=public This means the station is open to the public.
- fuel_type=ELEC This means the station provides electric fuel.
- country=CA This means the station is in Canada.
- limit=5 This parameter will limit the results to the 5 nearest.
Combining all these parameters into the proper API request URL format results in the above URL. Now we’re ready to use this in Excel.
Calling the API from Excel
Now that we’ve seen the API in use from the browser and have created a fancier URL to call, let’s use a From Web power query to call it in Excel.
Go to the Data tab in the ribbon and select From Web under the Get & Transform Data section.
This can also be found under Get Data in the From Other Sources menu.
We only need to use the Basic query so you can pop your URL into the field and press the OK button. We can go with the default settings in the next pop up so just press the Connect button.
When the query editor opens, we can see that Excel has recognized the output as JSON and applied the JSON.Document parsing to the result.
We see the meta data from the results of the API, this is the data about the resulting data such as the total number of records matching our query parameters (even though we’ve limited the results to 5 of these).
Notice the fuel_station contains a List. Left click on this list to expand and show the items in the list.
When you expand the list it will show 5 records since we limited the query to 5 results. We want to turn this list of records into a proper table. Right click on the list and select the To Table option from the menu. You can also select To Table from the Transform tab of the editor.
Select None under the Select or enter delimiter option and press the OK button.
Now that we have converted our output to a table, we can expand the records. Left click on the column heading filter toggle.
Select the columns you want to appear in the final output. You can also uncheck the column heading prefix option if desired. Then press the OK button.
Awesome! We now have our data from the API in a familiar table format. Just Close & Load the query from the Home tab of the editor and we’re good to go!
Hi there,
I’m seeing an issue in PowerQuery where only the first 5 rows are returned using this method. Strangely if I copy and past the same query into Power BI it returns the entire contents returned by the API. Am I missing something??
That is intentional. The URL I used has a limit parameter.
https://developer.nrel.gov/api/alt-fuel-stations/v1/nearest.json?api_key=XXXXXXXXX&location=Toronto+ON&status=E&access=public&fuel_type=ELEC&country=CA&
limit=5You may have just omitted this part of the URL in Power BI.
Hello John,
How could this work with a website that doesn’t have the parameters on the URL and instead works as a POST website?
This is an example that requires month, location and other things: https://app.cfe.mx/Aplicaciones/CCFE/Tarifas/TarifasCREIndustria/Tarifas/DemandaIndustrialSub.aspx
Thank you very much for this great article!
I tried, but unfortunately I was never able to get any POST type API to work.
You can try yourself by using the advanced options of the web query (advanced options are found when entering the URL).
Hi,
Thanks for the tutorial!
I managed to consume the JSON API to do just that for 1 product item.
Now, I would like to issue several requests (1 per product item that I have) and merge all the results in the same table with the same transformation (same header for all lines). I don’t have a way to do this in bulk (JSON API I’m using prevents that…)
Any idea about how to accomplish that?
Thanks.
I’ve got a post about getting data from multiple websites that might be useful.
https://www.howtoexcel.org/power-query/how-to-extract-data-from-multiple-webpages/
Does this only work on a Windows version of Excel?
Yep, power query is currently only available on Excel for PC.
I need help. I want a URL to report a single number from a cell in a spreadsheet I have made. I want to be able to put this URL in excel (API query from internet) or a google sheet or table master in wix and simply display this number…I’m just missing something and I’m certain it is very simple, yet I can’t get it to display in the table! Click the “current price” tab at the top of the page.
Hard to know exactly what the problem is without seeing it. Maybe you need to drill down to a single value? Right click on a value in the Power Query editor and choose Drill Down.
Hi Damon and John,
I have the same question, but will try to make it more explicit.
In your example you use a number of values: “Toronto+ON”, “E” and so on.
https://developer.nrel.gov/api/alt-fuel-stations/v1/nearest.json?api_key=XXXXXXXXX&location= Toronto+ON &status= E &access=public&fuel_type=ELEC&country=CA&limit=5
What I think Damon and I would like to achieve is to use a value in a cell in Excel to use in these places.
I have a link with a bunch of “Parameter1=5&Parameter2=3&….” and would like to have the numbers be input from Excel itself.
Is this possible?
Thanks in advance!
Check out this video on how to create a table of parameter values.
https://www.youtube.com/watch?v=sdR2BI2e5Y8