WARNING!!! This post is full of POWER.
Power Query is a great tool for importing and transforming data in Excel. But unfortunately it’s missing a key feature out of the box to make it truly useful.
In this post we’re going to find out how to import data from multiple files in a folder using Power Query. To do this we will need certain conditions to be true.
- We’ll need Power Query installed. Read this post to find out how to install it.
- The data structure in each file must be the same.
- Same number of columns with the same column headings.
- Data is on the same sheet name in each file.
- We will need all the files we want to import located in the same folder.
For our example, we’re going to import 4 Excel files all from this location.
C:\Users\John\Google Drive - Excel\Excel Website\Power Query\How To Import All Files In A Folder With Power Query\Example
Each file is structure the same way. The column headings are all the same and in the same order. The data is in a sheet called Sheet1 for each file. Only the data is different for each file.
Step 1: Setting up the import query for one file.
Now we will set up the import query for one file.
- Go to the Power Query tab in the ribbon.
- In the Get External Data section press the From File button.
- Choose From Excel in the drop down.
Now choose a file to import. You can choose any of the files in this step as we are creating an import query that will work for all the files and all the files are structurally the same.
- Navigate to your folder and select any file.
- Press the OK button.
A preview of your data will appear in the Navigator window.
- Select the sheet that contains your data. In our case Sheet1. Check the data preview to make sure you have the correct data.
- Press the Edit button.
Step 2: Edit the query.
Power Query will create the import code (in a custom language called M) behind the scenes, but we will need to edit it slightly.
- Go to the View tab.
- Press the Advanced Editor button.
- You will now see the code that has been created for your import.
We can now edit the M code.
- Add the red bits of text into your code.
let GetFiles=(Path,Name) => let Source = Excel.Workbook(File.Contents(Path&Name), null, true), Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data], #"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet), #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Customer ID", Int64.Type}, {"Order ID", Int64.Type}, {"Item", type text}, {"Quantity", Int64.Type}, {"Unit Price", Int64.Type}, {"Total", Int64.Type}}) in #"Changed Type" in GetFiles
- Press the Done button.
- Rename the query to functionGetFiles.
- Go to the Home tab.
- Press the Load & Close button.
We have just changed the query to a function that takes a file path and name as inputs then returns the data from that file path and name.
Step 3: Create a query on the folder containing your files.
Now we will create a query on the folder containing all our files.
- Go to the Power Query tab.
- Press the From File button.
- Choose From Folder in the drop down.
- Browse to or enter the path of the folder.
- Press the OK button.
This type of query outputs data about all the files in a folder such as the file name, location, date modified etc… A list of your files will be previewed and we can then press the Edit button if this preview looks ok.
Step 4: Add a column with your custom function.
Now we will add a column to this query.
- Go to the Add Column tab.
- Press the Add Custom Column button.
- Change the name of the column to GetData.
- Enter this formula into the Custom column formula.
=functionGetFiles([Folder Path],[Name])
- Press the OK button.
This will call the function we created in step 2.
Now we need to expand the GetData column we just created to show its results.
- Click on the expand and filter icon in the column heading of our new GetData column.
- Select the expand radio button.
- Press the OK button.
- Now go to the Home Tab.
- Press the Close & Load button.
Now our data will be imported from all the files in our folder and combined into one table. If we add files to our folder later on, we can import these also by simply refreshing the query. Wow, the POWER!
- Here is our data from all 4 files combined into the same table.
- If we add files to our folder and want to import these as well, go to the Data tab.
- Press the Refresh All button.
Can you do this with .csv files?
Yes, but in step 1 select From CSV instead of From Excel. The rest of the steps will be the same.
This doesn’t seem to work on Excel 2016. It asks to provoke a parameter, am I missing something?
Also, where does the Path&Name in this sequence Source = Csv.Document(File.Contents(“filepath\filename.csv”),[Delimiter=”,”, Columns=4, Encoding=1252, QuoteStyle=QuoteStyle.None]), go?
I appreciate it!
I don’t have access to 2016 at the moment but I’ll try it out with 2013 and CSV files and let you know. Try this method in the meantime instead How To Import Multiple Files With Multiple Sheets In Power Query.
I tried it out and it worked fine for me with CSV files. My Advice is start over and follow along step by step as sometimes the error that was made becomes a blind spot.
This was the M code I had for the function.
let GetFiles=(Path,Name) =>
let
Source = Csv.Document(File.Contents(Path&Name),[Delimiter=”,”, Columns=6, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#”Promoted Headers” = Table.PromoteHeaders(Source),
#”Changed Type” = Table.TransformColumnTypes(#”Promoted Headers”,{{“Customer ID”, Int64.Type}, {“Order ID”, Int64.Type}, {“Item”, type text}, {“Quantity”, Int64.Type}, {“Unit Price”, Int64.Type}, {“Total”, Int64.Type}})
in
#”Changed Type”
in GetFiles
Hello, I am getting the following message after trying to enter the custom column:
“Expression.Error: The name ‘functionGetFiles’ wasn’t recognized. Make sure it’s spelled correctly.”
Yes, make sure it’s spelled exactly the same in the function query name as it is when you call it in the custom column. Use copy and paste to make sure it is the same.
Hello John,
I’ve followed your guide step by step 3 times (download till finish Step 4 > Expend > Close & Load), and I got those data, but error message also come : “Initialization of the data source failed. Check the database server or contact your database administrator. Make sure the external database is available, and then try the operation again. If you see this message again, create a new data source to connect to the database.”
PS: – Ms. Excel 2016
– i didn’t try M code that you replied on 2017-01-22 at 10:42 . and didn’t try another article : How To Import Multiple Files With Multiple Sheets In Power Query
It sounds like an issue with access to the folder you’re trying to use. Try copying the folder to your desktop and use that location to pull the files from instead.
First off, WOW!! Thank you so much. It is my first time really using the PowerBI ( though I have a book and some video lessons to follow up) and I feel my life is about to change, for the better. I did have to figure out that I should ignore the parameters, and a few other things but no big. I think it might be that things have been changing at lighting speed over at MS.
However, I do have a question about how to handle this with almost similar invoice, (a few columns are different) and therefore pull up as an error. How would I be able to almost do an if function or just a skip if not there?
Apologies when I mean invoices, its because it is the xml docs I am trying to convert. So really I should have said any documents that are slightly different columns/format.
Yes, power query is awesome and still amazes me with what it can do.
I’m not sure I really understand your problems. Power query does have an XML parser, so I take it you’re using that and then trying to combine the results from a bunch of these files but some are missing columns?
It definitely sounds like a good blog post topic. Unfortunately, I don’t know the solution off hand. I did recently write a post about a similar situation that you may find useful. I encountered missing columns but wasn’t dealing with XML or combining data from multiple files in a folder.
https://www.howtoexcel.org/power-query/how-to-deal-with-changing-data-formats-in-power-query/
WOW!
Impressive article. Thank you so much. I was able to load a big amount of data using get data From Folder but the automatic code generated is so complex that it was difficult for me to do any kind of transformation to the data afterwards.
This way is perfect. M code is simple and controlled and I can do any kind of transformations first to the single file import and after, to the whole input.
Yep, power query is awesome! Thanks Microsoft for making an amazing product.
I love the tool, Is there any way to get around the query failing when there is a file open in the folder? (one out of 180)
This was a great lesson in power query but i have a slightly different scenario that i would like to apply this to but can’t seem to get it.
i have a csv file for every day of the month name for the date in the same folder, for example 2018.09.01.csv, 2018.09.02.csv, etc… .
I would like to load the data from each one of these files into a different worksheet using the same query.
so my first worksheet, called Summary, in column A had the dates in the same format as the filename, so A2=2018.09.01, A3=2018.09.02, etc.. the next sheet is named 1 for first day of month, the next one named 2, etc…. My summary worksheet will have additional columns with functions that will summarize the data from all the other day of month worksheets
I would like to have 1 power query that uses the values in column A as the file source to populate all the other worksheets. so in a new month i just have to change the month in column A from 09 to 10 in this example. so the Worksheet for day of month 1, also named 1, will will use the value in A2 as the source filename in the query, worksheet 2 will use A3 as source filename, etc…
I understand the the queries may need to be unique in each day of month worksheet to reference the correct cell is summary and i don’t mind defining it once in each of the 31 worksheets but i don’t want to have to change all 31 worksheets power query every month, just change the month in column A of the summary once a month.
Any help is greatly appreciated.
You need to set up a parameter table. Create the query and in the last step filter on the correct day. Copy and paste that query 31 times and then reference each parameter for the filter.
I’m always curious why people separate data. Maybe a better approach is to use load all the data into a pivot table and then you can look at whatever part you want in a dynamic way?
Either way good luck with it!
Hi John, I am using Excel 2016. The section where you create a query on the folder containing all files. You are seeing:
Load, Edit and Cancel
For me, I am unable to see Edit, all I am seeing are:
Combine, Load, Transform and Cancel. There is no Edit so I am unable to filter out the unwanted data file of different type.
I tried selected Transform and it brings up the Power Query Window but not matter what I selected in the Power Query window, nothing happens, Query window seems to freeze and I have to kill the task with Ctrl-Alt-Delete
Do you know why?
Yes, they updated the UI since I wrote this post. Transform is the new Edit button. You can also click on the combine button and choose combine and edit (also new).