Imagine we are using power query to clean data and need to replace values as part of the transformation process.
For each value which we want to replace, we have to create a Replace Values step in the query.ย If we have tens or even hundreds of values to replace, then it can be quite tedious.
If we already have a list of values which we need to replace, then it would be easier if we could do it all in one step based on the list!
In this post we’re going to learn how to do a bulk find and replace in power query based on a list of values.
Video Tutorial
The Setup
In this example, we have a small table with one column called Job Title and the table has been named MyData.ย This is the data we’re going to transform with our find and replace function.
We also have a table with two columns called Find and Replace and it’s been named MyFindReplace. This is a table of the pairs of items to find and replace.
The columns can actually be named anything, as we won’t be referring to them by name. The important thing is the find column is on the left and the replace column is on the right.
We then need to import these tables into power query. Go to the Data tab then press the From Table/Range command to import them into the power query editor.
Using Replace Values In Power Query
When inside the power query editor, we can perform a Replace Values step from either the Transform tab or the Right Click menu.
Select the column which we want to replace values in and go to the Transform tab and press the Replace Values command.
We can also access this command by right clicking on the column heading then selecting Replace Values from the menu.
Either of these options will open up the Replace Values menu and we can then enter the Value To Find and the value to Replace With.
This will then replace every instance of this in the entire column.
= Table.ReplaceValue(#"Changed Type","Text to find","Text to replace",Replacer.ReplaceText,{"Job Title"})
If we look at the M code that is generated from this, we can see it is using the Table.ReplaceValue power query function to perform the replacement.
We will use this function for our solution as well, but there is no easy way to use this function based on a list of values. We will need to create our own custom query function for this.
M Code For The Query Function
let BulkReplace = (DataTable as table, FindReplaceTable as table, DataTableColumn as list) =>
let
//Convert the FindReplaceTable to a list using the Table.ToRows function
//so we can reference the list with an index number
FindReplaceList = Table.ToRows(FindReplaceTable),
//Count number of rows in the FindReplaceTable to determine
//how many iterations are needed
Counter = Table.RowCount(FindReplaceTable),
//Define a function to iterate over our list
//with the Table.ReplaceValue function
BulkReplaceValues = (DataTableTemp, n) =>
let
//Replace values using nth item in FindReplaceList
ReplaceTable = Table.ReplaceValue(
DataTableTemp,
//replace null with empty string in nth item
if FindReplaceList{n}{0} = null then "" else FindReplaceList{n}{0},
if FindReplaceList{n}{1} = null then "" else FindReplaceList{n}{1},
Replacer.ReplaceText,
DataTableColumn
)
in
//if we are not at the end of the FindReplaceList
//then iterate through Table.ReplaceValue again
if n = Counter - 1
then ReplaceTable
else @BulkReplaceValues(ReplaceTable, n + 1),
//Evaluate the sub-function at the first row
Output = BulkReplaceValues(DataTable, 0)
in
Output
in
BulkReplace
In a previous post about replicating Excel’s TRIM function in power query, we went through the steps to creating a query function. We will need to follow the same steps with the above M code.
This query function takes 3 arguments.
- DataTable is the table that contains the columns which we want to find and replace values in.
- FindReplaceTable is a two column table. The first column contains values to find and the second column contains values to replace them with. Each row in the table consists of one pair of find and replace values.
- DataTableColumn is a list of the column names which we want to find and replace values in.
The function then converts ourย FindReplaceTable to a list of find and replace pairs and we iterate through them and apply aย Table.ReplaceValue function to each pair.
How To Use This Function
How we use this query function will be a bit different than ourย TRIM functionย example. In that case, we applied the function to each row in a column by adding a custom column.
To use this function, we need to apply it to the entire column.
We need to add a query step. From the query we want to use this function in, we can click on the small fx icon to the right of the formula bar.
= fBulkReplace(#"Changed Type", MyFindReplace, {"Job Title"})
We can then enter the above function.
The first argument will reference the previous step in the query. In our case this was a step named Changed Type.
The second argument will reference the name of the query which contains the table of find and replace values. In our case this was called MyFindReplace.
The third argument will reference the name of the column we want to replace values in. It needs to be in a list format using curly braces. In our case this wasย {"Job Title"}
.
Note, that we could apply the find and replace to multiple columns using this argument. For example, placing {"Job Title", "Job Description"}
in the third argument would apply the find and replace to both the Job Title and Job Description columns.
Conclusions
With some ingenuity, we are able to create a function which can perform bulk find and replace steps based on a list of values.
The key is to create a recursive function based on the index of the list of find and replace values.
We can even use this solution to perform the find and replace on multiple columns simultaneously to save even more time.
Another great use of recursive query functions in power query!
Thanks for this. It is just great. However, I have in my data X and XY or XZ. I wanted the function to replace only the same cell content, hence only X and not the X in XY.
Would it be possible for the function to replace the entire cell content instead of the only part of the string?
Yes, I think it is, but haven’t tested it.
In the query function, replace any instance of Replacer.ReplaceText with Replacer.ReplaceValue.
Thanks John. I tried this, but might be because the table of “MyFindPlace” has about 1000 rows, and “MyData” has 2000 rows, the excel dead after I applyed the function in query. Is that because huge data reason or something else?
Yes, it that might be too many replacements. Recursive functions are not considered “best practice”.
Hi John
This is awesomeโฆ but how do I get data cleaned in my original table and not just create a new one?
Either adding a column with cleaned data to existing or replacing original data.
Is that possible at all?
(I do not see the same box when as you do when connecting in the end
What is your data source?
My Original data is an excel file cleaned via Power Query
Some Material IDs though are not correct, so I have to replace those and either add a new cleaned Column or update the original column
I really like your version where I can add future wrong Material IDs to be cleaned, so really want to use this to do
This should actual work:
= Table.AddColumn(#”Changed Type2″, “SP Material ID Cleaned”, each if [SP Supplier Material ID Original] = “11-0016” then 110016 else if [SP Supplier Material ID Original] = “20-4038” then “22-4038” else [SP Supplier Material ID Original])
And just adding manaully the text diff. It is an easy one
๐
My Original data is an excel file cleaned via Power Query
Some Material IDs though are not correct, so I have to replace those and either add a new cleaned Column or update the original column
I really like your version where I can add future wrong Material IDs to be cleaned, so really want to use this to do
Br
Jette
Hey John,
I love this! It has helped me out tremendously.
However, I am running into a bit of an issue…..
I am trying to FIND
BAR ALMOND BUTTER
BAR ALMOND BUTTER BOX 8CT
Replace W/
ALMOND BUTTER
However, after I conduct the load, only BAR ALMOND BUTTER changes…. No Changes are made to BAR ALMOND BUTTER BOX 8CT.
I found this post when trying to find an answer – https://www.thebiccountant.com/2016/05/22/multiple-replacements-in-power-bi-and-power-query/
Is there anyway you could apply what is found above to your formula?
Here is a sample of the table.
Find // Replace
BAR ALMOND BUTTER // ALMOND BUTTER
BAR ALMOND BUTTER BOX 8CT // ALMOND BUTTER
I can confirm your bulk replace function works whenever strings are no similar.
In your case I think you can just re-order the table. Try this as your table instead:
Find // Replace
BAR ALMOND BUTTER BOX 8CT // ALMOND BUTTER
BAR ALMOND BUTTER // ALMOND BUTTER
But a more fool-proof way is in the query function, replace any instance of Replacer.ReplaceText with Replacer.ReplaceValue. This will find only replace the full value and not just a part of the value.
HI John,
Thanks for the solution.
Can i use the same M Code for my data, i tried to work in the same way as you shown in video and pics.
But i didnt worked.
Awaiting for your response.
Regards,
pavan
It works, try again.
Hi John,
Thanks for the code. It worked like a charm replacing values on a specific column. However, when I want to apply the function to another column in the same query, the original column reverts back to the original values. How can I tweek the code so that the function can be applied on multiple columns at once? Thanks in advance.
Select two columns and perform a replace values from the UI. Look at the m code this generates and adjust the code based on that.
Hi John, many thanks for sharing your knowledge. I an suffering some issue. When I run the function an unexpected erro is showed: Expression .Error: function expects 5 argument (Details: Pattern= , Arguments=[List]Can you pl help? Thanks.
I can’t start to describe how I hate this post, lured me with what I wanted to achieve and then blahblahed into nothing, I hate it
Thank you! This worked like a charm.
Good to hear!