In this post we’re going to explore how to find and remove multiple text strings from within another string.
Excel has a great built in function called SUBSTITUTE which allows you to find one bit of text within another text string and substitute it for another bit of text.
Copy and paste this table into cell A1 in Excel
Text | Revised Text |
---|---|
Ge%t rid% of thes%e perce%nt sig%ns | =SUBSTITUTE(A2,"%","") |
In the above example we can use the SUBSTITUTE function to clean the text of all the percentage signs by replacing them with an empty string (“”) in the following formula.
=SUBSTITUTE(A2,"%","")
Copy and paste this table into cell A1 in Excel
Text | Revised Text |
---|---|
G$$e%t ri$d$% of thes$$$%e pe$rce%nt and dol$lar sig%ns | =SUBSTITUTE(SUBSTITUTE(A2,"$",""),"%","") |
Now if we had add in another rogue character we wanted to get rid of ($) we could get rid of both characters by using a nested SUBSTITUTE formula.
=SUBSTITUTE(SUBSTITUTE(A2,"$",""),"%","")
As we add more and more items we want to remove we need to nest more and more SUBSTITUTE functions and this will become unmanageable. So instead we will create a user defined function in VBA to simplify this.
Function REMOVETEXTS(strInput As String, rngFind As Range) As String
Dim strTemp As String
Dim strFind As String
strTemp = strInput
For Each cell In rngFind
strFind = cell.Value
strTemp = Replace(strTemp, strFind, "")
Next cell
REMOVETEXTS = strTemp
End Function
Copy and paste this table into cell A1 in Excel
Text | Revised Text |
---|---|
T%%his$$ te$xt go%t | =REMOVETEXTS(A2,$D$2:$D$5) |
m#esse#d u#p s%ome!!!how | =REMOVETEXTS(A3,$D$2:$D$5) |
N#$%o!t sure !$h$%ow $% | =REMOVETEXTS(A4,$D$2:$D$5) |
but I !nee%d $to fix it | =REMOVETEXTS(A5,$D$2:$D$5) |
With this user defined function we can easily take care of the mess with a simple looking formula without nesting multiple SUBSTITUTE functions.
=REMOVETEXTS(A2,$D$2:$D$5)
Where $D$2:$D$5 is a range containing the characters we want to remove (#, $, % and !).
this is awesome. But, how do you prevent it from making partial matches? Ie. if I wanted to remove “the” but not “the” from “them” or “there”. Another example would be “or” and “doctor.” With your script, if you wanted to exclude just “or” from a list of text strings, it would output “doct”.
nevermind, I found a workaround… wrap the words you want to be excluded in spaces. Ie. ” or ” instead of “or”.
Yes, wrapping the text with spaces should solve most problems. Thanks for the tip!