Extract all partial matches in Excel
This tutorial shows how to Extract all partial matches in Excel using the example below;
To extract all matches based on a partial match, you can use use an array formula based on the INDEX and AGGREGATE functions, with support from ISNUMBER and SEARCH. In the example shown, the formula in G5 is:
with the following named ranges: “search” = D5, “ct” = D8, “data” = B5:B55.
Note: this is an array formula, but it does not require control + shift + enter, since AGGREGATE can handle arrays natively.
How this formula works
The core of this formula is the INDEX function, with AGGREGATE used to figure out the “nth match” for each row in the extract area:
Almost all of the work is in figuring out and reporting which rows in “data” match the search string, and reporting the position of for each matching value to INDEX. This is done with the AGGREGATE function configured like this:
The first argument, 15, tells AGGREGATE to behave like SMALL, and return nth smallest values. The second argument, 6, is an option to ignore errors. The third argument is an expression that generates an array of matching results (described below). The forth argument, F5, acts like “k” in SMALL to specify the “nth” value.
AGGREGATE operates on arrays, and the expression below builds an array for the third argument inside AGGREGATE :
Here, the ROW function is used to generate an array of relative row numbers, and ISNUMBER and SEARCH are used together to match the search string against values in the data, which generates an array of TRUE and FALSE values.
The clever bit is to divide the row numbers by the search results. In a math operation like this, TRUE behaves like 1, and FALSE behaves like zero. The result is row numbers associated with a positive match are divided by 1 and survive the operation, while row numbers associated with non-matching values are destroyed and become #DIV/0 errors. Because AGGREGATE is set to ignore errors, it ignores the #DIV/0 errors, and returns the “nth” smallest number in the remaining values, using the number in column F for “nth”.
Like all array formulas, this formula is “expensive” in terms of resources with a large data set. To minimize performance impacts, the entire INDEX and MATCH formula is wrapped in IF like this:
where the named range “ct” (D8) holds this formula:
This check stops the the INDEX and AGGREGATE part of the formula from running once all matching values have been extracted.
Array formula with SMALL
If your version of Excel does not have the AGGREGATE function, you can use an alternative formula based on SMALL and IF:
Note: this is an array formula and must be entered with control + shift + enter.