Multiple matching between two vectors. Different from R-native match
function, where only one match is returned even if there are multiple
matches, mmatch
returns all of them.
mmatch(x, table, nomatch = NA_integer_)
A list of the same length as the input x
vector. Each list
item contains the matching indices (similar to match
).
Multiple matches can be useful in many cases, and there is no native R
function for this purpose. User can write their own functions combining
lapplying
with match
or %in%
, our experience however
shows that such non-vectorized function can be extremely slow, especially
when the x
or table
vector gets longer.
mmatch
delegates the multiple-matching task to a C-level function,
which is optimized for speed. Internal benchmarking shows improvement of
hundred fold, namely using mmatching
costs about 1/100 of the time
used by R-implementation.
match
vec1 <- c("HSV", "BVB", "FCB", "HSV", "BRE", "HSV", NA, "BVB")
vec2 <- c("FCB", "FCN", "FCB", "HSV", "BVB", "HSV", "FCK", NA, "BRE", "BRE")
mmatch(vec1, vec2)
#> $HSV
#> [1] 6 4
#>
#> $BVB
#> [1] 5
#>
#> $FCB
#> [1] 1 3
#>
#> $HSV
#> [1] 6 4
#>
#> $BRE
#> [1] 9 10
#>
#> $HSV
#> [1] 6 4
#>
#> $<NA>
#> [1] 8
#>
#> $BVB
#> [1] 5
#>
## compare to match
match(vec1, vec2)
#> [1] 4 5 1 4 9 4 8 5