strsplit {base} | R Documentation |
Split the elements of a character vector x
into substrings
according to the presence of substring split
within them.
strsplit(x, split, extended = TRUE, fixed = FALSE, perl = FALSE)
x |
character vector, each element of which is to be split. Other inputs, including a factor, will give an error. |
split |
character vector (or object which can be coerced to such)
containing regular expression(s) (unless fixed = TRUE )
to use as “split”. If empty matches occur, in particular if
split has length 0, x is split into single characters.
If split has length greater than 1, it is re-cycled along
x .
|
extended |
logical. If TRUE , extended regular expression matching
is used, and if FALSE basic regular expressions are used.
|
fixed |
logical. If TRUE match string exactly, otherwise
use regular expressions.
|
perl |
logical. Should perl-compatible regexps be used?
Has priority over extended .
|
Argument split
will be coerced to character, so
you will see uses with split = NULL
to mean
split = character(0)
, including in the examples below.
Note that spltting into single characters can be done via
split=character(0)
or split=""
; the two are equivalent
as from R 1.9.0.
A missing value of split
does not split the the corresponding
element(s) of x
at all.
A list of length length(x)
the i
-th element of which
contains the vector of splits of x[i]
.
The standard regular expression code has been reported to be very slow
when applied to extremely long character strings
(tens of thousands of characters or more): the code used when
perl=TRUE
seems much faster and more reliable for such usages.
The perl = TRUE
option is only implemented for singlebyte and
UTF-8 encodings, and will warn if used in a non-UTF-8 multibyte locale.
paste
for the reverse,
grep
and sub
for string search and
manipulation; further nchar
, substr
.
regular expression for the details of the pattern specification.
noquote(strsplit("A text I want to display with spaces", NULL)[[1]]) x <- c(as = "asfef", qu = "qwerty", "yuiop[", "b", "stuff.blah.yech") # split x on the letter e strsplit(x,"e") unlist(strsplit("a.b.c", ".")) ## [1] "" "" "" "" "" ## Note that 'split' is a regexp! ## If you really want to split on '.', use unlist(strsplit("a.b.c", "\\.")) ## [1] "a" "b" "c" ## or unlist(strsplit("a.b.c", ".", fixed = TRUE)) ## a useful function: rev() for strings strReverse <- function(x) sapply(lapply(strsplit(x, NULL), rev), paste, collapse="") strReverse(c("abc", "Statistics")) ## get the first names of the members of R-core a <- readLines(file.path(R.home(),"AUTHORS"))[-(1:8)] a <- a[(0:2)-length(a)] (a <- sub(" .*","", a)) # and reverse them strReverse(a)