iconv {base}R Documentation

Convert Character Vector between Encodings

Description

This uses system facilities to convert a character vector between encodings: the ‘i’ stands for ‘internationalization’.

Usage

iconv(x, from, to, sub=NA)

iconvlist()

Arguments

x A character vector.
from A character string describing the current encoding.
to A character string describing the target encoding.
sub character string. If not NA it is used to replace any non-convertible bytes in the input. (This would normally be a single character, but can be more. If "byte", the indication is "<xx>" with the hex code of the byte.

Details

The names of encodings and which ones are available (and indeed, if any are) is platform-dependent. On systems that support R's iconv you can use "" for the encoding of the current locale, as well as "latin1" and "UTF-8".

On many platforms iconvlist provides an alphabetical list of the supported encodings. On others, the information is on the man page for iconv(5) or elsewhere in the man pages (and beware that the system command iconv may not support the same set of encodings as the C functions R calls). Unfortunately, the names are rarely common across platforms.

Elements of x which cannot be converted (perhaps because they are invalid or because they cannot be represented in the target encoding) will be returned as NA unless sub is specified.

Some versions of iconv will allow transliteration by appending //TRANSLIT to the to encoding: see the examples.

Value

A character vector of the same length and the same attributes as x.

Note

Not all platforms support these functions. See also capabilities("iconv").

See Also

localeToCharset, file.

Examples

## Not run: 
iconvlist()

## convert from Latin-2 to UTF-8: two of the glibc iconv variants.
iconv(x, "ISO_8859-2", "UTF-8")
iconv(x, "LATIN2", "UTF-8")

## Both x below are in latin1 and will only display correctly in a
## latin1 locale.
(x <- "fa\xE7ile")
charToRaw(xx <- iconv(x, "latin1", "UTF-8"))
## in a UTF-8 locale, print(xx)

iconv(x, "latin1", "ASCII")          #   NA
iconv(x, "latin1", "ASCII", "?")     # "fa?ile"
iconv(x, "latin1", "ASCII", "")      # "faile"
iconv(x, "latin1", "ASCII", "byte")  # "fa<e7>ile"

# Extracts from R help files
(x <- c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher"))
iconv(x, "latin1", "ASCII//TRANSLIT")
iconv(x, "latin1", "ASCII", sub="byte")
## End(Not run)

[Package base version 2.2.1 Index]