Skip to contents

Convert Australian state names and abbreviations into a consistent format

Usage

clean_state(
  x,
  to = "state_abbr",
  fuzzy_match = TRUE,
  max_dist = 0.4,
  method = "jw"
)

strayr(...)

Arguments

x

a (character) vector containing Australian state names or abbreviations or a (numeric) vector containing state codes (1 = NSW, 2 = Vic, 3 = Qld, 4 = SA, 5 = WA, 6 = Tas, 7 = NT, 8 = ACT).

to

what form should the state names be converted to? Options are "state_name", "state_abbr" (the default), "iso", "postal", "code" and "colour".

fuzzy_match

logical; either TRUE (the default) which indicates that approximate/fuzzy string matching should be used, or FALSE which indicates that only exact matches should be used.

max_dist

numeric, sets the maximum acceptable distance between your string and the matched string. Default is 0.4. Only relevant when fuzzy_match is TRUE.

method

the method used for approximate/fuzzy string matching. Default is "jw", the Jaro-Winker distance; see `??stringdist-metrics` for more options.

...

all arguments to `strayr` are passed to `clean_state`

Value

a character vector of state names, abbreviations, or codes.

Details

`strayr()` is a wrapper around `clean_state()` and is provided for backwards compatibility. `strayr()` is soft-deprecated, but will not be removed for the foreseeable future. New code should use `clean_state()`.

Examples


x <- c("western Straya", "w. A ", "new soth wailes", "SA", "tazz")

# Convert the above to state abbreviations
clean_state(x)
#> [1] "WA"  "WA"  "NSW" "SA"  "Tas"

# Convert the elements of `x` to state names

clean_state(x, to = "state_name")
#> [1] "Western Australia" "Western Australia" "New South Wales"  
#> [4] "South Australia"   "Tasmania"         

# Disable fuzzy matching; you'll get NAs unless exact matches can be found

clean_state(x, fuzzy_match = FALSE)
#> [1] NA   NA   NA   "SA" NA  

# You can use clean_state in a dplyr mutate call

x_df <- data.frame(state = x, stringsAsFactors = FALSE)

if (FALSE) x_df %>% mutate(state_abbr = clean_state(state))