NAME

Unicode::Semantics - Work around *the* Perl 5 Unicode bug

SYNOPSIS

$foo;      # could be anything
us($foo):  # force Unicode semantics

or:

us($foo) =~ s/\W/_/g;  # Upgrade and use immediately

DESCRIPTION

Perl uses Unicode semantics when the internal encoding for a string is UTF-8, but it uses ASCII semantics when the internal encoding is ISO-8859-1. This means that the non-ASCII part of the character set is ignored when for the following operations:

* uc, lc, ucfirst, lcfirst, \U, \L, \u, \l
* \d, \s, \w, \D, \S, \W
* /.../i, (?i:...)
* /[[:posix:]]/

Because you shouldn't (and often don't) know what the internal encoding will be, it's hard to predict whether these operations will actually do what you want. Unicode::Semantics::us() gives you predictable results.

This module exports us that upgrades your string to UTF-8 internally and returns the string.

You can also use utf8::upgrade, which does exactly the same thing, except that it does not return the string. This module was released because it's less typing in a large program :)

BINARY STRINGS

Obviously, these broken text operations are no problem when you're dealing with bytes instead of characters. Don't upgrade your binary strings!

AUTHOR

Juerd Waalboer <#####@juerd.nl>

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	Go to GitHub issues (only if GitHub is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)

NAME

SYNOPSIS

DESCRIPTION

BINARY STRINGS

AUTHOR

SEE ALSO

Module Install Instructions

Keyboard Shortcuts