NAME

Encode::UTF8Mac - "utf-8-mac" encoding, a variant utf-8 used by Mac OSX

SYNOPSIS

use Encode;
use Encode::UTF8Mac;

my $filename = Encode::encode('utf-8-mac', "\x{3054}\x{FA19}\x{4F53}");
# => \xE3\x81\x93\xE3\x82\x99\xEF\xA8\x99\xE4\xBD\x93
# note:
# Unicode utf-8(hex)    NFD()            MacOS
# U+3054  \xE3\x81\x94  U+3053 + U+3099  decompose
# U+3053  \xE3\x81\x93  (no-op)
# U+3099  \xE3\x82\x99  (no-op)
# U+FA19  \xEF\xA8\x99  U+795E           not decompose
# U+4F53  \xE4\xBD\x93  (no-op)

$filename = Encode::decode('utf-8-mac', $filename);
# => \x{3054}\x{FA19}\x{4F53}

DESCRIPTION

Encode::UTF8Mac provides a encoding called "utf-8-mac" used in Mac OSX.

On Mac OSX, utf-8 encoding is used and it is normalized form D (characters are decomposed). However, not follow the exact specification.

http://developer.apple.com/library/mac/#qa/qa2001/qa1173.html

Specifically, the following ranges are not decomposed.

U+2000-U+2FFF
U+F900-U+FAFF
U+2F800-U+2FAFF

In iconv (bundled Mac), this encoding can be using as "utf-8-mac".

This module adds "utf-8-mac" encoding for Encode, it encode/decode text with that rule in mind. This will help when you decode file name on Mac.

ENCODING

utf-8-mac
  • Encode::decode('utf-8-mac', $bytes)

    Decode as utf-8, and normalize form C except special range using Unicode::Normalize.

  • Encode::encode('utf-8-mac', $unicode)

    Normalize form D except special range using Unicode::Normalize, and encode as utf-8.

SEE ALSO

Encode, Encode::Locale, Unicode::Normalize

AUTHOR

Naoki Tomita <tomita@cpan.org>

LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.