rfc9233.original   rfc9233.txt 
Network Working Group P. Faltstrom Internet Engineering Task Force (IETF) P. Fältström
Internet-Draft Netnod Request for Comments: 9233 Netnod
Intended status: Standards Track February 13, 2022 Category: Standards Track March 2022
Expires: August 17, 2022 ISSN: 2070-1721
IDNA2008 and Unicode 12.0.0 Internationalized Domain Names for Applications 2008 (IDNA2008) and
draft-faltstrom-unicode12-07 Unicode 12.0.0
Abstract Abstract
This document describes the changes between Unicode 6.0.0 and Unicode This document describes the changes between Unicode 6.0.0 and Unicode
12.0.0 in the context of IDNA2008. Some additions and changes have 12.0.0 in the context of the current version of Internationalized
been made in the Unicode Standard that affect the values produced by Domain Names for Applications 2008 (IDNA2008). Some additions and
the algorithm IDNA2008 specifies. IDNA2008 allows adding exceptions changes have been made in the Unicode Standard that affect the values
to the algorithm for backward compatibility; however, this document produced by the algorithm IDNA2008 specifies. IDNA2008 allows adding
does not add any such exceptions. This document provides the exceptions to the algorithm for backward compatibility; however, this
necessary tables to IANA to make its database consistent with Unicode document does not add any such exceptions. This document provides
12.0.0. the necessary tables to IANA to make its database consistent with
Unicode 12.0.0.
To improve understanding, this document describes systems that are To improve understanding, this document describes systems that are
being used as alternatives to those that conform to IDNA2008. being used as alternatives to those that conform to IDNA2008.
TO BE REMOVED AT TIME OF PUBLICATION AS AN RFC:
This document is discussed on the i18n-discuss@ietf.org mailing list
of the IETF.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This is an Internet Standards Track document.
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 7841.
This Internet-Draft will expire on August 17, 2022. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc9233.
Copyright Notice Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Revised BSD License text as described in Section 4.e of the
the Trust Legal Provisions and are provided without warranty as Trust Legal Provisions and are provided without warranty as described
described in the Simplified BSD License. in the Revised BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction
2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Background
2.1. IDNA2008 Documents . . . . . . . . . . . . . . . . . . . 5 2.1. IDNA2008 Documents
2.2. Additional important IDNA2008-related documents . . . . . 6 2.2. Additional Important IDNA2008-Related Documents
2.3. Deployment . . . . . . . . . . . . . . . . . . . . . . . 6 2.3. Deployment
3. Notable Changes Between Unicode 6.0.0 and 12.0.0 . . . . . . 7 3. Notable Changes between Unicode 6.0.0 and 12.0.0
3.1. Changes between Unicode 6.0.0 and 7.0.0 . . . . . . . . . 7 3.1. Changes between Unicode 6.0.0 and 7.0.0
3.2. Changes between Unicode 7.0.0 and 10.0.0 . . . . . . . . 8 3.2. Changes between Unicode 7.0.0 and 10.0.0
3.3. Changes between Unicode 10.0.0 and 11.0.0 . . . . . . . . 9 3.3. Changes between Unicode 10.0.0 and 11.0.0
3.4. Changes between Unicode 11.0.0 and 12.0.0 . . . . . . . . 10 3.4. Changes between Unicode 11.0.0 and 12.0.0
4. U+111C9 SHARADA SANDHI MARK . . . . . . . . . . . . . . . . . 11 4. U+111C9 SHARADA SANDHI MARK
5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 11 5. Conclusion
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 6. IANA Considerations
7. Security Considerations . . . . . . . . . . . . . . . . . . . 12 7. Security Considerations
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 8. References
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 8.1. Normative References
9.1. Normative References . . . . . . . . . . . . . . . . . . 12 8.2. Informative References
9.2. Non-normative references . . . . . . . . . . . . . . . . 13 Appendix A. Changes from Unicode 6.0.0 to Unicode 7.0.0
Appendix A. Changes from Unicode 6.0.0 to Unicode 7.0.0 . . . . 15 Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0
Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0 . . . . 21 Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0
Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0 . . . . 23 Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0
Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0 . . . . 24 Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0
Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0 . . . 26 Appendix F. Changes from Unicode 11.0.0 to Unicode 12.0.0
Appendix F. Changes from Unicode 11.0.0 to Unicode 12.0.0 . . . 27 Acknowledgments
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 29 Author's Address
1. Introduction 1. Introduction
The current version of Internationalized Domain Names for The current version of Internationalized Domain Names for
Applications (IDNA) was initiated in 2008, and despite not being Applications (IDNA) was initiated in 2008, and despite not being
completed until 2010, is widely known as "IDNA2008". It is specified completed until 2010, is widely known as "IDNA2008". It is specified
in the series of documents listed in Section 2.1. The IDNA2008 in the series of documents listed in Section 2.1. The IDNA2008
standard includes an algorithm by which a derived property value is standard includes an algorithm by which a derived property value is
calculated based on the properties defined from the Unicode Standard. calculated based on the properties defined in the Unicode Standard.
The derived property values that can be calculated are defined in RFC The derived property values that can be calculated are defined in RFC
5892 [RFC5892]. Below is a summary to aid in the reading of this 5892 [RFC5892]. Below is a summary to aid in the reading of this
document. For definition of the terms, please see RFC 5892 document. For definition of the terms, please see RFC 5892
[RFC5892]. [RFC5892].
o PROTOCOL VALID: Those that are allowed to be used in IDNs. Code PROTOCOL VALID: Those that are allowed to be used in IDNs. Code
points with this property value are permitted for general use in points with this property value are permitted for general use in
IDNs. However, that a label consists only of code points that IDNs. However, the fact that a label consists only of code points
have this property value does not imply that the label can be used with this property value does not imply that the label can be used
in DNS. The abbreviated term PVALID is used to refer to this in DNS. The abbreviated term PVALID is used to refer to this
value. value.
o CONTEXTUAL RULE REQUIRED: Some characteristics of the character, CONTEXTUAL RULE REQUIRED: Some characteristics of the character,
such as it being invisible in certain contexts or problematic in such as it being invisible in certain contexts or problematic in
others, require that it not be used in labels unless specific others, require that it not be used in labels unless specific
other characters or properties are present. The abbreviated term other characters or properties are present. The abbreviated term
CONTEXT is used to refer to this value. As explained in RFC 5892 CONTEXT is used to refer to this value. As explained in RFC 5892
[RFC5892] CONTEXT is in turn divided into CONTEXTJ and CONTEXTO. [RFC5892], CONTEXT is in turn divided into CONTEXTJ and CONTEXTO.
o DISALLOWED: Those that should clearly not be included in IDNs. DISALLOWED: Those that should clearly not be included in IDNs. Code
Code points with this property value are not permitted in IDNs. points with this property value are not permitted in IDNs.
o UNASSIGNED: Those code points that are not designated (i.e., are UNASSIGNED: Those code points that are not designated (i.e., are
unassigned) in the Unicode Standard. unassigned) in the Unicode Standard.
When the Unicode Standard is updated, new code points are assigned When the Unicode Standard is updated, new code points are assigned
and already-assigned code points can have their property values and already assigned code points can have their property values
changed. changed.
o Assigning code points can create problems if the newly-assigned * Assigning code points can create problems if the newly assigned
code points are compositions of existing code points and because code points are compositions of existing code points and the
of that the normalization relationships associated with those code normalization relationships associated with those code points
points should have been changed. should have been changed because of that.
o Changing properties for already-assigned code points can create * Changing properties for already assigned code points can create
problems if the property change results in changes to the derived problems if the property change results in changes to the derived
property value. This might make an earlier allowed code point property value. A previously allowed code point whose derived
whose derived property value is PVALID to then not be allowed property value is PVALID may now be prohibited if its derived
anymore if its derived property value changes to DISALLOWED. The property value changes to DISALLOWED. The problem can also happen
problem can also happen the other way around: a code point that the other way around: a code point that was not allowed (and thus
was not allowed (and thus is prohibited) can suddenly end up being was prohibited) can suddenly be allowed.
allowed.
o Problems can also be created if the properties assigned to those * Problems can also be created if the properties assigned to those
code points are inconsistent with IDNA2008 assumptions about how code points are inconsistent with IDNA2008 assumptions about how
properties are assigned and/or about how code points with those properties are assigned and/or about how code points with those
properties are used or behave. properties are used or behave.
There were three incompatible changes in the Unicode standard between There were three incompatible changes in the Unicode Standard between
Unicode 5.2.0 [Unicode-5.2.0] and Unicode 6.0.0 [Unicode-6.0.0]; they Unicode 5.2.0 [Unicode-5.2.0] and Unicode 6.0.0 [Unicode-6.0.0]; they
are described in RFC 6452 [RFC6452]. The code points U+0CF1 and are described in RFC 6452 [RFC6452]. The code points U+0CF1 and
U+0CF2 had a derived property value change from DISALLOWED to PVALID, U+0CF2 had a derived property value change from DISALLOWED to PVALID,
and the code point U+19DA had a change in derived property value from and the code point U+19DA had a change in derived property value from
PVALID to DISALLOWED. These changes where examined in great detail, PVALID to DISALLOWED. These changes where examined in great detail,
but the IETF concluded that these changes to the Unicode standard did but the IETF concluded that these changes to the Unicode Standard did
not warrant an update to RFC 5892 [RFC5892]. not warrant an update to RFC 5892 [RFC5892].
As described in Section 3, more incompatible changes have been made As described in Section 3, more incompatible changes have been made
to code points between Unicode 6.0.0 and Unicode 12.0.0 to code points between Unicode 6.0.0 and Unicode 12.0.0
[Unicode-12.0.0]; however, the changes in the derived property values [Unicode-12.0.0]; however, the changes in the derived property values
do not result in exceptions (as defined in section 2.6 of RFC 5892 do not result in exceptions (as defined in Section 2.6 of RFC 5892
[RFC5892]) being added to RFC 5892 [RFC5892]. [RFC5892]) that would require an update to the "IDNA Contextual
Rules" registry (which would also be considered an update to RFC 5892
[RFC5892]).
Further, in 2015, the Internet Architecture Board (IAB) issued a Further, in 2015, the Internet Architecture Board (IAB) issued a
statement [IAB2005-1] that advised the community to avoid using any statement [IAB2005-1] that advised the community to avoid using any
of the potentially problematic code points and asked the IETF to of the potentially problematic code points and asked the IETF to
resolve the issues related to the code point ARABIC LETTER BEH WITH resolve the issues related to the code point ARABIC LETTER BEH WITH
HAMZA ABOVE (U+08A1) that was introduced in Unicode 7.0.0 HAMZA ABOVE (U+08A1) that was introduced in Unicode 7.0.0
[Unicode-7.0.0]. In February of that year, the statement was revised [Unicode-7.0.0]. In February of that year, the statement was revised
[IAB2005-2] to focus on the latter request. More details about the [IAB2005-2] to focus on the latter request. More details about the
problem of code point sequences not normalizing as one might expect problem of code point sequences not normalizing as one might expect
appear in a draft that was part of the discussion [IDNA7]. appear in a draft that was part of the discussion [IDNA7].
skipping to change at page 5, line 4 skipping to change at line 179
may have similar issues. While the affected code points remain may have similar issues. While the affected code points remain
PVALID in this document, identification of the problem resulted in a PVALID in this document, identification of the problem resulted in a
clarification of the review process for new Unicode versions. That clarification of the review process for new Unicode versions. That
clarification, which reinforces the original review plan to capture clarification, which reinforces the original review plan to capture
issues like these, was published as RFC 8753 [RFC8753]. Any review issues like these, was published as RFC 8753 [RFC8753]. Any review
of Unicode versions after 12.0.0 should be made according to RFC 8753 of Unicode versions after 12.0.0 should be made according to RFC 8753
[RFC8753]; an objective of this document is to ensure that a proper [RFC8753]; an objective of this document is to ensure that a proper
review of such versions after version 12.0.0 can be made. review of such versions after version 12.0.0 can be made.
2. Background 2. Background
2.1. IDNA2008 Documents 2.1. IDNA2008 Documents
IDNA2008 consists of the following documents. The documents in the IDNA2008 consists of the following documents. The documents in the
set have informal names. set have informal names.
o Internationalized Domain Names for Applications (IDNA): * "Internationalized Domain Names for Applications (IDNA):
Definitions and Document Framework [RFC5890], informally called Definitions and Document Framework" [RFC5890], informally called
"Defs" or "Definitions", contains definitions and other material "Defs" or "Definitions", contains definitions and other material
that are needed for understanding other documents in the set. that are needed for understanding other documents in the set.
o Internationalized Domain Names in Applications (IDNA): Protocol * "Internationalized Domain Names in Applications (IDNA): Protocol"
[RFC5891], informally called "Protocol", describes the core [RFC5891], informally called "Protocol", describes the core
IDNA2008 protocol and its operations. It needs to be interpreted IDNA2008 protocol and its operations. It needs to be interpreted
in combination with the Bidi document (described below). in combination with the Bidi document (described below). RFC 5891
[RFC5891] obsoletes RFC 3491 [RFC3491] and, in particular, the use
of the tables to which RFC 3491 [RFC3491] refers.
o The Unicode Code Points and Internationalized Domain Names for * "The Unicode Code Points and Internationalized Domain Names for
Applications (IDNA) [RFC5892], informally called "Tables", lists Applications (IDNA)" [RFC5892], informally called "Tables", lists
the categories and rules that identify the code points allowed in the categories and rules that identify the code points allowed in
a label written in native character form (called a "U-label"), and a label written in native character form (called a "U-label"), and
is based on Unicode 5.2.0 [Unicode-5.2.0] code point assignments is based on Unicode 5.2.0 [Unicode-5.2.0] code point assignments
and additional rules unique to IDNA2008. The Unicode-based rules and additional rules unique to IDNA2008. The Unicode-based rules
in RFC 5892 are expected to be stable across Unicode updates and in RFC 5892 are expected to be stable across Unicode updates and
hence independent of Unicode versions. RFC 5892 [RFC5892] hence independent of Unicode versions.
obsoletes RFC 3491 [RFC3491], and in particular the use of the
tables to which RFC 3491 [RFC3491] refers.
o Right-to-Left Scripts for Internationalized Domain Names for * "Right-to-Left Scripts for Internationalized Domain Names for
Applications (IDNA) [RFC5893], informally called "Bidi", specifies Applications (IDNA)" [RFC5893], informally called "Bidi",
special rules for labels that contain characters that are written specifies special rules for labels that contain characters that
from right to left. are written from right to left.
o Internationalized Domain Names for Applications (IDNA): * "Internationalized Domain Names for Applications (IDNA):
Background, Explanation, and Rationale [RFC5894], informally Background, Explanation, and Rationale" [RFC5894], informally
called "Rationale", provides an overview of the protocol and called "Rationale", provides an overview of the protocol and
associated tables, and gives explanatory material and some associated tables, and gives explanatory material and some
rationale for the decisions that led to IDNA2008. It also rationale for the decisions that led to IDNA2008. It also
contains advice for DNS registry operators and others who use contains advice for DNS registry operators and others who use
Internationalized Domain Names (IDNs). Internationalized Domain Names (IDNs).
o Mapping Characters for Internationalized Domain Names in * "Mapping Characters for Internationalized Domain Names in
Applications (IDNA) 2008 [RFC5895], informally called "Mapping", Applications (IDNA) 2008" [RFC5895], informally called "Mapping",
discusses the issue of mapping characters into other characters discusses the issue of mapping characters into other characters
and provides guidance for doing so when that is appropriate. RFC and provides guidance for doing so when that is appropriate. RFC
5895 provides advice only and is not a required part of IDNA. 5895 provides advice only and is not a required part of IDNA.
2.2. Additional important IDNA2008-related documents 2.2. Additional Important IDNA2008-Related Documents
There are other documents important for the understanding and There are other documents important for the understanding and
functioning of IDNA2008, for example this. functioning of IDNA2008, for example this.
o The Unicode Code Points and Internationalized Domain Names for * "The Unicode Code Points and Internationalized Domain Names for
Applications (IDNA) - Unicode 6.0 [RFC6452] describes some changes Applications (IDNA) - Unicode 6.0" [RFC6452] describes some
made to Unicode 6.0.0 [Unicode-6.0.0] that resulted in derived changes made to Unicode 6.0.0 [Unicode-6.0.0] that resulted in
property value change for the code points U+0CF1, U+0CF2 and derived property value changes for the code points U+0CF1, U+0CF2,
U+19DA. U+0CF1 and U+0CF2 changed from DISALLOWED to PVALID, and U+19DA. U+0CF1 and U+0CF2 changed from DISALLOWED to PVALID,
while U+19DA changed from PVALID to DISALLOWED. The IETF while U+19DA changed from PVALID to DISALLOWED. The IETF
concluded that no update to RFC 5892 [RFC5892] was needed based on concluded that no update to RFC 5892 [RFC5892] was needed based on
the changes made in Unicode 6.0.0 [Unicode-6.0.0]. As a result, the changes made in Unicode 6.0.0 [Unicode-6.0.0]. As a result,
the derived property value remained aligned with the Unicode the derived property value remained aligned with the Unicode
Standard. Specifically, no exception was added. Standard. Specifically, no exception was added.
2.3. Deployment 2.3. Deployment
There are many variations on the general IDNA model in use in the There are many variations on the general IDNA model in use in the
various parts of the community. The following lists some of the various parts of the community. The following lists some of the
strategies that implementations that claim to be IDNA compliant are strategies that implementations that claim to be IDNA compliant are
known to use, but it should be noted the list is not complete: known to use, but it should be noted the list is not complete:
o IDNA2003 as specified in RFC 3490 [RFC3490] and RFC 3491 * IDNA2003 as specified in RFC 3490 [RFC3490] and RFC 3491
[RFC3491]. Those specifications are dependent on case folding and [RFC3491]. Those specifications are dependent on case folding,
NFKC normalization and on tables that specify for each code point Normalization Form KC (NFKC), and on tables that specify for each
whether it is allowed to be used or not, with a distinction made code point whether it is allowed to be used or not, with a
between use for "stored strings" and "query strings". The tables distinction made between use for "stored strings" and "query
themselves are dependent on Unicode 3.2 [Unicode-3.2.0]. strings". The tables themselves are dependent on Unicode 3.2
[Unicode-3.2.0].
o A number of variations on IDNA2003, sometimes presented as * A number of variations on IDNA2003, sometimes presented as
"updated IDNA2003" or the like, which follow the principles of "updated IDNA2003" or the like, which follow the principles of
IDNA2003 as understood by the implementers but that use tables IDNA2003 as understood by the implementers but that use tables
that represent how the implementers believe Stringprep [RFC3454] that represent how the implementers believe Stringprep [RFC3454]
and Nameprep [RFC3491] would have evolved had the IETF not moved and Nameprep [RFC3491] would have evolved had the IETF not moved
in the direction of IDNA2008 instead. in the direction of IDNA2008 instead.
o A mix between IDNA2003 and IDNA2008 where code points assigned to * A mix between IDNA2003 and IDNA2008 where code points assigned to
Unicode after Unicode 3.2.0 [Unicode-3.2.0] have derived property Unicode after Unicode 3.2.0 [Unicode-3.2.0] have derived property
value calculated according to the algorithm specified in IDNA2008. value calculated according to the algorithm specified in IDNA2008.
o A mix between IDNA2003 and IDNA2008 according to the Unicode * A mix between IDNA2003 and IDNA2008 according to the Unicode
Technical Standard #46 [UTS-46]. Because that document specifies Technical Standard #46 [UTS-46]. Because that document specifies
different profiles, there are several variations that leave users different profiles, there are several variations that leave users
with no guarantee that two applications claiming conformance to with no guarantee that two applications claiming conformance to
UTS#46 will interoperate well with each other much less with UTS#46 will interoperate well with each other much less with
conforming IDNA2008 implementations. UTS#46 is ultimately based conforming IDNA2008 implementations. UTS#46 is ultimately based
on a normative table very much like the one used by Stringprep on a normative table very much like the one used by Stringprep
[RFC3454] but updated for each new version of Unicode. [RFC3454] but updated for each new version of Unicode.
o The (normative) IDNA2008 algorithm applied to whatever version of * The (normative) IDNA2008 algorithm applied to whatever version of
Unicode Standard exists in the operating system and/or libraries Unicode Standard exists in the operating system and/or libraries
used, independent of whatever version of tables appears in the used, independent of whatever version of tables appears in the
(non-normative) IANA database. (non-normative) IANA database.
In practice, the Unicode Consortium creates a maximum set of code In practice, the Unicode Consortium creates a maximum set of code
points by assigning code points in the Unicode Standard. The points by assigning code points in the Unicode Standard. The
IDNA2008 rules use the Unicode Standard to create a further subset of IDNA2008 rules use the Unicode Standard to create a further subset of
code points and context that are permitted in DNS labels associated code points and context that are permitted in DNS labels associated
with its PVALID, and CONTEXT (CONTEXTJ or CONTEXTO) derived property with its PVALID and CONTEXT (CONTEXTJ or CONTEXTO) derived property
values. DNS registries and other organizations that deal with IDNs values. DNS registries and other organizations that deal with IDNs
are supposed to create their own subsets from IDNA2008 for use by are supposed to create their own subsets from IDNA2008 for use by
those registries and organizations. those registries and organizations.
This progressive subsetting and narrowing of the repertoire of code This progressive subsetting and narrowing of the repertoire of code
points that can be used in labels is an implementation of the points that can be used in labels is an implementation of the
principles of being conservative when deciding what code points to principles of being conservative when deciding what code points to
include in such a subset. SAC-084 [SAC-084] and RFC 6912 [RFC6912] include in such a subset. SAC-084 [SAC-084] and RFC 6912 [RFC6912]
recommend to DNS registries and other organizations to be recommend to DNS registries and other organizations to be
conservative when creating their subsets, and to use the principle of conservative when creating their subsets and to use the principle of
creating subsets by inclusion. creating subsets by inclusion.
See also the Security Considerations section in this document. See also Security Considerations (Section 7) in this document.
3. Notable Changes Between Unicode 6.0.0 and 12.0.0 3. Notable Changes between Unicode 6.0.0 and 12.0.0
Among the changes between the Unicode versions, most code points that Among the changes between the Unicode versions, most code points that
change derived property value change from UNASSIGNED to PVALID or change derived property value change from UNASSIGNED to PVALID or
from UNASSIGNED to DISALLOWED. The interesting changes in derived from UNASSIGNED to DISALLOWED. The interesting changes in derived
property values include other changes. All changes between the major property values include other changes. All changes between the major
versions of Unicode can be found in Appendix A (6.0.0-7.0.0), versions of Unicode can be found in Appendix A (6.0.0-7.0.0),
Appendix B (7.0.0-8.0.0), Appendix C (8.0.0-9.0.0), Appendix D Appendix B (7.0.0-8.0.0), Appendix C (8.0.0-9.0.0), Appendix D
(9.0.0-10.0.0), Appendix E (10.0.0-11.0.0) and Appendix F (9.0.0-10.0.0), Appendix E (10.0.0-11.0.0), and Appendix F
(11.0.0-12.0.0). (11.0.0-12.0.0).
3.1. Changes between Unicode 6.0.0 and 7.0.0 3.1. Changes between Unicode 6.0.0 and 7.0.0
Change in number of characters in each category: Change in number of characters in each category:
PVALID changed from 97418 to 99867 (+2449) * PVALID changed from 97418 to 99867 (+2449)
UNASSIGNED changed from 865081 to 861509 (-3572) * UNASSIGNED changed from 865081 to 861509 (-3572)
CONTEXTJ did not change, at 2 * CONTEXTJ did not change, at 2
CONTEXTO did not change, at 25
DISALLOWED changed from 151586 to 152709 (+1123) * CONTEXTO did not change, at 25
TOTAL did not change, at 1114112 * DISALLOWED changed from 151586 to 152709 (+1123)
There are no changes made to Unicode between version 6.0.0 and * TOTAL did not change, at 1114112
7.0.0 that impact IDNA2008 calculation of the derived property
values. There are no changes made to Unicode between version 6.0.0 and 7.0.0
that impact IDNA2008 calculation of the derived property values.
The code points U+17B4 KHMER VOWEL INHERENT AQ and U+17B5 KHMER VOWEL The code points U+17B4 KHMER VOWEL INHERENT AQ and U+17B5 KHMER VOWEL
INHERENT AA both changed the general category from Cf (Format) to Mn INHERENT AA both changed the General Category from Cf (Format) to Mn
(Nonspacing_Mark), but that did not impact the calculation of the (Nonspacing_Mark), but that did not impact the calculation of the
derived property value which stayed at DISALLOWED. derived property value which stayed at DISALLOWED.
The character ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1) was The character ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1) was
introduced in Unicode 7.0.0. This was discussed extensively in the introduced in Unicode 7.0.0. This was discussed extensively in the
IETF, and by the IAB in their statement [IAB2005-1] requesting the IETF and also by the IAB in their statement [IAB2005-1] requesting
IETF to investigate the issue. Specifically, the IAB stated: the IETF to investigate the issue. Specifically, the IAB stated:
On the same precautionary principle, the IAB recommends that the | On the same precautionary principle, the IAB recommends that the
Internationalized Domain Names for Applications (IDNA) Parameters | Internationalized Domain Names for Applications (IDNA) Parameters
registry <https://www.iana.org/assignments/idna-tables/> not be | registry <https://www.iana.org/assignments/idna-tables/> not be
updated to Unicode 7.0.0 until the IETF has consensus on a | updated to Unicode 7.0.0 until the IETF has consensus on a
solution to this problem. | solution to this problem.
The discussion in the IETF concluded that although it is possible to The discussion in the IETF concluded that although it is possible to
create "the same" character in multiple ways, the issue with U+08A1 create "the same" character in multiple ways, the issue with U+08A1
is not unique. The character U+08A1 (ARABIC LETTER BEH WITH HAMZA is not unique. The character U+08A1 (ARABIC LETTER BEH WITH HAMZA
ABOVE) can be represented with the sequence ARABIC LETTER BEH ABOVE) can be represented with the sequence ARABIC LETTER BEH
(U+0628) and ARABIC HAMZA ABOVE (U+0654). This identical to LATIN (U+0628) and ARABIC HAMZA ABOVE (U+0654). This is identical to LATIN
SMALL LETTER O WITH STROKE (U+00F8), which can be represented with SMALL LETTER O WITH STROKE (U+00F8), which can be represented with
the sequence LATIN SMALL LETTER O (U+006F) followed by COMBINING the sequence LATIN SMALL LETTER O (U+006F) followed by COMBINING
SHORT SOLIDUS OVERLAY (U+0337). SHORT SOLIDUS OVERLAY (U+0337).
Although the discussion about this specific code point resulted in Although the discussion about this specific code point resulted in
acceptance of the derived property value of PVALID, the underlying acceptance of the derived property value of PVALID, the underlying
problem with combining sequences is not understood fully. Therefore, problem with combining sequences is not understood fully. Therefore,
it cannot be claimed that this case can be extrapolated to other it cannot be claimed that this case can be extrapolated to other
situations and other code points. situations and other code points.
3.2. Changes between Unicode 7.0.0 and 10.0.0 3.2. Changes between Unicode 7.0.0 and 10.0.0
Change in number of characters in each category: Change in number of characters in each category:
Code points that changed derived property value: 0 * Code points that changed derived property value: 0
PVALID changed from 99867 to 122411 (+22544) * PVALID changed from 99867 to 122411 (+22544)
UNASSIGNED changed from 861509 to 837775 (-23734)
CONTEXTJ did not change, at 2 * UNASSIGNED changed from 861509 to 837775 (-23734)
CONTEXTO did not change, at 25 * CONTEXTJ did not change, at 2
DISALLOWED changed from 152709 to 153899 (+1190) * CONTEXTO did not change, at 25
TOTAL did not change, at 1114112 * DISALLOWED changed from 152709 to 153899 (+1190)
There are no changes made to Unicode between version 7.0.0 and * TOTAL did not change, at 1114112
10.0.0 that impact IDNA2008 calculation of the derived property
values. There are no changes made to Unicode between version 7.0.0 and 10.0.0
that impact IDNA2008 calculation of the derived property values.
3.3. Changes between Unicode 10.0.0 and 11.0.0 3.3. Changes between Unicode 10.0.0 and 11.0.0
Change in number of characters in each category: Change in number of characters in each category:
Code points that changed derived property value: 1 * Code points that changed derived property value: 1
PVALID changed from 122411 to 122734 (+323) * PVALID changed from 122411 to 122734 (+323)
UNASSIGNED changed from 837775 to 837091 (-684) * UNASSIGNED changed from 837775 to 837091 (-684)
CONTEXTJ did not change, at 2 * CONTEXTJ did not change, at 2
CONTEXTO did not change, at 25 * CONTEXTO did not change, at 25
DISALLOWED changed from 153899 to 154260 (+361) * DISALLOWED changed from 153899 to 154260 (+361)
TOTAL did not change, at 1114112 * TOTAL did not change, at 1114112
Georgian letters in the ranges U+10D0..U+10FA and U+10FD..U+10FF * Georgian letters in the ranges U+10D0..U+10FA and U+10FD..U+10FF
had their General Properties changed from Lo to Ll, to reflect had their General Category changed from Lo (Other_Letter) to Ll
their status as the lowercase of new Georgian case pairs. Case (Lowercase_Letter) to reflect their status as the lowercase of new
mappings were also added. Georgian case pairs. Case mappings were also added.
SHARADA SANDHI MARK (U+111C9) was changed from Po to Mn, and from * SHARADA SANDHI MARK (U+111C9) General Category was changed from Po
bc=L to bc=NSM. (Other_Punctuation) to Mn (Nonspacing_Mark), and the Bidi property
was changed from L (Left to Right) to NSM (Nonspacing Mark).
The properties for ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and * The properties for ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and
ZANABZAR SQUARE VOWEL SIGN AU (U+11A08) were corrected from Mc to ZANABAZAR SQUARE VOWEL SIGN AU (U+11A08) were corrected from Mc to
Mn. Mn.
SPHERICAL ANGLE OPENING UP (U+29A1) was changed to Bidi_M=N. * SPHERICAL ANGLE OPENING UP (U+29A1) was changed to Bidi Mirrored
to No.
These changes to the Unicode Standard have the following implications These changes to the Unicode Standard have the following implications
for these code points: for these code points:
o The newly assigned 684 characters are assigned a derived property * The newly assigned 684 characters are assigned a derived property
value as of a result of applying the IDNA2008 algorithm. value as of a result of applying the IDNA2008 algorithm.
o The Georgian letters in the ranges U+10D0..U+10FA and * The Georgian letters in the ranges U+10D0..U+10FA and
U+10FD..U+10FF existed before IDNA2008 was created. Applying the U+10FD..U+10FF existed before IDNA2008 was created. Applying the
IDNA2008 algorithm to the code points assigned the derived IDNA2008 algorithm to the code points assigned the derived
property value PVALID, and that value is unchanged even if the property value PVALID, and that value is unchanged even if the
underlying Unicode properties have changed. The newly encoded underlying Unicode properties have changed. The newly encoded
Mtavruli letters have general category "Lu" and are therefore Mtavruli letters have General Category Lu (Uppercase_Letter) and
DISALLOWED. are therefore DISALLOWED.
o The U+111C9 SHARADA SANDHI MARK was added to Unicode 8.0.0 * The U+111C9 SHARADA SANDHI MARK was added to Unicode 8.0.0
[Unicode-8.0.0]. Applying the IDNA2008 algorithm to the code [Unicode-8.0.0]. Applying the IDNA2008 algorithm to the code
point assigned the derived property value DISALLOWED. The changes point assigned the derived property value DISALLOWED. The changes
in the underlying properties in the Unicode Standard Version in the underlying properties in Unicode 11.0.0 [Unicode-11.0.0]
11.0.0 [Unicode-11.0.0] caused the derived property value to caused the derived property value to change to PVALID.
change to PVALID.
o The characters ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and * The characters ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and
ZANABZAR SQUARE VOWEL SIGN AU (U+11A08) were added to Unicode ZANABAZAR SQUARE VOWEL SIGN AU (U+11A08) were added to Unicode
10.0.0 [Unicode-10.0.0]. Applying the IDNA2008 algorithm to the 10.0.0 [Unicode-10.0.0]. Applying the IDNA2008 algorithm to the
code points assigned the derived property value PVALID, and that code points assigned the derived property value PVALID, and that
value is unchanged even if the underlying Unicode properties have value is unchanged even if the underlying Unicode properties have
changed. changed.
o SPHERICAL ANGLE OPENING UP (U+29A1) existed before IDNA2008 was * SPHERICAL ANGLE OPENING UP (U+29A1) existed before IDNA2008 was
created. Applying the IDNA2008 algorithm to the code point created. Applying the IDNA2008 algorithm to the code point
assigned the derived property value DISALLOWED, and that value is assigned the derived property value DISALLOWED, and that value is
unchanged even if the underlying Unicode properties have changed. unchanged even if the underlying Unicode properties have changed.
3.4. Changes between Unicode 11.0.0 and 12.0.0 3.4. Changes between Unicode 11.0.0 and 12.0.0
Change in number of characters in each category: Change in number of characters in each category:
Code points that changed derived property value: 0 * Code points that changed derived property value: 0
PVALID changed from 122734 to 123006 (+272) * PVALID changed from 122734 to 123006 (+272)
UNASSIGNED changed from 837091 to 836537 (-554) * UNASSIGNED changed from 837091 to 836537 (-554)
CONTEXTJ did not change, at 2 * CONTEXTJ did not change, at 2
CONTEXTO did not change, at 25 * CONTEXTO did not change, at 25
DISALLOWED changed from 154260 to 154542 (+282) * DISALLOWED changed from 154260 to 154542 (+282)
TOTAL did not change, at 1114112 * TOTAL did not change, at 1114112
4. U+111C9 SHARADA SANDHI MARK 4. U+111C9 SHARADA SANDHI MARK
As one can see in Section 3, an incompatible property change was made As one can see in Section 3, an incompatible property change was made
between Unicode 6.0.0 and 12.0.0, affecting the code point U+111C9. between Unicode 6.0.0 and 12.0.0, affecting the code point U+111C9.
Its derived property value thus changed from DISALLOWED to PVALID. Its derived property value thus changed from DISALLOWED to PVALID.
In situations like these, IDNA2008 allow for addition of rules to RFC In situations like these, IDNA2008 allows for addition of rules to
5892 [RFC5892] section 2.7. If the code point is accepted, it might RFC 5892 [RFC5892], Section 2.7. If the code point is accepted, it
still be rejected if validated by software based on older versions of might still be rejected if validated by software based on versions of
Unicode than 12.0.0. As the character is rarely used outside the Unicode older than 12.0.0. As the character is rarely used outside
group of Sharada specialists, and used in some records for indicating the group of Sharada specialists but is used in some records for
sandhi breaks, the conclusion is that it could either be added as an indicating sandhi breaks, the conclusion was that it could either be
exception or allowed to change its property value, as the use of the added as an exception or allowed to change its property value. As
code point is limited outside a special community. As including an including an exception would require implementation changes to
exception would require implementation changes in deployed deployments of IDNA20008, the IETF has decided not to add a
implementations of IDNA20008, the IETF has decided to not add a BackwardCompatible rule to IDNA2008 (i.e., Section 2.7 of RFC 5892
BackwardCompatible rule to IDNA2008 (i.e. Section 2.7 of RFC 5892 [RFC5892]) for this code point. This also ensures all sandhi marks
[RFC5892] for this code point. This also ensures all sandhi marks are treated equally.
being treated in an equal way.
5. Conclusion 5. Conclusion
As described in Section 3 and Section 4, changes have been made to As described in Sections 3 and 4, changes have been made to Unicode
Unicode between version 6.0.0 and 12.0.0. Some changes to specific between version 6.0.0 and 12.0.0. Some changes to specific
characters changed their derived property value, whereas other characters changed their derived property value, whereas other
changes did not. Given the deployment considerations described in changes did not. Given the deployment considerations described in
Section 2.3 and changes in the Unicode Standard described in Section 2.3 and changes in the Unicode Standard described in Sections
Section 3 and Section 4, including implications to normalization, the 3 and 4, including implications to normalization, the conclusion is
conclusion is to not add any exception rules to IDNA2008. not to add any exception rules to IDNA2008.
This document addresses only changes to Unicode between version 6.0.0 This document addresses only changes to Unicode between version 6.0.0
and version 12.0.0. Changes in future Unicode versions might result and version 12.0.0. Changes in future Unicode versions might result
in the conclusion that exception rules need to be added to IDNA2008 in the conclusion that exception rules need to be added to IDNA2008
after the review process explained in RFC 8753 [RFC8753]. Separately after the review process explained in RFC 8753 [RFC8753]. Separately
from any changes in Unicode, the IETF might conclude that updates to from any changes in Unicode, the IETF might conclude that updates to
RFC 5892 [RFC5892] or other IDNA2008 documents might become RFC 5892 [RFC5892] or other IDNA2008 documents might become
necessary; such updates might include changes to the algorithm necessary; such updates might include changes to the algorithm
specified in IDNA2008 as well as additional rules, categories, or specified in IDNA2008 as well as additional rules, categories, or
other forms of tuning, like the clarifications in RFC 8753 [RFC8753]. other forms of tuning, like the clarifications in RFC 8753 [RFC8753].
6. IANA Considerations 6. IANA Considerations
IANA is requested to update the IDNA Parameters registry [IANA-IDNA] IANA updated the "IDNA Rules and Derived Property Values" [IANA-IDNA]
of derived property values, after the expert reviewer validates that registry after the expert reviewer validated that the derived
the derived property values are calculated correctly. property values were calculated correctly.
7. Security Considerations 7. Security Considerations
This document makes recommendations regarding the use of the IDNA2008 This document makes recommendations regarding the use of the IDNA2008
algorithm for calculation of derived property values, based on algorithm for calculation of derived property values, based on
Unicode version 12.0.0. This recommendation does not say anything Unicode version 12.0.0. This recommendation does not say anything
about what recommendations to make for future versions of the Unicode about what recommendations to make for future versions of the Unicode
Standard. Standard.
Not following these recommendations can lead to various security Not following these recommendations can lead to various security
issues. Specifically, allowing confusable characters may lead to issues. Specifically, allowing confusable characters may lead to
various phishing attacks, as described in the Security Consideration various phishing attacks, as described in the Security Consideration
Sections in the documents listed in Section 2.1. Sections in the documents listed in Section 2.1.
8. Acknowledgements 8. References
Thanks to Harald Alvestrand, Marc Blanchet, Martin Duerst, Asmus
Freytag, Ted Hardie, John Klensin, Erik Nordmark, Pete Resnick, Peter
Saint-Andre, Michel Suignard, Andrew Sullivan and Suzanne Woolf for
input to this document.
9. References
9.1. Normative References 8.1. Normative References
[RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
Profile for Internationalized Domain Names (IDN)", Profile for Internationalized Domain Names (IDN)",
RFC 3491, DOI 10.17487/RFC3491, March 2003, RFC 3491, DOI 10.17487/RFC3491, March 2003,
<https://www.rfc-editor.org/info/rfc3491>. <https://www.rfc-editor.org/info/rfc3491>.
[RFC5890] Klensin, J., "Internationalized Domain Names for [RFC5890] Klensin, J., "Internationalized Domain Names for
Applications (IDNA): Definitions and Document Framework", Applications (IDNA): Definitions and Document Framework",
RFC 5890, DOI 10.17487/RFC5890, August 2010, RFC 5890, DOI 10.17487/RFC5890, August 2010,
<https://www.rfc-editor.org/info/rfc5890>. <https://www.rfc-editor.org/info/rfc5890>.
skipping to change at page 13, line 10 skipping to change at line 557
[RFC5893] Alvestrand, H., Ed. and C. Karp, "Right-to-Left Scripts [RFC5893] Alvestrand, H., Ed. and C. Karp, "Right-to-Left Scripts
for Internationalized Domain Names for Applications for Internationalized Domain Names for Applications
(IDNA)", RFC 5893, DOI 10.17487/RFC5893, August 2010, (IDNA)", RFC 5893, DOI 10.17487/RFC5893, August 2010,
<https://www.rfc-editor.org/info/rfc5893>. <https://www.rfc-editor.org/info/rfc5893>.
[RFC6452] Faltstrom, P., Ed. and P. Hoffman, Ed., "The Unicode Code [RFC6452] Faltstrom, P., Ed. and P. Hoffman, Ed., "The Unicode Code
Points and Internationalized Domain Names for Applications Points and Internationalized Domain Names for Applications
(IDNA) - Unicode 6.0", RFC 6452, DOI 10.17487/RFC6452, (IDNA) - Unicode 6.0", RFC 6452, DOI 10.17487/RFC6452,
November 2011, <https://www.rfc-editor.org/info/rfc6452>. November 2011, <https://www.rfc-editor.org/info/rfc6452>.
9.2. Non-normative references 8.2. Informative References
[IAB2005-1] [IAB2005-1]
Internet Architecture Board, "IAB Statement on Identifiers Internet Architecture Board, "IAB Statement on Identifiers
and Unicode 7.0.0", IAB Statement on Identifiers and and Unicode 7.0.0", 27 January 2015,
Unicode 7.0.0
<https://www.iab.org/documents/correspondence-reports- <https://www.iab.org/documents/correspondence-reports-
documents/2015-2/iab-statement-on-identifiers-and-unicode- documents/2015-2/iab-statement-on-identifiers-and-unicode-
7-0-0/archive/>, January 2015. 7-0-0/archive/>.
[IAB2005-2] [IAB2005-2]
Internet Architecture Board, "IAB Statement on Identifiers Internet Architecture Board, "IAB Statement on Identifiers
and Unicode 7.0.0", IAB Statement on Identifiers and and Unicode 7.0.0", 11 February 2015,
Unicode 7.0.0
<https://www.iab.org/documents/correspondence-reports- <https://www.iab.org/documents/correspondence-reports-
documents/2015-2/iab-statement-on-identifiers-and-unicode- documents/2015-2/iab-statement-on-identifiers-and-unicode-
7-0-0/>, February 2015. 7-0-0/>.
[IANA-IDNA] [IANA-IDNA]
IANA, "IDNA Rules and Derived Property Values", IDNA Rules IANA, "IDNA Rules and Derived Property Values", February
and Derived Property Values 2022,
<https://www.iana.org/assignments/idna-tables-6.0.0/idna- <https://www.iana.org/assignments/idna-tables-12.0.0/>.
tables-6.0.0.xhtml>, April 2020.
[IDNA7] Klensin, J. and P. Faltstrom, "IDNA Update for Unicode 7.0 [IDNA7] Klensin, J. C. and P. Faltstrom, "IDNA Update for Unicode
and Later Versions", draft-klensin-idna-5892upd-unicode70 7.0 and Later Versions", Work in Progress, Internet-Draft,
<https://datatracker.ietf.org/doc/draft-klensin-idna- draft-klensin-idna-5892upd-unicode70-05, 8 October 2017,
5892upd-unicode70/>, October 2017. <https://datatracker.ietf.org/doc/html/draft-klensin-idna-
5892upd-unicode70-05>.
[RFC3454] Hoffman, P. and M. Blanchet, "Preparation of [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of
Internationalized Strings ("stringprep")", RFC 3454, Internationalized Strings ("stringprep")", RFC 3454,
DOI 10.17487/RFC3454, December 2002, DOI 10.17487/RFC3454, December 2002,
<https://www.rfc-editor.org/info/rfc3454>. <https://www.rfc-editor.org/info/rfc3454>.
[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,
"Internationalizing Domain Names in Applications (IDNA)", "Internationalizing Domain Names in Applications (IDNA)",
RFC 3490, DOI 10.17487/RFC3490, March 2003, RFC 3490, DOI 10.17487/RFC3490, March 2003,
<https://www.rfc-editor.org/info/rfc3490>. <https://www.rfc-editor.org/info/rfc3490>.
skipping to change at page 14, line 15 skipping to change at line 609
[RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for [RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for
Internationalized Domain Names in Applications (IDNA) Internationalized Domain Names in Applications (IDNA)
2008", RFC 5895, DOI 10.17487/RFC5895, September 2010, 2008", RFC 5895, DOI 10.17487/RFC5895, September 2010,
<https://www.rfc-editor.org/info/rfc5895>. <https://www.rfc-editor.org/info/rfc5895>.
[RFC6912] Sullivan, A., Thaler, D., Klensin, J., and O. Kolkman, [RFC6912] Sullivan, A., Thaler, D., Klensin, J., and O. Kolkman,
"Principles for Unicode Code Point Inclusion in Labels in "Principles for Unicode Code Point Inclusion in Labels in
the DNS", RFC 6912, DOI 10.17487/RFC6912, April 2013, the DNS", RFC 6912, DOI 10.17487/RFC6912, April 2013,
<https://www.rfc-editor.org/info/rfc6912>. <https://www.rfc-editor.org/info/rfc6912>.
[RFC8753] Klensin, J. and P. Faeltstroem, "Internationalized Domain [RFC8753] Klensin, J. and P. Fältström, "Internationalized Domain
Names for Applications (IDNA) Review for New Unicode Names for Applications (IDNA) Review for New Unicode
Versions", RFC 8753, DOI 10.17487/RFC8753, April 2020, Versions", RFC 8753, DOI 10.17487/RFC8753, April 2020,
<https://www.rfc-editor.org/info/rfc8753>. <https://www.rfc-editor.org/info/rfc8753>.
[SAC-084] The Security and Stability Advisory Committee, "SAC084", [SAC-084] The Security and Stability Advisory Committee, "SAC084",
SSAC Comments on Guidelines for the Extended Process SSAC Comments on Guidelines for the Extended Process
Similarity Review Panel for the IDN ccTLD Fast Track Similarity Review Panel for the IDN ccTLD Fast Track
Process <https://www.icann.org/en/system/files/files/sac- Process, August 2016,
084-en.pdf>, August 2016. <https://www.icann.org/en/system/files/files/sac-
084-en.pdf>.
[Unicode-3.2.0] [Unicode-3.2.0]
The Unicode Consortium, "The Unicode Standard, Version The Unicode Consortium, "The Unicode Standard, Version
3.2.0", The Unicode Standard, Version 3.2.0 ISBN 3.2.0", Mountain View: The Unicode Consortium,
0-201-61633-5, March 2002. ISBN 0-201-61633-5, March 2002,
<https://www.unicode.org/versions/Unicode3.2.0/>.
[Unicode-5.2.0] [Unicode-5.2.0]
The Unicode Consortium, "The Unicode Standard, Version The Unicode Consortium, "The Unicode Standard, Version
5.2.0", The Unicode Standard, Version 5.2.0 ISBN 5.2.0", Mountain View: The Unicode Consortium,
978-1-936213-00-9, October 2009. ISBN 978-1-936213-00-9, October 2009,
<https://www.unicode.org/versions/Unicode5.2.0/>.
[Unicode-6.0.0] [Unicode-6.0.0]
The Unicode Consortium, "The Unicode Standard, Version The Unicode Consortium, "The Unicode Standard, Version
6.0.0", The Unicode Standard, Version 6.0.0 ISBN 6.0.0", Mountain View: The Unicode Consortium,
978-1-936213-01-6, October 2011. ISBN 978-1-936213-01-6, October 2011,
<https://www.unicode.org/versions/Unicode6.0.0/>.
[Unicode-7.0.0] [Unicode-7.0.0]
The Unicode Consortium, "The Unicode Standard, Version The Unicode Consortium, "The Unicode Standard, Version
7.0.0", The Unicode Standard, Version 7.0.0 ISBN 7.0.0", Mountain View: The Unicode Consortium,
978-1-936213-09-2, June 2014. ISBN 978-1-936213-09-2, June 2014,
<https://www.unicode.org/versions/Unicode7.0.0/>.
[Unicode-8.0.0] [Unicode-8.0.0]
The Unicode Consortium, "The Unicode Standard, Version The Unicode Consortium, "The Unicode Standard, Version
8.0.0", The Unicode Standard, Version 8.0.0 ISBN 8.0.0", Mountain View: The Unicode Consortium,
978-1-936213-10-8, June 2015. ISBN 978-1-936213-10-8, June 2015,
<https://www.unicode.org/versions/Unicode8.0.0/>.
[Unicode-10.0.0] [Unicode-10.0.0]
The Unicode Consortium, "The Unicode Standard, Version The Unicode Consortium, "The Unicode Standard, Version
10.0.0", The Unicode Standard, Version 10.0.0 ISBN 10.0.0", Mountain View: The Unicode Consortium,
978-1-936213-16-0, June 2017. ISBN 978-1-936213-16-0, June 2017,
<https://www.unicode.org/versions/Unicode10.0.0/>.
[Unicode-11.0.0] [Unicode-11.0.0]
The Unicode Consortium, "The Unicode Standard, Version The Unicode Consortium, "The Unicode Standard, Version
11.0.0", The Unicode Standard, Version 11.0.0 ISBN 11.0.0", Mountain View: The Unicode Consortium,
978-1-936213-19-1, June 2018. ISBN 978-1-936213-19-1, June 2018,
<https://www.unicode.org/versions/Unicode11.0.0/>.
[Unicode-12.0.0] [Unicode-12.0.0]
The Unicode Consortium, "The Unicode Standard, Version The Unicode Consortium, "The Unicode Standard, Version
12.0.0", The Unicode Standard, Version 12.0.0 ISBN 12.0.0", Mountain View: The Unicode Consortium,
978-1-936213-22-1, March 2019. ISBN 978-1-936213-22-1, March 2019,
<https://www.unicode.org/versions/Unicode12.0.0/>.
[UTS-46] The Unicode Consortium, "Unicode Technical Standard #46, [UTS-46] The Unicode Consortium, "Unicode Technical Standard #46,
Version 12.0.0", UNICODE IDNA COMPATIBILITY Version 12.0.0", UNICODE IDNA COMPATIBILITY PROCESSING,
PROCESSING <https://www.unicode.org/reports/tr46/>, March March 2019,
2019. <https://www.unicode.org/reports/tr46/tr46-23.html>.
Appendix A. Changes from Unicode 6.0.0 to Unicode 7.0.0 Appendix A. Changes from Unicode 6.0.0 to Unicode 7.0.0
Changes from derived property value UNASSIGNED to either PVALID or Changes from derived property value UNASSIGNED to either PVALID or
DISALLOWED. DISALLOWED.
037F ; DISALLOWED # GREEK CAPITAL LETTER YOT 037F ; DISALLOWED # GREEK CAPITAL LETTER YOT
0528 ; DISALLOWED # CYRILLIC CAPITAL LETTER EN WITH LEFT HOOK 0528 ; DISALLOWED # CYRILLIC CAPITAL LETTER EN WITH LEFT HOOK
0529 ; PVALID # CYRILLIC SMALL LETTER EN WITH LEFT HOOK 0529 ; PVALID # CYRILLIC SMALL LETTER EN WITH LEFT HOOK
052A ; DISALLOWED # CYRILLIC CAPITAL LETTER DZZHE 052A ; DISALLOWED # CYRILLIC CAPITAL LETTER DZZHE
skipping to change at page 29, line 15 skipping to change at line 1306
1F9AE..1F9AF; DISALLOWED # GUIDE DOG..PROBING CANE 1F9AE..1F9AF; DISALLOWED # GUIDE DOG..PROBING CANE
1F9BA..1F9BF; DISALLOWED # SAFETY VEST..MECHANICAL LEG 1F9BA..1F9BF; DISALLOWED # SAFETY VEST..MECHANICAL LEG
1F9C3..1F9CA; DISALLOWED # BEVERAGE BOX..ICE CUBE 1F9C3..1F9CA; DISALLOWED # BEVERAGE BOX..ICE CUBE
1F9CD..1F9CF; DISALLOWED # STANDING PERSON..DEAF PERSON 1F9CD..1F9CF; DISALLOWED # STANDING PERSON..DEAF PERSON
1FA00..1FA53; DISALLOWED # NEUTRAL CHESS KING..BLACK CHESS KNIGHT-BISHOP 1FA00..1FA53; DISALLOWED # NEUTRAL CHESS KING..BLACK CHESS KNIGHT-BISHOP
1FA70..1FA73; DISALLOWED # BALLET SHOES..SHORTS 1FA70..1FA73; DISALLOWED # BALLET SHOES..SHORTS
1FA78..1FA7A; DISALLOWED # DROP OF BLOOD..STETHOSCOPE 1FA78..1FA7A; DISALLOWED # DROP OF BLOOD..STETHOSCOPE
1FA80..1FA82; DISALLOWED # YO-YO..PARACHUTE 1FA80..1FA82; DISALLOWED # YO-YO..PARACHUTE
1FA90..1FA95; DISALLOWED # RINGED PLANET..BANJO 1FA90..1FA95; DISALLOWED # RINGED PLANET..BANJO
Acknowledgments
Thanks to Harald Alvestrand, Marc Blanchet, Martin Dürst, Asmus
Freytag, Ted Hardie, John Klensin, Erik Nordmark, Pete Resnick, Peter
Saint-Andre, Michel Suignard, Andrew Sullivan, and Suzanne Woolf for
input to this document.
Author's Address Author's Address
Patrik Faltstrom Patrik Fältström
Netnod Netnod
Email: paf@netnod.se Email: paf@netnod.se
 End of changes. 118 change blocks. 
257 lines changed or deleted 259 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/