You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
237 lines
9.5 KiB
237 lines
9.5 KiB
3 years ago
|
Metadata-Version: 2.1
|
||
|
Name: idna
|
||
|
Version: 3.3
|
||
|
Summary: Internationalized Domain Names in Applications (IDNA)
|
||
|
Home-page: https://github.com/kjd/idna
|
||
|
Author: Kim Davies
|
||
|
Author-email: kim@cynosure.com.au
|
||
|
License: BSD-3-Clause
|
||
|
Platform: UNKNOWN
|
||
|
Classifier: Development Status :: 5 - Production/Stable
|
||
|
Classifier: Intended Audience :: Developers
|
||
|
Classifier: Intended Audience :: System Administrators
|
||
|
Classifier: License :: OSI Approved :: BSD License
|
||
|
Classifier: Operating System :: OS Independent
|
||
|
Classifier: Programming Language :: Python
|
||
|
Classifier: Programming Language :: Python :: 3
|
||
|
Classifier: Programming Language :: Python :: 3 :: Only
|
||
|
Classifier: Programming Language :: Python :: 3.5
|
||
|
Classifier: Programming Language :: Python :: 3.6
|
||
|
Classifier: Programming Language :: Python :: 3.7
|
||
|
Classifier: Programming Language :: Python :: 3.8
|
||
|
Classifier: Programming Language :: Python :: 3.9
|
||
|
Classifier: Programming Language :: Python :: 3.10
|
||
|
Classifier: Programming Language :: Python :: Implementation :: CPython
|
||
|
Classifier: Programming Language :: Python :: Implementation :: PyPy
|
||
|
Classifier: Topic :: Internet :: Name Service (DNS)
|
||
|
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
||
|
Classifier: Topic :: Utilities
|
||
|
Requires-Python: >=3.5
|
||
|
License-File: LICENSE.md
|
||
|
|
||
|
Internationalized Domain Names in Applications (IDNA)
|
||
|
=====================================================
|
||
|
|
||
|
Support for the Internationalised Domain Names in Applications
|
||
|
(IDNA) protocol as specified in `RFC 5891 <https://tools.ietf.org/html/rfc5891>`_.
|
||
|
This is the latest version of the protocol and is sometimes referred to as
|
||
|
“IDNA 2008”.
|
||
|
|
||
|
This library also provides support for Unicode Technical Standard 46,
|
||
|
`Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/>`_.
|
||
|
|
||
|
This acts as a suitable replacement for the “encodings.idna” module that
|
||
|
comes with the Python standard library, but which only supports the
|
||
|
older superseded IDNA specification (`RFC 3490 <https://tools.ietf.org/html/rfc3490>`_).
|
||
|
|
||
|
Basic functions are simply executed:
|
||
|
|
||
|
.. code-block:: pycon
|
||
|
|
||
|
>>> import idna
|
||
|
>>> idna.encode('ドメイン.テスト')
|
||
|
b'xn--eckwd4c7c.xn--zckzah'
|
||
|
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
|
||
|
ドメイン.テスト
|
||
|
|
||
|
|
||
|
Installation
|
||
|
------------
|
||
|
|
||
|
To install this library, you can use pip:
|
||
|
|
||
|
.. code-block:: bash
|
||
|
|
||
|
$ pip install idna
|
||
|
|
||
|
Alternatively, you can install the package using the bundled setup script:
|
||
|
|
||
|
.. code-block:: bash
|
||
|
|
||
|
$ python setup.py install
|
||
|
|
||
|
|
||
|
Usage
|
||
|
-----
|
||
|
|
||
|
For typical usage, the ``encode`` and ``decode`` functions will take a domain
|
||
|
name argument and perform a conversion to A-labels or U-labels respectively.
|
||
|
|
||
|
.. code-block:: pycon
|
||
|
|
||
|
>>> import idna
|
||
|
>>> idna.encode('ドメイン.テスト')
|
||
|
b'xn--eckwd4c7c.xn--zckzah'
|
||
|
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
|
||
|
ドメイン.テスト
|
||
|
|
||
|
You may use the codec encoding and decoding methods using the
|
||
|
``idna.codec`` module:
|
||
|
|
||
|
.. code-block:: pycon
|
||
|
|
||
|
>>> import idna.codec
|
||
|
>>> print('домен.испытание'.encode('idna'))
|
||
|
b'xn--d1acufc.xn--80akhbyknj4f'
|
||
|
>>> print(b'xn--d1acufc.xn--80akhbyknj4f'.decode('idna'))
|
||
|
домен.испытание
|
||
|
|
||
|
Conversions can be applied at a per-label basis using the ``ulabel`` or ``alabel``
|
||
|
functions if necessary:
|
||
|
|
||
|
.. code-block:: pycon
|
||
|
|
||
|
>>> idna.alabel('测试')
|
||
|
b'xn--0zwm56d'
|
||
|
|
||
|
Compatibility Mapping (UTS #46)
|
||
|
+++++++++++++++++++++++++++++++
|
||
|
|
||
|
As described in `RFC 5895 <https://tools.ietf.org/html/rfc5895>`_, the IDNA
|
||
|
specification does not normalize input from different potential ways a user
|
||
|
may input a domain name. This functionality, known as a “mapping”, is
|
||
|
considered by the specification to be a local user-interface issue distinct
|
||
|
from IDNA conversion functionality.
|
||
|
|
||
|
This library provides one such mapping, that was developed by the Unicode
|
||
|
Consortium. Known as `Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/>`_,
|
||
|
it provides for both a regular mapping for typical applications, as well as
|
||
|
a transitional mapping to help migrate from older IDNA 2003 applications.
|
||
|
|
||
|
For example, “Königsgäßchen” is not a permissible label as *LATIN CAPITAL
|
||
|
LETTER K* is not allowed (nor are capital letters in general). UTS 46 will
|
||
|
convert this into lower case prior to applying the IDNA conversion.
|
||
|
|
||
|
.. code-block:: pycon
|
||
|
|
||
|
>>> import idna
|
||
|
>>> idna.encode('Königsgäßchen')
|
||
|
...
|
||
|
idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed
|
||
|
>>> idna.encode('Königsgäßchen', uts46=True)
|
||
|
b'xn--knigsgchen-b4a3dun'
|
||
|
>>> print(idna.decode('xn--knigsgchen-b4a3dun'))
|
||
|
königsgäßchen
|
||
|
|
||
|
Transitional processing provides conversions to help transition from the older
|
||
|
2003 standard to the current standard. For example, in the original IDNA
|
||
|
specification, the *LATIN SMALL LETTER SHARP S* (ß) was converted into two
|
||
|
*LATIN SMALL LETTER S* (ss), whereas in the current IDNA specification this
|
||
|
conversion is not performed.
|
||
|
|
||
|
.. code-block:: pycon
|
||
|
|
||
|
>>> idna.encode('Königsgäßchen', uts46=True, transitional=True)
|
||
|
'xn--knigsgsschen-lcb0w'
|
||
|
|
||
|
Implementors should use transitional processing with caution, only in rare
|
||
|
cases where conversion from legacy labels to current labels must be performed
|
||
|
(i.e. IDNA implementations that pre-date 2008). For typical applications
|
||
|
that just need to convert labels, transitional processing is unlikely to be
|
||
|
beneficial and could produce unexpected incompatible results.
|
||
|
|
||
|
``encodings.idna`` Compatibility
|
||
|
++++++++++++++++++++++++++++++++
|
||
|
|
||
|
Function calls from the Python built-in ``encodings.idna`` module are
|
||
|
mapped to their IDNA 2008 equivalents using the ``idna.compat`` module.
|
||
|
Simply substitute the ``import`` clause in your code to refer to the
|
||
|
new module name.
|
||
|
|
||
|
Exceptions
|
||
|
----------
|
||
|
|
||
|
All errors raised during the conversion following the specification should
|
||
|
raise an exception derived from the ``idna.IDNAError`` base class.
|
||
|
|
||
|
More specific exceptions that may be generated as ``idna.IDNABidiError``
|
||
|
when the error reflects an illegal combination of left-to-right and
|
||
|
right-to-left characters in a label; ``idna.InvalidCodepoint`` when
|
||
|
a specific codepoint is an illegal character in an IDN label (i.e.
|
||
|
INVALID); and ``idna.InvalidCodepointContext`` when the codepoint is
|
||
|
illegal based on its positional context (i.e. it is CONTEXTO or CONTEXTJ
|
||
|
but the contextual requirements are not satisfied.)
|
||
|
|
||
|
Building and Diagnostics
|
||
|
------------------------
|
||
|
|
||
|
The IDNA and UTS 46 functionality relies upon pre-calculated lookup
|
||
|
tables for performance. These tables are derived from computing against
|
||
|
eligibility criteria in the respective standards. These tables are
|
||
|
computed using the command-line script ``tools/idna-data``.
|
||
|
|
||
|
This tool will fetch relevant codepoint data from the Unicode repository
|
||
|
and perform the required calculations to identify eligibility. There are
|
||
|
three main modes:
|
||
|
|
||
|
* ``idna-data make-libdata``. Generates ``idnadata.py`` and ``uts46data.py``,
|
||
|
the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors
|
||
|
who wish to track this library against a different Unicode version may use this tool
|
||
|
to manually generate a different version of the ``idnadata.py`` and ``uts46data.py``
|
||
|
files.
|
||
|
|
||
|
* ``idna-data make-table``. Generate a table of the IDNA disposition
|
||
|
(e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC
|
||
|
5892 and the pre-computed tables published by `IANA <https://www.iana.org/>`_.
|
||
|
|
||
|
* ``idna-data U+0061``. Prints debugging output on the various properties
|
||
|
associated with an individual Unicode codepoint (in this case, U+0061), that are
|
||
|
used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging
|
||
|
or analysis.
|
||
|
|
||
|
The tool accepts a number of arguments, described using ``idna-data -h``. Most notably,
|
||
|
the ``--version`` argument allows the specification of the version of Unicode to use
|
||
|
in computing the table data. For example, ``idna-data --version 9.0.0 make-libdata``
|
||
|
will generate library data against Unicode 9.0.0.
|
||
|
|
||
|
|
||
|
Additional Notes
|
||
|
----------------
|
||
|
|
||
|
* **Packages**. The latest tagged release version is published in the
|
||
|
`Python Package Index <https://pypi.org/project/idna/>`_.
|
||
|
|
||
|
* **Version support**. This library supports Python 3.5 and higher. As this library
|
||
|
serves as a low-level toolkit for a variety of applications, many of which strive
|
||
|
for broad compatibility with older Python versions, there is no rush to remove
|
||
|
older intepreter support. Removing support for older versions should be well
|
||
|
justified in that the maintenance burden has become too high.
|
||
|
|
||
|
* **Python 2**. Python 2 is supported by version 2.x of this library. While active
|
||
|
development of the version 2.x series has ended, notable issues being corrected
|
||
|
may be backported to 2.x. Use "idna<3" in your requirements file if you need this
|
||
|
library for a Python 2 application.
|
||
|
|
||
|
* **Testing**. The library has a test suite based on each rule of the IDNA specification, as
|
||
|
well as tests that are provided as part of the Unicode Technical Standard 46,
|
||
|
`Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/>`_.
|
||
|
|
||
|
* **Emoji**. It is an occasional request to support emoji domains in this library. Encoding
|
||
|
of symbols like emoji is expressly prohibited by the technical standard IDNA 2008 and
|
||
|
emoji domains are broadly phased out across the domain industry due to associated security
|
||
|
risks. For now, applications that wish need to support these non-compliant labels may
|
||
|
wish to consider trying the encode/decode operation in this library first, and then falling
|
||
|
back to using `encodings.idna`. See `the Github project <https://github.com/kjd/idna/issues/18>`_
|
||
|
for more discussion.
|
||
|
|