You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
1093 lines
27 KiB
1093 lines
27 KiB
Metadata-Version: 2.1
|
|
Name: bleach
|
|
Version: 5.0.1
|
|
Summary: An easy safelist-based HTML-sanitizing tool.
|
|
Home-page: https://github.com/mozilla/bleach
|
|
Maintainer: Will Kahn-Greene
|
|
Maintainer-email: willkg@mozilla.com
|
|
License: Apache Software License
|
|
Classifier: Development Status :: 5 - Production/Stable
|
|
Classifier: Environment :: Web Environment
|
|
Classifier: Intended Audience :: Developers
|
|
Classifier: License :: OSI Approved :: Apache Software License
|
|
Classifier: Operating System :: OS Independent
|
|
Classifier: Programming Language :: Python
|
|
Classifier: Programming Language :: Python :: 3 :: Only
|
|
Classifier: Programming Language :: Python :: 3
|
|
Classifier: Programming Language :: Python :: 3.7
|
|
Classifier: Programming Language :: Python :: 3.8
|
|
Classifier: Programming Language :: Python :: 3.9
|
|
Classifier: Programming Language :: Python :: 3.10
|
|
Classifier: Programming Language :: Python :: Implementation :: CPython
|
|
Classifier: Programming Language :: Python :: Implementation :: PyPy
|
|
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
Requires-Python: >=3.7
|
|
License-File: LICENSE
|
|
Requires-Dist: six (>=1.9.0)
|
|
Requires-Dist: webencodings
|
|
Provides-Extra: css
|
|
Requires-Dist: tinycss2 (<1.2,>=1.1.0) ; extra == 'css'
|
|
Provides-Extra: dev
|
|
Requires-Dist: build (==0.8.0) ; extra == 'dev'
|
|
Requires-Dist: flake8 (==4.0.1) ; extra == 'dev'
|
|
Requires-Dist: hashin (==0.17.0) ; extra == 'dev'
|
|
Requires-Dist: pip-tools (==6.6.2) ; extra == 'dev'
|
|
Requires-Dist: pytest (==7.1.2) ; extra == 'dev'
|
|
Requires-Dist: Sphinx (==4.3.2) ; extra == 'dev'
|
|
Requires-Dist: tox (==3.25.0) ; extra == 'dev'
|
|
Requires-Dist: twine (==4.0.1) ; extra == 'dev'
|
|
Requires-Dist: wheel (==0.37.1) ; extra == 'dev'
|
|
Requires-Dist: black (==22.3.0) ; (implementation_name == "cpython") and extra == 'dev'
|
|
Requires-Dist: mypy (==0.961) ; (implementation_name == "cpython") and extra == 'dev'
|
|
|
|
======
|
|
Bleach
|
|
======
|
|
|
|
.. image:: https://github.com/mozilla/bleach/workflows/Test/badge.svg
|
|
:target: https://github.com/mozilla/bleach/actions?query=workflow%3ATest
|
|
|
|
.. image:: https://github.com/mozilla/bleach/workflows/Lint/badge.svg
|
|
:target: https://github.com/mozilla/bleach/actions?query=workflow%3ALint
|
|
|
|
.. image:: https://badge.fury.io/py/bleach.svg
|
|
:target: http://badge.fury.io/py/bleach
|
|
|
|
Bleach is an allowed-list-based HTML sanitizing library that escapes or strips
|
|
markup and attributes.
|
|
|
|
Bleach can also linkify text safely, applying filters that Django's ``urlize``
|
|
filter cannot, and optionally setting ``rel`` attributes, even on links already
|
|
in the text.
|
|
|
|
Bleach is intended for sanitizing text from *untrusted* sources. If you find
|
|
yourself jumping through hoops to allow your site administrators to do lots of
|
|
things, you're probably outside the use cases. Either trust those users, or
|
|
don't.
|
|
|
|
Because it relies on html5lib_, Bleach is as good as modern browsers at dealing
|
|
with weird, quirky HTML fragments. And *any* of Bleach's methods will fix
|
|
unbalanced or mis-nested tags.
|
|
|
|
The version on GitHub_ is the most up-to-date and contains the latest bug
|
|
fixes. You can find full documentation on `ReadTheDocs`_.
|
|
|
|
:Code: https://github.com/mozilla/bleach
|
|
:Documentation: https://bleach.readthedocs.io/
|
|
:Issue tracker: https://github.com/mozilla/bleach/issues
|
|
:License: Apache License v2; see LICENSE file
|
|
|
|
|
|
Reporting Bugs
|
|
==============
|
|
|
|
For regular bugs, please report them `in our issue tracker
|
|
<https://github.com/mozilla/bleach/issues>`_.
|
|
|
|
If you believe that you've found a security vulnerability, please `file a secure
|
|
bug report in our bug tracker
|
|
<https://bugzilla.mozilla.org/enter_bug.cgi?assigned_to=nobody%40mozilla.org&product=Webtools&component=Bleach-security&groups=webtools-security>`_
|
|
or send an email to *security AT mozilla DOT org*.
|
|
|
|
For more information on security-related bug disclosure and the PGP key to use
|
|
for sending encrypted mail or to verify responses received from that address,
|
|
please read our wiki page at
|
|
`<https://www.mozilla.org/en-US/security/#For_Developers>`_.
|
|
|
|
|
|
Security
|
|
========
|
|
|
|
Bleach is a security-focused library.
|
|
|
|
We have a responsible security vulnerability reporting process. Please use
|
|
that if you're reporting a security issue.
|
|
|
|
Security issues are fixed in private. After we land such a fix, we'll do a
|
|
release.
|
|
|
|
For every release, we mark security issues we've fixed in the ``CHANGES`` in
|
|
the **Security issues** section. We include any relevant CVE links.
|
|
|
|
|
|
Installing Bleach
|
|
=================
|
|
|
|
Bleach is available on PyPI_, so you can install it with ``pip``::
|
|
|
|
$ pip install bleach
|
|
|
|
|
|
Upgrading Bleach
|
|
================
|
|
|
|
.. warning::
|
|
|
|
Before doing any upgrades, read through `Bleach Changes
|
|
<https://bleach.readthedocs.io/en/latest/changes.html>`_ for backwards
|
|
incompatible changes, newer versions, etc.
|
|
|
|
Bleach follows `semver 2`_ versioning. Vendored libraries will not
|
|
be changed in patch releases.
|
|
|
|
|
|
Basic use
|
|
=========
|
|
|
|
The simplest way to use Bleach is:
|
|
|
|
.. code-block:: python
|
|
|
|
>>> import bleach
|
|
|
|
>>> bleach.clean('an <script>evil()</script> example')
|
|
u'an <script>evil()</script> example'
|
|
|
|
>>> bleach.linkify('an http://example.com url')
|
|
u'an <a href="http://example.com" rel="nofollow">http://example.com</a> url'
|
|
|
|
|
|
Code of Conduct
|
|
===============
|
|
|
|
This project and repository is governed by Mozilla's code of conduct and
|
|
etiquette guidelines. For more details please see the `CODE_OF_CONDUCT.md
|
|
</CODE_OF_CONDUCT.md>`_
|
|
|
|
|
|
.. _html5lib: https://github.com/html5lib/html5lib-python
|
|
.. _GitHub: https://github.com/mozilla/bleach
|
|
.. _ReadTheDocs: https://bleach.readthedocs.io/
|
|
.. _PyPI: https://pypi.org/project/bleach/
|
|
.. _semver 2: https://semver.org/
|
|
|
|
|
|
Bleach changes
|
|
==============
|
|
|
|
Version 5.0.1 (June 27th, 2022)
|
|
-------------------------------
|
|
|
|
**Bugs**
|
|
|
|
* Add missing comma to tinycss2 require. Thank you, @shadchin!
|
|
|
|
* Add url parse tests based on wpt url tests. (#688)
|
|
|
|
* Support scheme-less urls if "https" is in allow list. (#662)
|
|
|
|
* Handle escaping ``<`` in edge cases where it doesn't start a tag. (#544)
|
|
|
|
* Fix reference warnings in docs. (#660)
|
|
|
|
* Correctly urlencode email address parts. Thank you, @larseggert! (#659)
|
|
|
|
|
|
Version 5.0.0 (April 7th, 2022)
|
|
-------------------------------
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
* ``clean`` and ``linkify`` now preserve the order of HTML attributes. Thank
|
|
you, @askoretskly! (#566)
|
|
|
|
* Drop support for Python 3.6. Thank you, @hugovk! (#629)
|
|
|
|
* CSS sanitization in style tags is completely different now. If you're using
|
|
Bleach ``clean`` to sanitize css in style tags, you'll need to update your
|
|
code and you'll need to install the ``css`` extras::
|
|
|
|
pip install 'bleach[css]'
|
|
|
|
See `the documentation on sanitizing CSS for how to do it
|
|
<https://bleach.readthedocs.io/en/latest/clean.html#sanitizing-css>`_. (#633)
|
|
|
|
**Bug fixes**
|
|
|
|
* Rework dev dependencies. We no longer have
|
|
``requirements-dev.in``/``requirements-dev.txt``. Instead, we're using
|
|
``dev`` extras.
|
|
|
|
See `development docs <https://bleach.readthedocs.io/en/latest/dev.html>`_
|
|
for more details. (#620)
|
|
|
|
* Add newline when dropping block-level tags. Thank you, @jvanasco! (#369)
|
|
|
|
Version 4.1.0 (August 25th, 2021)
|
|
---------------------------------
|
|
|
|
**Features**
|
|
|
|
* Python 3.9 support
|
|
|
|
**Bug fixes**
|
|
|
|
* Update sanitizer clean to use vendored 3.6.14 stdlib urllib.parse to
|
|
fix test failures on Python 3.9. (#536)
|
|
|
|
Version 4.0.0 (August 3rd, 2021)
|
|
--------------------------------
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
* Drop support for unsupported Python versions <3.6. (#520)
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
* fix attribute name in the linkify docs (thanks @CheesyFeet!)
|
|
|
|
Version 3.3.1 (July 14th, 2021)
|
|
-------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
* add more tests for CVE-2021-23980 / GHSA-vv2x-vrpj-qqpq
|
|
* bump python version to 3.8 for tox doc, vendorverify, and lint targets
|
|
* update bug report template tag
|
|
* update vendorverify script to detect and fail when extra files are vendored
|
|
* update release process docs to check vendorverify passes locally
|
|
|
|
**Bug fixes**
|
|
|
|
* remove extra vendored django present in the v3.3.0 whl (#595)
|
|
* duplicate h1 header doc fix (thanks Nguyễn Gia Phong / @McSinyx!)
|
|
|
|
Version 3.3.0 (February 1st, 2021)
|
|
----------------------------------
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
* clean escapes HTML comments even when strip_comments=False
|
|
|
|
**Security fixes**
|
|
|
|
* Fix bug 1621692 / GHSA-m6xf-fq7q-8743. See the advisory for details.
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
None
|
|
|
|
Version 3.2.3 (January 26th, 2021)
|
|
----------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
* fix clean and linkify raising ValueErrors for certain inputs. Thank you @Google-Autofuzz.
|
|
|
|
Version 3.2.2 (January 20th, 2021)
|
|
----------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
* Migrate CI to Github Actions. Thank you @hugovk.
|
|
|
|
**Bug fixes**
|
|
|
|
* fix linkify raising an IndexError on certain inputs. Thank you @Google-Autofuzz.
|
|
|
|
Version 3.2.1 (September 18th, 2020)
|
|
------------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
* change linkifier to add rel="nofollow" as documented. Thank you @mitar.
|
|
* suppress html5lib sanitizer DeprecationWarnings (#557)
|
|
|
|
Version 3.2.0 (September 16th, 2020)
|
|
------------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
* ``html5lib`` dependency to version 1.1.0. Thank you Sam Sneddon.
|
|
* update tests_website terminology. Thank you Thomas Grainger.
|
|
|
|
Version 3.1.5 (April 29th, 2020)
|
|
--------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
* replace missing ``setuptools`` dependency with ``packaging``. Thank you Benjamin Peterson.
|
|
|
|
Version 3.1.4 (March 24th, 2020)
|
|
--------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
* ``bleach.clean`` behavior parsing style attributes could result in a
|
|
regular expression denial of service (ReDoS).
|
|
|
|
Calls to ``bleach.clean`` with an allowed tag with an allowed
|
|
``style`` attribute were vulnerable to ReDoS. For example,
|
|
``bleach.clean(..., attributes={'a': ['style']})``.
|
|
|
|
This issue was confirmed in Bleach versions v3.1.3, v3.1.2, v3.1.1,
|
|
v3.1.0, v3.0.0, v2.1.4, and v2.1.3. Earlier versions used a similar
|
|
regular expression and should be considered vulnerable too.
|
|
|
|
Anyone using Bleach <=v3.1.3 is encouraged to upgrade.
|
|
|
|
https://bugzilla.mozilla.org/show_bug.cgi?id=1623633
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
* Style attributes with dashes, or single or double quoted values are
|
|
cleaned instead of passed through.
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
None
|
|
|
|
Version 3.1.3 (March 17th, 2020)
|
|
--------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
* Drop support for Python 3.4. Thank you, @hugovk!
|
|
|
|
* Drop deprecated ``setup.py test`` support. Thank you, @jdufresne! (#507)
|
|
|
|
**Features**
|
|
|
|
* Add support for Python 3.8. Thank you, @jdufresne!
|
|
|
|
* Add support for PyPy 7. Thank you, @hugovk!
|
|
|
|
* Add pypy3 testing to tox and travis. Thank you, @jdufresne!
|
|
|
|
**Bug fixes**
|
|
|
|
* Add relative link to code of conduct. (#442)
|
|
|
|
* Fix typo: curren -> current in tests/test_clean.py Thank you, timgates42! (#504)
|
|
|
|
* Fix handling of non-ascii style attributes. Thank you, @sekineh! (#426)
|
|
|
|
* Simplify tox configuration. Thank you, @jdufresne!
|
|
|
|
* Make documentation reproducible. Thank you, @lamby!
|
|
|
|
* Fix typos in code comments. Thank you, @zborboa-g!
|
|
|
|
* Fix exception value testing. Thank you, @mastizada!
|
|
|
|
* Fix parser-tags NoneType exception. Thank you, @bope!
|
|
|
|
* Improve TLD support in linkify. Thank you, @pc-coholic!
|
|
|
|
|
|
Version 3.1.2 (March 11th, 2020)
|
|
--------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
* ``bleach.clean`` behavior parsing embedded MathML and SVG content
|
|
with RCDATA tags did not match browser behavior and could result in
|
|
a mutation XSS.
|
|
|
|
Calls to ``bleach.clean`` with ``strip=False`` and ``math`` or
|
|
``svg`` tags and one or more of the RCDATA tags ``script``,
|
|
``noscript``, ``style``, ``noframes``, ``iframe``, ``noembed``, or
|
|
``xmp`` in the allowed tags whitelist were vulnerable to a mutation
|
|
XSS.
|
|
|
|
This security issue was confirmed in Bleach version v3.1.1. Earlier
|
|
versions are likely affected too.
|
|
|
|
Anyone using Bleach <=v3.1.1 is encouraged to upgrade.
|
|
|
|
https://bugzilla.mozilla.org/show_bug.cgi?id=1621692
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
None
|
|
|
|
Version 3.1.1 (February 13th, 2020)
|
|
-----------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
* ``bleach.clean`` behavior parsing ``noscript`` tags did not match
|
|
browser behavior.
|
|
|
|
Calls to ``bleach.clean`` allowing ``noscript`` and one or more of
|
|
the raw text tags (``title``, ``textarea``, ``script``, ``style``,
|
|
``noembed``, ``noframes``, ``iframe``, and ``xmp``) were vulnerable
|
|
to a mutation XSS.
|
|
|
|
This security issue was confirmed in Bleach versions v2.1.4, v3.0.2,
|
|
and v3.1.0. Earlier versions are probably affected too.
|
|
|
|
Anyone using Bleach <=v3.1.0 is highly encouraged to upgrade.
|
|
|
|
https://bugzilla.mozilla.org/show_bug.cgi?id=1615315
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
None
|
|
|
|
Version 3.1.0 (January 9th, 2019)
|
|
---------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
* Add ``recognized_tags`` argument to the linkify ``Linker`` class. This
|
|
fixes issues when linkifying on its own and having some tags get escaped.
|
|
It defaults to a list of HTML5 tags. Thank you, Chad Birch! (#409)
|
|
|
|
**Bug fixes**
|
|
|
|
* Add ``six>=1.9`` to requirements. Thank you, Dave Shawley (#416)
|
|
|
|
* Fix cases where attribute names could have invalid characters in them.
|
|
(#419)
|
|
|
|
* Fix problems with ``LinkifyFilter`` not being able to match links
|
|
across ``&``. (#422)
|
|
|
|
* Fix ``InputStreamWithMemory`` when the ``BleachHTMLParser`` is
|
|
parsing ``meta`` tags. (#431)
|
|
|
|
* Fix doctests. (#357)
|
|
|
|
|
|
Version 3.0.2 (October 11th, 2018)
|
|
----------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
* Merge ``Characters`` tokens after sanitizing them. This fixes issues in the
|
|
``LinkifyFilter`` where it was only linkifying parts of urls. (#374)
|
|
|
|
|
|
Version 3.0.1 (October 9th, 2018)
|
|
---------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
* Support Python 3.7. It supported Python 3.7 just fine, but we added 3.7 to
|
|
the list of Python environments we test so this is now officially supported.
|
|
(#377)
|
|
|
|
**Bug fixes**
|
|
|
|
* Fix ``list`` object has no attribute ``lower`` in ``clean``. (#398)
|
|
* Fix ``abbr`` getting escaped in ``linkify``. (#400)
|
|
|
|
|
|
Version 3.0.0 (October 3rd, 2018)
|
|
---------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
* A bunch of functions were moved from one module to another.
|
|
|
|
These were moved from ``bleach.sanitizer`` to ``bleach.html5lib_shim``:
|
|
|
|
* ``convert_entity``
|
|
* ``convert_entities``
|
|
* ``match_entity``
|
|
* ``next_possible_entity``
|
|
* ``BleachHTMLSerializer``
|
|
* ``BleachHTMLTokenizer``
|
|
* ``BleachHTMLParser``
|
|
|
|
These functions and classes weren't documented and aren't part of the
|
|
public API, but people read code and might be using them so we're
|
|
considering it an incompatible API change.
|
|
|
|
If you're using them, you'll need to update your code.
|
|
|
|
**Features**
|
|
|
|
* Bleach no longer depends on html5lib. html5lib==1.0.1 is now vendored into
|
|
Bleach. You can remove it from your requirements file if none of your other
|
|
requirements require html5lib.
|
|
|
|
This means Bleach will now work fine with other libraries that depend on
|
|
html5lib regardless of what version of html5lib they require. (#386)
|
|
|
|
**Bug fixes**
|
|
|
|
* Fixed tags getting added when using clean or linkify. This was a
|
|
long-standing regression from the Bleach 2.0 rewrite. (#280, #392)
|
|
|
|
* Fixed ``<isindex>`` getting replaced with a string. Now it gets escaped or
|
|
stripped depending on whether it's in the allowed tags or not. (#279)
|
|
|
|
|
|
Version 2.1.4 (August 16th, 2018)
|
|
---------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
* Dropped support for Python 3.3. (#328)
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
* Handle ambiguous ampersands in correctly. (#359)
|
|
|
|
|
|
Version 2.1.3 (March 5th, 2018)
|
|
-------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
* Attributes that have URI values weren't properly sanitized if the
|
|
values contained character entities. Using character entities, it
|
|
was possible to construct a URI value with a scheme that was not
|
|
allowed that would slide through unsanitized.
|
|
|
|
This security issue was introduced in Bleach 2.1. Anyone using
|
|
Bleach 2.1 is highly encouraged to upgrade.
|
|
|
|
https://bugzilla.mozilla.org/show_bug.cgi?id=1442745
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
* Fixed some other edge cases for attribute URI value sanitizing and
|
|
improved testing of this code.
|
|
|
|
|
|
Version 2.1.2 (December 7th, 2017)
|
|
----------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
* Support html5lib-python 1.0.1. (#337)
|
|
|
|
* Add deprecation warning for supporting html5lib-python < 1.0.
|
|
|
|
* Switch to semver.
|
|
|
|
|
|
Version 2.1.1 (October 2nd, 2017)
|
|
---------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
None
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
None
|
|
|
|
**Features**
|
|
|
|
None
|
|
|
|
**Bug fixes**
|
|
|
|
* Fix ``setup.py`` opening files when ``LANG=``. (#324)
|
|
|
|
|
|
Version 2.1 (September 28th, 2017)
|
|
----------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
* Convert control characters (backspace particularly) to "?" preventing
|
|
malicious copy-and-paste situations. (#298)
|
|
|
|
See `<https://github.com/mozilla/bleach/issues/298>`_ for more details.
|
|
|
|
This affects all previous versions of Bleach. Check the comments on that
|
|
issue for ways to alleviate the issue if you can't upgrade to Bleach 2.1.
|
|
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
* Redid versioning. ``bleach.VERSION`` is no longer available. Use the string
|
|
version at ``bleach.__version__`` and parse it with
|
|
``pkg_resources.parse_version``. (#307)
|
|
|
|
* clean, linkify: linkify and clean should only accept text types; thank you,
|
|
Janusz! (#292)
|
|
|
|
* clean, linkify: accept only unicode or utf-8-encoded str (#176)
|
|
|
|
|
|
**Features**
|
|
|
|
|
|
**Bug fixes**
|
|
|
|
* ``bleach.clean()`` no longer unescapes entities including ones that are missing
|
|
a ``;`` at the end which can happen in urls and other places. (#143)
|
|
|
|
* linkify: fix http links inside of mailto links; thank you, sedrubal! (#300)
|
|
|
|
* clarify security policy in docs (#303)
|
|
|
|
* fix dependency specification for html5lib 1.0b8, 1.0b9, and 1.0b10; thank you,
|
|
Zoltán! (#268)
|
|
|
|
* add Bleach vs. html5lib comparison to README; thank you, Stu Cox! (#278)
|
|
|
|
* fix KeyError exceptions on tags without href attr; thank you, Alex Defsen!
|
|
(#273)
|
|
|
|
* add test website and scripts to test ``bleach.clean()`` output in browser;
|
|
thank you, Greg Guthe!
|
|
|
|
|
|
Version 2.0 (March 8th, 2017)
|
|
-----------------------------
|
|
|
|
**Security fixes**
|
|
|
|
* None
|
|
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
* Removed support for Python 2.6. (#206)
|
|
|
|
* Removed support for Python 3.2. (#224)
|
|
|
|
* Bleach no longer supports html5lib < 0.99999999 (8 9s).
|
|
|
|
This version is a rewrite to use the new sanitizing API since the old
|
|
one was dropped in html5lib 0.99999999 (8 9s).
|
|
|
|
If you're using 0.9999999 (7 9s) upgrade to 0.99999999 (8 9s) or higher.
|
|
|
|
If you're using 1.0b8 (equivalent to 0.9999999 (7 9s)), upgrade to 1.0b9
|
|
(equivalent to 0.99999999 (8 9s)) or higher.
|
|
|
|
* ``bleach.clean`` and friends were rewritten
|
|
|
|
``clean`` was reimplemented as an html5lib filter and happens at a different
|
|
step in the HTML parsing -> traversing -> serializing process. Because of
|
|
that, there are some differences in clean's output as compared with previous
|
|
versions.
|
|
|
|
Amongst other things, this version will add end tags even if the tag in
|
|
question is to be escaped.
|
|
|
|
* ``bleach.clean`` and friends attribute callables now take three arguments:
|
|
tag, attribute name and attribute value. Previously they only took attribute
|
|
name and attribute value.
|
|
|
|
All attribute callables will need to be updated.
|
|
|
|
* ``bleach.linkify`` was rewritten
|
|
|
|
``linkify`` was reimplemented as an html5lib Filter. As such, it no longer
|
|
accepts a ``tokenizer`` argument.
|
|
|
|
The callback functions for adjusting link attributes now takes a namespaced
|
|
attribute.
|
|
|
|
Previously you'd do something like this::
|
|
|
|
def check_protocol(attrs, is_new):
|
|
if not attrs.get('href', '').startswith('http:', 'https:')):
|
|
return None
|
|
return attrs
|
|
|
|
Now it's more like this::
|
|
|
|
def check_protocol(attrs, is_new):
|
|
if not attrs.get((None, u'href'), u'').startswith(('http:', 'https:')):
|
|
# ^^^^^^^^^^^^^^^
|
|
return None
|
|
return attrs
|
|
|
|
Further, you need to make sure you're always using unicode values. If you
|
|
don't then html5lib will raise an assertion error that the value is not
|
|
unicode.
|
|
|
|
All linkify filters will need to be updated.
|
|
|
|
* ``bleach.linkify`` and friends had a ``skip_pre`` argument--that's been
|
|
replaced with a more general ``skip_tags`` argument.
|
|
|
|
Before, you might do::
|
|
|
|
bleach.linkify(some_text, skip_pre=True)
|
|
|
|
The equivalent with Bleach 2.0 is::
|
|
|
|
bleach.linkify(some_text, skip_tags=['pre'])
|
|
|
|
You can skip other tags, too, like ``style`` or ``script`` or other places
|
|
where you don't want linkification happening.
|
|
|
|
All uses of linkify that use ``skip_pre`` will need to be updated.
|
|
|
|
|
|
**Changes**
|
|
|
|
* Supports Python 3.6.
|
|
|
|
* Supports html5lib >= 0.99999999 (8 9s).
|
|
|
|
* There's a ``bleach.sanitizer.Cleaner`` class that you can instantiate with your
|
|
favorite clean settings for easy reuse.
|
|
|
|
* There's a ``bleach.linkifier.Linker`` class that you can instantiate with your
|
|
favorite linkify settings for easy reuse.
|
|
|
|
* There's a ``bleach.linkifier.LinkifyFilter`` which is an htm5lib filter that
|
|
you can pass as a filter to ``bleach.sanitizer.Cleaner`` allowing you to clean
|
|
and linkify in one pass.
|
|
|
|
* ``bleach.clean`` and friends can now take a callable as an attributes arg value.
|
|
|
|
* Tons of bug fixes.
|
|
|
|
* Cleaned up tests.
|
|
|
|
* Documentation fixes.
|
|
|
|
|
|
Version 1.5 (November 4th, 2016)
|
|
--------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
* None
|
|
|
|
**Backwards incompatible changes**
|
|
|
|
* clean: The list of ``ALLOWED_PROTOCOLS`` now defaults to http, https and
|
|
mailto.
|
|
|
|
Previously it was a long list of protocols something like ed2k, ftp, http,
|
|
https, irc, mailto, news, gopher, nntp, telnet, webcal, xmpp, callto, feed,
|
|
urn, aim, rsync, tag, ssh, sftp, rtsp, afs, data. (#149)
|
|
|
|
**Changes**
|
|
|
|
* clean: Added ``protocols`` to arguments list to let you override the list of
|
|
allowed protocols. Thank you, Andreas Malecki! (#149)
|
|
|
|
* linkify: Fix a bug involving periods at the end of an email address. Thank you,
|
|
Lorenz Schori! (#219)
|
|
|
|
* linkify: Fix linkification of non-ascii ports. Thank you Alexandre, Macabies!
|
|
(#207)
|
|
|
|
* linkify: Fix linkify inappropriately removing node tails when dropping nodes.
|
|
(#132)
|
|
|
|
* Fixed a test that failed periodically. (#161)
|
|
|
|
* Switched from nose to py.test. (#204)
|
|
|
|
* Add test matrix for all supported Python and html5lib versions. (#230)
|
|
|
|
* Limit to html5lib ``>=0.999,!=0.9999,!=0.99999,<0.99999999`` because 0.9999
|
|
and 0.99999 are busted.
|
|
|
|
* Add support for ``python setup.py test``. (#97)
|
|
|
|
|
|
Version 1.4.3 (May 23rd, 2016)
|
|
------------------------------
|
|
|
|
**Security fixes**
|
|
|
|
* None
|
|
|
|
**Changes**
|
|
|
|
* Limit to html5lib ``>=0.999,<0.99999999`` because of impending change to
|
|
sanitizer api. #195
|
|
|
|
|
|
Version 1.4.2 (September 11, 2015)
|
|
----------------------------------
|
|
|
|
**Changes**
|
|
|
|
* linkify: Fix hang in linkify with ``parse_email=True``. (#124)
|
|
|
|
* linkify: Fix crash in linkify when removing a link that is a first-child. (#136)
|
|
|
|
* Updated TLDs.
|
|
|
|
* linkify: Don't remove exterior brackets when linkifying. (#146)
|
|
|
|
|
|
Version 1.4.1 (December 15, 2014)
|
|
---------------------------------
|
|
|
|
**Changes**
|
|
|
|
* Consistent order of attributes in output.
|
|
|
|
* Python 3.4 support.
|
|
|
|
|
|
Version 1.4 (January 12, 2014)
|
|
------------------------------
|
|
|
|
**Changes**
|
|
|
|
* linkify: Update linkify to use etree type Treewalker instead of simpletree.
|
|
|
|
* Updated html5lib to version ``>=0.999``.
|
|
|
|
* Update all code to be compatible with Python 3 and 2 using six.
|
|
|
|
* Switch to Apache License.
|
|
|
|
|
|
Version 1.3
|
|
-----------
|
|
|
|
* Used by Python 3-only fork.
|
|
|
|
|
|
Version 1.2.2 (May 18, 2013)
|
|
----------------------------
|
|
|
|
* Pin html5lib to version 0.95 for now due to major API break.
|
|
|
|
|
|
Version 1.2.1 (February 19, 2013)
|
|
---------------------------------
|
|
|
|
* ``clean()`` no longer considers ``feed:`` an acceptable protocol due to
|
|
inconsistencies in browser behavior.
|
|
|
|
|
|
Version 1.2 (January 28, 2013)
|
|
------------------------------
|
|
|
|
* ``linkify()`` has changed considerably. Many keyword arguments have been
|
|
replaced with a single callbacks list. Please see the documentation for more
|
|
information.
|
|
|
|
* Bleach will no longer consider unacceptable protocols when linkifying.
|
|
|
|
* ``linkify()`` now takes a tokenizer argument that allows it to skip
|
|
sanitization.
|
|
|
|
* ``delinkify()`` is gone.
|
|
|
|
* Removed exception handling from ``_render``. ``clean()`` and ``linkify()`` may
|
|
now throw.
|
|
|
|
* ``linkify()`` correctly ignores case for protocols and domain names.
|
|
|
|
* ``linkify()`` correctly handles markup within an <a> tag.
|
|
|
|
|
|
Version 1.1.5
|
|
-------------
|
|
|
|
|
|
Version 1.1.4
|
|
-------------
|
|
|
|
|
|
Version 1.1.3 (July 10, 2012)
|
|
-----------------------------
|
|
|
|
* Fix parsing bare URLs when parse_email=True.
|
|
|
|
|
|
Version 1.1.2 (June 1, 2012)
|
|
----------------------------
|
|
|
|
* Fix hang in style attribute sanitizer. (#61)
|
|
|
|
* Allow ``/`` in style attribute values.
|
|
|
|
|
|
Version 1.1.1 (February 17, 2012)
|
|
---------------------------------
|
|
|
|
* Fix tokenizer for html5lib 0.9.5.
|
|
|
|
|
|
Version 1.1.0 (October 24, 2011)
|
|
--------------------------------
|
|
|
|
* ``linkify()`` now understands port numbers. (#38)
|
|
|
|
* Documented character encoding behavior. (#41)
|
|
|
|
* Add an optional target argument to ``linkify()``.
|
|
|
|
* Add ``delinkify()`` method. (#45)
|
|
|
|
* Support subdomain whitelist for ``delinkify()``. (#47, #48)
|
|
|
|
|
|
Version 1.0.4 (September 2, 2011)
|
|
---------------------------------
|
|
|
|
* Switch to SemVer git tags.
|
|
|
|
* Make ``linkify()`` smarter about trailing punctuation. (#30)
|
|
|
|
* Pass ``exc_info`` to logger during rendering issues.
|
|
|
|
* Add wildcard key for attributes. (#19)
|
|
|
|
* Make ``linkify()`` use the ``HTMLSanitizer`` tokenizer. (#36)
|
|
|
|
* Fix URLs wrapped in parentheses. (#23)
|
|
|
|
* Make ``linkify()`` UTF-8 safe. (#33)
|
|
|
|
|
|
Version 1.0.3 (June 14, 2011)
|
|
-----------------------------
|
|
|
|
* ``linkify()`` works with 3rd level domains. (#24)
|
|
|
|
* ``clean()`` supports vendor prefixes in style values. (#31, #32)
|
|
|
|
* Fix ``linkify()`` email escaping.
|
|
|
|
|
|
Version 1.0.2 (June 6, 2011)
|
|
----------------------------
|
|
|
|
* ``linkify()`` supports email addresses.
|
|
|
|
* ``clean()`` supports callables in attributes filter.
|
|
|
|
|
|
Version 1.0.1 (April 12, 2011)
|
|
------------------------------
|
|
|
|
* ``linkify()`` doesn't drop trailing slashes. (#21)
|
|
* ``linkify()`` won't linkify 'libgl.so.1'. (#22)
|