Bleach is a whitelist-based HTML sanitization and text linkification library. It is designed to take untrusted user input with some HTML.
Because Bleach uses html5lib to parse document fragments the same way browsers do, it is extremely resilient to unknown attacks, much more so than regular-expression-based sanitizers.
Bleach’s linkify function is highly configurable and can be used to find, edit, and filter links most other auto-linkers can’t.
The version of bleach on GitHub is always the most up-to-date and the master branch should always work.
Bleach is available on PyPI, so you can install it with pip:
$ pip install bleach
Or with easy_install:
$ easy_install bleach
Or by cloning the repo from GitHub:
$ git clone git://github.com/jsocol/bleach.git
Then install it by running:
$ python setup.py install