webhelpers2.html.builder
¶
HTML/XHTML/HTML 5 tag builder.
HTML Builder provides:
an
HTML
object that creates (X)HTML tags in a Pythonic way.a
literal
class used to mark strings containing intentional HTML markup.a smart escaping mechanism that preserves literals but escapes other strings that may accidentally contain markup characters (“<”, “>”, “&”, ‘”’, “’”) or malicious Javascript tags. Escaped strings are returned as literals to prevent them from being double-escaped later.
The builder uses markupsafe and follows Python’s unofficial
.__html__
protocol, which Mako, Chameleon, Pylons, and some other
packages also follow. These features are explained in the next section.
Literals¶
- class webhelpers2.html.builder.literal(s, encoding=None, errors=strict')¶
An HTML literal string, which will not be further escaped.
I’m a subclass of
markupsafe.Markup
, which itself is a subclass ofunicode
in Python 2 orstr
in Python 3. The main difference from ordinary strings is the.__html__
method, which allows smart escapers to recognize it as a “safe” HTML string that doesn’t need to be escaped.All my string methods preserve literal arguments and escape plain strings. However, in expressions you must pay attention to which value “controls” the expression. I seem to be able to control all combinations of the
+
operator, but with%
and.join
I must be on the left side. So these all work:"A" + literal("B") literal(", ".join(["A", literal("B")]) literal("%s %s") % (16, literal("kg"))
But these return plain strings which are vulnerable to double-escaping later:
"\n".join([literal("<span>A</span"), literal("Bar!")]) "%s %s" % ([literal("16"), literal("<>")])
- static __new__(cls, base='', encoding=None, errors='strict')¶
Constructor.
I convert my first argument to a string like
str()
does. However, I convertNone
to the empty string, which is usually what’s desired in templates. (In contrast, rawMarkup(None)
returns"None"
.)Examples:
>>> literal("A") # => literal("A") >>> literal(">") # => literal(">") >>> literal(None) # => literal("") >>> literal(11) # => literal("11") >>> literal(datetime.date.today()) # => literal("2014-08-31")
The default encoding is “ascii”.
- classmethod escape(s)¶
Escape the argument and return a literal.
This is a class method. The result depends on the argument type:
literal: return unchanged.
an object with an
.__html__
method: call it and return the result. The method should take no arguments and return the object’s preferred HTML representation as a string.plain string: escape any HTML markup characters in it, and wrap the result in a literal to prevent double-escaping later.
non-string: call
str()
, escape the result, and wrap it in a literal.None: convert to the empty string and return a literal.
If the argument has an
.__html__
method, I call it and return the result. This causes literals to pass through unchanged, and other objects with an.__html__
method return their preferred HTML representation. If the argument is a plain string, I escape any HTML markup characters and wrap the result in a literal to prevent further escaping. If the argument is a non-string, I convert it to a string, escape it, and wrap it in a literal. Examples:>>> literal.escape(">") # => literal(">") >>> literal.escape(literal(">")) # => literal(">") >>> literal.escape(None) # => literal("")
I call
markupsafe.escape_silent()
. It escapes double quotes as “"”, single quotes as “'”, “<” as “<”, “>” as “>”, and “&” as “&”.
- lit_join(iterable)¶
Like the
.join
string method but don’t escape elements in the iterable.
- striptags() str ¶
unescape()
the markup, remove tags, and normalize whitespace to single spaces.>>> Markup("Main » <em>About</em>").striptags() 'Main » About'
- unescape() str ¶
Convert escaped markup back into a text string. This replaces HTML entities with the characters they represent.
>>> Markup("Main » <em>About</em>").unescape() 'Main » <em>About</em>'
The HTML generator¶
The HTML
global is an instance of the HTMLBuilder
class.
Normally you use the global rather than instantiating it yourself.
- class webhelpers2.html.builder.HTMLBuilder¶
An HTML tag generator.
- __call__(*args, **kw)¶
Escape the string args, concatenate them, and return a literal.
This is the same as
literal.escape(s)
but accepts multiple strings. Multiple arguments are useful when mixing child tags with text, such as:html = HTML("The king is a >>", HTML.strong("fink"), "<<!")
Keyword args:
nl
If true, append a newline to the value. (Default False.)
lit
If true, don’t escape the arguments. (Default False.)
- __getattr__(attr)¶
Same as the
tag
method but using attribue access.HTML.a(...)
is equivalent toHTML.tag("a", ...)
.
- tag(tag, *args, **kw)¶
Create an HTML tag.
tag
is the tag name. The other positional arguments become the content for the tag, and are escaped and concatenated.Keyword arguments are converted to HTML attributes, except for the following special arguments:
c
Specifies the content. This cannot be combined with content in positional args. The purpose of this argument is to position the content at the end of the argument list to match the native HTML syntax more closely. Its use is entirely optional. The value can be a string, a tuple, or a tag.
_closed
If present and false, do not close the tag. Otherwise the tag will be closed with a closing tag or an XHTML-style trailing slash.
_nl
If present and true, insert a newline before the first content element, between each content element, and at the end of the tag.
Note that this argument has a leading underscore while the same argument to
__call__
doesn’t. That’s because this method has so many other complex arguments, and for backward compatibility._bool
Additional HTML attributes to consider boolean beyond those listed in
.boolean_attrs
. See “Class Attributes” below.
Other keyword arguments are converted to HTML attributes after undergoing several transformations:
Ignore attributes whose value is None.
Delete trailing underscores in attribute names. (‘class_’ -> ‘class’).
Replace non-trailing underscores with hyphens. (‘data_foo’ -> ‘data-foo’).
If the attribute is a boolean attribute — e.g. “defer”, “disable”, “readonly” — render it as an HTML 5 boolean attribute. If the value is true, copy the attribute name to the value. If the value is false, don’t render the attribute at all. See
self.boolean_attrs
and_bool
to customize which attributes are considered boolean.If the attribute is known to be list- or set- valued — e.g. “class” (or “class_”), “style”, “rel” — and the value is a list or tuple, convert the value to a string by conjoining the values. A separator appropriate to the attribute will be used to separate the values within the string. (E.g. “class” is space-separated, “style” is semicolon-separated.) If the value is an empty list or tuple, don’t render the attribute at all. If the value contains elements that are 2-tuples, the first subelement is the string item, and the second subelement is a boolean flag; render only subelements whose flag is true. This allows users to programatically set the parts of a composable attribute in a template without extra loops or logic code. See
self.compose_attrs
to customize which attributes have list/tuple conversion and what their delimiter is.
Examples:
>>> HTML.tag("div", "Foo", class_="important") literal(u'<div class="important">Foo</div>') >>> HTML.tag("div", "Foo", class_=None) literal(u'<div>Foo</div>') >>> HTML.tag("div", "Foo", class_=["a", "b"]) literal(u'<div class="a b">Foo</div>') >>> HTML.tag("div", "Foo", class_=[("a", False), ("b", True)]) literal(u'<div class="b">Foo</div>') >>> HTML.tag("div", "Foo", style=["color:black", "border:solid"]) literal(u'<div style="color:black; border:solid">Foo</div>') >>> HTML.tag("br") literal(u'<br />') >>> HTML.tag("input", disabled=True) literal(u'<input disabled="disabled"></input>') >>> HTML.tag("input", disabled=False) literal(u'<input></input>')
To generate opening and closing tags in isolation:
>>> HTML.tag("div", _closed=False) literal(u'<div>') >>> HTML.tag("/div", _closed=False) literal(u'</div>')
- comment(*args)¶
Wrap the content in an HTML comment.
Escape and concatenate the string arguments.
Example:
>>> HTML.comment("foo", "bar") literal(u'<!-- foobar -->')
- cdata(*args)¶
Wrap the content in a “<![CDATA[ … ]]>” section.
Plain strings will not be escaped because CDATA itself is an escaping syntax. Concatenate the arguments:
>>> HTML.cdata(u"Foo") literal(u'<![CDATA[Foo]]>')
>>> HTML.cdata(u"<p>") literal(u'<![CDATA[<p>]]>')
- render_attrs(attrs)¶
Format HTML attributes into a string of ‘ key=”value”’ pairs.
You don’t normally need to call this because the
tag
method calls it for you. However, it can be useful for lower-level formatting in string templates like this:Click <a href="http://example.com/"{attrs1}>here</a> or maybe <a{attrs2}>here</a>.
attrs
is a list of attributes. The values will be escaped if they’re not literals, but no other transformation will be performed on them.The return value will have a leading space if any attributes are present. If no attributes are specified, the return value is the empty string literal. This allows it to render prettily in the interpolations above regardless of whether
attrs
contains anything.
The following class attributes are literal constants:
- EMPTY¶
The empty string as a literal.
- SPACE¶
A single space as a literal.
- TAB2¶
A 2-space tab as a literal.
- TAB4¶
A 4-space tab as a literal.
- NL¶
A newline (”\n”) as a literal.
- NL2¶
Two newlines as a literal.
- BR¶
A literal consisting of one “<br />” tag.
- BR2¶
A literal consisting of two “<br />” tags.
The following class attributes affect the behavior of the ``tag`` method:
- void_tags¶
The set of tags which can never have content. These are rendered in self-closing style; e.g., ‘<br />’. See About XHTML and HTML below.
- boolean_attrs¶
The set of attributes which are rendered as booleans. E.g.,
disabled=True
renders as ‘disabled=”disabled”’, whiledisabled=False
is not rendered at all.The default set contains all attributes designated as boolean by the current HTML 5.1 draft specification.
- compose_attrs¶
A dict of attributes whose value may have string-delimited components. The keys are attribute names and the values are delimiter literals. The default configuration supports all attributes designated as being set- or list-valued by the HTML 5.1 draft specification.
- literal¶
The
literal
class that will be used internally to generate literals. Changing this does not automatically affect the constant attributes (EMPTY, NL, BR, etc).
About XHTML and HTML¶
This builder always produces tags that are valid as both HTML and XHTML.
“Void” tags – those which can never have content like <br>
and <input>
– are written like <br />
, with a space and a trailing /
.
Only void tags get this treatment. The library will never, for
example, produce <script src="..." />
, which is invalid HTML and
legacy browsers misinterpret it as still being open. Instead
the builder will produce <script src="..."></script>
.
The W3C HTML validator validates these
constructs as valid HTML Strict. It does produce warnings, but those
warnings warn about the ambiguity if this same XML-style self-closing
tags are used for HTML elements that are allowed to take content (<script>
,
<textarea>
, etc). This library never produces markup like that.
Rather than add options to generate different kinds of behavior, we felt it was better to create markup that could be used in different contexts without any real problems and without the overhead of passing options around or maintaining different contexts, where you’d have to keep track of whether markup is being rendered in an HTML or XHTML context.
If you _really_ want void tags without training slashes (e.g.,
<br>
), you can abuse _closed=False
to produce them.
Functions¶
- webhelpers2.html.builder.escape(s)¶
Same as
literal.escape(s)
.
- webhelpers2.html.builder.lit_sub(*args, **kw)¶
Literal-safe version of re.sub. If the string to be operated on is a literal, return a literal result. All arguments are passed directly to
re.sub
.
- webhelpers2.html.builder.url_escape(s, safe='/')¶
Urlencode the path portion of a URL. This is the same function as
urllib.quote
in the Python standard library. It’s exported here with a name that’s easier to remember.
The markupsafe
package has a function soft_unicode
which converts a
string to Unicode if it’s not already. Unlike the Python builtin unicode()
,
it will not convert Markup
(literal
) to plain Unicode, to avoid
overescaping. This is not included in webhelpers2 but you may find it useful.