webhelpers2.text
¶
Functions that output text (not HTML).
Helpers for filtering, formatting, and transforming strings.
- webhelpers2.text.chop_at(s, sub, inclusive=False)¶
Truncate string
s
at the first occurrence ofsub
.If
inclusive
is true, truncate just aftersub
rather than at it.
- webhelpers2.text.collapse(string, character=' ')¶
Removes specified character from the beginning and/or end of the string and then condenses runs of the character within the string.
Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)
- webhelpers2.text.convert_accented_entities(string)¶
Converts HTML entities into the respective non-accented letters.
Examples:
>>> convert_accented_entities("á") 'a' >>> convert_accented_entities("ç") 'c' >>> convert_accented_entities("è") 'e' >>> convert_accented_entities("î") 'i' >>> convert_accented_entities("ø") 'o' >>> convert_accented_entities("ü") 'u'
Note: This does not do any conversion of Unicode/ASCII accented-characters. For that functionality please use unidecode.
Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)
- webhelpers2.text.convert_misc_entities(string)¶
Converts HTML entities (taken from common Textile formattings) into plain text formats
Note: This isn’t an attempt at complete conversion of HTML entities, just those most likely to be generated by Textile.
Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)
- webhelpers2.text.excerpt(text, phrase, radius=100, excerpt_string='...')¶
Extract an excerpt from the
text
, or ‘’ if the phrase isn’t found.phrase
Phrase to excerpt from
text
radius
How many surrounding characters to include
excerpt_string
Characters surrounding entire excerpt
Example:
>>> excerpt("hello my world", "my", 3) '...lo my wo...'
- webhelpers2.text.lchop(s, sub)¶
Chop
sub
off the front ofs
if present.>>> lchop("##This is a comment.##", "##") 'This is a comment.##'
The difference between
lchop
ands.lstrip
is thatlchop
strips only the exact prefix, whiles.lstrip
treats the argument as a set of leading characters to delete regardless of order.
- webhelpers2.text.plural(n, singular, plural, with_number=True)¶
Return the singular or plural form of a word, according to the number.
If
with_number
is true (default), the return value will be the number followed by the word. Otherwise the word alone will be returned.
- webhelpers2.text.rchop(s, sub)¶
Chop
sub
off the end ofs
if present.>>> rchop("##This is a comment.##", "##") '##This is a comment.'
The difference between
rchop
ands.rstrip
is thatrchop
strips only the exact suffix, whiles.rstrip
treats the argument as a set of trailing characters to delete regardless of order.
- webhelpers2.text.remove_formatting(string)¶
Simplify HTML text by removing tags and several kinds of formatting.
If the
unidecode
package is installed, it will also transliterate non-ASCII Unicode characters to their nearest pronunciation equivalent in ASCII.Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)
- webhelpers2.text.replace_whitespace(string, replace=' ')¶
Replace runs of whitespace in string
Defaults to a single space but any replacement string may be specified as an argument. Examples:
>>> replace_whitespace("Foo bar") 'Foo bar' >>> replace_whitespace("Foo bar", "-") 'Foo-bar'
Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)
- webhelpers2.text.series(*items, **kw)¶
Join strings using commas and a conjunction such as “and” or “or”.
The conjunction defaults to “and”. Pass ‘conj’ as a keyword arg to change it. Pass ‘strict=False’ to omit the comma before the conjunction.
Examples:
>>> series("A", "B") 'A and B' >>> series("A", "B", conj="or") 'A or B' >>> series("A", "B", "C") 'A, B, and C' >>> series("A", "B", "C", strict=False) 'A, B and C'
- webhelpers2.text.strip_leading_whitespace(s)¶
Strip the leading whitespace in all lines in
s
.This deletes all leading whitespace.
textwrap.dedent
deletes only the whitespace common to all lines.
- webhelpers2.text.truncate(text, length=30, indicator='...', whole_word=False)¶
Truncate
text
with replacement characters.length
The maximum length of
text
before replacementindicator
If
text
exceeds thelength
, this string will replace the end of the stringwhole_word
If true, shorten the string further to avoid breaking a word in the middle. A word is defined as any string not containing whitespace. If the entire text before the break is a single word, it will have to be broken.
Example:
>>> truncate('Once upon a time in a world far far away', 14) 'Once upon a...'
- webhelpers2.text.urlify(string)¶
Create a URI-friendly representation of the string
Can be called manually in order to generate an URI-friendly version of any string.
If the
unidecode
package is installed, it will also transliterate non-ASCII Unicode characters to their nearest pronounciation equivalent in ASCII.- Examples::
>>> urlify("Mighty Mighty Bosstones") 'mighty-mighty-bosstones'
Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)
Changed in WebHelpers 1.2: urlecode the result in case it contains special characters like “?”.
- webhelpers2.text.wrap_long_lines(text, width=72)¶
Wrap all long lines in a text string to the specified width.
width
may be an int or atextwrap.TextWrapper
instance. The latter allows you to set other options besides the width, and is more efficient when wrapping many texts.Unlike
wrap_paragraphs()
, this splits individual lines and does not look at the paragraph context. Thus it never joins lines. This is safer if the text might contain preformatted lines (tables, poetry, headers) in the middle of paragraphs. However, it could lead to splitting a line just before the last word or two, putting the orphan words on a separate line, in the middle of a paragraph.
- webhelpers2.text.wrap_paragraphs(text, width=72)¶
Wrap all paragraphs in a text string to the specified width.
width
may be an int or atextwrap.TextWrapper
instance. The latter allows you to set other options besides the width, and is more efficient when wrapping many texts.This is intended only to split lines that are too long. It keeps short lines intact, including at the beginning of paragraphs. If a paragraph starts with short lines and then a long line, it will keep the initial short lines as is, and wrap from the long line until the end of the paragraph (a blank line, a line containing only whitespace, or the end of the document). This is intended to preserve preformatted text (tables, poetry, headers), but occasionally it may preserve short lines you wanted to join.