= Problem =

You have a chunk of arbitrary data taken from user input, from a file, or from a database.
You wish to include it in the HTML output generated by a Quixote handler.  This means
that '<', '>', and '&' characters in the data need to be turned into the HTML entities
'&lt;', '&gt;', and '&amp;'.

= Solution =

You can do the escaping manually, putting in calls to `quixote.html.html_quote()` as needed:

{{{
from quixote.html import html_quote
def f (request):
    paragraph = request.get_form_var('para', 'No text provided')
    return '<p>%s</p>' % html_quote(paragraph)
}}}

Or, you can use the `htmlescape()` function to get 
an instance of the `htmltext` type; this type will take care
of escaping data when it's combined with strings:

{{{
from quixote.html import htmlescape, htmltext
def f (request):
    paragraph = request.get_form_var('para', 'No text provided')
    return htmltext('<p>') + htmlescape(paragraph) + htmltext('</p>')
}}}


= Discussion =

Most non-Quixote applications take the manual approach; wrap all usage of data 
that needs escaping in some function that does the required substitution.
The major problem is that it's error-prone: it's easy to forget to call the function in one location.
Worse, you won't notice the error until the data actually contains an 
HTML tag or an ampersand.  This opens up your application to a "cross-site scripting" attack, where
some attacker inserts some JavaScript into your web site and uses it to steal data or cause damage.

A second problem is the opposite error, quoting too many times so that users end up seeing `<p>&amp;...`
on your pages.  In a complicated application you'll have utility functions to format a specific type of object, 
or generate some frequently-used text.  Should these utility functions return already-escaped data, or should it be 
the caller's responsibility?  It's easy to get it wrong and run the data through `html_quote()` twice.  
This error is just embarrassing, but doesn't open any security holes like forgetting to escape does.

The `htmltext` type simplifies the problem; if it's an `htmltext` instance, it's already been quoted 
and is safe to output.  `htmlescape()` is a counterpart of `html_quote()` that returns `htmltext` instead of a 
regular Python string, so data can be passed through `htmlescape()` any number of times but will only be escaped once.

If you decide to use the `htmltext` constructor, be sure to use it only with string literals, or with data you're ''very'' sure 
is safe, because passing a string to the constructor says that the string is safe to output.

You can either create `htmltext` instances manually, or you can use the HTML template type in PTL, which
turns the second example above into the following PTL code:

{{{
def f [html] (request):
    paragraph = request.get_form_var('para', 'No text provided')
    '<p>'
    paragraph
    '</p>'
}}}

In an HTML template, string literals in the code are automatically turned into `htmltext` instances.
Properly written HTML templates will never need to use the `htmltext()` constructor explicitly.

Consult the [[http://www.mems-exchange.org/software/quixote/doc/PTL.html|doc/PTL]] file in 
the Quixote source distribution for more details about PTL and the `htmltext` type.




----
CategoryCookbook