Auto-escaping and output formats

This is a detailed tutorial to auto-escaping and related concepts; for the bare minimum, read this instead.

Note:

The kind of automatic escaping described here requires at least FreeMarker 2.3.24. If you have to use an earlier version, use the deprecated escape directive instead.

Output formats

Each template has an associated output format (a freemarker.core.OutputFormat instance). The output format dictates the escaping rules, which is applied on all ${...}-s (and #{...}-s) that aren't inside a string literal. It also specifies a MIME type (e.g. "text/HTML") and a canonical name (e.g. "HTML") that the embedding application/framework can use for its own purposes.

It's the programmer's responsibility to associate output format to templates. Furthermore it's recommended that FreeMarker is configured so that templates with ftlh and ftlx file extensions are automatically associated with the HTML and XML output formats, respectively.

The predefined output formats are:

Name Description MIME Type Default implementation (freemarker.core.*)
HTML Escapes <, >, &, ", ' as &lt;, &gt;, &amp;, &quot;, &#39; text/html HTMLOutputFormat.INSTANCE
XHTML Escapes <, >, &, ", ' as &lt;, &gt;, &amp;, &quot;, &#39; application/xhtml+xml XHTMLOutputFormat.INSTANCE
XML Escapes <, >, &, ", ' as &lt;, &gt;, &amp;, &quot;, &apos; application/xml XMLOutputFormat.INSTANCE
RTF Escapes {, }, \ as \{, \}, \\ application/rtf RTFOutputFormat.INSTANCE
undefined Doesn't escape. Prints markup output values (concept explained later) from other output formats as is. The default output format used when no output format was explicitly set in the configuration. None (null) UndefinedOutputFormat.INSTANCE
plainText Doesn't escape. text/plain PlainTextOutputFormat.INSTANCE
JavaScript Doesn't escape. application/javascript JavaScriptOutputFormat.INSTANCE
JSON Doesn't escape. application/json JSONOutputFormat.INSTANCE
CSS Doesn't escape. text/css CSSOutputFormat.INSTANCE

The programmers can add their your own output formats, so this is maybe not all the output formats in your application!

Overriding the output format in templates

Especially in legacy applications, you will often find that the output format is undefined (you can check that with ${.output_format}), and so no automatic escaping is happening. In other cases, a common output format (like HTML) is set for all templates, but a few templates need a different output format. In any case, the output format of a template can be enforced in the the ftl header:

<#ftl output_format="XML">
${"'"}  <#-- Prints: &apos; -->

Above, the output format was referred by its name shown in the earlier table (looked up via Configuration.getOutputFormat(String name), actually).

Note:

If escaping doesn't happen after adding the above ftl header, then <#ftl output_format="XML" auto_esc=true> might helps (and that means that FreeMarker was configured to use "disable" auto-escaping policy, which is generally not recommended).

The output format can also be applied to only a section of a template, like:

<#-- Let's assume we have "HTML" output format by default. -->
${"'"}  <#-- Prints: &#39; -->
<#outputformat "XML">
  ${"'"}  <#-- Prints: &apos; -->
</#outputformat>
${"'"}  <#-- Prints: &#39; -->

Basically, each position in a template has an associated output format, and as you saw above, it might not be the same everywhere in the template. This association sticks to the positions and won't change as the template executes. So if, for example, you call a macro from inside an outputformat block and the called macro is defined outside that block, it won't get the output format of it. Or, if you have a macro that's defined in a template with HTML output format, no mater from where you call it, that macro will always execute with HTML output format. This is like if you were coloring each characters of the template files by output format in the text editor, and then later when the templates are executed, it only considers the color of the statement being executed. This gives you firm control over the output format and hence escaping; you don't have to consider the possible execution paths that can lead to a point.

Disabling auto escaping

For a single interpolation you can disable auto-escaping with ?no_esc:

<#-- Let's assume we have "HTML" output format by default. -->
${'<b>test</b>'}  <#-- prints: &lt;b&gt;test&lt;/b&gt; -->
${'<b>test</b>'?no_esc}  <#-- prints: <b>test</b> -->

You can also disable auto escaping for a whole section with the noautoesc directive:

${'&'}  <#-- prints: &amp; -->
<#noautoesc>
  ${'&'}  <#-- prints: & -->
  ...
  ${'&'}  <#-- prints: & -->
</#noautoesc>
${'&'}  <#-- prints: &amp; -->

Just like outputformat, this only applies to the part that's literally inside the block ("coloring" logic).

Auto-escaping can also be disabled for the whole template in the ftl header. It can then be re-enabled for a section with the autoesc directive:

<#ftl autoesc=false>
${'&'}  <#-- prints: & -->
<#autoesc>
  ${'&'}  <#-- prints: &amp; -->
  ...
  ${'&'}  <#-- prints: &amp; -->
</#autoesc>
${'&'}  <#-- prints: & -->

You can also force escaping for an individual interpolation when escaping is disabled, with ?esc:

<#ftl autoesc=false>
${'&'}  <#-- prints: & -->
${'&'?esc}  <#-- prints: &amp; -->

Naturally, both autoesc and ?esc works inside noautoesc blocks too.

"Markup output" values

In FTL, values have type, like string, number, boolean, etc. One such type is called "markup output". A value of that type is a piece of text that's already in the output format (like HTML), and hence needs no further escaping. We have already produced such values earlier:

  • s?esc creates a markup output value out of a string value by escaping all special characters in it.

  • s?no_esc creates a markup output value out of a string value by assuming that the string already stores markup and so needs no further escaping

These can be useful outside ${...} too. For example, here the caller of the infoBox macro can decide if the message is plain text (hence needs escaping) or HTML (hence it mustn't be escaped):

<#-- We assume that we have "HTML" output format by default. -->

<@infoBox "Foo & bar" />
<@infoBox "Foo <b>bar</b>"?no_esc />

<#macro infoBox message>
  <div class="infoBox">
    ${message}
  </div>
</#macro>
  <div class="infoBox">
    Foo &amp; bar
  </div>
  <div class="infoBox">
    Foo <b>bar</b>
  </div>

Another case where you get a markup output value is output capturing:

<#-- We assume that we have "HTML" output format by default. -->
<#assign captured><b>Test</b></#assign>
Just a string: ${"<b>Test</b>"}
Captured output: ${captured}
Just a string: &lt;b&gt;Test&lt;/b&gt;
Captured output: <b>Test</b>

Because the captured output is markup output, it wasn't auto-escaped.

It's important that markup output values aren't strings, and aren't automatically coerced to strings. Thus ?upper_case, ?starts_with etc., will give an error with them. You won't be able to pass them to Java methods for String parameters either. But sometimes you need the markup that's behind the value as a string, which you can get as markupOutput?markup_string. Be sure you know what you are doing though. Applying string operations on markup (as opposed to on plain text) can result in invalid markup. Also there's the danger of unintended double escaping.

<#-- We assume that we have "HTML" output format by default. -->

<#assign markupOutput1="<b>Test</b>"?no_esc>
<#assign markupOutput2="Foo & bar"?esc>

As expected:
${markupOutput1}
${markupOutput2}

Possibly unintended double escaping:
${markupOutput1?markup_string}
${markupOutput2?markup_string}
As expected:
<b>Test</b>
Foo &amp; bar

Possibly unintended double escaping:
&lt;b&gt;Test&lt;/b&gt;
Foo &amp;amp; bar

Further details and tricky cases

Non-markup output formats

An output format is said to be a non-markup format if it defines no escaping rules. Examples of such output formats are the undefined format and the plainText format.

These formats produce no markup output values, hence you can't use ?esc or ?no_esc when they are the current format. You can use output capturing (like <#assign captured>...</#assign>), but the resulting value will be a string, not a markup output value.

Furthermore, you aren't allowed to use the autoesc directive or <#ftl auto_esc=true> when the current output format is non-markup.

Using constructs that aren't supported by the current output format will give parse-time error.

Inserting markup output values from other markups

Each markup output value has an associated output format. When a markup output value is inserted with ${...} (or #{...}), it has to be converted to the current output format at the point of insertion (if they differ). As of this writing (2.3.24), such output format conversion will only be successful if the value to convert was created by escaping plain text:

<#-- We assume that we have "HTML" output format by default. -->

<#assign mo1 = "Foo's bar {}"?esc>
HTLM: ${mo1}
XML:  <#outputformat 'XML'>${mo1}</#outputformat>
RTF:  <#outputformat 'RTF'>${mo1}</#outputformat>

<#assign mo2><p>Test</#assign>
HTML: ${mo2}
XML:  <#attempt><#outputformat 'XML'>${mo2}</#outputformat><#recover>Failed</#attempt>
RTF:  <#attempt><#outputformat 'RTF'>${mo2}</#outputformat><#recover>Failed</#attempt>
HTLM: Foo&#39;s bar {}
XML:  Foo&apos;s bar {}
RTF:  Foo's bar \{\}

HTML: <p>Test
XML:  Failed
RTF:  Failed

But, an output format can also chose to insert pieces of other output formats as is, without converting them. Among the standard output formats, undefined is like that, which is the output format used for templates for which no output format was specified in the configuration:

<#-- We assume that we have "undefined" output format here. -->

<#outputformat "HTML"><#assign htmlMO><p>Test</#assign></#outputformat>
<#outputformat "XML"><#assign xmlMO><p>Test</p></#assign></#outputformat>
<#outputformat "RTF"><#assign rtfMO>\par Test</#assign></#outputformat>
HTML: ${htmlMO}
XML:  ${xmlMO}
RTF:  ${rtfMO}
HTML: <p>Test
XML:  <p>Test</p>
RTF:  \par Test

Markup output values and the "+" operator

As you certainly know, if one of the sides of the + operator is a string then it does concatenation. If there's a markup output value in one side, the other side gets promoted to markup output value of the same output format (if it's not already that), by escaping its string value, and finally the two markups are concatenated to form a new markup output value. Example:

<#-- We assume that we have "HTML" output format by default. -->
${"<h1>"?no_esc + "Foo & bar" + "</h1>"?no_esc}
<h1>Foo &amp; bar</h1>

If the two sides of the + operator are markup values of different output formats, the right side operand is converted to the output format of the left side. If that's not possible, then the left side operand is converted to the output format of the right side. If that isn't possible either, that's an error. (See the limitations of conversions here.)

${...} inside string literals

When ${...} is used inside string expressions (e.g., in <#assign s = "Hello ${name}!">), it's just a shorthand of using the + operator (<#assign s = "Hello" + name + "!">). Thus, ${...}-s inside string expressions aren't auto-escaped, but of course when the resulting concatenated string is printed later, it will be possibly auto-escaped.

<#-- We assume that we have "HTML" output format by default. -->
<#assign name = "Foo & Bar">

<#assign s = "<p>Hello ${name}!">
${s}
<p>Hello ${name}!

To prove that s didn't contain the value in escaped form:
${s?replace('&'), 'and'}
&lt;p&gt;Hello Foo &amp; Bar!
<p>Hello Foo &amp; Bar!

To prove that "s" didn't contain the value in escaped form:
&lt;p&gt;Hello Foo and Bar!

Combined output formats

Combined output formats are output formats that are created ad-hoc from other output formats by nesting them into each other, so that the escaping of both output formats are applied. They are discussed here...