On this page
Charset issues
FreeMarker, as most Java applications, works with "UNICODE text" (UTF-16). Nonetheless, there are situations when it must deal with charsets, because it has to exchange data with the outer world that may uses various other charsets.
The charset of the input
When FreeMarker has to load a template file (or an unparsed text file), then it must know the charset of the file, since files are just raw byte sequences. You can use the encoding
setting to specify the charset. This setting takes effect only when FreeMarker loads a template (parsed or unparsed) with the getTemplate
method of Configuration
. Note that the include
directive uses this method internally, so the value of the encoding
setting is significant for an already loaded template if the template contains include
directive call.
The getter and setter method of the encoding
setting is special in the first (configuration) layer. The getter method guesses the return value based on a Locale
passed as parameter; it looks up the encoding in a table that maps locales to encodings (called encoding map), and if the locale was not found there, it returns the default encoding. You can fill the encoding map with the setEncoding(Locale locale, String encoding)
method of the configuration; the encoding map is initially empty. The default encoding is initially the value of the file.encoding
system property, but you always should set a default default with the setDefaultEncoding
method, rather than relying on that. For new projects, a popular default encoding is utf-8
.
You can give the charset directly by overriding the encoding
setting in the template layer or runtime environment layer (When you specify an encoding as the parameter of getTemplate
method, you override the encoding
setting in the template layer.). If you don't override it, the effective value will be what the configuration.getEncoding(Locale)
method returns for the effective value of the locale
setting.
Also, instead of relying on this charset guessing mechanism, you can specify the charset of the template in the template file itself, with the ftl
directive, like <#ftl encoding="utf-8">
.
Note that the charset of the template is independent from the charset of the output that the tempalte generates (unless the enclosing software deliberately sets the output charset to the same as the template charset).
The charset of the output
The output_encoding
setting/variable and the url
built-in is available since FreeMarker 2.3.1. It doesn't exist in 2.3.
In principle FreeMarker does not deal with the charset of the output, since it writes the output to a java.io.Writer
. Since the Writer
is made by the software that encapsulates FreeMarker (such as a Web application framework), the output charset is controlled by the encapsulating software. Still, FreeMarker has a setting called output_encoding
(starting from FreeMarker version 2.3.1). The enclosing software should set this setting (to the charset that the Writer
uses), to inform FreeMarker what charset is used for the output (otherwise FreeMarker can't find it out). Some features, such as the url
built-in, and the output_encoding
special variable utilize this information. Thus, if the enclosing software doesn't set this setting then FreeMarker features that need to know the output charset can't be used.
If you write software that will use FreeMarker, you may wonder what output charset should you choose. Of course it depends on the consumer of the FreeMarker output, but if the consumer is flexible regarding this question, then the common practice is either using the charset of the template file for the output, or using UTF-8. Using UTF-8 is usually a better practice, because arbitrary text may comes from the data-model, which then possibly contains characters that couldn't be encoded with the charset of the template.
FreeMarker settings can be set for each individual template processing if you use Template.createProcessingEnvironment(...)
plus Environment.process(...)
instead of Template.process(...)
. Thus, you can set the output_encoding
setting for each template execution independently:
Writer w = new OutputStreamWriter(out, outputCharset);
Environment env = template.createProcessingEnvironment(dataModel, w);
env.setOutputEncoding(outputCharset);
env.process();