[Dancer-users] [ proposal ] charset discussion

damien krotkine dkrotkine at gmail.com
Tue Dec 14 13:59:55 CET 2010


On 14 December 2010 11:36, Alex Kapranoff <kapranoff at gmail.com> wrote:

> On Tue, Dec 14, 2010 at 13:10, damien krotkine <dkrotkine at gmail.com>wrote:
>
>> I was willing to fix a minor thing (the fact that the feature provided in
>> the commit works only with utf8, not any other encoding), then I realize
>> that there are 2 concepts of charsets that are mixed in Dancer (at least I
>> think) :
>>
>> - the 'charset' setting is used to set the response headers' encoding and
>> charset (see
>> https://github.com/sukria/Dancer/blob/master/lib/Dancer/Handler.pm#L129 )
>> - the 'charset' setting is used to read files (with this new open_file
>> feature)
>>
>> These 2 concepts are legitimate, but having them bound to the same setting
>> is error prone : let say you have latin1 template files, and you want your
>> Dancer application to provide UTF8 encoding, you can't. And I can see people
>> trying to fix that by mixing the charset setting, and the ENCODING options
>> of D::P::TemplateToolkit for instance. That will be messy.
>>
>
> As the original author the 'charset' setting I definitely intended it to
> mean "outgoing chatset" — the one to generate HTTP responses in.
> I absolutely agree that using it to mean different things might be very
> confusing.
>
>
>> The Dancer documentation says that the 'charset' setting is "The default
>> charset of outgoing content". I think that it should be kept this way. the
>> 'charset' setting should not be used for local file reading/writing.
>>
>> Also, Dancer::Renderer::templates provides a default layout template (
>> https://github.com/sukria/Dancer/blob/master/lib/Dancer/Renderer.pm#L200), but set its charset as UTF-8 by default, which contradicts the 'charset'
>> documentation : "Default value is empty which means don't do anything. HTTP
>> responses without charset will be interpreted as ISO-8859-1 by most
>> clients.". But maybe I missed something, especially the cases
>> where Dancer::Renderer::templates is used.
>>
>> Anyway, I can see the current situation bite us in the back. Unless I have
>> misunderstood the code, in this case, let me know :)
>>
>> These steps could fix the situation :
>>
>> - open_file() should not refer to the same thing used by the Handler to
>> set Dancer headers
>> - there should be a way to indicate to Dancer that local files read /
>> written are in a given charset
>> - also, my personal opinion is that the 'charset' setting could be utf8 by
>> default, as utf8 is cool :) But that's not mandatory to fix the issues
>>
>> Sadly, I don't know how to do that without adding a new setting, something
>> like 'local_charset', or 'files_charset'... which would be used y open_file,
>> the 'charset' be left as it is now, and used only for Dancer responses.
>>
>
> I would strongly suggest limiting the support to UTF-8 only in as many
> places as possible. I don't see any need to support templates in any other
> encoding in 2010, for example. Dancer doesn't have the burden of huge legacy
> systems and it should use this as its advantage. I would also suggest
> defaulting 'charset' to UTF-8 but that will break things for people not
> careful enough with their 8bit data.
>

Well, regarding file i/o, as long as we allow users to set their encoding,
we are not limited to utf8. I suspect that a lot of linux operating system
are still not using utf8. And Windows variants are not. As for what content
Dancer returns, I agree that utf8 could be the only charset supported. But I
wouldn't want to make that call by myself.

In addition, there is a catch : there is no strict separation between what
is read from the file system and what is served by Dancer : static files are
served directly, while templates, being generated by templating system, can
produce files in different charsets.

So it seems difficult to draw a line between what 'charset' does and
'file_io_charset' would do.

It's probably easier to consider that the files are in the same charset as
the charset that Dancer outputs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.backup-manager.org/pipermail/dancer-users/attachments/20101214/7f276b26/attachment.htm>


More information about the Dancer-users mailing list