On Tue, Dec 14, 2010 at 13:10, damien krotkine <dkrotkine@gmail.com> wrote:
I was willing to fix a minor thing (the fact that the feature provided in the commit works only with utf8, not any other encoding), then I realize that there are 2 concepts of charsets that are mixed in Dancer (at least I think) :
- the 'charset' setting is used to read files (with this new open_file feature)
These 2 concepts are legitimate, but having them bound to the same setting is error prone : let say you have latin1 template files, and you want your Dancer application to provide UTF8 encoding, you can't. And I can see people trying to fix that by mixing the charset setting, and the ENCODING options of D::P::TemplateToolkit for instance. That will be messy.
As the original author the 'charset' setting I definitely intended it to mean "outgoing chatset" — the one to generate HTTP responses in.
I absolutely agree that using it to mean different things might be very confusing.
The Dancer documentation says that the 'charset' setting is "The default charset of outgoing content". I think that it should be kept this way. the 'charset' setting should not be used for local file reading/writing.
Also, Dancer::Renderer::templates provides a default layout template (
https://github.com/sukria/Dancer/blob/master/lib/Dancer/Renderer.pm#L200 ), but set its charset as UTF-8 by default, which contradicts the 'charset' documentation : "Default value is empty which means don't do anything. HTTP responses without charset will be interpreted as ISO-8859-1 by most clients.". But maybe I missed something, especially the cases where Dancer::Renderer::templates is used.
Anyway, I can see the current situation bite us in the back. Unless I have misunderstood the code, in this case, let me know :)
These steps could fix the situation :
- open_file() should not refer to the same thing used by the Handler to set Dancer headers
- there should be a way to indicate to Dancer that local files read / written are in a given charset
- also, my personal opinion is that the 'charset' setting could be utf8 by default, as utf8 is cool :) But that's not mandatory to fix the issues
Sadly, I don't know how to do that without adding a new setting, something like 'local_charset', or 'files_charset'... which would be used y open_file, the 'charset' be left as it is now, and used only for Dancer responses.
I would strongly suggest limiting the support to UTF-8 only in as many places as possible. I don't see any need to support templates in any other encoding in 2010, for example. Dancer doesn't have the burden of huge legacy systems and it should use this as its advantage. I would also suggest defaulting 'charset' to UTF-8 but that will break things for people not careful enough with their 8bit data.