Hello dear clojurians,
I just started using lacinia-pedestal
and was wondering how to configure the character encoding of data returned. I modified the example on https://github.com/walmartlabs/lacinia-pedestal and changed the schema to:
{:queries
{:hello
{:type 'String
:resolve (constantly "Motörhead")}}}
and using curl
to query the for hello
messes up the ö
.
HTTP/1.1 200 OK
Date: Thu, 23 Jan 2020 14:01:33 GMT
Content-Type: application/json
Transfer-Encoding: chunked
{"data":{"hello":"Mot?rhead"}}
Same thin with lacinia’s build in GraphiQL “IDE”.
Is this a bug or am I missing something?@lennart.buit it all depends on the “correct setup” of your JVM. As I said, the file.encoding
system property must be set to UTF-8
either by starting the JVM with option -Dfile.encoding=UTF-8
or having the UNIX environment where the JVM is launched set up with an appropriate locale, like LC_ALL=en_US.UTF-8
. At least this is what I have figured out so far.
It would not consider this a bug in any of these components (lacinia, pedestal, cheshire, jackson) but you have to be careful that you set up things correctly. I will put at least a warning in my code if the file.encoding
system property is not set to UTF-8.
@orestis I think you don’t need a custom interceptor to fix this. See my comments above.
Thanks for the mention - I’ll try that.
When returning JSON via pedestal
with handler like
(defn respond-hello [request]
{:status 200
:headers {"Content-Type" "application/json"}
:body (clojure.data.json/write-str {"Hello" "Motörhead"})
})
The umlaut ö
is encoded correctly:
{"Hello":"Mot\u00f6rhead"}
I remember having a similar problem with pedestal generally, and this helped: https://stackoverflow.com/a/16747545
(if using jetty)
Yes, I’m using jetty running on macOS and the encoding is not okay.
pedestal-lacinia
uses cheshire
as JSON generator und cheshire
seems to have different default encodings than clojure.data.json
. The latter does escaping, cheshire
uses UTF-8
in an isolated little test. The ö
is encoded as two bytes. Whereas in the pedestal-lacinia
program the ö
is encoded in just one byte. So it could be that the output there is encoded in something like “Latin-1” and not “UTF-8". I will investigate if this is a jetty configuration issue. Thanks for the hint, @isak…
I remember exactly the same with pedestal and Lacinia, ended up replacing the lacinia interceptor with one of my own.
I would also like at some point to hook up Transit, since I am consuming this in ClojureScript
Not sure at the moment why but when using pedestal-lacinia
you need to set the Java property file.encoding
to UTF-8
which will fix the issue. Start your JVM with -Dfile.encoding=UTF-8
or use clj -J-Dfile.encoding=UTF-8 …
Again, thanks @isak for the hint. Guess I will dig into the lacinia-pedestal
code base to see if they configure jetty
or the JVM to not use UTF-8
as file.encoding
or derive some other properties from the file encoding. A simple pedestal
only server works without setting file.encoding
explicitly. Strange.
@orestis yes, Transit or EDN would be fine when using Clojure on both ends.
I have isolated the file.encoding
issue and it is neither pedestal
nor jetty
. It is https://github.com/FasterXML/jackson that obviously relies on file.encoding
to be set to UTF-8
. Or use the UNIX environment:
export LC_ALL=en_US.UTF-8
However this is strange, since JSON SHALL be encoded as UTF-8/16/32 … why isn’t that enforced by jackson/cheshire? Or when the encoding is not UTF-? use escaping as clojure.data.json/write-str
does?Hmm strange, I just returned a unicode emoji from a lacinia resolver and I am finding the correct byte sequence in my curl response