graphql

defa 2020-01-23T14:01:15.006600Z

Hello dear clojurians, I just started using lacinia-pedestal and was wondering how to configure the character encoding of data returned. I modified the example on https://github.com/walmartlabs/lacinia-pedestal and changed the schema to:

{:queries
 {:hello
  {:type    'String
   :resolve (constantly "Motörhead")}}}
and using curl to query the for hello messes up the ö .
HTTP/1.1 200 OK
Date: Thu, 23 Jan 2020 14:01:33 GMT
Content-Type: application/json
Transfer-Encoding: chunked

{"data":{"hello":"Mot?rhead"}}
Same thin with lacinia’s build in GraphiQL “IDE”. Is this a bug or am I missing something?

defa 2020-01-24T08:35:02.012400Z

@lennart.buit it all depends on the “correct setup” of your JVM. As I said, the file.encoding system property must be set to UTF-8 either by starting the JVM with option -Dfile.encoding=UTF-8 or having the UNIX environment where the JVM is launched set up with an appropriate locale, like LC_ALL=en_US.UTF-8. At least this is what I have figured out so far. It would not consider this a bug in any of these components (lacinia, pedestal, cheshire, jackson) but you have to be careful that you set up things correctly. I will put at least a warning in my code if the file.encoding system property is not set to UTF-8.

👍 2
defa 2020-01-24T09:08:12.013Z

@orestis I think you don’t need a custom interceptor to fix this. See my comments above.

orestis 2020-01-24T09:08:45.013600Z

Thanks for the mention - I’ll try that.

defa 2020-01-23T14:49:18.007200Z

When returning JSON via pedestal with handler like

(defn respond-hello [request]
  {:status  200
   :headers {"Content-Type" "application/json"}
   :body    (clojure.data.json/write-str {"Hello" "Motörhead"})
   })
The umlaut ö is encoded correctly:
{"Hello":"Mot\u00f6rhead"}

isak 2020-01-23T16:25:50.007500Z

I remember having a similar problem with pedestal generally, and this helped: https://stackoverflow.com/a/16747545

👍 1
isak 2020-01-23T16:26:15.007800Z

(if using jetty)

defa 2020-01-23T18:37:47.008Z

Yes, I’m using jetty running on macOS and the encoding is not okay.

defa 2020-01-23T18:39:05.008200Z

pedestal-lacinia uses cheshire as JSON generator und cheshire seems to have different default encodings than clojure.data.json. The latter does escaping, cheshire uses UTF-8 in an isolated little test. The ö is encoded as two bytes. Whereas in the pedestal-lacinia program the ö is encoded in just one byte. So it could be that the output there is encoded in something like “Latin-1” and not “UTF-8". I will investigate if this is a jetty configuration issue. Thanks for the hint, @isak

1
orestis 2020-01-23T19:12:55.009600Z

I remember exactly the same with pedestal and Lacinia, ended up replacing the lacinia interceptor with one of my own.

orestis 2020-01-23T19:13:33.010600Z

I would also like at some point to hook up Transit, since I am consuming this in ClojureScript

defa 2020-01-23T19:20:33.010800Z

Not sure at the moment why but when using pedestal-lacinia you need to set the Java property file.encoding to UTF-8 which will fix the issue. Start your JVM with -Dfile.encoding=UTF-8 or use clj -J-Dfile.encoding=UTF-8 …

defa 2020-01-23T19:23:09.011Z

Again, thanks @isak for the hint. Guess I will dig into the lacinia-pedestal code base to see if they configure jetty or the JVM to not use UTF-8 as file.encoding or derive some other properties from the file encoding. A simple pedestal only server works without setting file.encoding explicitly. Strange.

🙂 1
defa 2020-01-23T19:24:07.011200Z

@orestis yes, Transit or EDN would be fine when using Clojure on both ends.

defa 2020-01-23T20:50:58.011600Z

I have isolated the file.encoding issue and it is neither pedestal nor jetty. It is https://github.com/FasterXML/jackson that obviously relies on file.encoding to be set to UTF-8. Or use the UNIX environment:

export LC_ALL=en_US.UTF-8
However this is strange, since JSON SHALL be encoded as UTF-8/16/32 … why isn’t that enforced by jackson/cheshire? Or when the encoding is not UTF-? use escaping as clojure.data.json/write-str does?

2020-01-23T22:20:29.012200Z

Hmm strange, I just returned a unicode emoji from a lacinia resolver and I am finding the correct byte sequence in my curl response