onetom 2019-09-21T08:57:44.000700Z

Is there anyone actively using Clojure in HK? Or maybe anyone who would like to use it at work?

onetom 2019-09-21T09:11:07.006600Z

I've done some small projects in the past few months in Clojure again and I would be interested in discussing the latest Clojure news or maybe trying out some libraries together, stuff like that... For example I was doing some web scraping with clj-soup and pumped the cleaned up results into an MS SQL server, using seancorfield/next.jdbc. I never touched any MS SQL servers before, ever in my life, so I was pleasantly surprised how seamless the experience was. It wasn't easy to find next.jdbc though; I had to practically discover the whole evolution of JDBC and its various Clojure interfaces/wrappers... Then, just the other week I saw something about

onetom 2019-09-21T09:15:13.009400Z

Then I've done some work with JSON, CSV/TSV files and their state is pretty messy too. I was especially surprised how big the performance differences are between various implementations. Eg. with the org.clojure/data.csv and org.clojure/data.json libs it might take several seconds to read files on the 10MB size-range, but with alternatives, like clojure-csv + semantic-csv and metosin/jsonista it takes sub-second to decode the same files.

onetom 2019-09-21T09:23:35.013Z

Working with TOML files are even worse. Reading a 1.7MB TOML file with the toml library took 8 seconds. Writing a TOML in a specific format was also on the seconds range. I've ended up using org.tomlj/tomlj directly and just tailored it to the specifics of my data, to achieve reasonable performance...

(ns ...
  (:import (org.tomlj Toml)))

(defn map-vals [f m] (reduce-kv #(assoc %1 %2 (f %3)) {} m))

(defn decode
  "NOTE: Only supports 2-level nested maps with string leaf-values"
  [^String toml-str]
  (->> toml-str Toml/parse
       .toMap (into {})
       (map-vals #(into {} (.toMap %)))))

onetom 2019-09-21T09:28:16.016300Z

I was also looking for some consistent way to control the serialisation of CSV/TSV, EDN, JSON, TOML. I was expecting to find some generic approach, which allows me to control line wrapping and choosing between string representations and map key conversions, but every library has their own approaches... 😕 I know there is but it's not the most straight-forward things to use