malli

https://github.com/metosin/malli :malli:
2020-11-10T07:18:54.492Z

@ikitommi for what it's worth, i still got a huge performance increase by actually caching the validators as well.

(crit/with-progress-reporting
  (crit/quick-bench (m/validate schema value)))
;; => Execution time mean : 297.880813 ms

(def schema' (m/schema schema))
(crit/with-progress-reporting
  (crit/quick-bench (m/validate schema' value)))
;; => Execution time mean : 533.885193 µs

(def validator (m/validator schema))
(crit/with-progress-reporting
  (crit/quick-bench (validator value)))
;; => Execution time mean : 1.830348 µs
so it looks like about a 500x improvement by caching schemas, and then another 300x improvement by caching the validators

2020-11-10T07:21:25.493100Z

i suspect in your specific benchmark, the schema is fairly simple so then a larger share of the benchmark is actually about performing the validation

ikitommi 2020-11-10T09:18:57.494Z

@lmergen there was a cljs-issue, just merged the cached satisfies. could you retry with the latest master?

2020-11-10T09:19:12.494500Z

sure!

2020-11-10T09:19:29.495100Z

1 minute

ikitommi 2020-11-10T09:20:40.496400Z

there is still a lot of room for improvement for maps (`-parse-entries` is really slow) and for handling property-based registries. I would guess can make schema creation 2-5 times faster. But then again, after malli is used to validate schema properties & children, it will slow things down again.

2020-11-10T09:29:35.498200Z

(crit/with-progress-reporting
  (crit/quick-bench (m/validate schema value)))
;; before: => Execution time mean : 297.880813 ms
;; after:  => Execution time mean : 12.194964 ms

(def schema' (m/schema schema))
(crit/with-progress-reporting
  (crit/quick-bench (m/validate schema' value)))
;; before: => Execution time mean : 533.885193 µs
;; after:  => Execution time mean : 517.890217 µs


(def validator (m/validator schema))
(crit/with-progress-reporting
  (crit/quick-bench (validator value)))
;; before: => Execution time mean : 1.830348 µs
;; after:  => Execution time mean : 1.952607 µs
so while m/validate got ~ 20x faster, caching the actual validator is still much, much faster

👍 1
2020-11-10T09:40:09.001200Z

i'm caching the explainers in my own defn macro, but it requires quite a bit of macro magic to make this work, so i was looking for a more generic way to make this happen -- possibly some kind of registry

ikitommi 2020-11-10T09:43:50.002200Z

what should be in the registry? validator + explainer + generator + decoder(s) + encoder(s) + …?

2020-11-10T09:44:18.003300Z

if possible, i'd say all of them yes

ikitommi 2020-11-10T09:44:23.003500Z

in reitit, there is a Coercion protocol to cache things relevant there: https://github.com/metosin/reitit/blob/master/modules/reitit-malli/src/reitit/coercion/malli.cljc

2020-11-10T09:45:20.004600Z

right, so then you lazily cache things

2020-11-10T09:45:31.004900Z

which would be the best middle-ground

ikitommi 2020-11-10T09:47:00.006300Z

one would be to add a wrapper Schema impl, that is returned from registry instead of the real one. And that impl would have a cache -> first call to -validate would store the validator.

2020-11-10T09:47:30.007100Z

oh i see

ikitommi 2020-11-10T09:47:30.007200Z

could be just an option to the registry to return caching proxys instead of normal ones…

2020-11-10T09:47:41.007400Z

yes

2020-11-10T09:47:45.007600Z

this would be very effective

2020-11-10T09:48:20.008400Z

i'll experiment with this approach

ikitommi 2020-11-10T09:50:15.009500Z

… actually, just a new option key that m/schema uses would do fine (to wrap the returned thing if the option is present)

2020-11-10T10:02:14.011900Z

so then you cache it inside the actual schema, rather than a wrapper around it ?

ikitommi 2020-11-10T10:06:09.014500Z

I would wrap it outside, e.g. the return value wrapped

2020-11-10T10:06:47.015400Z

ah, right -- and the option to m/schema would then tell it whether to return the wrapped schema or the "regular" schema

ikitommi 2020-11-10T10:09:01.018900Z

yes. Or there could be a memoized-schema etc. as a separate fn? (-> :string m/schema m/memoized)

2020-11-10T10:09:12.019100Z

yes

2020-11-10T10:09:18.019500Z

well that's a detail

2020-11-10T10:09:38.020Z

let me experiment with creating that memoized / cached schema in the first place

👍 1
2020-11-10T16:52:29.020900Z

(crit/with-progress-reporting
  (crit/quick-bench (m/validate schema value)))
;; => Execution time mean : 11.782073 ms

(def schema' (memoized-schema (m/schema schema)))
(crit/with-progress-reporting
  (crit/quick-bench (m/validate schema' value)))
;; => Execution time mean : 2.095245 µs
@ikitommi conceptually it seems to be working like a charm

2020-11-10T16:55:43.021900Z

exactly which schema am i supposed to wrap here -- it's just the regular malli.core/Schema, right ? the into-schema is meant more for building a hierarchy of parent/child schemas ?

ikitommi 2020-11-10T16:57:08.023200Z

yes Schema. IntoSchema is the factory-protocol for creating a Schema out of the Schema AST, each Schema is responsible for it’s own props & children.

2020-11-10T16:57:15.023500Z

right

2020-11-10T16:57:17.023700Z

awesome

2020-11-10T16:57:35.024300Z

i'll send a PR once i have all the functions working