malli

https://github.com/metosin/malli :malli:
ikitommi 2020-09-03T05:23:30.038Z

@jeroenvandijk wanted to test the lazy registries.

(require '[malli.core :as m])
(require '[malli.registry :as mr])
Given a data-source that can map names to schemas:
(def schema-provider
  {"int" :int
   "map" [:map [:x "int"]]
   "maps" [:vector "map"]})
We can compose a registry that uses both local and lazy/external resolving:
(defn LazyRegistry [default-registry]
  (let [cache* (atom {})
        registry* (atom nil)]
    (reset!
      registry*
      (mr/composite-registry
        default-registry
        (reify
          mr/Registry
          (-schema [_ name]
            (or (@cache* name)
                (do (println "loading" (pr-str name))
                    (when-let [schema (schema-provider name)]
                      (swap! cache* assoc name (m/schema schema {:registry @registry*}))
                      schema))))
          (-schemas [_] @cache*))))))

(def registry (LazyRegistry m/default-registry))
Using the registry (either swap the m/default-registry or pass as argument:
(count (mr/-schemas registry))
; => 125

(m/validate "map" {:x 1} {:registry registry})
;loading "map"
;loading "int"
; => true

(m/validate "map" {:x 1} {:registry registry}) ;; cached
; => true

(count (mr/-schemas registry))
; => 127

(m/validate "maps" [{:x 1}] {:registry registry})
;loading "maps"
; => true

(count (mr/-schemas registry))
; => 128
Schemas are first class :refs:
(m/schema "map" {:registry registry})
; => "map"

(m/-deref (m/schema "map" {:registry registry}))
; => [:map [:x "int"]]
Hope this helps.

2020-09-03T07:56:28.040800Z

@ikitommi Thanks for sharing. I think it’s almost what I need. I’m puzzling how to deal with the (lazy) dispatch on a map key. In clojure.spec I would use multimethods and multispec:

(defmulti resource-type :Type)

(s/def :aws.cfn/resource (s/multi-spec resource-type :Type))

;; Some random examples
(defmethod resource-type "AWS::AmazonMQ::Broker" [_] :aws.amazon-mq/broker)
(defmethod resource-type "AWS::AmazonMQ::Configuration" [_] :aws.amazon-mq/configuration)
(defmethod resource-type "AWS::ApiGateway::Account" [_] :aws.api-gateway/account)
(defmethod resource-type "AWS::ApiGateway::ApiKey" [_] :aws.api-gateway/api-key)
...
If I can do this dispatch somehow, with your suggestion I think I have all I need

2020-09-03T08:02:49.041400Z

I’ll study the :multi schema and see if that is the missing piece

ikitommi 2020-09-03T08:08:46.042800Z

s/multi-spec is open & mutable, :multi is closed & immutable.

ikitommi 2020-09-03T08:09:40.043500Z

so here, I think a lazy multi variant would be needed.

ikitommi 2020-09-03T08:12:15.045700Z

a) lazy multi, with immutable values

[:multi {:dispatch :type, :children children-fn}]
b) mutable multi, backed by a custom (mutable) multimethod:
[:multi {:dispatch :type, :children my-multimethod}]

ikitommi 2020-09-03T08:12:58.046500Z

… actually would be the same code, it’s in user-space whether to allow overriding the keys.

ikitommi 2020-09-03T08:13:16.047Z

should not be many loc to implement

2020-09-03T08:29:37.048700Z

Thanks. Makes sense. I’ll try to adapt https://github.com/metosin/malli/blob/master/src/malli/core.cljc#L796

👍 1
ikitommi 2020-09-03T08:40:55.051Z

If you make a PR, would like that the default case (e.g. no :children key set) will not slow down -> the entry parsing will happen at schema creation time. for the case of dynamic childs - it would happen at runtime.

ikitommi 2020-09-03T08:41:49.052100Z

one question is: what happens if you create a validator, explainer or generator out of that schema: should the current children be used or should those be dynamic too.

ikitommi 2020-09-03T08:42:28.052800Z

e.g. if you add a branch after creating a validator, will the validators before that see it or not.

2020-09-03T08:45:38.055400Z

With clojure.spec I have one spec that contains all types. This gives you a suggestion in case the dispatch on type fails. E.g.

(s/def :cfn.all/Type #{"AWS::AmazonMQ::Broker" "AWS::AmazonMQ::Configuration" "AWS::ApiGateway::Account" "AWS::ApiGateway::ApiKey" "AWS::ApiGateway::Authorizer" "AWS::ApiGateway::BasePathMapping" "AWS::ApiGateway::ClientCertificate" .....})
This is not ideal either because it doesn’t have spell-check functionality. But to answer your question, I don’t think, at least for my use case, everything has to be dynamic

2020-09-03T14:39:40.056300Z

@ikitommi The start of this seems to be simple indeed https://gist.github.com/jeroenvandijk/59d22a726cda2158c01b9d63790aec50#file-malli_lazy-clj-L80 I’ve only added the validator part, not sure if the transformers and explainers will make things more painful

ikitommi 2020-09-03T15:57:13.057300Z

@jeroenvandijk just to Make sure: you do know all the possible dispatch keys in advance?

ikitommi 2020-09-03T15:57:33.058Z

(if so, there might be a simpler solution)

2020-09-03T16:17:16.060300Z

Yeah all the dispatch types are known in this case. The raw schema data is close to 1mb. So that's the main reason to do it lazy

ikitommi 2020-09-03T19:12:49.062200Z

@jeroenvandijk This would be a small change in :ref impl:

(defn LazyRegistry [default-registry f]
  (let [cache* (atom {})
        registry* (atom nil)]
    (reset!
      registry*
      (mr/composite-registry
        default-registry
        (reify
          mr/Registry
          (-schema [_ name]
            (or (@cache* name)
                (do (println "loading" (pr-str name))
                    (when-let [schema (f name)]
                      (swap! cache* assoc name (m/schema schema {:registry @registry*}))
                      schema))))
          (-schemas [_] @cache*))))))

(def registry
  (LazyRegistry
    m/default-registry
    {"map1" [:map [:type [:= "map1"]] [:x :int]]
     "map2" [:map [:type [:= "map2"]] [:y :int]]
     "map3" [:map [:type [:= "map3"]] [:z :int]]}))

(m/validate
  [:multi {:dispatch :type}
   ["map1" [:ref "map1"]]
   ["map2" [:ref "map2"]]
   ["map3" [:ref "map3"]]]
  {:type "map3", :z 1}
  {:registry registry
   ::m/lazy-refs true})
;loading "map3"
;=> true

ikitommi 2020-09-03T19:14:40.063800Z

new option :malli.core/lazy-refs that would control if the :refs are checked eagerly or lazily

ikitommi 2020-09-03T19:15:10.064300Z

or there could be a :lazy variant of :ref to make things explicit.

ikitommi 2020-09-03T19:16:23.065200Z

or a new property :lazy to :ref to mark it being lazy:

[:ref "map1"]

[:ref {:lazy true} "map1"]

ikitommi 2020-09-03T19:16:50.065700Z

I think that’s actually good.

ikitommi 2020-09-03T19:23:36.065900Z

https://github.com/metosin/malli/pull/252

ikitommi 2020-09-03T20:05:51.072Z

actually, we can push all the changes from user api (e.f. schema props) into extender api (here: lazy registry impl). This allows to write fully lazy multis:

[:multi {:dispatch :type}
 "AWS::AmazonMQ::Broker"         
 "AWS::AmazonMQ::Configuration"
 "AWS::ApiGateway::Account"
 "AWS::ApiGateway::ApiKey"
 "AWS::ApiGateway::Authorizer"]

ikitommi 2020-09-03T20:07:53.074600Z

(`:multi` uses the entry-syntax, like :map which allows single-value elements if they are valid schema reference types, now: just qualified keywords, should be strings too)