I want to construct one or more relatively expensive NLP pipelines and use them in all of my tests, located in different test namespaces. How do I best accomplish this?
what does a NLP pipeline look like? e.g. invoking a discrete expensive function, or connecting to a database, or ...
In this case it’s a processing function backed by a multi-gigabyte memory model.
… and the model takes a little while to load. The function itself is relatively quick to run.
what's the problem in sharing it across namespaces like any other helper defn? concurrency perhaps?
The main problem is the cost of instantiating. The way I tend to write unit tests, I have a lot of namespaces. I was just wondering if there was a typical pattern for reusing a fixture in tests like this.
The model could be instantiated in a single ns:
(def the-model (delay (initialize-model)))
arbitrary defns could use this var. Only the first consumer initializes it, with no possible race conditions (https://github.com/clojure/clojure/blob/0df3d8e2e27fb06fa53398754cac2be4878b12d1/src/jvm/clojure/lang/Delay.java#L35)
Does that sound like a good start?
yeah, that makes sense.
thanks!