I'm working on a biginteger generator for test.check, presumably to be used by spec as well (directly or indirectly); the main decision to make is what the distribution should be. Here's what I've got so far, interested in any comments. https://gist.github.com/gfredericks/b6b59f1c531dc36017e45f2f0beeff9e
let’s say i want to test some stateful thing by generating a sequence of actions to perform
how do i use state to inform future generated actions?
simple example: let’s say i want to test a growable array class with three methods: getLength, append, and getNth
that was easy to test, but as soon as i wanted to test setNth, i was at a loss
how do i make it so that setNth can check getLength first?
my current hack is to just do modulo math at the time of applying the action - but that only works for this simple example & i have more complex stuff i want to test
my best guest is to try defining generators recursively, deferred by an fmap or bind or something - but wasn’t sure if that was a hack or recommended or what
ie parameterize them based on a model of the state
the general approach is to first build a (random) model, then produce the action list via fmap/bind
not 100% sure i follow. is there a super simple example somewhere?
There's a whole lib for this kind of thing I think
i’m looking at two such libraries now, trying to see if they clarify things for me: stateful-check and states
I haven't used either, but I've reviewed stateful-check a bit and it seems solid
there’s a non-trivial amount of code in this lib. i’ll study it, but i’m hoping to identify the essence of it.
@bbloom do you understand the idea of modeling the whole interaction ahead of time?
it’s been a while since I’ve looked at collection-check but I know it does a lot of this operation kind of thing (can’t remember if it’s stateful though)
so i haven’t used test-check in anger at all, but i did successfully roll my own generative/simulation testing thing in go for a wire protocol. but that was a hierarchical temporal marckov model, i didn’t do shrinking, and i did all the validation later on the log file - worked mostly as a stress test for a long running system
to get multiple tests, i didn’t have a backtracking generator type thing - instead just ran a bunch in parallel, since they were also testing limited hardware resources (ie physical devices)
I think the shrinking model is the biggest reason that the generate-everything-up-front approach works best
so i have a bit of a grasp on the concepts, but no knowledge of the specific apis
sooo now back to your question: i’m not sure what you mean “modeling the whole itneraction ahead of time”
the only viable alternative I've seen is the python/hypothesis approach which is extremely imperative and wildly different
@alexmiller: i’m looking at collection-check, as what i’m doing is quite similar. thanks
@bbloom I mean the generator generates the entire intended interaction, e.g. - insert a - insert b - read 0, expect a - read 1, expect b - etc.
I believe this is essentially the idea behind stateful-check
it does all the wiring-up for you
yeah, so i’m trying to test some java code that is stateful & i’m successfully generating a sequence of actions & applying them
the tricky bit is the bit that stateful-check seems to address: using the state of the model in order inform the generators
"Why can't I call generators in the middle of my test?" is a common thing people run into; currently test.check doesn't try to give you a way to do that, but I'm not 100% convinced it can't be done
I've thought about it a lot; the shrinking is the hard part
the fact that it’s asked for a lot leads me to wonder: is it a misunderstanding that leads to it being wanted? or it just hard to provide?
ie should i be thinking about it differently?
We've done a number of sim testing projects at Cognitect that pushed this kind of thing really far
But there you don't really care about shrinking at all
glad to hear that - since i had no use for shrinking in my sim test and wondered if i was just missing out
I'd ask @luke about some of those ideas - he has some libs that I think are oss that do statistical / stateful generative stuff
this is the first time i wanted to test something that felt complex enough to justify test check and not complex enough to justify distributed sim testing
it should be possible to write a thingamajigger to call generators in the test deterministically, without doing any shrinking
In general with sim testing you want some simple (ish) model how users use the system such that you can generate realistic (ish) streams of random activity
yeah - in my case, i simulated a police officer on patrol. turning cameras on and off, watching videos, driving to and from places with or without wifi, etc
worked nicely
Yeah, exactly
Our domains were a lot more complicated :)
i’d imagine
Cognitect does arch consult gigs, just saying :)
🙂
...arch?
architecture
like...CPU chips? big buildings?
large-scale computer systems?
That
Really any scale :)
Architecture review
i don’t work on that any more, but the team seems quite happy with the architecture i built for them - which is largely clojure/hickey-inspired, thank you very much 😉
@gfredericks i don’t quite understand the api well enough yet to know if that code is useful to me
@bbloom I added an example
the idea is you generate one of these f
s and then you can call it from the body of your property as many times as you want with arbitrary generators
it's a stateful function, whose state is randomness derived from the normal test.check source of randomness, so the whole thing should be deterministic as long as you don't call the function in nondeterministic ways
i’m not sure i need something that sophisticated
so e.g., if you have your stateful java collection and want to generate a valid index, you'd call (f (gen/choose 0 (dec (.getLength thingamajig))))
and that call would return an index between 0
and (dec (.getLength thingamajig))
the sophistication is mostly to satisfy my own ideals of reproducibility
if you couldn't care less about that you can just call gen/generate
and not bother with any of this
so i don’t actually need to call .getLength on the real stateful object during test execution
i just want to track some state in the model & use that to inform which actions to generate
looking at collection-check, they seem to use clojure’s types as a model
but zach does basically what i mentioned earlier: he generates wild indexes for nth/assoc and then fixes them up during execution
how much am i expected to know about the rose tree stuff in order to make effective use of test-check? this stateful-check thing manipulates it quite a bit, plus has some gen-do monadic bind syntax etc that seems redundant w/ fmap, gen/let, etc
users don't normally need to care about the rose tree
my guess is that stateful-check's low-level internals are an attempt to shrink better than the naive shrinking you get by making a pile of binds
gen/bind
is pretty dumb about shrinking
and stateful-check might be able to make assumptions that bind
can't
maybe a simple example of generating a model with actions would be helpful
gimme a minute
thanks for your help. greatly appreciated
you too @alexmiller
i’m still reading and trying to make sense of stateful-check, but some things concern me
for example, command argument generation seems to attempt to implicitly and recursively lift values to be generators
which seems like magic i don’t want, at least not before i am comfortable w/ the monadic api
meanwhile, i’m failing to extract the essence from the core of the generate-commands* routine
yeah I've never liked the values-can-be-generators idea
in practice it's probably not too confusing
....maybe
….maybe with a type system? 😛
yeah 🙂
could fancy it up to add the intermediate state at each step, or generate random assertions if you don't want to just compare the whole map
note that the gen/let
s are where bind
is happening
there might be stack problems with this; maybe that's another reason for stateful-check's complexity
e.g., if you wanted to generate 10000 actions you might have problems
yeah - so i tried something like this and got a stackoverflow
but thanks, this example is much closer to my intuition
I just generated 20,000 of these up to the normal max-size of 200
and got no exceptions
looks like the sizes don't get above 20 though
so that might be bad
that's probably why it's not SOing 🙂
😛
to get larger you'd use gen/frequency
instead of gen/one-of
and tweak the weights
alternately you could generate a size up front and pass that through the recursion
well you had equal assoc/dissoc frequency
yeah
i wasn’t doing any dissoc to reduce the size, so it grew faster
I think stack problems would be related to action-count, not the size of the state
oh, hmm that’s right
not sure what i did, since it’s like 5000 undos ago 🙂
with respect to shrinking: how well does that work as things grow more complex?
i’m somewhat skeptical it can work well for stateful tests - since each removed operation can potentially invalidate the remainder of the operations
that's exactly the problem
if you're using the naive bind structure in my example, what would happen is if test.check tries to shrink an earlier action it will generate a totally new set of following actions
it might be that in practice this isn't as bad as it sounds
I get almost 0 complaints about this
which of course could mean 1) it’s perfect 2) no body is using it or 3) people don’t understand it well enough to complain about it without embarassing themselves like i’m doing now 🙂
I made a ticket about it at least, but don't know if it will lead anywhere https://dev.clojure.org/jira/browse/TCHECK-112
yeah exactly
2 isn't plausible; people are definitely using bind
, at least via gen/let
in fact gen/let
tends to encourage over using bind
i.e., using bind
to combine two independent generators in a way that you could do with gen/tuple
does tuple do “parallel” bind? ie it shrinks from any position fairly?
and bind is “left biased”?
that...might be right
tuple
shrinks each thing independently
you can always rewrite tuple
with bind
but not vice versa
gotcha - so tuple is applicative
that's probably true
ok - so i guess i’ve confirmed 1) i understand this and 2) i still have no idea how to use the API 😛 thanks
i’ll keep playing with it
will let you know how it turns out
oh well; do let me know if you have more questions or accusations
heh, i hope the accusations thing was a joke and/or not about me. let me know if i failed to politely convey my gratitude!
oh yeah sorry, definitely a joke
so shrinking only removes elements from the reproduction, right? no rewriting in any way?
i’m attempting a “loose” approach to action generation - will leave state out of that part & use state in the code that “runs” the actions. that will screw up shrinking probably, but i’ll deal with that when i have trouble pinpointing an issue in the future
so like if i were testing that random access stack example from before, actions may just be like “do 5 pushes, then randomly swap elements 8 times, then do two pops, etc….”
and see how that does for me
is there a generator that will (with high likelihood?) test boundary cases such as max/min integers? - the large-integer generator doesn’t seem to explicitly select those, so i guess i need to just use one-of or frequency to ensure those cases get covered (which is relevant to me b/c the code under test does some low level binary fiddling sometimes & i want to make sure it doesn’t screw that up)
i’ve gotta say - the learning curve is relatively steep with this stuff
but that might just be b/c i’m not comfortable using stuff w/o understanding it - and the monadic stuff makes it a bit opaque
for example, it wasn’t immediately clear what a “property” really was under the hood (just a generator that produces a particular shape of data) or that it was safe to use side effecting asserts in such a property
the readme shows an example that returns a boolean
but if i tested each property individually, my tests would be far too slow
spotted http://blog.colinwilliams.name/blog/2015/01/26/alternative-clojure-dot-test-integration-with-test-dot-check/ - which seems like an improvement
while i’m rambling to myself (or whoever is listening?):
i’m not quite happy with my attempts to generate action sequences yet - i’m getting more variability in parameters and less variability in the sequence of actions - i’m not sure how to influence that properly yet
to use the stack example again, i’m getting a lot of tests that try just using different elements to push on the stack, but really what i want is a lot more tests that have longer sequences of pushes/pops/etc
the internal generation strategy doesn’t appear to be documented anywhere. is it a top-down strategy? or a bottom-up one? or some hybrid? i can’t really tell easily, nor what the implications of that would be
Hi
https://clojurians.slack.com/archives/C0JKW8K62/p1493238965936535 If you're talking about something like the event modeling code I shared earlier, new events CAN be generated during shrinking
how does that work?
So bind generates a value from one generator, then uses the user supplied function to create a new generator and generates a value from that
That much is straightforward?
yeah
There are two ways to shrink from there
You can shrink the thing generated in the second step - that's the easy way
The hard way is to shrink the thing generated in the first step, because then you're obligated to call the user function with the new smaller value, which returns a potentiality entirely new generator that may have nothing to do with the original second-step value
In any case you can't assume they're related
So the only way to proceed is to generate something entirely new and probably unrelated from that new generator
I suppose you could at least use the same randomness, just in case that helps. I think test.check already does that
i guess i don’t understand the shrinking process at all then - b/c what you’re saying here doesn’t make sense to me
is it just that shrinking re-runs the generators in some “shrink mode” and can do whatever the hell it wants?
So shrinking is not an independent operation on generated values -- it's essentially done at generation time, by each generator in the composition
This is a difference from haskell as I understand it
so is shrinking just “re-generate, but smaller”?
No
i would have expected generate & shrink to be two methods on a protocol
It's not - a generator returns a Very Large lazy tree of values
The root is the generated value; each node's children are ways to shrink from the value at that node
The shrinking algorithm walks that tree
Higher order generators combine these trees
E.g., gen/fmap calls rose/fmap
ah, see, my mental model was one of producer/consumer
emitting generated values
totally wrong apparently 😛
I can't remember if reid came up with it himself, but people don't seem to expect it
I think maybe he guessed the impl based on erlang's API
So perhaps john hughes is the source
My talk Purely Random is about a lot of these details and about how I converted test.check to use an immutable rng
i might have to watch that
I wish I knew how to get slack to notify me of any activity in this channel
Maybe I just did that
so if i do (gen/sample (gen/list gen/int)) for a large sample size - and take the max of the lengths of the lists, it seems to stop at 99
i have no idea what controls that value
and i have no idea how to go about figuring it out lol
There's a doc page in the repo on growth and shrinking
The other thing people don't usually expect is that growth and shrinking are not related
the word “grow” does not appear in any of the .md files in the doc directory 😛
Should be in one of the titles
Growth
nope - maybe it’s somewhere else? i’ve read all the docs in the repo that i could find, heh
Um
I'll find it
https://github.com/clojure/test.check/blob/master/doc/growth-and-shrinking.md
It's new, maybe you didn't have it locally
ah, sorry - my bad. my last pull failed, i have code from december
will read that
lol, i had code checked out from ~2 days before you added that
Haha
ok - that doc gave me a much better idea of how growth & size works. thank you!
however, it didn’t give me much insight in to shrinking
other than the fact that shrinking is unrelated to growth 😛
also, i’m still not clear on how size is related to whether or not a generator is run
for example, if i do this: (gen/sample (gen/vector (gen/resize 0 gen/int)))
i get stuff like [0] multiple times, which i’d expect - but it suggests that the frequency of generator execution is unrelated to the size of the range of the generator
in order to have any confidence… i guess i’m going to need to understand this rose tree thing….
found this explanation: http://reiddraper.com/writing-simple-check/
gen/generate can give you better control than sample for playing with sizing
ok, i think i understand the rose tree / shrinking now….
Reid’s rationale suggests getting a shrinker “for free” from the generator, but i’m skeptical of that approach vs the two-method approach that apparently haskell uses
You're thinking a generator is an implementation of a protocol with one method for generating a value and another for shrinking it?
i did think that, which is apparently true in QuickCheck according to Reid’s blog post
Consider how to implement fmap then
(gen/fmap str gen/nat), e.g.
How do you shrink "42" given a function that can shrink 42
same way it does now? seems like it just divides by two in some complex round-about fashion
It's a string, not a number
ah
sorry, i misread
(defn fmap [f x] (reify IGenerator (gen [this ctx] (f (gen x ctx))) (shrink [this ctx value] (f (shrink x ctx))))
😛
er sub that second value with x
or second x with value
you get the idea
but yeah, i get my intuition doesn’t match reality
but as far as i can tell, the public API of test.check doesn’t expose the rose tree stuff
all the constructors for generators assume you return a value, not a rose tree
which i guess that explains why stateful-check mucks around with internals a bit