clojure-spec

About: http://clojure.org/about/spec Guide: http://clojure.org/guides/spec API: https://clojure.github.io/spec.alpha/clojure.spec.alpha-api.html
Alex Whitt 2021-02-22T18:06:49.001800Z

Would anyone like to jump on this discussion thread I created on the subreddit? (I'd prefer to keep it there so it doesn't disappear behind Slack's paywall) https://www.reddit.com/r/Clojure/comments/lpv8ok/spec_vs_malli/

Alex Whitt 2021-02-22T21:41:13.002600Z

The sentiment from the community so far has been leaning towards Malli. I'd love to get some alternative viewpoints in there.

seancorfield 2021-02-22T21:50:39.005200Z

@alex.joseph.whitt All I'll say is that I try hard to use "official" Cognitect stuff where it is available and we've been very happy with Spec for several years at work, although we have recently started using exoscale/coax in addition to Spec so that we can tease apart/replace our "coercing web specs" and use a more standard approach (i.e., separate coercion of strings, such as form input data, to stuff like long, bool, date... from the actual specs themselves).

alexmiller 2021-02-22T21:54:06.005500Z

I think your summary and the comments there are fair

seancorfield 2021-02-22T21:57:41.011400Z

I have not looked at Malli in any depth but we did use Schema for a while (and abandoned it -- actually twice -- because it felt non-idiomatic with its syntax and it was only as good as your testing strategy, without generative testing: we were happy with our switch to Spec in comparison to Schema). We also tried Typed Clojure (again, twice) and abandoned that for different reasons. Spec hits a sweet spot for us -- and we use it in a lot of different ways per https://corfield.org/blog/2019/09/13/using-spec/

👍 1
alexmiller 2021-02-22T21:58:02.012100Z

if I could wave a magic wand and have a final design and impl for spec 2 I would, but it's tackling some hard problems (beyond what we tried to tackle in spec 1). I know it feels stalled but work really does continue on it and I have hope that we will pop the stack of other work in progress back to it "soon".

👍 9
2021-02-23T16:06:06.067800Z

I'm glad y'all are taking your time. It's worth it.

seancorfield 2021-02-22T21:58:35.012700Z

Your comment about Malli embracing coercion would make me want to avoid it -- we went down that path with our own library and it was a mistake, hence our shift to Spec + coax now.

borkdude 2021-02-22T21:59:18.013Z

@seancorfield Can you explain why it was a mistake?

seancorfield 2021-02-22T22:00:06.014Z

Complecting coercions with validations is just messy and it can hide errors.

seancorfield 2021-02-22T22:00:30.014700Z

Our specs are simpler now, and don't need custom generators as often.

seancorfield 2021-02-22T22:01:05.015800Z

The coercions are clear and separate. We can test coercing code and validating code separately.

seancorfield 2021-02-22T22:02:04.016700Z

I was one of those arguing in favor of having specs that coerced values for a long time. I was wrong about that.

💯 1
borkdude 2021-02-22T22:02:53.017400Z

I would love to add spec to babashka btw. But I don't feel certain about the alpha suffix and where it's going next and how long it's going to be. It is a recurring question from users. Since it's part of the current clojure people sometimes expect it to be just there. clojure.test.check is already in there, as a preparation step

borkdude 2021-02-22T22:04:59.018600Z

I would also love to hear some insights from @ikitommi about the coercion philosophy in malli. Maybe it's one of those easy vs simplicity things, where a lot of people really just want the easy?

borkdude 2021-02-22T22:09:55.021100Z

@seancorfield Do you use clojure.spec to validate incoming web requests, in an open world way? @dominicm recently pointed out that this can be a security issue. This problem will be solved with spec2 which supports closed.

seancorfield 2021-02-22T22:15:09.024400Z

I disagree that it's a problem with spec. If you have a context where you must only accept a certain subset of keys, that's what select-keys is for.

💯 1
seancorfield 2021-02-22T22:16:53.026400Z

We use spec as the "source of truth" for various things and we have situations where we derive the set of keys from the spec and then either explicitly restrict the data we process to that subset or we simply trim the keys present down to that subset, depending on whether we want to flag unwanted keys or not. But that trimming/checking is independent of how we validate incoming web requests.

borkdude 2021-02-22T22:18:24.027700Z

If it's not a problem, then why is spec2 introducing closed specs again? :thinking_face:

seancorfield 2021-02-22T22:19:05.028300Z

Because people whined about it not being in Spec 1 🙂

seancorfield 2021-02-22T22:20:03.030Z

You could certainly do closed spec checking with Spec 1 -- you just have to do it manually.

borkdude 2021-02-22T22:20:05.030200Z

I think what @dominicm was hinting at was that when you allow any spec to be validated in a web request, this may trigger functions you did not expect to be executed, or something, which may lead to DoS, or whatever

seancorfield 2021-02-22T22:21:10.030700Z

That's a lot of hand-waving 🙂

borkdude 2021-02-22T22:21:59.031500Z

I was asking questions, I did not take a position in this. Just wondering if you had considered it

seancorfield 2021-02-22T22:22:05.031700Z

I think it's a strawman argument to "blame" spec for something like that, frankly.

borkdude 2021-02-22T22:22:50.032Z

ok

seancorfield 2021-02-22T22:24:55.033700Z

If he has a specific, realistic scenario where just using Spec 1 to validate an incoming web request would cause a DoS, I'd be very interested to see that.

seancorfield 2021-02-22T22:29:35.038Z

This is what Dominic said in #yada when asked why Yada does not have integration with Spec: "Because it gives attackers access to all your spec keys. Some of which might be very slow (and not run in production in normal circumstances). This opens you up to DOS attacks, the guidance from Alex on this was to validate data before passing it to spec. On top of that, you usually want to restrict keys in the web context as you're going to pass them into the database." -- I've covered the latter part (above) and I would expect in real-world code what you put in the database isn't necessarily going to be anything like the set of keys that you get passed in a web request because the persistence model and the API model aren't likely to be a 1:1 match anyway.

robertfw 2021-02-22T22:30:09.039200Z

I've had cases where the data I wanted to spec had to enforce a closed nature, namely, working with an HTTP API that throw an error if you gave it data it wasn't expecting. In that case we want a spec that validates that we're not including anything extra. The same would apply for a spec describing incoming data on a system that likewise wants to error out if the user sends unexpected fields (e.g., so users get warnings if they typo a parameter name or send data they think is doing something when it is actually being ignored)

borkdude 2021-02-22T22:31:35.040700Z

I guess you could ask the question: why should web APIs validate differently than functions?

seancorfield 2021-02-22T22:31:43.040900Z

I don't find the first part very convincing: if you have "very slow" specs that don't run as part of production code, why have them in your production code at all? Why not have them in test code (and if they're "very slow" they're not even going to be part of your unit tests).

seancorfield 2021-02-22T22:34:15.044700Z

As for Robert's case: I covered that above -- if you do want to fail validation on additional/unknown keys, that is possible with Spec 1. It's just not directly built in. I don't see Spec 2's "closed checking" as any sort of admission that Spec 1 was wrong -- I see it much more as a convenience that lets you write a little less code in certain situations. I think most people made a big ol' mountain out of wanting closed specs when it's really just a molehill.

seancorfield 2021-02-22T22:34:49.045800Z

(I think this is absolutely one of those cases where people pushed for "easy" rather than "simple")

borkdude 2021-02-22T22:35:50.047500Z

Well, sometimes UX matters

ikitommi 2021-02-22T22:36:49.048900Z

@seancorfield just to clarify: malli separates coercion from validation, for the reasons you mentioned. You first "coerce" a value, then validate and explain (and humanize the explanation) if needed.

3
seancorfield 2021-02-22T22:38:18.049600Z

Good to know -- that wasn't clear to me from @alex.joseph.whitt’s comment.

2021-02-22T22:51:00.049800Z

Tried following the link to the yada channel to see the conversion but perhaps it happened awhile ago. Curious what validation was suggested before passing to spec.

seancorfield 2021-02-22T23:05:51.050100Z

I didn't dig far enough into the Zulip archives to see what Alex had actually said about that -- I'm pretty sure Alex didn't literally say folks should "validate data before passing it to spec" but he has talked about doing coercion on input data before passing it to spec (a separate thing).

seancorfield 2021-02-22T23:06:51.050300Z

Slack has a 10,000 message limit on the free plan but most channels are mirrored to Zulip http://clojurians.zulipchat.com which has an unlimited archive and search facility (for the open source plan we're on there).

2021-02-22T23:15:40.050500Z

Got it thanks 👍

Alex Whitt 2021-02-22T23:51:48.054600Z

Well, I don't know if we're any closer to a decision... so many good points brought up on both sides. I'd love to know if the Cognitect team ever considered implementing spec with a data-driven architecture like Malli's, and if so, what factors led to deciding against it. For some context, I've been an avid spec user for years, and only recently became aware of Malli. As we're starting a new project, I guess I'm experiencing some FOMO regarding some of Malli's features, but I can't escape thinking that a lot of Spec's design decisions were really on-point and forward-thinking.

1
mpenet 2021-02-23T09:45:56.057800Z

spec is generally very opinionated, but once you "get" the design choices it's hard to be in disagreement with it. It fits quite well with datomic, "Maximal Graph" & co approach to dealing with data, which imho is spot on (I am not a datomic or pathom user but I agree with the general ideas). Right now it's also quite bare bones but there are a few libraries that fill the important gaps (coercion, error messages, etc). Then for me at least spec2 would fix most of the important problems I have with v1 (mostly schema+select, or whatever these will be called), but even without spec1 is very usable. Most of the problems stated about spec 1 imho are very often overblown or just misunderstanding on how to use spec or its goals. But I get the frustration of malli authors who wanted spec2 now (I do too, but I rather have something carefully designed than a rushed draft) and the amount of work they poor into malli earns respect. Personally I do not agree with some of their design choices and most importantly the fact this will cause yet more fragmentation in that space. I would rather have had all that fire power put in spec libraries. I also had to deal with code bases that invested in spec-tools and broke some teeth in the process, so I prefer to be very conservative when I choose a library to fill that spot now.

mpenet 2021-02-23T09:50:36.058Z

about closed specs & data for me these are quite minor issues (if at all), I had to generate specs from another dsl on work I did recently and it's not a big deal at all. For more advanced composition I can't remember a case where i was blocked by the current api, in extreme cases you can rely on eval/macros to get to what you want but it's extremely rare in my experience at least. That said if spec2 improves that, great. For closed specs it's the same, writing a strict-keys version of s/keys is quite easy a naive version with poor error reporting is a couple of lines and a more involved one with nice error messages more work but it's under 50 loc

mpenet 2021-02-23T09:53:51.058500Z

having multiple choices might be good, that's could be a sign we have a "healthy" community

borkdude 2021-02-23T10:47:38.059300Z

About the data driven nature of malli vs spec: I think it's fair to say that malli schemas are a better fit if you want to have a programmatic way of creating/changing schemas (in spec you end up with macros for this usually) and when you want to have (de)serialization of your schemas. That's not to say that you can't do this with spec, but the UX of malli is probably better for this kind of use.