Yet another question about spec.
This time about the semantics of the sequence designators such as s/alt
, s/+
, s/*
etc.
an element such as (s/+ even?)
means (if I understand correctly, a sequence of 1 or more objects for which the even?
predicate is true.
but I can substitute a spec name for even?
and say (s/+ ::big-even)
and that will be a sequence of 1 or more objects each of which validates the ::big-even
spec.
However, what does (s/+ (s/alt :x even? :y prime?))
mean. I would guess that it means a sequence of 1 or more elements each of which either satisfy even?
or satisfy prime?
But if I define
(s/def ::even-or-prime (s/alt :x even? :y prime?))
what does this mean? (s/+ ::even-or-prime)
It is a sequence of 1 or more sequences (each of which contains elements which are either even or prime), or is it a sequence of 1 or more integers each of which is either even or prime?I'm pretty sure spec has a way to designate both, as they are both possible things the user might want to express. right?
@jimka.issy s/alt
is for regexes, use s/or
for normal predicates
(s/+ (s/alt ....))
means a sequence of one or more alternatives
it doesn't mean each, this can be expressed with s/every
or s/coll
Sorry, my question too convoluted. Let me ask simpler.
If ::foo
is a spec defined somewhere using s/def
, what does (s/+ ::foo)
mean? Does it mean a sequence of 1 or more objects which each validate ::foo
?
yes
So it is not referntially transparent?
I.e. substituting the expression for ::foo in place, changes the meaning?
I don't see why that would be the case?
suppose
(s/def ::foo (s/alt :x even? :y prime?))
Now
(s/+ (s/alt :x even? :y prime?))
matches a sequence of 1 or more integers each of which are either even or prime, right?
but (s/+ ::foo)
matches a sequence of 1 or more objects which each satisfy :foo
, i.e., a sequence of sequences each of which contain even or prime integers.@jimka.issy Please give a working example. E.g.:
user=> (s/def ::foo (s/or :x even? :y neg?))
:user/foo
user=> (s/conform (s/+ ::foo) [2 4 -11])
[[:x 2] [:x 4] [:y -11]]
user=> (s/conform (s/+ (s/or :x even? :y neg?)) [2 4 -11])
[[:x 2] [:x 4] [:y -11]]
A spec is only applied to one thing, not to all elements of a collection, unless you use s/every
or s/coll-of
.
user=> (s/def ::nums-or-strings (s/or :nums (s/every number?) :strings (s/every string?)))
:user/nums-or-strings
user=> (s/conform ::nums-or-strings [1 2 3])
[:nums [1 2 3]]
user=> (s/conform ::nums-or-strings ["foo" "bar"])
[:strings ["foo" "bar"]]
yes, but you answer avoids my question. I'm not asking how to write a good spec. I'm asking about the semantics.
And I asked you about a counter-example of referential transparency. A working one without imaginary functions like prime?
which does not exist in core.
Ok here is a very simple one.
I mean you don’t really need prime?
source to know its semantics
(s/+ ::foo)
matches a sequence for which all the elements are either even or negative.
looks right
Your words are ambiguous to me. Do you mean: every element is a ..., or on a case by case basis?
lets stick with integers and sequences thereof.
Now define
(s/def ::foo2 (s/+ ::foo))
what does (s/+ ::foo2)
mean. It does not mean a sequence of 1 or more objects which of which match ::foo2
What does ::foo
mean in your code here?
some kind of integer that's either this or that?
(s/def ::foo (s/or :x even? :y neg?))
then ::foo2 means multiple even or negs, but they can be mixed. it's not all or nothing
I would. say (s/+ ::foo2)
is a seq of 1 or more objects each of which is is ::foo2
, why would you say it isn’t that?
Wait. I think I'm confused. Let me work out a more consistent example. And come back to you in 5 minutes. OK?
ok
taking this conversation to a thread. I hope everyone follows.
(s/def ::foo (s/or :x even? :y neg?))
(s/def ::foo2 (s/+ ::foo))
Now ::foo matches even and negative integers
(s/+ ::foo)
matches a non-empty sequence of even and negative integers
sounds right
but (s/+ ::foo2)
does not match a non-empty sequence of objects which match ::foo2
ah, I think that’s because of how regex ops nest
yess!!!!!!!!
this was my original question (s/+ x)
does not match a non empty sequence of things which match x
@jimka.issy If you want to do this, you need to use s/spec
around the ::foo2
This has to do with regex nesting indeed, not with referential transparency. Read the spec guide, which explains this
> When regex ops are combined, they describe a single sequence. If you need to spec a nested sequential collection, you must use an explicit call to spec to start a new nested regex context.
(s/valid? (s/+ (s/spec ::foo2)) [[2 -3 4] [2 -3 4] [2 -3 4]]) => true
I have read down https://clojure.org/guides/spec#_sequences.
Did I miss the paragraph about when regex ops are combined.
I assume you know how Ctrl-f works ;)
Besides, I'm confused about what "regex" operator means.
regex operators are things like s/+
, s/?
sometimes it means string regexps. and some times it means sequence operations.
in the context of spec, regex means a spec which describes a sequence of things
The idea of spec is based on this paper: http://matt.might.net/papers/might2011derivatives.pdf where that terminology comes from
so does this mean that if a sequence operator is directly within another sequence operator it has this special modal/merging meaning. otherwise it has its normal meaning?
(or maybe not that one, but the paper where that paper is based on)
> When regex ops are combined, they describe a single sequence. If you need to spec a nested sequential collection, you must use an explicit call to spec to start a new nested regex context.
s/alt
returns a regex op, s/or
is for combining predicates
so s/or
is not a regex op.
have you seen this paper? https://www.lrde.epita.fr/dload/papers/newton.18.meta.pdf
I haven't
and this one, where I introduced the derivate for implementing something quite similar to spec in common lisp https://www.lrde.epita.fr/dload/papers/newton.16.els.pdf
the system I devised (in 2016) and after, is called RTE (regular type expressions).
cool!
What i'm trying to do now is figure how similar or different it is from spec at a theoretical leve.
So I'm trying to see if I can "compile" a spec into an RTE.
if so, then an RTE can me validated in linear time with no backtracking.
as I understand it, a spec takes exponential time worst case, so users avoid worst cases.
however, since I have found no academic papers on spec, and I am not an expert of spec, I'm just guessing. Maybe I'm completely wrong about my supposition
@jimka.issy I'm interested in that for clj-kondo as well. Clj-kondo has a tiny type system in which I try to avoid backtracking of type checking arguments. Currently by just not allowing sophisticated constructs :)
(https://github.com/borkdude/clj-kondo/blob/master/doc/types.md)
interesting*
I submitted a paper recently to the Dynamic Languages Symposium where I introduced a Simple Type System for use in dynamic languages. The paper was rejected. One of the reasons for rejection was the revewers didn't see practical applications.
a simple type system allows you to reason about types (where type is defined as any set of values) . Furthermore intersections, unions, and complements of types are also types.
If your type logic can conclude that a type is empty, then you've usually found a bug in the user's code.
It sounds like it could be very useful for clojure. Are you aware of the work done in core.typed?
Thats Ambrose?
yes
He's looked at my work and is interested in helping out. but everyone is busy.
I'd also love to have a student researcher interested in working with clojure.
It looks like your work could help improve clj-kondo's type system since linear checking is necessary for performance
what's clj-kondo written in? is it written in clojure?
yes
are you more of a hacker, or more of an academic?
or more of an engineer? I don't mean that in any way insulting. sorry if it sounds as such....
haha. I do have a CS master degree, but not a Phd in type systems ;)
clj-kondo focuses on usefulness though, whereas something like core.typed is maybe more an academic project
I don't care about publishing papers, I want the tool to help me
I'm not sure why my next best step should be.
For the moment I have an implementation of genus, (the simple type system in clojure) and RTE (regular type expressions) which allows genus to represent sets of sequences which match regular expressions in terms of types. Genus claims to be extensible.
For the moment my experimental task is see if I can indeed extend genus (in user space) to incorporate the spec
type. I.e., given a spec, the type will be the set of all objects which validate the spec.
spec types will be treated as opaque types such as arbitrary predicate. except in the case where I can compile a spec into more fundamental types in which case the system can reason about them.
Another tool I made recently also uses clojure.spec for searching code: https://github.com/borkdude/grasp This could potentially made faster Another project which tries to use spec at compile time: https://github.com/arohner/spectrum It's experimental in nature.
for example it will be able to decide whether the set of objects satisfying spec1 and also spec2 is empty. or decide whether two given specs are equivelent.
in order for RTE to represent the regular type expression as a deterministic finite automaton, it needs to be able to determine whether two given types are disjoint. This is currently not possible with spec in the most general case, but could be possible in many particular cases.
As I understand spec's internal data structure uses non-deterministic finite automata, thus the exponential behavior in the worst case.
ANYWAY, maybe now you understand a bit better the reason for my strange beginner questions about the semantics of spec ???
yes, thanks for explaining!
The authors of spec usually emphasise that spec is not a replacement for a type system, but a runtime validation system.
in my opinion there is no difference
What might be of interest, I have a collection of specs for core functions here: https://github.com/borkdude/speculative
As in, spec could be expressed as a case of dependent typing?
what is speculative? I didn't immediately understand by reading the intro.
> a collection of specs for core functions
For me a type is a set of values. And a type system allows programmers to designate and reason about certain types.
that's what spec does.
e.g. a spec for clojure.core/cons
:
https://github.com/borkdude/speculative/blob/4e773794a4065a84bdadd997516e52c76ab51b1f/src/speculative/core.cljc#L17
yes, if you define it like that, it makes sense. but this set of values can often not be known at compile time, which makes the difference between static compile time checking
what is s/fdef
?
clojure.spec supports defining specs for checking function arguments and return values using fdef
yes many times you can indeed know a lot about a type at compile time. in those cases, compilers like the SBCL common lisp compiler can create more efficient code, or tell the user about errors or unreachable code.
so an fdef decribes what a valid function call site looks like?
yes
I can see how the predicate based assumption of spec, could be a bottle neck for a code analyzer like kondo.
I initially used spec in clj-kondo, but I found the performance not good enough
one think that I've noticed is that many many many of these predicates can be rewritten as a simple type check. And I have code which automates this.
ahhh, so you have abandoned that piste now?
the format now uses something more explicit: you provide the seq of types per arity. so you don't have to do any backtracking to to match a seq based on some s/alt expression. which makes more sense to me considering the structure of multi-arity functions.
and a seq of types is usually a fixed number of things or a fixed number of thing + a variable amount of things of the same type
I think this matches 99% of cases how people call functions in clojure
for example
(gns/canonicalize-type '(satisfies int?))
==> (or Long Integer Short Byte)
(or ...) is a union type. when intersected and unioned with other types, it can be arranged so that redundant checks are eliminated.clj-kondo also supports union types, it's just #{:int :string}
it also tries to infer returns types and threads types through let expressions
Example:
so it has type inferencing?
It has inferencing yes, but limited to the things it knows, else it's any? and it won't complain
I wrote something pretty similar some years ago for SKILL++ which was a proprietary lisp used by the company I worked for for many years.
in my system, called "loathing" it would see that foo
returns type string (because of the definition), and the call site inc expects an integer. So the type at that point is (and integer string) which is the empty-type.
clj-kondo does something similar. but it just has a simple graph of things that are reachable or not: you can never go from string to int, so this is an error
but you can go from any to int, or any to string, so this is not an error
however, if inc had input type (or integer string), then the intersection would be (and string (or string integer)) = string != empty-set
yep, same thing in kondo
e.g.: this will not give a warning:
(defn foo [x]
(if (int? x)
x
(str x)))
(inc (foo 1))
You mentioned that you are not interested in publications. But I need to make publications. It's part of my job.
Well, I meant that this is not the goal of clj-kondo.
I need to get several publications out of this work.
It's not the product of academic research
I have nothing against making publications
I do have my name on one publication ;)
and I think there are several interesting results to publish. But i'm a very poor publicist. I don't do well at convincing people of the interest or novelty of my work.
If you could get practical benefit out of your work by e.g. improving clj-kondo or a similar tool, I'm sure it will convince people
@jimka.issy I'm sharing this for fun, but it's also related: https://borkdude.github.io/re-find.web/ Find functions based on example in and outputs
This uses the speculative specs to search for matches
the clojure community is friendly for the most part. although I do see some snide comments now and then.
So it is usally pretty easy to get questions answered.
I would love to have some spec examples for test cases.
I would say the clojure community is better in that respect than some others
Could I solicit you to give me some spec examples, simple and complex. some which are trivial, and some which are very slow to validate?
For the moment I'm just worried about sequences, not functions or maps or data structures. My code is not advanced enough to handle those.
I want to try to compile these to RTE and see if I can come up with some cases where RTE is faster.
if RTE is never faster, then i'm in trouble
You could take a look at speculative, which has more than hundred specs
I don't have a particular example of a slow spec, but I guess you can fabricate one based one which does backtracking
when analyzing code statically, i suppose you are usually looking at nested sequences, not really at maps and such.
This tool: https://github.com/borkdude/grasp searches through code. It applies the spec at every sub-tree of an s-expression
So performance improvements would be really noticeable there
I have to go now. speak to you later
ok, thanks. talk later
BTW i filed an issue for kondo, was it helpful?
another spec question. Are all spec definitions global? If I'm writing test cases, I'd love to introduce local definitions that don't pollute the global space.
Yes, thank you. I'm not sure if clj-kondo will be able to support this, as it sees one file as a standalone unit and in-ns
doesn't really match with this model. So either using in-ns
with :refer
on the original namespace requires you to help clj-kondo with some annotations or config, or clj-kondo has to make non-trivial changes.
I will think about this for a while
gocha
@jimka.issy Yes, spec has one global registry (compare spec keywords to RDF uris for example). Malli (#malli) which aims to be an alternative schema library has local registries as well.
I suppose for test cases I can use ::xyzzy to keep the symbols in my local namespace.
Do I understand correctly that there is no intersection equivalent of s/alt
for regex semantics. I.e., s/or
is to s/alt
as s/and
is to what?
s/&
ahh, it's not included in https://clojure.org/guides/spec#_sequences, at least not in the table with the others
syntactically does s/&
work like s/alt
and s/cat
in that I need to tag the components? or like s/and
where no tags are used?
@jimka.issy In the guide, search for: > Spec also defines one additional regex operator, &, which takes a regex operator and constrains it with one or more additional predicates. to see an example
ahh so s/&
is not really analogous to s/alt
it is yet a third syntax and semantic.
I thought you were asking analogous to s/and
.
> I.e., s/or is to s/alt as s/and is to what?
s/alt
and s/&
are supposed to be used to build regexes, s/and
and s/or
are just general ways of combining predicates
but sometimes they do the same thing
It looks (from the documentation) that s/&
is not analgous to s/and
in the same way that s/alt
is analogous to s/or
. if they were analogous I'd expect s/&
to work like this.
(s/* (s/alt :x (s/cat :a neg? :b even?) :y (s/cat :c odd? :d pos?)))
(s/* (s/& (s/cat :a neg? :b even?) (s/cat :c odd? :d pos)))
but instead it works like this
(s/* (s/& (s/cat :a neg? :b even?) #(even? (count %))))
See the difference here:
user=> (s/conform (s/alt :foo int? :bar string?) [1])
[:foo 1]
user=> (s/conform (s/or :foo int? :bar string?) [1])
:clojure.spec.alpha/invalid
user=> (s/conform (s/or :foo int? :bar string?) 1)
[:foo 1]
yes s/or matches an object, and s/alt matches a sequence of objects
So why doesn't this return true?
(s/valid? (s/* (s/& (s/cat :a neg? :b even?)
(s/cat :c odd? :d pos?)))
[-3 4 -5 2])
it is both a sequence of neg even occurring 0 or more times, and also a sequence of odd pos occuring 0 or more timesIf I replace s/& with s/alt (and insert the required keys), it works as expected.
(s/valid? (s/* (s/alt :x (s/cat :a neg? :b even?)
:y (s/cat :c odd? :d pos?)))
[-3 4 -5 2])
It looks to me (on first and second reading of the specification) that (s/& pattern predicate)
means that the subsequence which matches the pattern must as a sequence match the predicate. So the following
(s/cat :a (s/& (s/* string?) #(even? (count %)))
:b (s/& (s/+ int?) #(odd? (count %))))
matches a sequence which begins with an even number of strings and ends with an odd number of integers.
Is my interpretation correct?
I don't think these are really regular expressions any more. I have to think about it, but my suspicion is that this cannot be solved without using a stack.I don't think s/& is a regular expression operation. It doesn't seem regular. as far as I can tell, it requires the implementation to remember the sequence being tested. a regular expression demands that every decision be based only one the current object of the sequence, and a state. whereas only a finite number of states are allowed. In this case if there are N states, you cannot test a sequence of length N+1.
How can I recognize an object such as the one returned by
(s/* (s/alt :x (s/cat :a neg? :b even?)
:y (s/cat :c odd? :d pos?)))
s/spec?
returns false
. type
returns clojure.lang.PersistentArrayMap
.
s/get-spec
returns nil
.isn’t there a regex?
so regex?
returns non-nil. is that one I should use?
Yeah
seems to be any object for which ::op returns boolean true.
where ::op is :clojure-spec-alpha/op
fair enough