honeysql

Discussion of https://github.com/seancorfield/honeysql :slightly_smiling_face:
2021-03-16T01:21:26.002100Z

Here is the latest v2 "puzzle": The SQL I'm trying to generate is:

LEAD(g) OVER (PARTITION BY fid ORDER BY m)
I've tried this:
(sql.helpers/over [[:lead :g]
                   (-> (sql.helpers/partition-by :fid) (order-by :m))])
which generates:
[[:over [[:lead :g] {:partition-by [:fid], :order-by [:m]}]]]
which sql/format chokes on

seancorfield 2021-03-16T01:42:40.002500Z

Can you provide a bit more context? This works:

user=> (sql/format (-> (select (over [[:lead :g] (-> (h/partition-by :fid) (order-by :m))]))))
["SELECT LEAD(g) OVER (PARTITION BY fid ORDER BY m ASC)"]

seancorfield 2021-03-16T01:45:03.002900Z

Here’s a more complex example from the docs:

user=> (sql/format (-> (select :id
  #_=>                                (over [[:avg :salary] (-> (h/partition-by :department) (order-by :designation)) :Average]
  #_=>                                      [[:max :salary] :w :MaxSalary]))
  #_=>                        (from :employee)
  #_=>                        (window :w (h/partition-by :department))))
["SELECT id, AVG(salary) OVER (PARTITION BY department ORDER BY designation ASC) AS Average, MAX(salary) OVER w AS MaxSalary FROM employee WINDOW w AS (PARTITION BY department)"]

2021-03-16T02:27:50.004900Z

I definitely looked at the docs that I could find in https://github.com/seancorfield/honeysql/blob/v2/doc/clause-reference.md The broader context is that the snipped above is part of a with CTE, here is the raw version of it (that works):

[:E [:raw "(SELECT DISTINCT ON (fid) fid, id AS a, id + 1 AS b, ST_3DDistance(ST_MakeLine(g, LEAD(g) OVER (PARTITION BY fid ORDER BY m)), pca_raw) AS dist FROM D ORDER BY fid, 4)"]]

2021-03-16T02:32:16.007100Z

Here is what I tried:

[:E (-> (sql.helpers/select-distinct-on :fid
                                        :fid
                                        [:id :a]
                                        [[:raw "id + 1"] :b]
                                        [[:ST_3DDistance [:ST_MakeLine :g
                                                                        (sql.helpers/over [[:lead :g]
                                                                                          (-> (sql.helpers/partition-by :fid)
                                                                                              (order-by :m))])] :pca_raw] :dist])
        (from :D)
        (order-by [:fid 4]))]

seancorfield 2021-03-16T03:22:38.010800Z

[I was wrong: HoneySQL treats :over as special syntax so it should work in any expression]

seancorfield 2021-03-16T03:29:23.011600Z

Oh, I see the issue @dcj :select-distinct-on expects the first argument to be a sequence of column names per the docs: "Similar to :select-distinct above but the first element in the sequence should be a sequence of columns for the DISTINCT ON clause"

seancorfield 2021-03-16T03:32:43.012600Z

Also, the order-by should have a direction (`:asc` or :desc) instead of 4 -- can you explain what you're trying to do there?

seancorfield 2021-03-16T03:34:18.012900Z

Maybe this is what you're aiming for:

dev=> (sql/format (-> (h/select-distinct-on [:fid]
 #_=>  #_=>                                         :fid
 #_=>  #_=>                                         [:id :a]
 #_=>  #_=>                                         [[:+ :id 1] :b]
 #_=>  #_=>                                         [[:ST_3DDistance [:ST_MakeLine :g
 #_=>  #_=>                                                                         (h/over [[:lead :g]
 #_=>  #_=>                                                                                  (-> (h/partition-by :fid)
 #_=>  #_=>                                                                                      (h/order-by :m))])] :pca_raw] :dist])
 #_=>  #_=>                 (h/from :D) (h/order-by :fid [[:inline 4]])))
["SELECT DISTINCT ON(fid) fid, id AS a, id + ? AS b, ST_3DDISTANCE(ST_MAKELINE(g, (LEAD(g) OVER (PARTITION BY fid ORDER BY m ASC))), pca_raw) AS dist FROM D ORDER BY fid ASC, 4 ASC" 1]
Not sure whether the id + 1 is better for you as an expression rather than raw like that?

2021-03-16T16:16:23.015400Z

1. Thank you for the syntax fix! 2. Clearly I need to read the docs more carefully 3. Is there some way I can run my proposed query DSL through Spec, to veriify it is correct? 4. Thank you for [[:+ :id 1] :b] that is way better

2021-03-16T16:23:58.019300Z

Also, the order-by should have a direction (:asc  or :desc) instead of 4 -- can you explain what you're trying to do there?
The short answer is that I can't explain this, the query I am working on was created by an extremely experienced PostGIS developer, and I don't understand how it works 😞 That being said, I checked the Postgres docs on order-by: https://www.postgresql.org/docs/9.5/queries-order.html Note this:
A sort_expression can also be the column label or number of an output column, as in:

    SELECT a, max(b) FROM table1 GROUP BY a ORDER BY 1;
So, at this point, here is what I have:
(pprint tqe)

{:select-distinct-on
 [[:fid]
  :fid
  [:id :a]
  [[:raw "id + 1"] :b]
  [[:ST_3DDistance
    [:ST_MakeLine
     :g
     [[:over [[:lead :g] {:partition-by [:fid], :order-by [:m]}]]]]
    :pca_raw]
   :dist]],
 :from [:D],
 :order-by [[[:raw "fid, 4"]]]}

(println (sql/format tqe {:pretty true}))

[
SELECT DISTINCT ON(fid) fid, id AS a, id + 1 AS b, ST_3DDISTANCE(ST_MAKELINE(g, (LEAD(g) OVER (PARTITION BY fid ORDER BY m ASC))), pca_raw) AS dist
FROM D
ORDER BY fid, 4 ASC
]
I haven't tested this yet, need to head to a meeting, back in a couple hours

seancorfield 2021-03-16T17:03:39.020Z

I showed an order-by that would work for :fid, 4: (h/order-by :fid [[:inline 4]]) — you don’t need :raw here.

seancorfield 2021-03-16T17:05:13.021800Z

Re: #3 — there’s an issue on GitHub for a Spec for the DSL but I’m not sure how tractable that really is since it would need to pretty much match the entire recursive descent “formatter”: SQL is extremely complex as a language and Spec isn’t really suited to describing language grammars.

2021-03-16T19:39:37.022200Z

Yes, this is working, thank you! Now, pne last giant blob of :raw to convert....

2021-03-16T19:41:35.022400Z

Could each helper fn check its args? Not saying this would enable checking everything.... I will try and give this some thought

seancorfield 2021-03-16T20:21:59.022600Z

Not really: the helpers are nearly all generic — they mostly just assoc/merge arguments into the underlying data structure (they’re nearly all just calls to generic under the hood).

seancorfield 2021-03-16T20:22:41.022800Z

And the problem is that the syntax of SQL is recursive: SELECT expressions can contain SELECT expressions in all sorts of complex ways.

seancorfield 2021-03-16T20:23:03.023Z

Any Spec that I end up writing is only going to catch the most egregious mistakes.

seancorfield 2021-03-16T20:24:26.023200Z

There are a few clauses where the structure required is unique enough that Spec could catch some errors (`:select-distinct-on` is a particular case where it expects a sequence where the first element must be a sequence — but you already got an error from that… just not a very informative one).

seancorfield 2021-03-16T20:25:22.023400Z

I can try to add more “specific error checking” in cases where, if you pass in bad data today you get a generic error (either an NPE or a CCE), but many of those would be hard to predict…

2021-03-16T20:35:18.024600Z

Understood! I'm thinking more about the "user error" case....

2021-03-16T20:53:49.028500Z

Today's v2 puzzle, trying to convert this from :raw

FROM
    (SELECT fid, pca, dca FROM P) P
    INNER JOIN
    (SELECT fid, m, ST_3DDistance(g,pca) AS dist FROM D WHERE (fid, id) IN (SELECT fid, a FROM E)) A ON P.fid = A.fid
    INNER JOIN
    (SELECT fid, m, ST_3DDistance(g,pca) AS dist FROM D WHERE (fid, id) IN (SELECT fid, b FROM E)) B ON P.fid = B.fid
My approach to these tasks is to try and build "up" from an innermost. So in this case, I started trying to write:
(SELECT fid, m, ST_3DDistance(g,pca) AS dist FROM D WHERE (fid, id) IN (SELECT fid, b FROM E)) B
The part I can't figure out is WHERE (fid, id) IN (SELECT fid, b FROM E), specifically the IN part...

seancorfield 2021-03-16T20:58:43.029400Z

user=> (sql/format {:where [:in [:composite :fid :id] {:select [:fid :b] :from :e}]})
["WHERE (fid, id) IN (SELECT fid, b FROM e)"]

seancorfield 2021-03-16T20:59:14.029900Z

(assuming you want the data structure version, not the helper version)

seancorfield 2021-03-16T20:59:57.030100Z

user=> (sql/format (where [:in (composite :fid :id) (-> (select :fid :b) (from :e))]))
["WHERE (fid, id) IN (SELECT fid, b FROM e)"]

2021-03-16T21:00:14.030600Z

Nice! Thank you again!

seancorfield 2021-03-16T21:01:19.031400Z

It’s probably worth pointing out that this is one of those places where the V2 support for symbols instead of keywords makes for a more readable piece of code

user=> (sql/format '{where [in [composite fid id] {select [fid b] from e}]})
["WHERE (fid, id) IN (SELECT fid, b FROM e)"]

seancorfield 2021-03-16T21:01:51.032100Z

(that was not possible in V1 although symbols were accepted in a few places)

seancorfield 2021-03-16T21:02:17.033Z

Obviously you can’t use a quoted structure if you have Clojure variables in the structure but…

seancorfield 2021-03-16T21:03:40.034400Z

(although you could use named parameters for that)

enn 2021-03-16T21:04:38.034800Z

perfect use case for quasi-quote

2021-03-16T22:46:57.037600Z

Recall my "today's V2 puzzle" above... Here is what I've got now:

(-> (select :fid :pca :dca)
    (from :P)
    (inner-join [(-> (select :fid :m [[:ST_3DDistance :g :pca] :dist])
                     (from :D)
                     (where [:in (composite :fid :id) (-> (select :fid :a) (from :E))]))
                 :A]
                [:= :P.fid :A.fid])
    (inner-join [(-> (select :fid :m [[:ST_3DDistance :g :pca] :dist])
                     (from :D)
                     (where [:in (composite :fid :id) (-> (select :fid :b) (from :E))]))
                 :B]
                [:= :P.fid :B.fid]))
Which results in this:
[
SELECT fid, pca, dca
FROM P
INNER JOIN (SELECT fid, m, ST_3DDISTANCE(g, pca) AS dist FROM D WHERE (fid, id) IN (SELECT fid, a FROM E)) AS A ON P.fid = A.fid INNER JOIN (SELECT fid, m, ST_3DDISTANCE(g, pca) AS dist FROM D WHERE (fid, id) IN (SELECT fid, b FROM E)) AS B ON P.fid = B.fid
]
This seems really close, but I can't figure out how to get the first SELECT to end with AS P It is supposed to be:
(SELECT fid, pca, dca FROM P) P

seancorfield 2021-03-16T23:03:59.038100Z

You don’t need it to be aliased tho’ since the table name is already P, right?

seancorfield 2021-03-16T23:06:39.040500Z

I’m not even sure how to get an alias on that without nesting it in a SELECTSELECT * FROM (SELECT fid, pca, dca FROM P) AS P which would be (-> (select :*) (from [(-> (select :fid :pca :dca) (from :p)) :p]) ...) I think @dcj?

seancorfield 2021-03-16T23:07:27.040700Z

user=> (sql/format (-> (select :*) (from [(-> (select :fid :pca :dca) (from :p)) :p])))
["SELECT * FROM (SELECT fid, pca, dca FROM p) AS p"]

seancorfield 2021-03-16T23:09:40.041400Z

(SELECT .. FROM ..) AS alias INNER JOIN .. is certainly not legal SQL as far as MySQL is concerned.

2021-03-16T23:12:21.043400Z

AFAICT, it is supposed to be:

(SELECT some-stuff FROM OTHER_P) P
INNER JOIN ...
INNER JOIN ...
AFAIK this is legal Postgres, my raw query above works fine

2021-03-16T23:15:48.044100Z

Had to edit the above 3 times to get it right

seancorfield 2021-03-16T23:19:23.044800Z

There’s no way to generate that with HoneySQL — the closest you’ll get is the nested select I showed above.

seancorfield 2021-03-16T23:19:46.045200Z

I can’t even think how to write that syntax in the DSL.

seancorfield 2021-03-16T23:20:09.046Z

SELECT some-stuff FROM OTHER_P AS P would be “equivalent” though, wouldn’t it?

seancorfield 2021-03-16T23:20:21.046400Z

(and that is expressible in the DSL)

2021-03-16T23:20:41.046600Z

That makes me feel better since I spent the last couple hours trying and failing

2021-03-16T23:22:34.048Z

I also think your point that "you already have P, so why alias it again" bears some thought, seems like I ought to be able to make my query work without that

seancorfield 2021-03-16T23:27:05.048600Z

I can’t see how it’s legal based on the Postgres docs either but, hey, maybe PG accepts syntax that isn’t documented?

2021-03-16T23:28:49.050Z

Don;t know if it matters, but all this ends up in a from

seancorfield 2021-03-16T23:31:49.050700Z

The FROM can alias the result — but it would be the entire result, not just that select/`from` part.

seancorfield 2021-03-16T23:35:26.052Z

Oh, I see what you mean. So it is (-> .. (from [ (-> (select :fid :pca :dca) (from :other_p)) :p ]) (inner-join .. ) )

seancorfield 2021-03-16T23:37:00.053100Z

because overall your query is select stuff from (select fid, pca, dca from p) as p ... right?

2021-03-16T23:38:46.053800Z

yes the original P was part of a with/CTE above

2021-03-16T23:43:19.054600Z

but the inner joins apply to the select/from within the larger from

seancorfield 2021-03-16T23:58:33.056Z

I’d have to see the whole raw query to be able to help further at this point.