datomic

Ask questions on the official Q&A site at https://ask.datomic.com!
nando 2020-10-04T00:03:23.206800Z

I'm getting an inconsistent result using the sum aggregate function on dev-local. If I include only the sum function, the result is much less than it should be. If I add the count function to the same query, the result of the sum function is then correct. Here's the query with only the sum function. There are multiple batch items per batch and I need the total weight of all batch items.

[:find ?e ?formula-name ?doses ?date (sum ?weight) 
         :keys e formula-name doses date total-weight 
         :in $ ?e
         :where [?e :batch/formula ?fe]
         [?fe :formula/name ?formula-name]
         [?e :batch/doses ?doses]
         [?e :batch/date ?date]
         [?bi :batch-item/batch ?e]
         [?bi :batch-item/weight ?weight]]
=> :total-weight 1027800

nando 2020-10-04T00:12:39.209700Z

Here's the query with both the sum and count aggregate functions:

[:find ?e ?formula-name ?doses ?date (sum ?weight) (count ?bi)
         :keys e formula-name doses date total-weight count
         :in $ ?e
         :where [?e :batch/formula ?fe]
         [?fe :formula/name ?formula-name]
         [?e :batch/doses ?doses]
         [?e :batch/date ?date]
         [?bi :batch-item/batch ?e]
         [?bi :batch-item/weight ?weight]]
=> :total-weight 2009250, :count 45 I've confirmed that 2009250 is the correct amount. What am I not understanding here?

Joe Lane 2020-10-04T00:14:31.210700Z

[:find ?e ?formula-name ?doses ?date (sum ?weight)
         :keys e formula-name doses date total-weight
         :with ?bi
         :in $ ?e
         :where [?e :batch/formula ?fe]
         [?fe :formula/name ?formula-name]
         [?e :batch/doses ?doses]
         [?e :batch/date ?date]
         [?bi :batch-item/batch ?e]
         [?bi :batch-item/weight ?weight]]

Joe Lane 2020-10-04T00:14:37.211Z

What does that return?

nando 2020-10-04T00:15:00.211400Z

Reading now .... so duplicates are being excluded from the sum?

nando 2020-10-04T00:17:24.211900Z

It's correct now!

nando 2020-10-04T00:18:33.212500Z

That's subtle. Thanks @lanejo01

Joe Lane 2020-10-04T00:19:02.213Z

Does the concept of a set vs a bag make sense to you from the docs?

nando 2020-10-04T00:21:10.214600Z

I understood immediately that duplicates might be excluded from (sum ...) when I saw the example, but that's not what one would expect from a sum function.

nando 2020-10-04T00:21:52.215600Z

2 + 2 = 2 ???

nando 2020-10-04T00:23:07.217Z

So I think it might be good to point this out in the sum section of the documentation (if it isn't there already)

Joe Lane 2020-10-04T00:24:19.217800Z

It's not related to the sum aggregate though, it's related to whether or not you want a bag vs a set of the ?bi lvar.

Joe Lane 2020-10-04T00:24:41.218300Z

It's a more general concept.

nando 2020-10-04T00:24:43.218400Z

;; query
[:find (sum ?count) 
 :with ?medium
 :where [?medium :medium/trackCount ?count]]
I see an example is in there, but I didn't understand the signficance.

nando 2020-10-04T00:27:32.220900Z

I understand it doesn't only apply to the sum aggregate. I'm only saying that if it has a non-obvious impact on a specific aggregate function, it might be helpful for beginners like me to point that out.

seancorfield 2020-10-04T00:30:22.222200Z

Interesting. I hadn't learned enough about Datomic to realize it specifically deals in sets by default instead of bags...

nando 2020-10-04T00:34:35.223100Z

It is still quite vague to me when a query would return a set.

nando 2020-10-04T00:36:42.224600Z

I guess it has to be kept always in mind, because as the example in the documentation on With Clauses shows, it isn't only an issue with some aggregate functions.

nando 2020-10-04T01:03:02.229800Z

@lanejo01 Here's a specific suggestion for the docs that might help to make this more clear for beginners. In the subsection on sum where it says "The following query uses sum to find the total number of tracks on all media in the database." You might change that to something like "The following query uses sum to find the total number of tracks on all media in the database. Note carefully the use of the with-clause in the query so that all trackCounts are summed. If the with-clause is excluded, only unique trackCounts will be summed."