i'm sure it's for a very good reason and so i'm just curious: if Ion lambdas proxy requests to the compute/query groups, then what is the reason for running them in a JVM runtime rather than something with a quicker cold start?
i'm also asking because quite a few of my ion lambdas are synchronous where response time matters, and there are even some hard limits in AWS (for example Cognito has a fixed 5 second timeout on its post confirmation lambda trigger). i can use lambda concurrency to solve the problem at a price 🙂
have you tried http direct?
i have yes, and it works really well. in this case i'm referring to (ion) lambdas that should be lambdas by design: handling Cognito triggers, glueing together Step Functions and pipelines, handlers for AppSync resources etc.
unless i'm missing something and http direct can help there?
ok, I assumed you meant web… but nevermind 😄
glad to hear http direct works well, I’ve yet to try it out
I’ve been wondering about this myself. HTTP direct required the prod topology because of the NLB requirement, but would it be possible to spin up a NLB manually and use it with solo?
So I can use
[?entity1 ?attrname1 ?attrval]
[?entity2 ?attrname2 ?attrval]
in a :where
clause to get ?entity1
and ?entity2
where there exists an ?attrval
that matches
---
I'm using
[(q '[:find (seq ?attrval)
:in $ ?entity ?attrname
:where [?entity ?attrname ?attrval]]
db ?entity1 ?attrname1) [[?attrvals1]]]
[(q '[:find (seq ?attrval)
:in $ ?entity ?attrname
:where [?entity ?attrname ?attrval]]
db ?entity2 ?attrname2) [[?attrvals2]]]
(not-join [?attrvals1 ?attrvals2]
[(seq ?attrvals1) [?element ...]]
(not [(contains? ?attrvals2 ?element)]))
to get ?entity1
and ?entity2
where all attrval
s for ?entity1
exist for ?entity2
. Is there a more performant way to do this??
(This feels like a directional "and" to the implicit "or" being applied to each attrval
matching in the first case)What should I make of finding tx-ids with no associated txInstant? (:db/txInstant (d/entity db (d/t->tx t-time))) ;; => nil
Hey Cameron, I’m on mobile now so please forgive the brevity and any possible misunderstanding. Instead of two subqueries, try putting both of the where clauses from each subquery into one top level query and then adding a final clause of [(not= attrval1 attrval2)]. I believe there is No need for the nested subqueries, the not-join, nor the boxing and unboxing via seq and [?element ...] I’ll try and double check this when I get back at a computer.
sweet! already got rid of the subqueries I think
I hope I’m understanding it correctly haha
the double not is used to produce an and essentially
so I am not sure how it would be achieved with only the not=
also tried
(not-join [?cat ?dog]
[?cat :cat/paws ?cat-paw]
(not-join [?paw ?dog]
[?dog :dog/paws ?dog-paw]
[?cat-paw :paws/smaller-than ?dog-paw]))
but it's timing out with large numbers of paws 😉and ?cat
and ?dog
need to be bound, where they didn't need to be in the subqueries...
the above query testing that all the paws on the cat have a :paws/smaller-than
relationship with any paw on the dog
hmm. we appear to be missing txInstants on the large majority of tx entities:
#_(let [end-t (d/basis-t db) ;; => current basis-t: 104753910
missing-tx-instant? #(nil? (:db/txInstant (d/entity db (d/t->tx %))))]
(count (filter missing-tx-instant? (range 0 end-t))))
;; => 84492058
this seems to be 10x slower than the original
Can I see the actual, full query you're trying to run?
Range probably isn't what you want. The contract is that T
is guaranteed to be increasing, not that it always increases by exactly 1.
ha! damn. You know I kept wondering about that assumption of mine. Thank you!!!
sheesh :face_palm: hence datomic.api/next-t
sure one sec
Internally there is a single T counter incremented for newly-minted entity ids (when a tempid needs a new entity id). transaction temp ids are just one of the consumers of that counter
so there is an invariant that for all entity ids in the system, none share a T
thanks for the inside scoop!
We haven't upgraded to have qseq, so I'm having to break a tx-ids query into a set of smaller ranges. I was calculating these smaller ranges with simple arithmetic -- is that still acceptable, or do I need to ensure that the start-t and end-t handed to tx-ids are bonifide t-times?
It should be fine, but why not something like (->> (d/seek-datoms :aevt :db/txInstant (d/t->tx start-t)) (map :e) (map d/tx->t) (take-while #(< % end-t) (partition-all 10000))
i.e., just seek part of the :db/txInstant index to get the tx ids?
Ahh, nice. If I understand you implication, all that would be left is for me to get the start and end of each partition as follows:
I like this approach much better, thanks! Just to be clear though, it should be okay to fabricate a non-existent t-time near the target time when in doubt? It always appeared to work for me, but then maybe I was just being sloppy.
It depends on what you’re doing
d/tx-range, d/seek-datoms are ok with it, because they’re using your number to do array bisection
d/as-of and d/since are ok with it because they’re using it for filtering.
I was feeding each range to this ^
yeah should be fine
Thanks for the help!!
this would be more efficient without query though
why not use d/tx-range directly?
Sure, I'm not opposed to it. Would it just benefit readability or something more than that?
query needs to realize and retain the intermediate result sets. that’s why you were chunking in the first place, right?
d/tx-range is lazy
bingo. no qseq required
not sure qseq would help
it did in my testing
if I pass a month range to the function i shared above I run out of memory before I can process it
the same didn't occur with qseq
that’s surprising because qseq doesn’t AFAIK defer any processing except pull
qseq still needs to realize and retain the intermediate result sets like @favila is saying, it just supports lazy transformations (like pull) which can consume an enormous amount of memory when done eagerly
well, then I didn't test what I thought I was testing.
Your query is the same as this:
(->> (d/tx-range log start-t end-t)
(mapcat :data)
(filter #(contains? attr-ids (:a %)))
(map :e)
(distinct))
except this is evaluated lazily and incrementally, so memory use is bounded
Welp, color me doubly embarrassed then. I must have been testing with a larger range of time when I was using d/q than when I tested d/qseq