datascript doesn't let you query old versions anyway, because it's all in memory
why can't you just transact new maps into entities as they become available?
could https://github.com/alandipert/intension help?
@thedavidmeister the data is basically a dump of a couple database tables. right now, it is just a snapshot of the tables at a point in time. the difference between the current version and the previous version might include updated entities, and without just having the delta of updated and created entities, we'd need to look at every map and create-or-update, no?
if you have no sense of "identity" for each map
every list of maps is a new list
there is no meaningful concept of "update" π
they have IDs π
ah cool
well then can you map the IDs to :db/id in datascript?
i don't know why i didn't think of that
but lol yes, i could
try it
i was adding tempids to everything like a bozo
i mean, i don't know if datascript supports your ids
they're ints
but if they're numbers i don't see why it wouldn't work
ints yeah
although it's probably technically not supported
what about across different tables though? if one of them has a colliding ID
again...
π yeah
whatever your definition of identity is
if your definition of identity is the ID, and the IDs collide, then they are the same entity, by definition
i suppose i could get clever and do something like use an offset for each table to prevent collision
table 1: (- (:id {:id 1}) 10000) table 2: (- (:id {:id 1}) 20000)
etc.
yeah
datascript does something like that internally to differentiate between eids and txn ids
but really, this is pretty contextual at this point
there's probably no perfect answer
though i do have 100s of thousands of some of these maps, so it might need to be more clever than that
that should still be fine
yes, basically im looking to try and fit a whole lot of batched data into a tight, queryable package
js max int is something like 10^15
can it be two different dbs?
per table, or are you talking about splitting and querying across dbs?
i will need to do joins occasionally across the imported data
this is getting to where i'm not 100% sure
i have a feeling it's possible to stick multiple dbs into a query
but i haven't done it
yes, i knew this was possible in datomic
not sure if datascript follows that
any thoughts on speeding up the import?
right now im just doing a (d/with-db (d/empty-db) (concat stuff more-stuff))
in a background thread, and swapping the value
hmmm
so you're building a new db from scratch every time
yes, and so i guess what you mentioned above would be worth trying.
why not just (transact! conn [...])
provided you have the db/id bit working, it should ignore anything that is the same
i have no idea which is faster
but transact! certainly seems more idiomatic
finding out now π
loading 150k maps takes ~42 seconds
basically the same with d/db-with
it's probably doing similar things under the hood
what about incremental updates?
is it any faster than the initial load?
i only added one record, but it didn't appear so no
it took 42 seconds to add 1 record?
oh, no, what im saying is i had a collection, and i changed a record in it but left it's ID the same, and then ran transact! over the whole collection again
its*
right
i was curious to see if it'd be able to quickly tell it didn't need to do anything with most of the maps
i wonder how long it takes to compare those maps outside datascript
maybe you can do the comparison yourself and just transact the diff
yeah
i also wonder if the types on the value side of a map have anything to do with performance
like, clojure.set/difference
or something
like are instants more expensive?
maybe, i'd imagine iso8601 strings are faster to compare, but i don't know
I can probably slim some of these maps down.
I bet that would have some appreciable impact
yeah i mean, at some point you just run into the limits of comparing lots of data
From the datascript tutorial in reading: > Datom has (-hash) and (-eqiv) methods overloaded so only [e a v] take part in comparisons. It helps to guarantee that each datom can be added to the database only once. Does this mean that a retracted datom cannot be added after retraction?
Are retraction and excision synonymous?
> When datom is removed from a DB, thereβs no trace of it anywhere. Retracted means gone.
That makes it sounds like excision and retraction are the same, but perhaps Iβm just confused by the role of :added in light of the comment about e a v comparison operations.
datascript has no history
i think a lot of the datascript API is to line up with datomic API rather than achieve a specific goal for datascript itself
if you wanted to sync datascript with datomic via transaction reports you'd need :added i think
@devn There is also conn-from-datoms
in case you want to manually import. It's pretty fast. I use it to bootstrap my DB on CLJS.
Datomic maintains history; you could rewind and look at the database at a given time, meaning all datoms retracted between that previous time and now would be visible again. Retraction is basically adding new information: "This datom no longer holds true; it used to up 'til now, but now it does not." Excision is much more dangerous, because it removes any trace that the datom was ever there, both currently and historically. Datascript maintains no history, as thedavidmeister said, so it doesn't necessarily make that distinction