The basic architecture of most of the Clojure web apps I work on, starting from the bottom up, looks like this:
- Database layer -- next.jdbc / HugSQL functions.
- Service layer -- functions that wrangle the data the database layer returns into a shape that's more useful to consumers (often entity-id->entity
maps or something like that).
- Handler layer -- Ring handlers that call the service layer and wrap the return values into Ring response maps. The main handler often has Ring middleware that adds a database transaction into a Ring request map. That then gets propagated all the way down to the database level.
This is nice and simple and generally works quite well. Integration tests generally use a dedicated test database and rollback transactions, so the testing story is pretty good, too.
However, one issue this approach is that the database abstraction leaks all the way up to the Ring handler layer. In practice, it hasn't been a huge problem, but I'm curious about alternative solutions. One option would be to add something like a Repository
interface using protocols and inject that into the Ring request map instead of the database transaction object.
The problem there, though, is that Clojure editors generally don't support navigating from protocol definitions to implementations and vice versa, so moving around becomes a bit of a pain. Also, I feels somewhat averse to using protocols for things other than interop. Any thoughts regarding alternative solutions?
I think if you want to make progress on addressing the issue, you really have to unpack "However, one issue this approach is that the database abstraction leaks all the way up to the Ring handler layer". • How does the leakiness manifest? • What kind of logical interface is required? • What kind of logical interface does the database provide? • Which parts of the code base are incidental to the domain problem? What are they doing? • etc.
I guess I'm trying to gauge whether I'm seeing a problem where there actually isn't one. I'm also curious to hear what the basic architecture of other Clojure web apps is like in this respect.
The leakiness manifests simply in that the Ring handler layer must be aware of the database layer specifically because it must create the database transaction object.
@flowthing my 2 cents: you’ve correctly identified a problem, but I think you’re not going to solve it as long as you’re using a traditional SQL database. ultimately the job of most backends (and yours sounds like a part of the majority, nothing wrong with that) is pretty simple: move some data from point A to point B and from B to A, while applying some transformations along the way; adding more abstraction layers into that transformation is, IMHO, a fools errand; it might feel like work or solving a problem, but it probably isn’t. From my little experience with Datomic-like databases (Datomic, DataScript, et al) I can say that if you choose to use them, you will eliminate most of that complexity. Of course, choosing to use them might have other trade-offs (cost, operational complexity, correctness, etc etc) but I believe this is where the core of the problem lies: ultimately you have some implicit or explicit data model shape in Clojure, which is most likely a tree, or a graph; by using a traditional SQL database, you’re trying to shove that data model shape into what’s ultimately a relatively rigid structure of 2-dimensional tables; it’s possible to do, but it is cumbersome; now if you design your whole database model from scratch and enforce a structure that is directly and repeatably translatable into your data model in Clojure and vice versa, you might be able to eliminate the complexity (I’ve tried doing that with some reasonable success in the past, see https://github.com/raspasov/immutable-sql (I don’t advocate using this library since I haven’t used it in some time and haven’t updated it) )
are your web apps super standard crud apps where your database tables more or less map to whats shown in the UI? ie table / detail view? if so stay lean. but If you are doing some weird/complex stuff you may want to experiment with building a domain core.
@raspasov Thanks! Yeah, the question of whether I’d gain anything by adding some kind of repository abstraction is the main point. I have no experience with Datomic et al, sadly, so I have no opinion on whether they’d help here. I have no doubt they might, though.
@enricoteterra Well, I don’t know about super standard, but I guess there’s nothing too weird going on. So I guess it’s more of the former. What do you mean by a domain core?
I would say, unlikely, unless you design the whole thing from scratch with that in mind; and then again; if you decide at some point that you don’t like the abstraction tower you’ve built-up, you’re kinda screwed… so use caution 🙂 I arrived at what’s in that lib, immutable-sql, after quite some iterations… I would say the ideas are very much worth a look (even if the library itself is somewhat dated in terms of drivers and it’s Postgres specific)
I will take a look, thanks. 🙂
Effectively, it allows you to have “immutable” tables… and you use a function to update them, very much in the spirit of Clojure’s (swap! …)
So it’s a very direct mapping of Clojure data model to what’s in your database… the piece that’s really missing is the whole tree/graph part where if you have {:username “raspasov” :address {:street “123 Main”}} … it will not automatically insert those in two tables (something that Datomic/DataScript graph database allows you to do easily)
It’s all possible in SQL, but I haven’t seen a good solution that’s not a whole bunch of ORM insanity
IMO, the big win of something like Datomic is that it almost completely gets rid of the transformation part for a standard CRUD backend… you have some data, in Clojure, which goes to the database a-la carte, saves, updates, etc - all taken care of; It ALSO comes back in the same shape! Almost no transformation needed.
The downside is, of course, proprietary product, less resources online, etc etc.
I’ve wanted to build something that does this automagically in SQL, but once you get going, there’s so many SQL edge cases…
And there’s different tricks that are available for each DB, so it’s hard to make it general + performant. If it’s Postgres or MySQL specific perhaps easier to make nice to use + fast.
@flowthing which SQL database do you use?
MariaDB (ugh) and SQL Server.
Haven’t used either… Have used MySQL (a lot, but some years ago), and Postgres (more recently)
Do either of those have the ability to: - add columns/indices on the fly without locking up the whole table? - partial indices?
That was a big difference when I moved from MySQL to Postgres 🙂
I haven't had the need for either, so I'm not sure.
(but I’ve heard that MySQL has recently gained some of those abilities)
Years ago, a 100% guaranteed way to crash your system was adding a column to a big and hot MySQL table lol
Would take minutes… and the whole table gets locked for writing.
The solution was… Add a new table! And link them by ID…. Lol
You end up with a bunch of user_x, user_y, user_whatever tables.
I don't know when I'd ever need to add a new attribute to a relation other than in a migration on startup. 🙂
So your domain/model is pretty static? Doesn’t actively grow I assume.
Not so actively that I'd need to be doing something like that, no. 🙂
Got it.
But yeah, even the thought of doing something like that makes me want to reach for something other than a SQL database.
Or perhaps consider using the Postgres JSON datatype or something.
It is a trade-off… I really think Datomic has so much potential, but databases are one of the most conservative areas of a tech stack (and for good reasons) - people are veeeery slow to adopt new things nowadays since there’s a bunch of very well tested options.
Yes, Postgres has some escape hatches like that, but ultimately, it’s a hack… If you start shoving everything into JSON you might end-up with a NoSQL mess….
Sure. If there's something I've learned, it's that everything is a trade-off. 🙂 And yeah, that's exactly why I'm not sure I'd even have the option of picking Datomic even if I wanted to.
“Everything is a trade-off” Amen to that 🙂
@flowthing a domain core is part of domain driven design (DDD), but it's only advised to go with DDD if you have a complex enough problem, like for example my business has thousands of container ships and i need an application that coordinates where which freight goes worldwide (example from the book)
I see, thanks.
Two things come to mind reading this thread: an approach like component (or mount or integrant) which allows to handle dependencies cleanly. This, of course, does not magically solve the "leaking" part. The other thing is the onion (or hexagonal or clean) architecture, where you basically reverse the dependency. Protocols could indeed be one way on how you can implement this in Clojure.
I'm using Clip, which helps, but is not sufficient to solve the issue, as far as I can see -- I'd still need a repository abstraction (or equivalent).
One of the projects I maintain uses the Clean architecture, and I must say it's immensely painful to work on.
I've seen a number of heated discussions about protocols (also in the context of component) but I think they can be very useful.
That's not a condemnation of the architecture pattern itself -- I'm sure it could've been implemented differently. But the current implementation is chock full of protocols and their implementations, and navigating the codebase without any of the aids that are there with statically-typed languages is extremely laborious.
Ah, the joys of dependency injection and magical layers. Dan North would probably recommend to "just write simple code" (cf. https://dannorth.net/2021/03/16/cupid-the-back-story/)
@flowthing One approach i've taken is to use a "system map"
this is done in Component and other libraries, but you don't need those to do the concept
just at the start of the server, make and initialize a map like
{:db ...db pool instance ...
:redis ... redis connection pool ...
:etc ... some service object ...}
you can then either 1. Inject it into your ring handlers and use components explicitly
(defn handler [{:keys [system]} request]
(db/in-transaction (:db system) .... request ....))
2. Inject it into your ring handlers and pull out components explicitly, but delegate to a "service layer" namespace
(defn handler [{:keys [system]} request]
(s-stuff (:db system) ...request...))
3. Inject it into your ring handlers and pass it opaquely to a "service layer" namespace
(defn handler [{:keys [system]} request]
(s-stuff system ...request...))
4. Bind it to a dynamic or constant var at startup and have other parts of your code look for the global
(def ^:dynamic *system* nil)
(alter-var-root #'*system* ...)
(defn handler [request]
(s-stuff ... implicitly looks at *system* and pulls out what it wants ...))
its DI in that, any given component only cares about the keys it is looking for
in the last 2, the database abstraction doesn't leak. With the opaque map the abstraction is "big blob of stuff" and with the (dynamic?) var its all hidden
@emccue That's interesting, thanks! I'm using Clip, so I do have a system map, but it hadn't occurred to me to inject the whole thing. I guess I could even use middleware to add the whole system map to every Ring request and just destructure it from there. 🤪
I usually don't really go for dynamic vars, but some variation of 1–3 might be worth considering.
yeah - it feels somewhat obvious afterwards
at least to me
> The problem there, though, is that Clojure editors generally don't support navigating from protocol definitions to implementations and vice versa
I think cursive does
And I don't think implementing it in Emacs would take more than a couple hours (perhaps 1 day if you're not much into CIDER hacking)
A good recipe that works for me is implementing it in pure Clojure first and then creating thin glue for bridging said code with emacs
cider-nrepl-sync-request:eval
is your friend :)
and rewrite-clj or clj-kondo's analysis might also save some work
also #lsp folks might have a recipe or even be willing to implement the feature (which you could either use directly, or programatically if you're not much into LSP)
Cursive didn’t the last time I used it. clojure-lsp would be the correct place to implement something like that.
I think if implementing it were that easy, it would’ve been done already, but I could be wrong. 🙂 Or maybe people just don’t use protocols enough to have made a big enough fuss about it.
A lot of things are relatively easy, but contributors' time/energy are (rightfully) the bottleneck. And of course every new feature increments the maintenance burden Personally I have quite a lot of features, bugfixes, etc implemented in my personal clojure-mode, cider, clj-refactor (.el code) or cider, compliment, refactor-nrepl (.clj code) forks. You might be surprised at how accessible that code actually is (you don't need to permanently fork; that's more of an odd choice of mine)
> https://github.com/cursive-ide/cursive/issues/437 remembered badly then, sorry
I’ve been writing an interactive development environment for Clojure from scratch for the past year, so I’ve fiddled with some things in that domain. Not protocols specifically, though, so dunno about those. Some things really are surprisingly easy, as you say, but for things like finding function usages there’s no built-in API. The hack that I think CIDER has doesn’t find all usages, for instance, I think.
Anyway, this is getting somewhat off-topic. 🙂
> Anyway, this is getting somewhat off-topic. tooling is a fairly relevant consideration because as hinted in the OP, people can decide go/no-go depending on it. The funny part being that it's not that much work... simply someone has to do it Here's protocol detection using tools.analyzer (which clj-refactor uses https://github.com/jonase/eastwood/blob/7672782d748bcc15c0ea5bf1d913c3d47c9e0810/src/eastwood/linters/unused.clj#L66) (tools.analyzer works wonderfully well for small-to-medium projects... then generally it starts to become too slow to be practical) Given that you mentioned, here's 'function usages' detection using clj-kondo (borkdude shared the original snippet I derived this from) https://github.com/reducecombine/.lein/blob/3c539770447a599b1ef1f9432e961cf9e4c808a4/scripts/vemv/usages.clj and here is an issue that would allow to implement something analog for defprotocols https://github.com/clj-kondo/clj-kondo/issues/405
Thanks! Well, let’s hope people can get their hands on that some day. 🙂 I’ll definitely check out tools.analyzer.
Looks like your comment in that issue already got the wheels rolling. Exciting stuff! I think it’d be a real boon to every LSP-using editor if that got implemented. 👍:skin-tone-2:
Feel like I’m coming a bit late to this but we have 110K lines of Clojure at work and a lot of the core of it is essentially fancied-up CRUD. We do not run a transaction for every request — we hardly use transactions at all. We use Component and have the whole “Application” Component injected into the Ring request by middleware so it’s an opaque blob passed down to the system/services layer that needs to peel it apart to get at parts of it for DB access, caches, various API services etc.
But we have also taken a pretty pragmatic approach that a lot of simple DB access (such as next.jdbc.sql
friendly function calls) might as well just be at the edges of the handlers since, well, it’s all tied to the DB anyway so we gain very little benefit from trying to abstract it away.
We’ve tried various approaches of abstracting away the persistence stuff and it’s mostly just not worth the effort: it complicates the code, it makes certain things a lot more work and/or takes a lot more code. Because we build the database connections via Component at startup, it’s easy to have throwaway dev/test DBs locally. We have multiple connection pools and three schemas — “databases” in MySQL — and some apps bridge between our staging content DB and our production content DB.
At one point I wrote https://github.com/seancorfield/engine to try to abstract out “sources” and “sinks” (specifically Queryable
and Committable
) as a way to try to have pure business logic and be able to test that code without needing actual access to a database or 3rd party APIs etc — and the code ended up being very monadic / non-idiomatic for Clojure and just being far more work than any gains we got from it. The README talks about that.
@seancorfield Thank you, that's very informative! It also mirrors the experiences I've had. I rewrote one of the apps I maintain to have a repository-like protocol thing at one point, but I eventually ended up throwing it away because I felt the benefits just weren't there. Injecting the system into the Ring request does indeed feel like a smart and idiomatic way of going about it. I think I'll give it a shot. Whether everything needs a transaction is also something I'll need to think about. Coincidentally, I wonder if there are any blog posts or articles on this sort of thing. As in, how to architect your Clojure web app. I feel like there must be, but I just haven't come across any. :man-shrugging::skin-tone-2: In any case, thanks to everyone for your thoughts! Discussions like this are immensely useful for a solo developer. :thumbsup::skin-tone-2:
What I like about protocols is that one can create a test-only implementation that was non-IO based and therefore sped up / simplified test setup significantly.
...that is of course to be weighed against the cost of a SQL abstraction. Probably many of us have gotten that wrong at some point :)
with honeysql being a thing in particular, I think it's reasonably possible to create a SQL façade that can replace jdbc with pure functions (e.g. in tests, if you receive the exact honeysql {:select ...
input, return the fixed [[ ...
output)
If you have a DB-heavy app, you have to have a pretty enormous protocol — or a lot of little protocols — and it just isn’t worth it, IMO. If you have only a few tables and a handful of standard access patterns, maybe. We have about 300 database tables…
It might depend? I agree that creating Widget
, Invoice
,`Thing` protocols, one per 'model', and each with 4 crud protocol methods might not be that great.
But a single protocol abstracting the whole jdbc+honeysql combo, generically, might be cheap enough
That was essentially what Engine did through Queryable
and Committable
. Wasn’t worth it on a complex app.
In the applications I maintain, at least, tests that operate on the database are plenty fast enough when running them in a rollback transaction.
Yes there's definitely a productive and pragmatic path in that direction :thumbsup: Personally I tend to think in these lines https://widdindustries.com/cross-platform/ e.g. I should be able to write a backend app in .cljc , even if I don't intend to use node.js as an actual backend. Protocols help there (as it would a pure-functional architecture). Is it a bit wasteful? Yes. Is it a good line of research with some tangible benefits at scale? Also yes :)
@seancorfield Your work on engine reminded me of this project https://github.com/rafd/tada/ that also to have a declarative approach to the query/command->transform->mutate workflow. Do you think there's a space for this kind of solution or is it just over engineering ?
@avocabio Hard to say. re-frame
is basically an event-based system like that (and can be used on the server side with some limitations). Prior to Engine, I worked on several event-based libraries (mostly in other tech, pre-Clojure) and they seem good for some types of problem and a poor fit for others. Pretty much all the “reactive” stuff falls into this category, although not always in such a declarative form as tada
. I kind of like it as an approach conceptually but have never been comfortable with it in the real world as system complexity scales: these systems are often hard to debug, in my experience.
The https://github.com/oakes/odoyle-rulese has a similar approach. There a few examples with user interfaces. I tried to use it, but it sorta feels like building your own wonky database.
The concept of interceptors from pedestal is a similar idea, an "engine" that orchestrate flow, but with a smaller footprint, it's used to handle http request but also re-frame event and malli schema decoders and encoders. What I find interesting is that all of those project implemented the concept themself, and they all end up being ~20 loc for the engine part.