architecture

flowthing 2021-03-23T06:55:16.003700Z

The basic architecture of most of the Clojure web apps I work on, starting from the bottom up, looks like this: - Database layer -- next.jdbc / HugSQL functions. - Service layer -- functions that wrangle the data the database layer returns into a shape that's more useful to consumers (often entity-id->entity maps or something like that). - Handler layer -- Ring handlers that call the service layer and wrap the return values into Ring response maps. The main handler often has Ring middleware that adds a database transaction into a Ring request map. That then gets propagated all the way down to the database level. This is nice and simple and generally works quite well. Integration tests generally use a dedicated test database and rollback transactions, so the testing story is pretty good, too. However, one issue this approach is that the database abstraction leaks all the way up to the Ring handler layer. In practice, it hasn't been a huge problem, but I'm curious about alternative solutions. One option would be to add something like a Repository interface using protocols and inject that into the Ring request map instead of the database transaction object. The problem there, though, is that Clojure editors generally don't support navigating from protocol definitions to implementations and vice versa, so moving around becomes a bit of a pain. Also, I feels somewhat averse to using protocols for things other than interop. Any thoughts regarding alternative solutions?

phronmophobic 2021-03-23T07:09:56.003800Z

I think if you want to make progress on addressing the issue, you really have to unpack "However, one issue this approach is that the database abstraction leaks all the way up to the Ring handler layer". • How does the leakiness manifest? • What kind of logical interface is required? • What kind of logical interface does the database provide? • Which parts of the code base are incidental to the domain problem? What are they doing? • etc.

flowthing 2021-03-23T07:16:42.004100Z

I guess I'm trying to gauge whether I'm seeing a problem where there actually isn't one. I'm also curious to hear what the basic architecture of other Clojure web apps is like in this respect.

flowthing 2021-03-23T07:21:03.004300Z

The leakiness manifests simply in that the Ring handler layer must be aware of the database layer specifically because it must create the database transaction object.

raspasov 2021-03-23T08:04:40.004500Z

@flowthing my 2 cents: you’ve correctly identified a problem, but I think you’re not going to solve it as long as you’re using a traditional SQL database. ultimately the job of most backends (and yours sounds like a part of the majority, nothing wrong with that) is pretty simple: move some data from point A to point B and from B to A, while applying some transformations along the way; adding more abstraction layers into that transformation is, IMHO, a fools errand; it might feel like work or solving a problem, but it probably isn’t. From my little experience with Datomic-like databases (Datomic, DataScript, et al) I can say that if you choose to use them, you will eliminate most of that complexity. Of course, choosing to use them might have other trade-offs (cost, operational complexity, correctness, etc etc) but I believe this is where the core of the problem lies: ultimately you have some implicit or explicit data model shape in Clojure, which is most likely a tree, or a graph; by using a traditional SQL database, you’re trying to shove that data model shape into what’s ultimately a relatively rigid structure of 2-dimensional tables; it’s possible to do, but it is cumbersome; now if you design your whole database model from scratch and enforce a structure that is directly and repeatably translatable into your data model in Clojure and vice versa, you might be able to eliminate the complexity (I’ve tried doing that with some reasonable success in the past, see https://github.com/raspasov/immutable-sql (I don’t advocate using this library since I haven’t used it in some time and haven’t updated it) )

2021-03-23T08:05:51.005Z

are your web apps super standard crud apps where your database tables more or less map to whats shown in the UI? ie table / detail view? if so stay lean. but If you are doing some weird/complex stuff you may want to experiment with building a domain core.

💯 1
flowthing 2021-03-23T08:08:02.005200Z

@raspasov Thanks! Yeah, the question of whether I’d gain anything by adding some kind of repository abstraction is the main point. I have no experience with Datomic et al, sadly, so I have no opinion on whether they’d help here. I have no doubt they might, though.

flowthing 2021-03-23T08:08:48.005400Z

@enricoteterra Well, I don’t know about super standard, but I guess there’s nothing too weird going on. So I guess it’s more of the former. What do you mean by a domain core?

raspasov 2021-03-23T08:10:15.005600Z

I would say, unlikely, unless you design the whole thing from scratch with that in mind; and then again; if you decide at some point that you don’t like the abstraction tower you’ve built-up, you’re kinda screwed… so use caution 🙂 I arrived at what’s in that lib, immutable-sql, after quite some iterations… I would say the ideas are very much worth a look (even if the library itself is somewhat dated in terms of drivers and it’s Postgres specific)

flowthing 2021-03-23T08:10:40.005800Z

I will take a look, thanks. 🙂

raspasov 2021-03-23T08:11:07.006Z

Effectively, it allows you to have “immutable” tables… and you use a function to update them, very much in the spirit of Clojure’s (swap! …)

👀 1
raspasov 2021-03-23T08:13:09.006300Z

So it’s a very direct mapping of Clojure data model to what’s in your database… the piece that’s really missing is the whole tree/graph part where if you have {:username “raspasov” :address {:street “123 Main”}} … it will not automatically insert those in two tables (something that Datomic/DataScript graph database allows you to do easily)

raspasov 2021-03-23T08:13:27.006500Z

It’s all possible in SQL, but I haven’t seen a good solution that’s not a whole bunch of ORM insanity

raspasov 2021-03-23T08:17:00.006700Z

IMO, the big win of something like Datomic is that it almost completely gets rid of the transformation part for a standard CRUD backend… you have some data, in Clojure, which goes to the database a-la carte, saves, updates, etc - all taken care of; It ALSO comes back in the same shape! Almost no transformation needed.

raspasov 2021-03-23T08:17:24.006900Z

The downside is, of course, proprietary product, less resources online, etc etc.

raspasov 2021-03-23T08:19:54.007100Z

I’ve wanted to build something that does this automagically in SQL, but once you get going, there’s so many SQL edge cases…

raspasov 2021-03-23T08:21:07.007300Z

And there’s different tricks that are available for each DB, so it’s hard to make it general + performant. If it’s Postgres or MySQL specific perhaps easier to make nice to use + fast.

raspasov 2021-03-23T08:22:45.007500Z

@flowthing which SQL database do you use?

flowthing 2021-03-23T08:23:12.007700Z

MariaDB (ugh) and SQL Server.

raspasov 2021-03-23T08:23:43.007900Z

Haven’t used either… Have used MySQL (a lot, but some years ago), and Postgres (more recently)

raspasov 2021-03-23T08:24:34.008100Z

Do either of those have the ability to: - add columns/indices on the fly without locking up the whole table? - partial indices?

raspasov 2021-03-23T08:24:46.008300Z

That was a big difference when I moved from MySQL to Postgres 🙂

flowthing 2021-03-23T08:25:04.008500Z

I haven't had the need for either, so I'm not sure.

raspasov 2021-03-23T08:25:15.008800Z

(but I’ve heard that MySQL has recently gained some of those abilities)

raspasov 2021-03-23T08:25:50.009Z

Years ago, a 100% guaranteed way to crash your system was adding a column to a big and hot MySQL table lol

raspasov 2021-03-23T08:26:10.009200Z

Would take minutes… and the whole table gets locked for writing.

raspasov 2021-03-23T08:26:54.009400Z

The solution was… Add a new table! And link them by ID…. Lol

raspasov 2021-03-23T08:27:46.009700Z

You end up with a bunch of user_x, user_y, user_whatever tables.

flowthing 2021-03-23T08:28:19.009900Z

I don't know when I'd ever need to add a new attribute to a relation other than in a migration on startup. 🙂

raspasov 2021-03-23T08:28:52.010100Z

So your domain/model is pretty static? Doesn’t actively grow I assume.

flowthing 2021-03-23T08:29:08.010300Z

Not so actively that I'd need to be doing something like that, no. 🙂

raspasov 2021-03-23T08:29:24.010500Z

Got it.

flowthing 2021-03-23T08:29:55.010700Z

But yeah, even the thought of doing something like that makes me want to reach for something other than a SQL database.

flowthing 2021-03-23T08:30:31.011Z

Or perhaps consider using the Postgres JSON datatype or something.

raspasov 2021-03-23T08:31:12.011200Z

It is a trade-off… I really think Datomic has so much potential, but databases are one of the most conservative areas of a tech stack (and for good reasons) - people are veeeery slow to adopt new things nowadays since there’s a bunch of very well tested options.

raspasov 2021-03-23T08:32:19.011400Z

Yes, Postgres has some escape hatches like that, but ultimately, it’s a hack… If you start shoving everything into JSON you might end-up with a NoSQL mess….

flowthing 2021-03-23T08:32:33.011600Z

Sure. If there's something I've learned, it's that everything is a trade-off. 🙂 And yeah, that's exactly why I'm not sure I'd even have the option of picking Datomic even if I wanted to.

raspasov 2021-03-23T08:33:05.011800Z

“Everything is a trade-off” Amen to that 🙂

2021-03-23T08:48:10.012Z

@flowthing a domain core is part of domain driven design (DDD), but it's only advised to go with DDD if you have a complex enough problem, like for example my business has thousands of container ships and i need an application that coordinates where which freight goes worldwide (example from the book)

flowthing 2021-03-23T08:48:45.012200Z

I see, thanks.

2021-03-23T10:59:43.012500Z

Two things come to mind reading this thread: an approach like component (or mount or integrant) which allows to handle dependencies cleanly. This, of course, does not magically solve the "leaking" part. The other thing is the onion (or hexagonal or clean) architecture, where you basically reverse the dependency. Protocols could indeed be one way on how you can implement this in Clojure.

flowthing 2021-03-23T11:00:34.012700Z

I'm using Clip, which helps, but is not sufficient to solve the issue, as far as I can see -- I'd still need a repository abstraction (or equivalent).

flowthing 2021-03-23T11:01:12.012900Z

One of the projects I maintain uses the Clean architecture, and I must say it's immensely painful to work on.

2021-03-23T11:01:33.013100Z

I've seen a number of heated discussions about protocols (also in the context of component) but I think they can be very useful.

flowthing 2021-03-23T11:02:33.013300Z

That's not a condemnation of the architecture pattern itself -- I'm sure it could've been implemented differently. But the current implementation is chock full of protocols and their implementations, and navigating the codebase without any of the aids that are there with statically-typed languages is extremely laborious.

2021-03-23T11:05:57.013700Z

Ah, the joys of dependency injection and magical layers. Dan North would probably recommend to "just write simple code" (cf. https://dannorth.net/2021/03/16/cupid-the-back-story/)

emccue 2021-03-23T13:14:50.014Z

@flowthing One approach i've taken is to use a "system map"

emccue 2021-03-23T13:15:24.014200Z

this is done in Component and other libraries, but you don't need those to do the concept

emccue 2021-03-23T13:15:42.014400Z

just at the start of the server, make and initialize a map like

emccue 2021-03-23T13:16:34.014600Z

{:db    ...db pool instance ...
 :redis ... redis connection pool ...
 :etc   ... some service object ...}

emccue 2021-03-23T13:21:19.014800Z

you can then either 1. Inject it into your ring handlers and use components explicitly

(defn handler [{:keys [system]} request]
  (db/in-transaction (:db system) .... request ....))
2. Inject it into your ring handlers and pull out components explicitly, but delegate to a "service layer" namespace
(defn handler [{:keys [system]} request]
  (s-stuff (:db system) ...request...))
3. Inject it into your ring handlers and pass it opaquely to a "service layer" namespace
(defn handler [{:keys [system]} request]
  (s-stuff system ...request...))
4. Bind it to a dynamic or constant var at startup and have other parts of your code look for the global
(def ^:dynamic *system* nil)

(alter-var-root #'*system* ...)

(defn handler [request]
  (s-stuff ... implicitly looks at *system* and pulls out what it wants ...))

emccue 2021-03-23T13:21:53.015Z

its DI in that, any given component only cares about the keys it is looking for

emccue 2021-03-23T13:23:50.015200Z

in the last 2, the database abstraction doesn't leak. With the opaque map the abstraction is "big blob of stuff" and with the (dynamic?) var its all hidden

flowthing 2021-03-23T13:46:49.015400Z

@emccue That's interesting, thanks! I'm using Clip, so I do have a system map, but it hadn't occurred to me to inject the whole thing. I guess I could even use middleware to add the whole system map to every Ring request and just destructure it from there. 🤪

flowthing 2021-03-23T13:48:18.015600Z

I usually don't really go for dynamic vars, but some variation of 1–3 might be worth considering.

emccue 2021-03-23T14:10:24.015900Z

yeah - it feels somewhat obvious afterwards

emccue 2021-03-23T14:10:40.016100Z

at least to me

vemv 2021-03-23T14:18:17.016300Z

> The problem there, though, is that Clojure editors generally don't support navigating from protocol definitions to implementations and vice versa I think cursive does And I don't think implementing it in Emacs would take more than a couple hours (perhaps 1 day if you're not much into CIDER hacking) A good recipe that works for me is implementing it in pure Clojure first and then creating thin glue for bridging said code with emacs cider-nrepl-sync-request:eval is your friend :) and rewrite-clj or clj-kondo's analysis might also save some work also #lsp folks might have a recipe or even be willing to implement the feature (which you could either use directly, or programatically if you're not much into LSP)

flowthing 2021-03-23T14:42:32.019300Z

Cursive didn’t the last time I used it. clojure-lsp would be the correct place to implement something like that.

flowthing 2021-03-23T14:47:35.023200Z

I think if implementing it were that easy, it would’ve been done already, but I could be wrong. 🙂 Or maybe people just don’t use protocols enough to have made a big enough fuss about it.

flowthing 2021-03-23T14:51:52.023500Z

https://github.com/cursive-ide/cursive/issues/437

vemv 2021-03-23T14:51:59.023800Z

A lot of things are relatively easy, but contributors' time/energy are (rightfully) the bottleneck. And of course every new feature increments the maintenance burden Personally I have quite a lot of features, bugfixes, etc implemented in my personal clojure-mode, cider, clj-refactor (.el code) or cider, compliment, refactor-nrepl (.clj code) forks. You might be surprised at how accessible that code actually is (you don't need to permanently fork; that's more of an odd choice of mine)

vemv 2021-03-23T14:52:30.024Z

> https://github.com/cursive-ide/cursive/issues/437 remembered badly then, sorry

flowthing 2021-03-23T15:00:05.030800Z

I’ve been writing an interactive development environment for Clojure from scratch for the past year, so I’ve fiddled with some things in that domain. Not protocols specifically, though, so dunno about those. Some things really are surprisingly easy, as you say, but for things like finding function usages there’s no built-in API. The hack that I think CIDER has doesn’t find all usages, for instance, I think.

flowthing 2021-03-23T15:00:41.031600Z

Anyway, this is getting somewhat off-topic. 🙂

vemv 2021-03-23T15:09:15.031800Z

> Anyway, this is getting somewhat off-topic. tooling is a fairly relevant consideration because as hinted in the OP, people can decide go/no-go depending on it. The funny part being that it's not that much work... simply someone has to do it Here's protocol detection using tools.analyzer (which clj-refactor uses https://github.com/jonase/eastwood/blob/7672782d748bcc15c0ea5bf1d913c3d47c9e0810/src/eastwood/linters/unused.clj#L66) (tools.analyzer works wonderfully well for small-to-medium projects... then generally it starts to become too slow to be practical) Given that you mentioned, here's 'function usages' detection using clj-kondo (borkdude shared the original snippet I derived this from) https://github.com/reducecombine/.lein/blob/3c539770447a599b1ef1f9432e961cf9e4c808a4/scripts/vemv/usages.clj and here is an issue that would allow to implement something analog for defprotocols https://github.com/clj-kondo/clj-kondo/issues/405

flowthing 2021-03-23T15:16:17.033700Z

Thanks! Well, let’s hope people can get their hands on that some day. 🙂 I’ll definitely check out tools.analyzer.

flowthing 2021-03-23T15:35:34.036100Z

Looks like your comment in that issue already got the wheels rolling. Exciting stuff! I think it’d be a real boon to every LSP-using editor if that got implemented. 👍:skin-tone-2:

✌️ 1
seancorfield 2021-03-23T16:05:15.036400Z

Feel like I’m coming a bit late to this but we have 110K lines of Clojure at work and a lot of the core of it is essentially fancied-up CRUD. We do not run a transaction for every request — we hardly use transactions at all. We use Component and have the whole “Application” Component injected into the Ring request by middleware so it’s an opaque blob passed down to the system/services layer that needs to peel it apart to get at parts of it for DB access, caches, various API services etc.

👍 3
seancorfield 2021-03-23T16:07:05.036700Z

But we have also taken a pretty pragmatic approach that a lot of simple DB access (such as next.jdbc.sql friendly function calls) might as well just be at the edges of the handlers since, well, it’s all tied to the DB anyway so we gain very little benefit from trying to abstract it away.

seancorfield 2021-03-23T16:10:31.036900Z

We’ve tried various approaches of abstracting away the persistence stuff and it’s mostly just not worth the effort: it complicates the code, it makes certain things a lot more work and/or takes a lot more code. Because we build the database connections via Component at startup, it’s easy to have throwaway dev/test DBs locally. We have multiple connection pools and three schemas — “databases” in MySQL — and some apps bridge between our staging content DB and our production content DB.

❤️ 1
seancorfield 2021-03-23T16:12:33.037100Z

At one point I wrote https://github.com/seancorfield/engine to try to abstract out “sources” and “sinks” (specifically Queryable and Committable) as a way to try to have pure business logic and be able to test that code without needing actual access to a database or 3rd party APIs etc — and the code ended up being very monadic / non-idiomatic for Clojure and just being far more work than any gains we got from it. The README talks about that.

flowthing 2021-03-23T16:22:08.037500Z

@seancorfield Thank you, that's very informative! It also mirrors the experiences I've had. I rewrote one of the apps I maintain to have a repository-like protocol thing at one point, but I eventually ended up throwing it away because I felt the benefits just weren't there. Injecting the system into the Ring request does indeed feel like a smart and idiomatic way of going about it. I think I'll give it a shot. Whether everything needs a transaction is also something I'll need to think about. Coincidentally, I wonder if there are any blog posts or articles on this sort of thing. As in, how to architect your Clojure web app. I feel like there must be, but I just haven't come across any. :man-shrugging::skin-tone-2: In any case, thanks to everyone for your thoughts! Discussions like this are immensely useful for a solo developer. :thumbsup::skin-tone-2:

vemv 2021-03-23T16:23:03.037700Z

What I like about protocols is that one can create a test-only implementation that was non-IO based and therefore sped up / simplified test setup significantly. ...that is of course to be weighed against the cost of a SQL abstraction. Probably many of us have gotten that wrong at some point :) with honeysql being a thing in particular, I think it's reasonably possible to create a SQL façade that can replace jdbc with pure functions (e.g. in tests, if you receive the exact honeysql {:select ... input, return the fixed [[ ... output)

seancorfield 2021-03-23T16:30:41.037900Z

If you have a DB-heavy app, you have to have a pretty enormous protocol — or a lot of little protocols — and it just isn’t worth it, IMO. If you have only a few tables and a handful of standard access patterns, maybe. We have about 300 database tables…

vemv 2021-03-23T16:33:48.038100Z

It might depend? I agree that creating Widget, Invoice,`Thing` protocols, one per 'model', and each with 4 crud protocol methods might not be that great. But a single protocol abstracting the whole jdbc+honeysql combo, generically, might be cheap enough

seancorfield 2021-03-23T16:36:27.038300Z

That was essentially what Engine did through Queryable and Committable. Wasn’t worth it on a complex app.

👀 1
flowthing 2021-03-23T16:49:42.041100Z

In the applications I maintain, at least, tests that operate on the database are plenty fast enough when running them in a rollback transaction.

vemv 2021-03-23T16:56:16.041300Z

Yes there's definitely a productive and pragmatic path in that direction :thumbsup: Personally I tend to think in these lines https://widdindustries.com/cross-platform/ e.g. I should be able to write a backend app in .cljc , even if I don't intend to use node.js as an actual backend. Protocols help there (as it would a pure-functional architecture). Is it a bit wasteful? Yes. Is it a good line of research with some tangible benefits at scale? Also yes :)

Panel 3000 2021-04-02T22:16:34.063400Z

@seancorfield Your work on engine reminded me of this project https://github.com/rafd/tada/ that also to have a declarative approach to the query/command->transform->mutate workflow. Do you think there's a space for this kind of solution or is it just over engineering ?

seancorfield 2021-04-02T22:22:58.063600Z

@avocabio Hard to say. re-frame is basically an event-based system like that (and can be used on the server side with some limitations). Prior to Engine, I worked on several event-based libraries (mostly in other tech, pre-Clojure) and they seem good for some types of problem and a poor fit for others. Pretty much all the “reactive” stuff falls into this category, although not always in such a declarative form as tada. I kind of like it as an approach conceptually but have never been comfortable with it in the real world as system complexity scales: these systems are often hard to debug, in my experience.

phronmophobic 2021-04-02T22:41:39.063800Z

The https://github.com/oakes/odoyle-rulese has a similar approach. There a few examples with user interfaces. I tried to use it, but it sorta feels like building your own wonky database.

Panel 3000 2021-04-03T02:27:04.064300Z

The concept of interceptors from pedestal is a similar idea, an "engine" that orchestrate flow, but with a smaller footprint, it's used to handle http request but also re-frame event and malli schema decoders and encoders. What I find interesting is that all of those project implemented the concept themself, and they all end up being ~20 loc for the engine part.