data-oriented-programming

Spread the word among the global developer community about Data-Oriented programming https://en.wikipedia.org/wiki/Draft:Data-oriented_programming
Yehonathan Sharvit 2021-03-11T16:56:01.006700Z

I’d like to come back to the topic of loose coupling as promoted by Rich Hickey. I think I finally grasped what Rich meant by loose coupling. Let’s take the example of a 3rd party library that gives access to Google calendar API. Let’s assume this lib provides a function createEvent to create an event. Now let’s compare how we pass the information about the event we’d like to create: 1. In a statically typed approach (FP or OOP), we need to pass an Event record with the information about the record 2. In a data-oriented approach, we pass a map with the appropriate field name There is a huge difference between the two approaches: 1. In the statically typed approach, the information is passed in a way that is dependent of the implementation of createEvent (Rich call it a data concretion). The only way for my code to generate the information that createEvent expects is to use the Event record. There is a tight coupling between my code and the library code. 2. In the data-oriented approach, I am free to generate the information as I want. The only constraints that I need to respect are the field names in the map. There is a loose coupling between my code and the library code. @me1740, @cgrand what do you think?

cgrand 2021-03-11T17:19:07.007500Z

bad example as it ends as a network call

cgrand 2021-03-11T17:24:53.009900Z

it has been a long time since I listened to RH talks. that’s dependency coupling: two modules that share data shouldn’t have to share code, only agree on a spec (general sense).

Yehonathan Sharvit 2021-03-11T17:25:41.010200Z

I don’t get why you say it’s a bad example

cgrand 2021-03-11T17:29:32.011100Z

it’s a JSON msg in the end

benoit 2021-03-11T17:48:41.017400Z

I’m not seeing the difference between the two to be honest (coupling-wise). In the Clojure case, your code still needs to pass a map with the right keywords. The main advantage of using maps over classes like Event is that you can reuse all the functions to work on maps to manipulate your event data record. Instead of creating a new abstraction for each type of record you manipulate, you reuse the same abstraction (Map). This is related to Perlis’ quote: “It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures.” So I agree with you that to represent static information, reusing the same abstraction integrated in your language is much better than creating a new class every time like you do in Java. But I wouldn’t “sell this” as a reduction of coupling.

Yehonathan Sharvit 2021-03-11T18:00:52.018700Z

@me1740 Don’t you see a difference in terms of coupling between being forced to pass an instance of class vs. passing a map?

Yehonathan Sharvit 2021-03-11T18:01:03.019Z

How would you define loose coupling then?

2021-03-11T18:18:54.021200Z

There are older systems like CORBA and COM that I have not used, but my understanding is they much more tightly coupled the language used in the hosts with the messages being passed over the network. JSON / XML / EDN / etc. are already a way to help avoid that coupling -- you are passing values between hosts over the network. Any semantics of them being RPC calls or not is up to the application to define (or not).

2021-03-11T18:20:04.022300Z

JSON / XML / EDN don't by themselves impose any restrictions on what map keys are required, or optional, and what kinds of values they are expected to have. CORBA/COM/etc. style things do.

2021-03-11T18:20:33.023Z

(is my very basic understanding of things like CORBA/COM/etc. from 10,000 foot view and never having used them myself -- corrections welcome)

2021-03-11T18:22:50.024100Z

Rich Hickey mentions and at least briefly discusses CORBA / COM in his talks "Value of Values" and "Language of the System", found quickly by grep'ing through his talk transcripts here: https://github.com/matthiasn/talk-transcripts

2021-03-11T18:24:27.025600Z

I don't have an example handy off the top of my head, but I'm nearly certain there are straightforward examples of using JSON for over-the-network communication between hosts, and still having other forms of tight coupling, different than the ones CORBA/COM get you into.

benoit 2021-03-11T18:28:31.029200Z

For me "coupling" means the tendency for 2 pieces of code to change together. When a function calls another function, there is some coupling because if you change the spec of the function you might have to change the caller. But this coupling is visible and is fine. Ideally you still want to minimize the coupling as much as possible because you want to be able to modify the callee without impacting the caller. The type of coupling that is bad though, is hidden coupling. It is when the same design decision is reflected in multiple parts of the code. If you change one part of the code, you will also have to change the other one. But because the coupling is not visible (as the function calling example), you might forget to update the other code and bug happens.

Yehonathan Sharvit 2021-03-11T18:31:18.030Z

Would you say that a frontend and a backend that communicates over HTTP via JSON are tightly coupled or loosely coupled?

2021-03-11T18:33:35.030900Z

I'd say that isn't enough information to tell.

2021-03-11T18:33:55.031600Z

I'm thinking of an integer. Is it even or odd? You don't know yet.

2021-03-11T18:35:07.033200Z

My takeaway from the Hickey talk (the same one, I think) was that JSON/HTTP services are already loosely coupled, and that pure functions with immutable data seek to replicate that looseness.

2021-03-11T18:36:28.034800Z

I don't know if this is a good example or not -- interested to hear from others. Suppose your HTTP server checked the incoming JSON messages from the clients to ensure that they had keys x, y, and z, but gave an error if any others were present. Some would prefer to do this as a form of error checking / validation, but it does mean that as you want to extend the data model over time (assume you do), you need to change that error check at the boundary of the server, in addition to internal implementation changes related to the new data.

2021-03-11T18:38:28.036700Z

If the "only has keys in this set" check is in some HTTP handling code far from the core of your application that uses those keys, then there is at least some level of coupling between the error checking part of the code, and the other parts that actually use it. If you have 10 software components between the network interface and the part of the HTTP server that uses the data, and they all do closed world assumption checks that reject keys outside of a valid set, then all of those must be changed whenever that set grows.

benoit 2021-03-11T21:06:57.037900Z

From a logical point of view, the coupling is the same as passing a map or a Customer instance. Both sides have to agree on a representation for the message, whether it is a Clojure map, Java instance, or JSON document. But components interact in other ways. By separating the system into a client and server we got rid of a lot of other interactions: propagation of certain kinds of error, they don't rely on shared state, the client can timeout if the server fails to respond... In this sense we reduced the coupling between the two components because we prevented certain kind of change or error to propagate between the two.

benoit 2021-03-11T21:09:44.038100Z

For me that's an example of hidden coupling. The design decision (what is the set of required keys) is replicated in multiple places in the code. Every time you change one part, you will have to change the other parts as well. We should definitely not do that 🙂

cgrand 2021-03-11T22:08:26.041700Z

with a map (or an http message) it’s easy to have a custom header. It’s safe because things are namespaced (ok X-MyHeader is not that great) and because the thing is built on “must-ignore”. You can have many intermediaries and still get your custom header in the end. Would all intermediaries have modeled and stored only the finite set they know/care when they were designed you wouldn’t get your custom header.

👍 1
benoit 2021-03-12T15:33:34.045100Z

For the discussion, let's agree that coupling is about the amount of knowledge pieces of code need to know about each other. I think it is related to saying that changes propagate because if A needs to know something about B then when this something in B changes, A needs to change as well. So I don't think we're too far off in our definitions 🙂 Now, if you want to be precise, you need to explain what it is that you don't need to know in case of maps or data interface. Or at least give an example of knowledge that is required in one case and not the other.

benoit 2021-03-12T16:13:34.045300Z

Another way to think about it. In your Java version, if you passed an instance of a Map with the customer data as key/value pairs. Would you say that it is comparable to the Clojure map in terms of coupling or different?

cgrand 2021-03-12T17:14:55.045500Z

Mostly, with two differences (most signgificative first): The immutability brings you peace of mind that the map isn’t going to change under your feet (which an immutable facade doesn’t bring you). It’s temporal decoupling: you can stash the map for later use without having to care about the object lifecycle (eg pooled/reused object or mutable event in UI). Namespaced keys (that you can emulate in Java).

benoit 2021-03-12T18:04:11.045700Z

Yes, definitely, the Clojure map has many advantages. It is also not a type casting nightmare to get a value out of the map 🙂

Yehonathan Sharvit 2021-03-16T05:02:57.000500Z

In a sense “just use maps” is an application of DIP to data: instead of passing instances of a data class or of an abstract data class, we pass generic data represented by immutable maps.

benoit 2021-03-11T22:37:15.041900Z

By reusing an abstraction you can extend it and implement features that you can reuse everywhere. You can also augment your map with metadata… This is the leverage you get from using a uniform data structure/abstraction. I would not call this flexibility “loose coupling” though.