other-languages

here be heresies and things we have to use for work
Yehonathan Sharvit 2021-04-23T13:25:42.004700Z

A question regarding the importance of data immutability in node.js server-side. Considering a typical scenario where an nodejs app reads some data from some data sources, apply business logic and return data in JSON.

Yehonathan Sharvit 2021-04-23T13:26:26.005500Z

What kind of issues could arise if we don’t use immutable data?

Yehonathan Sharvit 2021-04-23T13:28:35.006200Z

I mean: the state is external to the app. So why data immutability is important in that specific case?

borkdude 2021-04-23T13:30:23.006900Z

immutability leads to better local reasoning. if you have an immutable thing, you don't have to fear that some other part of the code will modify it from under you

Yehonathan Sharvit 2021-04-23T13:31:41.008Z

I know but it might sound theoretic for nodejs devs. I forgot to mention that the context of this question is a talk about the value of data immutability that I am going to give at a node.js meetup next week

Yehonathan Sharvit 2021-04-23T13:32:12.009Z

It is easier (at least for me) to articulate the value of immutability when the app has an inner state

borkdude 2021-04-23T13:32:14.009200Z

Maybe it's good to read the READMEs from several immutable JS libraries like immutable.js

Yehonathan Sharvit 2021-04-23T13:32:21.009500Z

E.g. in the frontend

borkdude 2021-04-23T13:32:24.009600Z

because they exist for a reason

borkdude 2021-04-23T13:33:10.010400Z

ah you mean nodeJS as in backend JS apps, yeah, not sure if immutable JS libs are used a lot there, interesting question

Yehonathan Sharvit 2021-04-23T13:33:24.010700Z

I am gonna re-read Immutable.js README. By the way did you know that the Immutable.js was kind of dead?

borkdude 2021-04-23T13:33:43.011Z

no, again? which lib has arisen now

Yehonathan Sharvit 2021-04-23T13:33:51.011200Z

immer

Yehonathan Sharvit 2021-04-23T13:34:05.011500Z

also lodash fp

Yehonathan Sharvit 2021-04-23T13:34:08.011700Z

and ramda

borkdude 2021-04-23T13:34:41.012500Z

I don't follow all that BS and hype anymore. Just use CLJS :P

1
borkdude 2021-04-23T13:35:22.013500Z

A book written by Fogus about functional programming in JS in 2013 is now probably stale. While the book he wrote earlier in 2010 still runs with Clojure 1.11

borkdude 2021-04-23T13:35:33.013800Z

But maybe you could read that book for inspiration as well

Yehonathan Sharvit 2021-04-23T13:35:54.014300Z

Good idea!

Yehonathan Sharvit 2021-04-23T13:36:02.014500Z

Something else: I understood recently that all the libs that provide immutable data manipulation on top of native JS objects are efficient only with records but not with associative arrays

Yehonathan Sharvit 2021-04-23T13:37:42.015100Z

@ericnormand do you have a take on the relevance of immutable data in nodejs backend side?

borkdude 2021-04-23T13:39:23.015800Z

It's interesting why TypeScript became popular while it's not immutable by default, whereas Clojure has the opposite: dynamic typing + immutability

borkdude 2021-04-23T13:39:39.016200Z

probably marketing though, TypeScript is pushed by M$FT

ericnormand 2021-04-23T13:41:37.016600Z

I think the statelessness of HTTP helps a lot

ericnormand 2021-04-23T13:41:49.017Z

each request is handled largely independently, using its own state

ericnormand 2021-04-23T13:42:00.017200Z

it reduces the amount of sharing

ericnormand 2021-04-23T13:42:09.017500Z

so even if you use mutable data, it’s very local

ericnormand 2021-04-23T13:42:24.017900Z

until, of course, your code grows and it still gets out of hand

ericnormand 2021-04-23T13:42:52.018200Z

that’s all very general, though

ericnormand 2021-04-23T13:43:06.018600Z

I don’t have experience using Node.js

ericnormand 2021-04-23T13:43:26.019Z

related: i think that’s one of the hidden values of microservices

ericnormand 2021-04-23T13:43:47.019500Z

the services don’t share memory

ericnormand 2021-04-23T13:44:07.020Z

they make copies of anything that needs to be shared (by serializing and deserializing)

Yehonathan Sharvit 2021-04-23T13:44:25.020400Z

Yeah. My question is not specific to nodejs

Yehonathan Sharvit 2021-04-23T13:44:33.020700Z

Feel free to address the broader question

🙌 1
ericnormand 2021-04-23T13:45:06.021400Z

the fact that each HTTP request is handled with very little sharing really helps

ericnormand 2021-04-23T13:45:48.022200Z

so, for instance, you copy data out of the DB, you mess with it all you want, then send a copy back to the client

ericnormand 2021-04-23T13:46:03.022600Z

no other request had access to that copy

ericnormand 2021-04-23T13:46:30.023Z

another thing that helps is that most apps are partitioned by user

Yehonathan Sharvit 2021-04-23T13:46:47.023400Z

what do you mean?

ericnormand 2021-04-23T13:47:33.024100Z

race conditions are rare because, even with millions of users, they’re all reading and writing to different rows in the database

ericnormand 2021-04-23T13:48:00.024500Z

it’s a very rare case where you’ve got two windows open and quickly clicking buttons in both

ericnormand 2021-04-23T13:48:23.024900Z

that has more to do with DB concurrency than in-memory data structures

ericnormand 2021-04-23T13:48:47.025400Z

but from what I have seen, most web apps do not have concurrent access done right

ericnormand 2021-04-23T13:49:19.026100Z

in practice, though, because I’m modifying my documents and you’re modifying your documents, there isn’t much concurrency anyway

Yehonathan Sharvit 2021-04-23T13:49:53.026600Z

In a Google docs like scenario, there is concurrency

ericnormand 2021-04-23T13:50:07.026900Z

but if I logged in on a few phones and started messing with it, I’d probably find some bugs

ericnormand 2021-04-23T13:50:23.027300Z

yes, and in those cases, they are well-built

ericnormand 2021-04-23T13:50:49.028Z

the whole google doc is a concurrent data structure

ericnormand 2021-04-23T13:51:03.028200Z

it’s not crud

Yehonathan Sharvit 2021-04-23T13:51:18.028500Z

What other concurrent use cases do we have out there

Yehonathan Sharvit 2021-04-23T13:51:18.028700Z

?

Yehonathan Sharvit 2021-04-23T13:51:30.029Z

less large-scale that Google docs

ericnormand 2021-04-23T13:51:38.029300Z

chat rooms?

ericnormand 2021-04-23T13:51:45.029500Z

games?

Yehonathan Sharvit 2021-04-23T13:51:57.029800Z

let’s focus on chat rooms

Yehonathan Sharvit 2021-04-23T13:52:43.030600Z

One could implement a chat room with websocets. So it’s a good use case for nodejs, I guess

ericnormand 2021-04-23T13:52:57.030900Z

yes

ericnormand 2021-04-23T13:53:04.031300Z

you could have the chat log in memory

Yehonathan Sharvit 2021-04-23T13:53:22.031600Z

you could or you should?

ericnormand 2021-04-23T13:53:37.031900Z

could

ericnormand 2021-04-23T13:53:53.032200Z

i’m just trying to avoid using a DB in this scenario

ericnormand 2021-04-23T13:54:02.032500Z

most apps push their concurrency into the DB

Yehonathan Sharvit 2021-04-23T13:54:11.032700Z

I know

Yehonathan Sharvit 2021-04-23T13:54:34.033200Z

That’s why I am looking for a use case where it makes lots of sense of have the state in mem

ericnormand 2021-04-23T13:55:07.033500Z

sessions are another one, but they are partitioned by user as well

Yehonathan Sharvit 2021-04-23T13:58:28.034100Z

what kind of concurrency issues would we have if we don’t use immutable data in a chat app?

orestis 2021-04-23T14:52:37.036Z

@viebel I can give you a nightmare example of lack of immutability in a Node.js app. We use mongo and mongo has queries represented as data. So you construct a query based on various request parameters and send it off to mongo to execute. There's a bunch of middleware that goes between the original query and the execution, each of which will modify the query.

orestis 2021-04-23T14:53:40.037Z

The problem with mutable data here is that during development, you can't know what's going on. Once you pass the original query off for execution, you can't reuse it to do a second execution.

orestis 2021-04-23T14:54:25.037800Z

We have had dozens of subtle bugs where people assumed the query was the original one and tried to extract parameters from it, reuse it, log it -- but instead they were dealing with a mutated one.

orestis 2021-04-23T14:55:35.038900Z

In fact, in a relatively big codebase, once you pass in that query to any function, all bets are off. Even if the function says that it will give you a new query back, there's no way to know unless you go in and review every step of the way.

orestis 2021-04-23T14:55:57.039400Z

Which, in a nutshell, is a manifestation of the local reasoning that @borkdude mentioned.

orestis 2021-04-23T14:56:54.039900Z

(add on top of all this the async nature of JS, and it can be a nightmare to figure who's mutating what)

orestis 2021-04-23T14:57:50.040800Z

In the end, to debug such bugs I had to add console.log every step of the way to capture the values of the query in an immutable place (the stdout).

Yehonathan Sharvit 2021-04-23T14:58:34.041Z

Very interesting @orestis

Yehonathan Sharvit 2021-04-23T14:58:59.041700Z

Could you elaborate a bit about the bunch of middlewares that modify the query?

borkdude 2021-04-23T15:00:18.043400Z

@viebel Imagine if the maps that go through ring middleware were mutable. That would be a nightmare

orestis 2021-04-23T15:02:08.045300Z

Say that you get a query that says "give me all the posts". So you have a mongo query that looks naively like {} -> matches all the documents. But then the business logic kicks in and says, "all the posts for this users means all the posts in the teams they are members of". So it adds {channel_id: $in: [x, y, z]}. Then another middleware adds "don't show drafts unless it's your own posts" so it adds {$or: [{status: "published", author_id: foo}]}... and so on.

orestis 2021-04-23T15:02:56.045700Z

The way I write it, it sounds manageable, but in reality it's not 🙂

Yehonathan Sharvit 2021-04-23T15:03:09.045900Z

I see what you mean.

orestis 2021-04-23T15:03:48.046700Z

E.g. in this legacy codebase, we have a function that is named querySchema.validate. You would expect that this will, well, validate the query. But it actually mutates it.

orestis 2021-04-23T15:06:50.049800Z

It's nothing that a little discipline can't fix (that's what Uncle Bob would say). But diving into a new codebase without any systemic guarantees... good luck.

Yehonathan Sharvit 2021-04-23T15:07:03.050100Z

When data is immutable, you can store in a variable each step of the process and inspect it or replay it as you wish. Libraries like https://github.com/vvvvalvalval/scope-capture cannot work in a mutable environment.

Yehonathan Sharvit 2021-04-23T15:08:51.052Z

@orestis I’d like to claim that there two approaches to embrace immutability in JavaScript: 1. Using a lib like Immutable.js => immutability at the level of the data structures 2. Using a lib like Lodash FP, Ramda or Immer => immutability at the level of the way we manipulate data

Yehonathan Sharvit 2021-04-23T15:09:19.052900Z

The problem with approach #1 is that it requires non-native objects

orestis 2021-04-23T15:09:35.053600Z

I'm not sure if the typesystem could help you here. Does Typescript have a concept of immutable function arguments?

Yehonathan Sharvit 2021-04-24T18:30:03.115600Z

@orestis in what sense is the guarantee not that strong?

orestis 2021-04-26T07:13:03.116600Z

You can find numerous ways to work around it (based on that article)

Yehonathan Sharvit 2021-04-23T15:09:43.053800Z

The proble with approach #2 is that it is hard to enforce + it doesn’t scale well

Yehonathan Sharvit 2021-04-23T15:10:10.054400Z

Do you think that approach #2 would have solved the problems you encoutered in your nodejs app?

Yehonathan Sharvit 2021-04-23T15:10:32.054900Z

I don’t think so. @borkdude?

orestis 2021-04-23T15:10:35.055100Z

No, not unless the original developers who put the system together understood the problems of mutabilty 😄

Yehonathan Sharvit 2021-04-23T15:10:48.055300Z

Why?

borkdude 2021-04-23T15:10:59.055700Z

I actually don't know TypeScript

Yehonathan Sharvit 2021-04-23T15:11:05.056Z

I mean if you forbid object filed assignment

Yehonathan Sharvit 2021-04-23T15:11:15.056400Z

What could go wrong?

orestis 2021-04-23T15:11:17.056500Z

How would you forbid it?

orestis 2021-04-23T15:11:28.057Z

(I'm not familiar so much with those libraries either)

Yehonathan Sharvit 2021-04-23T15:11:32.057100Z

Either by convention or with Object.freeze (deep)

orestis 2021-04-23T15:11:55.057400Z

Right, so back to discipline 🙂

Yehonathan Sharvit 2021-04-23T15:12:17.057900Z

Yeah. But it’s much easier to catch during a PR

Yehonathan Sharvit 2021-04-23T15:12:38.058600Z

I imagine one could write a linter that checks that (js-kondo @borkdude?)

orestis 2021-04-23T15:12:44.058800Z

My opinion based on what I've seen in this codebase is that if things are possible, people will do it.

💯 1
orestis 2021-04-23T15:13:12.059500Z

So any time you have a plain JS object, you cannot know that someone will not mutate it.

orestis 2021-04-23T15:13:35.060300Z

Perhaps the current team is disciplined and consistent. What about a 3rd-party library?

Yehonathan Sharvit 2021-04-23T15:13:43.060600Z

Unless you call object.freeze

orestis 2021-04-23T15:14:01.061100Z

Well they will try to modify it and then it will throw at runtime, right? Marginally better but not ideal.

orestis 2021-04-23T15:14:40.062Z

Using immutable.js actually is a proper API contract. The moment you leave immutable.js land (e.g. to use said 3rd-party library) you know you are entering the danger zone.

orestis 2021-04-23T15:15:06.062400Z

Which is the point of having this immutability baked in the language. There's no danger zone 🙂

orestis 2021-04-23T15:15:30.063Z

I need to run, thanks for giving me a soap box to vent my frustrations at this legacy codebase. Fortunately the transition to Clojure is going well 😄

Yehonathan Sharvit 2021-04-23T15:15:51.063300Z

Before you run, save this link for later https://github.com/tc39/proposal-record-tuple

Yehonathan Sharvit 2021-04-23T15:16:13.063900Z

One day JavaScript will have immutability at the level of the language

Yehonathan Sharvit 2021-04-23T15:16:22.064200Z

Thank you @orestis for sharing your insights

2021-04-23T15:21:52.065400Z

Defaults in a language matter. As someone mentioned above, you can be disciplined on a single project, if everyone agrees, to avoid mutability, but as team members change, the project grows, etc. very difficult to enforce over time.

2021-04-23T15:23:23.067200Z

I have worked on single-threaded large C code bases with fairly extensive data structures kept in memory between client requests, and it becomes fear-inducing to look at some code that is 5 levels deep in the function call tree, with 10 more levels beneath you, to have any kind of assurance which functions modify what, even in single-threaded code. Reasoning about correctness is very non-local -- you pretty much need to understand the whole code base in order to understand whether a change is correct (or whether the current code is correct)

Yehonathan Sharvit 2021-04-23T15:43:31.068200Z

Could you get into more details about why reasoning about a local function correctness is non-local when data is mutable?

Elliot Stern 2021-04-26T00:38:58.115900Z

var valid = validate(list);
foo(list);
bar(list);
var valid2 = validate(list);
// valid could be true and valid2 could be false
// it entirely depends on the implementation of foo and bar

Elliot Stern 2021-04-26T00:39:28.116100Z

By contrast, if list were immutable, you know that valid2 is true iff valid is true.

Elliot Stern 2021-04-26T00:40:40.116300Z

If you want to change list, it also has to be done explicitly, making reasoning about what the code is doing easier.

emccue 2021-04-23T15:47:09.068500Z

There are different kinds of locality

emccue 2021-04-23T15:47:51.069400Z

multithreading on the jvm means that "changes from under you" produces undefined behavior

emccue 2021-04-23T15:48:30.070100Z

but in a single threaded context there are still logical boundaries

emccue 2021-04-23T15:51:11.072800Z

const execute_lazy = (query) => { 
   return () => {
      return execute(query);
   };
}

const query_a = { select: '*', from: 'table' }
query_a['where'] = 'field > 0 && field < 100';
const results_a = execute_lazy(query);
console.log(results_a());
query_a['where'] = 'field > 100';
const results_b = execute_lazy(query);
console.log(results_b());

emccue 2021-04-23T15:51:18.073100Z

so this would work and produce no bugs

emccue 2021-04-23T15:51:44.073700Z

const query_a = { select: '*', from: 'table' }
query_a['where'] = 'field > 0 && field < 100';
const results_a = execute_lazy(query);
query_a['where'] = 'field > 100';
const results_b = execute_lazy(query);

console.log(results_a());
console.log(results_b());

emccue 2021-04-23T15:51:50.073900Z

but this would not

emccue 2021-04-23T15:52:55.074700Z

anything that "stores" what it is given to refer to later is a potential boundary

emccue 2021-04-23T15:53:30.075Z

either closures or objects or wtvr

emccue 2021-04-23T15:54:03.075600Z

and in node you still have concurrent processes, so they can share data

emccue 2021-04-23T15:54:59.076500Z

so say you have some piece of mutable data you put in a middleware shared between route handlers

emccue 2021-04-23T15:55:28.077300Z

state updates to that can cross the boundary into other "processes" when you await some request or whatever

Yehonathan Sharvit 2021-04-23T15:56:20.078300Z

Sounds very interesting @emccue. Unfortunately, I gotta run 😞. Keep writing and I’ll read and respond later

emccue 2021-04-23T15:57:06.078400Z

yes, it has readonly

Yehonathan Sharvit 2021-04-23T15:57:27.078600Z

Cool!

2021-04-23T16:02:40.081800Z

"Could you get into more details about why reasoning about a local function correctness is non-local when data is mutable?" Imagine you have some graph data structures with nodes and edges in memory, mutable, and a single-threaded program handling requests and updating that graph data structure. It has a particular schema, and it is big. The code for modifying that graph in memory is not in a single function. You have a call tree of C functions with a single top level entry point, but the full call tree is a decent size tree with up to 10 levels of calls deep. Some of those functions only read things in the graph, but a large fraction of those functions can insert nodes, add edges, or mutate existing nodes or edges. If you have a picture on the board or in your head of exactly which of those hundred or so functions modify exactly what, and under what conditions, you can reason about how a certain change to the code will behave. If you do not have that knowledge in your head, then you are not sure whether a change to one of those functions will violate assumptions in 1, 2, or 7 other functions in those hundred.

2021-04-23T16:03:38.082700Z

I mean, with a large enough code base and immutable data, you could potentially also create something where local reasoning breaks down, but it breaks down in different ways. Mutation increases the number of ways you can be wrong.

2021-04-23T16:05:05.083500Z

Immutability at the very least lets you answer this question very quickly and easily: "If I call function foo and pass it these parameters, will it mutate those parameters, or anything they reference?" because the answer is always "no".

2021-04-23T16:05:30.084100Z

In a program where mutation is common and expected, that question can be extremely difficult to answer correctly.

orestis 2021-04-23T17:32:07.085400Z

Looks like it’s not that strong of a guarantee https://basarat.gitbook.io/typescript/type-system/readonly

orestis 2021-04-23T17:32:56.086700Z

And of course as all Typescript, the guarantees go away at runtime. So again the 3rd party library story isn’t covered.

2021-04-23T17:38:33.087800Z

Even just a simple example like this breaks my brain

x == a; // true
// a changes here in some multi-thread environment
y == a; // true
x == y; // false...
err...

2021-04-23T17:40:52.088700Z

There's a nice section in, I think, Joy of CLojure, where the author talks about equality and how you can't really have equality in an environment where you have concurrency and mutation.

2021-04-23T17:42:30.090600Z

At best all your equality statements need qualifiers, i.e. x and y were equal, where equal means they both have the same value within a specific period of time, but how do you define that period of time? What if your values have some sort of STM, do you have to qualify equality with something like x and y were equal within a certain time period, and we don't care if x or y were in the process of a transaction that would result in a value where they weren't equal?

2021-04-23T18:42:27.091300Z

I read an article on some proposed new programming language where they discussed ideas for equality, and proposed that equals on mutable values should be explicitly called something different that could be read "equals now"

2021-04-23T18:44:05.092200Z

Yes, it was this paper: https://www.researchgate.net/publication/310823923_The_left_hand_of_equals. They didn't advocate going all immutable in the end for their programming language, but I like the idea of calling something "equals now"

2021-04-23T18:44:44.092700Z

Baker's EGAL operation they call, to contrast it, "equals always", which is what equals on immutable values is.

2021-04-23T18:45:40.092800Z

Yes. I think that makes sense jf any two things are equal at any point in time then they are equal at all points in time.

raspasov 2021-04-23T23:37:39.094700Z

Mutability: everybody has a plan until it punches them in the face.

raspasov 2021-04-23T23:43:12.095400Z

;Start node CLJS REPL
;clj -Sdeps '{:deps {org.clojure/clojurescript {:mvn/version "RELEASE"}}}' -M -m cljs.repl.node

(defn mutable-danger-101 []
 (let [obj #js{:x 42}]


  (js/setTimeout
   (fn []
    (set! (.-x obj) :boom))
   (rand 1000))

  (js/setTimeout
   (fn []
    (println "What am I?" (.-x obj)))
   (rand 1000))))

(dotimes [i 100]
 (mutable-danger-101))

raspasov 2021-04-23T23:43:32.095800Z

This will randomly print either What am I? :boom … or … What am I? 42

raspasov 2021-04-23T23:46:27.099Z

Sorry to come off the high ropes like that, but to me this is the truth: If a person doesn’t understand the problem of the example above, they haven’t tried doing quality UI or backend development. I can only point them to the number of Rich Hickey talks out there; He explains the problems of mutability very well. I think that in order to really see the problem, you must have experienced the pain, and messed up a codebase 1+ time (while you really cared, and wanted to do good work).