beginners

Getting started with Clojure/ClojureScript? Welcome! Also try: https://ask.clojure.org. Check out resources at https://gist.github.com/yogthos/be323be0361c589570a6da4ccc85f58f.
2020-10-30T02:55:53.378Z

This clears it up well, thanks Sean!

seancorfield 2020-10-27T00:23:06.071800Z

(POST "/emails" [] (fn [req] ...) ; emails-handler as anon fn -- do you mean (fn [req] (emails-handler req)) so that there's an indirection in there? I would just do (POST "/emails" [] #'emails-handler) for that indirection.

👍 1
seancorfield 2020-10-27T00:23:13.072Z

^ @stopachka

Ben Sless 2020-10-27T00:38:49.072100Z

It's from the reitit doc but the idea is similar https://cljdoc.org/d/metosin/reitit/0.5.5/doc/advanced/dev-workflow

❤️ 1
Dave Nguyen 2020-10-27T03:20:16.074Z

Hi everyone, is there a framework or a set of libraries that can resemble next.js functionality in Clojurescript world?

2020-10-27T03:30:59.075500Z

that's probably a bettter question for #clojurescript but I think reagent is the most popular react wrapper, and last I heard luminus came closest to that pre-bundled out of the box experience (these are the things used in the excellent "web development in clojure" book, which is fairly up to date)

2020-10-27T03:31:48.076200Z

luminus doesn't do cljs / reagent / re-frame out of the box, but can be convinced to add those features via flags

1
Joel 2020-10-27T04:30:36.078Z

to start a repl with cursive or calva, it seems i need a deps.edn. I'm trying to tack on clojure to an existing maven project. is there a way to just have the deps.edn simply refer to the local pom.xml for dependencies, or do i need to copy them in to the deps.edn? More specifically I want other modules from the pom.xml accessible in the repl.

seancorfield 2020-10-27T04:44:04.079600Z

@joel380 I would sort of expect Cursive to know about pom.xml files, given that it's a Java IDE under the hood, but in general for Clojure projects, yes, you'll need a deps.edn that has the same :deps {...} content as the pom.xml file.

seancorfield 2020-10-27T04:44:44.080500Z

At least once you've done that, you can keep deps.edn as the source of truth for the project and re-sync the pom.xml <dependencies> from deps.edn via clojure -Spom.

👍 1
Joel 2020-10-27T04:48:14.082200Z

if cursive can figure out from pom.xml i'd prefer that... i'm having trouble figuring out how to specify a pom module in the deps.edn (specifying 3rd party no issue, but a submodule doesn't seem to work)

practicalli-john 2020-10-27T04:57:15.082300Z

If the maven project is a Java project (or other JVM library), I'd suggest creating a separate Clojure project and include the jar from the maven project as a dependency on the Clojure project.

seancorfield 2020-10-27T05:56:46.083100Z

Ah, yeah, you can't use BOM-style deps in Clojure -- you have to specify all the individual components directly.

pez 2020-10-27T07:03:15.085300Z

@joel380 if you can start the repl some whatever way, then you can connect to it with Calva. You'll need to provide the cider-nrepl dependencies to the running app to get things like peek definition and completions going.

phoenixjj 2020-10-27T07:54:43.087Z

(for [x *command-line-args*] (print x)) If I save above two lines to pa.clj and run. $ clj pa.clj Clojure is even more fun , why it does not print anything to console ?

phoenixjj 2020-10-27T07:56:04.087900Z

if I put (println **command-line-args**) above for, it prints the list of args.

vlaaad 2020-10-27T08:05:39.088200Z

for is lazy

vlaaad 2020-10-27T08:05:56.088600Z

use doseq instead

phoenixjj 2020-10-27T08:42:04.089100Z

thanks @vlaaad.

solf 2020-10-27T09:56:47.091900Z

I have a map in a -> thread, with multiple assoc/updates. I want to assoc/update a key using the value of another key from object being threaded , is there an idiomatic way that keeps it thread-like? Here's an illustration:

(-> {:foo 10}
    (update :foo inc)
    ;; A better way to do this while keeping it inside the threading macro
    ((fn [m] (assoc m :bar (:foo m)))) 
    )
;; => {:foo 11, :bar 11} 

borkdude 2020-10-27T10:28:18.092300Z

@dromar56 I would maybe use as-> for this:

(as-> {:foo 10} $
  (update $ :foo inc)
  (assoc $ :bar (:foo $)))

borkdude 2020-10-27T10:28:48.092800Z

but probably I would just use let and not try to force everything in a thread. let over thread :)

vlaaad 2020-10-27T10:31:36.093500Z

I usually use as-> inside -> :

(-> {:foo 10}
    (update :foo inc)
    (as-> $ (assoc $ :bar (:foo $))))

solf 2020-10-27T10:34:35.093600Z

I think that's the "cleanest" way in my use case, thanks

Jim Newton 2020-10-27T10:52:41.095500Z

I have yet another question about semantics of Java classes and interfaces. The class clojure.lang.ISeq is an interface, and the class java.lang.Number is an abstract. Does this mean it is possible to have an object with is both a sequence and a Number? If not why? More precisely, according to refl/type-reflect we see that clojure.lang.ISeq has flags #{:interface :public :abstract} and we see that java.lang.Number has flags #{:public :abstract} . I am currently ignoring the fact that these are :public, does that matter?

bronsa 2020-10-27T11:08:51.096100Z

> Does this mean it is possible to have an object with is both a sequence and a Number yes

bronsa 2020-10-27T11:10:49.096700Z

user=> (def x (proxy [java.lang.Number clojure.lang.ISeq] []))
#'user/x
user=> (number? x)
true
user=> (seq? x)
true

2020-10-27T11:24:26.097600Z

Possible surprises me, as it is definitely unusual. But apparently there are no JVM rules against it. Interesting.

bronsa 2020-10-27T11:30:41.098400Z

definitely unusual, not completely unreasonable. but if you ever needed to do something like this, you'd make it a seqable number, not a seq and a number

bronsa 2020-10-27T11:30:53.098700Z

(i.e. implement clojure.lang.Seqable not clojure.lang.ISeq )

2020-10-27T11:32:06.099900Z

Sure, like a JVM String, making a number seqable would perhaps make sense in order to get a seq of its decimal digits, or something like that.

bronsa 2020-10-27T11:32:12.100100Z

yeah

bronsa 2020-10-27T11:32:36.100600Z

or the number as a list of bits

bronsa 2020-10-27T11:32:47.100900Z

or the number as a range

bronsa 2020-10-27T11:33:03.101200Z

not that I would ever do it like this

bronsa 2020-10-27T11:33:15.101500Z

but I've seen it done in the wild (not in clojure tho)

Jim Newton 2020-10-27T11:58:46.102100Z

I was thinking that the justification might be that someone could create a number which is a finite sequence of digits or of bits.

2020-10-27T12:10:42.106100Z

That might be a reason someone would want to do it, but I suspect that there are very few reasons that the JVM disallows a class from implementing two JVM interfaces, or disallows extending a JVM abstract class and also implementing any number of JVM interfaces, whether doing so in any particular case makes sense to a person or not.

vlaaad 2020-10-27T12:11:14.106400Z

you can represent average as both number and sequence

vlaaad 2020-10-27T12:12:02.107200Z

e.g. “loss-less” average produced from n numbers, which is a number with a metadata about it’s source

2020-10-27T12:14:32.107300Z

Here is an article I found from a quick Google search, since I don't feel like reading the Java Language Spec to see what it says on which combinations are legal, and which are disallowed by the JVM: https://www.geeksforgeeks.org/two-interfaces-methods-signature-different-return-types/

bronsa 2020-10-27T12:14:48.107800Z

the JVM disallows a class from implementing two JVM interfaces

bronsa 2020-10-27T12:14:59.108100Z

@andy.fingerhut the only thing that the JVM disallows is extending multiple classes

bronsa 2020-10-27T12:15:04.108300Z

you can implement as many interfaces as you want

2020-10-27T12:15:57.109300Z

I put in a thread to my previous message one article showing a case where trying to implement two interfaces with same method name but different return value type seems not to be allowed, so there are at least a few rules restricting some kinds of combinations, if that article is correct.

bronsa 2020-10-27T12:16:45.110Z

right but that doesn't have much to do with the class implementing two interfaces

bronsa 2020-10-27T12:17:17.110700Z

it has to do with the JVM disallowing a class to implement methods with polymorphic return types

2020-10-27T12:17:21.110900Z

agreed that absent such kinds of corner cases, a class can implement an arbitrary number of interfaces

Jim Newton 2020-10-27T12:25:40.116700Z

if you're willing to accept numbers as ranges, then it would also make sense to consider matrices and vectors as numbers. However, it seems this is not the intent of java.lang.Number . Rather a number seems to be something which implements the following members. In particular numbers have an integer value and a double value, which range does not but, a bit stream representing the bit representation does. It appears from this definition that a bignum would not qualify as a number.

{:bases #{java.lang.Object java.io.Serializable},
 :flags #{:public :abstract},
 :members
 #{{:name byteValue,
    :return-type byte,
    :declaring-class java.lang.Number,
    :parameter-types [],
    :exception-types [],
    :flags #{:public}}
   {:name java.lang.Number,
    :declaring-class java.lang.Number,
    :parameter-types [],
    :exception-types [],
    :flags #{:public}}
   {:name floatValue,
    :return-type float,
    :declaring-class java.lang.Number,
    :parameter-types [],
    :exception-types [],
    :flags #{:public :abstract}}
   {:name longValue,
    :return-type long,
    :declaring-class java.lang.Number,
    :parameter-types [],
    :exception-types [],
    :flags #{:public :abstract}}
   {:name shortValue,
    :return-type short,
    :declaring-class java.lang.Number,
    :parameter-types [],
    :exception-types [],
    :flags #{:public}}
   {:name serialVersionUID,
    :type long,
    :declaring-class java.lang.Number,
    :flags #{:private :static :final}}
   {:name intValue,
    :return-type int,
    :declaring-class java.lang.Number,
    :parameter-types [],
    :exception-types [],
    :flags #{:public :abstract}}
   {:name doubleValue,
    :return-type double,
    :declaring-class java.lang.Number,
    :parameter-types [],
    :exception-types [],
    :flags #{:public :abstract}}}}

Eric Ihli 2020-10-27T12:27:36.119Z

I have some data that's 144mb in EDN. It's taking a significant amount of time to serialize/deserialize (60+ seconds as EDN). I'm trying Nippy now. Nippy appears to have serialized it just fine with nippy/freeze-to-file, but when I try (nippy/thaw-from-file) I get an error: GC overhead limit exceeded. It also churns 100% CPU for over a minute before throwing that error. I thought Nippy was faster to deserialize than EDN, so this is surprising. My assumpsions are that 144mb is not an unreasonable size and Nippy should be faster than EDN and it shouldn't throw this error. Are my assumptions wrong? If my assumptions are right, any thoughts on how to troubleshoot?

2020-10-27T12:28:43.120Z

For what reason do you say that it appears from this definition that a bignum would not qualify as a number?

Jim Newton 2020-10-27T12:30:36.121300Z

w.r.t a class implementing two different interfaces, is it possible to look at the members of two interfaces in search of contradictory type definitions. For example if one interface requires member foo to have type int and another interface requires a member of the same name to have type double, does that mean it is impossible to have a class which lists both interfaces in its ancestors set?

Jim Newton 2020-10-28T12:20:05.227700Z

I started looking into this question of whether interfaces are disjoint. and now I don't know how to figure out whether two members are the same.

Jim Newton 2020-10-28T12:25:19.228Z

apparently a java class (or interface) can have multiple members of the same name. So when two different interfaces both have a method of the same name, they might be a conflict or not. Is it the arity that determines whether two members of the same or different, or is it both the arity and somehow the parameter types? For example, java.util.List has many methods named of. Three of which are:

{:name of,
  :return-type java.util.List,
  :declaring-class java.util.List,
  :parameter-types [],
  :exception-types [],
  :flags #{:public :static}}
 {:name of,
  :return-type java.util.List,
  :declaring-class java.util.List,
  :parameter-types [java.lang.Object<>],
  :exception-types [],
  :flags #{:varargs :public :static}}
 {:name of,
  :return-type java.util.List,
  :declaring-class java.util.List,
  :parameter-types
  [java.lang.Object
   java.lang.Object
   java.lang.Object
   java.lang.Object
   java.lang.Object],
  :exception-types [],
  :flags #{:public :static}}
if a second interface has a member named of with different arity, I suppose it is NOT a conflict. If it has the same arity and same types, I suppose it. IS a conflict. But if it has the same arity, and related types, such as types which are all subtypes or all super types, is that allowed?

2020-10-28T13:26:42.241Z

I don't know the precise rules for conflict off the top of my head, but I suspect that even methods with the same name will not arise very often in interfaces defines in Clojure's implementation, since those interfaces were all created by a single author.

2020-10-28T13:27:46.241200Z

I'd recommend for now throwing an exception, or returning some kind of "unknown-answer" result, if you actually come across two interfaces where you don't already know the rule.

2020-10-28T13:34:58.241600Z

It is arity and the parameter types, but I don't know the rule Java uses for distinguishing parameter types when they are both classes or interfaces.

2020-10-28T14:14:22.243500Z

You could also ask in the #java channel here on Clojurians Slack, and someone else may have a good reference to an official answer, or know it.

Jim Newton 2020-10-28T14:37:11.244700Z

I experimented with one of my students. It seems that if two interfaces have a method with the same name and the same input parameter types, it is an error, regardless of the return types. That's pretty easy to check.

Jim Newton 2020-10-28T14:38:06.244900Z

e.g. interface Ia can have method m(int) and interface I2 can have method m(double) and there's no conflict.

Jim Newton 2020-10-28T14:39:47.245100Z

but if they both have a method m(int) there's a conflict

Jim Newton 2020-10-28T14:43:43.245300Z

cool! I didn't know there was a #java challenge on Clojurians

2020-10-27T12:31:52.121400Z

I sent to myself recently above, linking to an article that seems to say "yes" to your question: https://www.geeksforgeeks.org/two-interfaces-methods-signature-different-return-types/

2020-10-27T12:32:30.122Z

For what reason do you say above: "It appears from this definition that a bignum would not qualify as a number."

Jim Newton 2020-10-27T12:33:09.122100Z

@andy.fingerhut I'm just trying to understand, but it seems a bignum cannot have a member intValue of type int , as the largest int is 2^64 - 1, whereas the largest bignum is larger than that.

2020-10-27T12:33:31.122300Z

Those methods are allowed to return useless values in such cases.

2020-10-27T12:33:42.122500Z

or, e.g. the least significant N bits of the bignum value

2020-10-27T12:35:47.122700Z

In most cases I know of where they can lose information, they do so in a fashion that is at least semi-useful.

Jim Newton 2020-10-27T12:36:18.122900Z

hmmm. useless values, I laughed when I read that. 😉 I wouldn't have guessed that to be the case

2020-10-27T12:36:41.123100Z

It wasn't a carefully thought out answer -- the later ones are closer to my opinion 🙂

2020-10-27T12:37:53.123300Z

If those methods cannot lose information, then the only type that can implement all of those methods at the same time is an 8-bit integer, because byteValue is in the list of methods.

2020-10-27T12:41:41.123500Z

There are at least a few syntactic rules that the JVM enforces about which classes can be sub-classes of others, and which combinations of interfaces that a class can implement, but except for those, the language itself makes no restrictions. That allows a lot of combinations that a person might never think to do, because they have little or no utility.

2020-10-27T12:42:23.123700Z

Trying to look at method names and determining from the method signatures alone which combinations are allowed and which are not is going on too little information.

2020-10-27T12:48:05.123900Z

I haven't used Nippy, so don't have any information for you specific to that library, unfortunately. Creating an issue on its Github repository, and/or looking through existing issues, might glean some useful information.

2020-10-27T12:49:24.124100Z

You can try increasing your JVM's max heap size with a java command line option such as -Xmx1g to see if that lets it go through for you, but if it is a bug where the library is trying to allocate an unlimited memory, it would only lead to GC overhead limit exceeded later, rather than earlier. Still, it might be a useful bit of information in your investigation.

2020-10-27T12:51:01.124300Z

There is a commercial tool called YourKit Java Profiler that has a 15-day free trial version, and I believe free licenses for use on open source projects, that can attach to a JVM started with a couple of extra command line options it gives you, that can analyze the set of allocated objects in various ways, e.g. by class, sorted by most memory occupied, that might help. There are free tools for doing that, too.

Eric Ihli 2020-10-27T13:06:19.124500Z

Thanks for the ideas. I think I found it in a closed issue from last year. https://github.com/ptaoussanis/nippy/issues/117

sova-soars-the-sora 2020-10-27T13:51:50.125400Z

super duper om computer

sova-soars-the-sora 2020-10-27T14:01:29.125700Z

Yeah my first guess is what andy said -- increase the jvm heap size via the command line option or in your project.clj ... but I don't know .. maybe Nippy is not paging data ... but isn't that supposed to be the job of the JVM?

Eric Ihli 2020-10-27T14:53:43.126200Z

Hmm. I don't think the heap size is the problem. It might be "a" problem. But it's not "the" problem. The EDN file is ~144 mb and gets processed by read-string in about a minute. Nippy is supposedly faster. Nippy serializes the EDN in about 10 seconds to about 80 mb. That all seems reasonable. It just blows up when desrializing. Maybe something specific about the type of data. I'll open an issue.

2020-10-27T15:05:23.126700Z

It looks like some Java developers run into this restriction when they have actual desires to have a single class implement multiple interfaces with conflicting method signatures, and find workarounds for that: https://stackoverflow.com/questions/2598009/java-method-name-collision-in-interface-implementation

2020-10-27T15:06:02.127Z

That StackOverflow answer shows that the C# language has a way to disambiguate which method you are implementing, by also supplying the interface name in the definition of the method. Java does not have that.

Eric Ihli 2020-10-27T15:52:04.127500Z

I tried a few more things trying to make this easily reproducible and I think I've narrowed it down to something specific about the 140mb of edn data that I have. If I generate similar-looking data, I don't experience any issues. https://github.com/ptaoussanis/nippy/issues/136

Jim Newton 2020-10-27T17:31:33.129100Z

Actually the question I'm trying to solve is given two java classes (which might be :abstract, :interface, or :final) decide whether the set of all objects of one class intersects with the set of all objects of the other class. For example the set of Serializables intersects the set of Comparables. However the set of Strings is disjoint from the set of Integers.

Jim Newton 2020-10-27T17:32:25.129300Z

Currently my rule is that if both neither class is final, and neither is explicitly Object, then they are not disjoint.

Jim Newton 2020-10-27T17:32:57.129500Z

But in light of this recent discussion, there are times when two interface classes are necessarily disjoint.

2020-10-27T17:44:26.129700Z

If you ignore Java interfaces, then the answer is true if one class is a sub-class of the other (either directly, or through a chain of sub-class relationship).

2020-10-27T17:46:02.129900Z

If you are trying to answer the question "given to java classes, is there an interface that they both implement?", that is also pretty straightforward to answer, but I doubt that is the question you are trying to answer.

Jim Newton 2020-10-27T17:47:12.130100Z

yes if 1 class is a subclass of another they are not disjoint, that's evident

2020-10-27T17:47:39.130300Z

I should mention that in the JVM, an interface will show up in the output of the Clojure reflection API whose output you copied and pasted a sample above as having the keyword :interface there. In Java source code and documentation, there is usually a sharper distinction made between Java classes versus Java interfaces than that.

2020-10-27T17:49:31.130500Z

That is, for something that has the :interface keyword in it, most people familiar with Java would never call that a class. They would call it an interface.

Jim Newton 2020-10-27T17:49:32.130700Z

no, the question is given two classes are they disjoint? where classes includes any non-nil value which find-class returns. for which class? returns true.

2020-10-27T17:51:14.130900Z

If you want to know for two interfaces, are there any two JVM classes that both implement that interface, then the only way to check that, that I know of, is to somehow iterate over all class definitions and check whether there are two of them that both implement that interface.

2020-10-27T17:51:51.131100Z

The JVM is dynamic enough in creation of classes and interfaces at run time, that the answer can change over time. e.g. Clojure's compiler creates new classes when you eval a defn form, and some others.

Jim Newton 2020-10-27T17:52:28.131300Z

By class I mean any value for which class? returns true

2020-10-27T17:52:45.131500Z

There are libraries I have searched for, and found, that attempt to iterate through all classes currently defined in a running JVM, but there are many caveats in how complete they can be.

2020-10-27T17:53:27.131800Z

class? returns true for things that most people call Java interfaces, as well as things most people call Java classes.

2020-10-27T17:53:45.132Z

classes always form a tree structure of sub-class relationships, with java.lang.Object at the root.

Jim Newton 2020-10-27T17:53:48.132200Z

exactly. I am calling those classes, inkeeping with the class? function

Jim Newton 2020-10-27T17:54:54.132500Z

what does java call the set of all classes union with the set of all interfaces?

2020-10-27T17:55:05.132700Z

I don't know if that has a name.

2020-10-27T17:55:20.132900Z

Every JVM object has one and only one class, determined when it is created.

2020-10-27T17:55:31.133100Z

Such a class is never a Java interface.

Jim Newton 2020-10-27T17:55:40.133300Z

for my purposes I don't care about the distinction. But to be sure, anytime I talk to a java person they get upset about this.

2020-10-27T17:55:57.133500Z

I am not upset. I am trying to determine if the question you are trying to answer makes sense or not.

2020-10-27T17:56:13.133700Z

It might not be a well-formed question.

Jim Newton 2020-10-27T17:56:14.133900Z

😉 yes you're cool headed.

Jim Newton 2020-10-27T17:56:59.134100Z

but the people in my office always start talking about the differences in the two, and they've never convinced me that it is something I need to care about.

2020-10-27T17:57:34.134300Z

You said the question you are trying to answer is this: "Actually the question I'm trying to solve is given two java classes (which might be :abstract, :interface, or :final) decide whether the set of all objects of one class intersects with the set of all objects of the other class."

Jim Newton 2020-10-27T17:57:36.134500Z

Ok. let me test a definition on you.

2020-10-27T17:58:03.134700Z

Suppose I pick two Java interfaces, I1, and I2, as the Java classes you want to ask and answer that question about.

2020-10-27T17:58:44.134900Z

So the more restricted question becomes: "given two java interfaces I1 and I2, decide whether the set of all objects of one class intersects with the set of all objects of the other class"

2020-10-27T17:59:13.135100Z

What do you mean by "the set of all objects of class I1"? Do you mean the set of all objects, each of which is a particular non-interface class X, does X implement class I1?

Jim Newton 2020-10-27T17:59:24.135300Z

now set s1 = the set of all objects, whose class has l1 in its ancestors list union with the singleton set containing the class itself.

Jim Newton 2020-10-27T17:59:41.135500Z

now set s2 = the set of all objects, whose class has l2 in its ancestors list union with the singleton set containing the class itself.

Jim Newton 2020-10-27T17:59:59.135700Z

is s1 guaranteed to be disjoint from s2 ?

Jim Newton 2020-10-27T18:00:32.136100Z

or is it possible that those sets have a non-empty intersection?

2020-10-27T18:00:52.136300Z

So here is how I might restate that question, which seems like it might have the same answer as your question. Given interface I1, are there two Java classes C1 and C2, such that C1 implements I1, and C2 implements I1?

Jim Newton 2020-10-27T18:01:05.136500Z

Currently my code concludes that if l1 and l2 are interfaces, then it is not guaranteed that s1 is disjoint from s2.

Jim Newton 2020-10-27T18:01:35.136700Z

no, that's not the same question, but it is indeed similar.

2020-10-27T18:01:36.136900Z

OK, my question was not as close to yours as I thought. Let me try again.

Jim Newton 2020-10-27T18:01:49.137100Z

can I restate your questionb ack to you?

2020-10-27T18:02:22.137300Z

I am pretty sure now my question is different than yours, but you are welcome to go ahead.

Jim Newton 2020-10-27T18:02:37.137500Z

change "are there two java classes" to "are there guaranteed to never be two java classes"

Jim Newton 2020-10-27T18:02:46.137700Z

either now or at some point in the future

2020-10-27T18:03:27.137900Z

It is always possible to define new interfaces and new classes at run time in the JVM, barring some kind of JVM security manager restrictions or running out of memory. Definitely not the common case if you are running Clojure.

2020-10-27T18:03:50.138100Z

So it depends upon what you are willing to allow happening in the future.

Jim Newton 2020-10-27T18:04:10.138300Z

every time I define a clojure Record I define a new java class, right?

Jim Newton 2020-10-27T18:04:18.138500Z

at run time.

2020-10-27T18:04:29.138700Z

If you are allowing the future to define new classes, except under fairly unusual circumstances of two interfaces having very similar incompatible method signatures, it is always possible to define a new class that implements both interfaces.

Jim Newton 2020-10-27T18:05:18.138900Z

so my code is currently supposing that it cannot guarantee interfaces are disjoint.

2020-10-27T18:05:23.139100Z

whether it makes sense to a human to productively do so is an entirely separate question

Jim Newton 2020-10-27T18:06:18.139300Z

The new information from our discussion today leads me to believe that I can tighten that restriction. If two interfaces declare the same member with two different?/incompatible? types, then we know the interfaces are disjoint regardless of future events.

Jim Newton 2020-10-27T18:06:56.139500Z

indeed, i'm not asking what is useful, I'm asking what is possible

Jim Newton 2020-10-27T18:07:10.139700Z

my program does not judge the usefulness of your application

2020-10-27T18:07:52.140100Z

So continuing in my habit of distinguishing classes vs. interfaces, you cannot guarantee interfaces are disjoint. Similarly, if you allow the future to define new sub-classes C2 of a given class C1, you cannot guarantee that an interface I1 and a class C1 are disjoint, unless C1 is final (in which case sub-classes of C1 cannot be created).

Jim Newton 2020-10-27T18:08:39.140300Z

agree. however, Object IS FINAL

2020-10-27T18:08:56.140500Z

It isn't clear to me whether knowing this disjointness property of two Java classes/interfaces is, but I imagine you have ideas on that.

Jim Newton 2020-10-27T18:08:57.140700Z

I'm not sure if that's a one-off or whether there are other final classes we can inherit from.

2020-10-27T18:09:15.140900Z

The class java.lang.Object is final?

Jim Newton 2020-10-27T18:09:43.141100Z

correct me if I'm wrong

2020-10-27T18:10:10.141300Z

If it were, then no other classes could be created that are sub-classes of it, and the class hierarchy would consist of only that class java.lang.Object

2020-10-27T18:10:44.141500Z

unless I am completely misremembering the meaning of final as it applies to Java classes...

Jim Newton 2020-10-27T18:11:02.141700Z

I think you're right. it is :public but not :final and not :abstract. I was remembering wrong

Jim Newton 2020-10-27T18:11:57.141900Z

clojure-rte.rte-core> (:flags (refl/type-reflect Object))
#{:public}
clojure-rte.rte-core> 

2020-10-27T18:12:20.142100Z

no worries. If it is any consolation, I was quite surprised to learn that the Clojure and Java reflection APIs treat interfaces as classes with an extra :interface attribute.

2020-10-27T18:12:35.142300Z

(year ago, when I first encountered that fact)

Jim Newton 2020-10-27T18:12:38.142500Z

I may have to revisit my code to see what I'm doing with classes which are not :final, and not :abstract, and not :interface

2020-10-27T18:13:18.142800Z

Is it easy to explain why you want to prove whether two classes are disjoint?

Jim Newton 2020-10-27T18:14:50.143Z

let me try in one paragraph.

2020-10-27T18:15:02.143200Z

Given how common it is for Clojure functions to care more about the interfaces that an argument or return value implements, rather than its Java class (often called the concrete class in discussion you and I have had in the past couple of days with others), as soon as you associate a Clojure parameter or return value as "should implement this interface", it becomes possibly-overlapping with almost every other object that implements another interface.

timo 2020-10-27T18:16:04.144300Z

I am having a project using Clojure but there is Java in it as well. How do I compile the java with clojure tools?

Jim Newton 2020-10-27T18:16:14.144400Z

Imagine you have a regular expression, not of characters but of types. e.g., (:* Number) means a sequence of 0 or more numbers.

2020-10-27T18:16:34.144900Z

And actually, my earlier statement about two classes being related as either one is a subclass of the other, or not, if they are not final, then a future sub-class of both of them could implement a common interface, so they cannot be guaranteed disjoint either

Jim Newton 2020-10-27T18:16:54.145500Z

(:* (:cat Number String)) is a sequence of alternating Number followed by String. 0 or more of them

2020-10-27T18:17:01.145800Z

So the only things you could ever guarantee disjoint are two final classes that don't implement a common interface I1, or maybe a handful of other cases.

seancorfield 2020-10-27T18:17:12.146200Z

@timok Are you using Leiningen or Clojure CLI? If the latter, the answer is: compile the Java manually, put the classes on your classpath. Or put the Java in its own project and build a lib JAR you can depend on from Clojure.

2020-10-27T18:17:28.146600Z

stopping talking now, and reading instead, sorry .... 🙂

seancorfield 2020-10-27T18:17:40.147Z

Leiningen has a facility to compile Java code and incorporate it into your Clojure project.

timo 2020-10-27T18:17:48.147400Z

mvn compile then?

Jim Newton 2020-10-27T18:17:59.147900Z

I think any two final classes are disjoint. There is never an object (as I understand) which is an instance of two different final classes. correct me if I'm wrong.

seancorfield 2020-10-27T18:18:00.148100Z

(but it shells out to javac so it's not exactly magic)

timo 2020-10-27T18:18:02.148200Z

Correct, it is clojure cli tools

2020-10-27T18:18:23.148400Z

but if those two final classes both implement interface I1, do you consider them disjoint, or not?

timo 2020-10-27T18:18:35.148900Z

I mean clojure cli tools create a pom anyway so I use maven correct?

Jim Newton 2020-10-27T18:19:06.149100Z

they are disjoint. The set of objects of class-a is disjoint from the set of objects of class-b, even if the classes share an interface.

Jim Newton 2020-10-27T18:19:23.149300Z

if class-a and class-b are both final

Jim Newton 2020-10-27T18:19:33.149500Z

and class-a != class-b

Jim Newton 2020-10-27T18:20:00.149700Z

note that String = java.lang.String

2020-10-27T18:20:13.149900Z

ok

2020-10-27T18:20:16.150100Z

sure

Jim Newton 2020-10-27T18:20:18.150300Z

back to the discussion

Jim Newton 2020-10-27T18:21:00.150500Z

(:* (:cat Number (:or String (:not Long))))

Jim Newton 2020-10-27T18:21:30.150700Z

represents the set of squences of Number followed by either a String or a Number which is not a Long.

Jim Newton 2020-10-27T18:21:32.150900Z

etc etc etc.

Jim Newton 2020-10-27T18:21:44.151100Z

These can be combined infinitum.

2020-10-27T18:21:44.151300Z

So Clojure spec has similar regex-of-objects-satisfying-predicates kind of capabilities, so this looks familiar in that sense

Jim Newton 2020-10-27T18:22:26.151500Z

they so-called regular-type-expression can be represented as a deterministic finite automaton, whose transitions are disjoint types.

2020-10-27T18:22:44.151700Z

It is only trying to do dynamic checks so far, no kinds of static analysis that I am aware of, although a few people have investigated in that direction.

2020-10-27T18:23:02.151900Z

OK, I see where the disjointness guarantee can come in there

Jim Newton 2020-10-27T18:23:09.152100Z

if any state in the finite automaton has two transitions labeled with types which fail to be disjoint, then the automaton fails to be deterministic

2020-10-27T18:23:20.152300Z

If you want to avoid nondeterministic finite automata

Jim Newton 2020-10-27T18:23:35.152500Z

if the automaton is determistic then we can answer lots of questions which we cannot answer if it is non-deterministic.

Jim Newton 2020-10-27T18:24:10.152700Z

For example, we can ask is there a sequence which is recognized by two different automata

2020-10-27T18:24:41.152900Z

I mean, every non-deterministic finite automata can be transformed into a deterministic one, but since there is a potential exponential size blowup in the number of states, that fact is often not a practically useful one.

Jim Newton 2020-10-27T18:24:49.153100Z

also given a sequence, we can ask "does it match the regular-type-expression" and we can perform the computation in LINEAR time WITHOUT backtracking

Jim Newton 2020-10-27T18:25:22.153300Z

correction: every finite automaton over a finite alphabet.

Jim Newton 2020-10-27T18:25:34.153500Z

the set of types is infinite.

2020-10-27T18:26:09.153700Z

sure. I often forget there are people studying infinite alphabets, as they seem a bit fairy-tale in practical uses, but I suppose in this case their use seems a bit more realistic.

Jim Newton 2020-10-27T18:27:00.153900Z

the set of clojure objects is an infinite set. so if you wanted to make a finite automaton to recognize arbitrary sequences you'd need infintite alphabet and consequently infinitely many transitions.

Jim Newton 2020-10-27T18:27:35.154100Z

my research handles a special case of infinite alphabets,

2020-10-27T18:27:44.154300Z

I know regular expressions can be transformed into a nondeterminstic finite automata (with epsilon transitions) with a linear number of states and transitions, in the size of the regular expression (at least for finite alphabets), but I thought that sometimes the deterministic finite automata was exponential in the size of the regular expression?

Jim Newton 2020-10-27T18:27:55.154500Z

alphabets which can be partitioned into recursively-enumerable sets. (edited) where the membership function is always deciable.

Jim Newton 2020-10-27T18:28:37.154700Z

So I have imposed a type-calculus over the Java type system, written in clojure.

Jim Newton 2020-10-27T18:29:08.154900Z

a student of mine is writing the same thing in Scala

Jim Newton 2020-10-27T18:29:31.155100Z

my PhD thesis was implementing this in Common Lisp, which already was equipped with such a type system.

Jim Newton 2020-10-27T18:29:42.155300Z

so I didn't have to implment the type system.

Jim Newton 2020-10-27T18:30:15.155500Z

I hope that sheds some light on my wierd questions all the time...

2020-10-27T18:30:48.155700Z

I found a reference to this Wikipedia article which might back up my belief that there are regular expressions whose minimal DFAs have exponential size blowup: https://en.wikipedia.org/wiki/Regular_expression#Expressive_power_and_compactness

2020-10-27T18:32:33.156300Z

It does, and Rich Hickey when giving an early talk on Clojure spec mentioned the future possibility of comparing specs to each other as a function's API changed over time, to determine subset relationships between Clojure specs.

2020-10-27T18:32:51.156500Z

To my knowledge, no one has implemented such a thing yet.

2020-10-27T18:34:51.156700Z

If you are curious for that quote in context, search for "solved problem" in this talk transcript: https://github.com/matthiasn/talk-transcripts/blob/master/Hickey_Rich/ClojureSpec.md

Jim Newton 2020-10-27T18:34:54.157Z

questions of inclusion (subsetness) and disjointness can always be answered with RTE (regular type expressions) provided the questions can be answered on the leaf level types.

Jim Newton 2020-10-27T18:35:22.157300Z

do you know Rich?

2020-10-27T18:35:35.157500Z

I knew that there were algorithms for doing this over finite alphabets, but had not considered whether infinite alphabets, or Java interfaces vs. classes, changed how practical such decision algorithms might be.

2020-10-27T18:36:20.157700Z

Not personally, no. We have had the occasional on-line and in-person interaction, but probably only a few hours over 10 years.

Jim Newton 2020-10-27T18:36:28.157900Z

i'm sure there are issues/problems/errors in my theory and implementation. But I'd like to get it into a state where I can release it or publish it.

2020-10-27T18:37:18.158100Z

So does the inclusion problem remain solvable if you don't know the disjointness properties of pairs of types?

Jim Newton 2020-10-27T18:37:42.158300Z

one important property (at least I think it is important) of my type system is that it is extensible. The hope is to create a type which represents the set of all objects which match a given spec

Jim Newton 2020-10-27T18:37:46.158500Z

but I don't yet know how to do that

2020-10-27T18:39:14.158700Z

Or maybe a more practically useful question for software developers: suppose you consider a collection of classes and interfaces that are defined right now, and leave out of consideration possible future classes and interfaces that might be created. That seems potentially more practically useful to answer such inclusion questions for RTEs, if the answer for most pairs of interfaces is "can't be proved disjoint" (when arbitrary future classes are allowed)

Jim Newton 2020-10-27T18:39:49.159Z

Good question indeed. I believe the answer is yes. Why, given two types A and B, if you don't know their subtype relation nor their disjoint relation, then you can partition (union A B) into {(and (not A) B) (and A (not B)) (and A B)}. but any of those three might be empty, we just don't know whether it is.

2020-10-27T18:40:36.159400Z

the cli tools have no opinion or restriction on how you use java

2020-10-27T18:41:16.160200Z

There are a whole bunch of Java classes that aren't final, and imagining that some one some day might create a sub-class that implements some interface somewhere, isn't a practical future possibility.

Jim Newton 2020-10-27T18:41:23.160500Z

such a partition causes the automaton to be larger and potentially have transitions which will never be taken. For example, if A is a subset of B, then (and A (not B)) is always false.

2020-10-27T18:41:42.161100Z

you could use an ant project and import a jar made via deps.edn for example

2020-10-27T18:42:03.161600Z

but the fact that deps.end can make a pom (and uses the maven ecosystem regardless) does make using maven convenient

2020-10-27T18:42:55.161700Z

And if people knew that a system answered certain questions for the current set of classes and interfaces, and you gave them an example of how the answer might change if certain classes were defined later, they could either re-run the algorithms later as the software changed with new class/interface definitions, or at least use their judgement about the software to determine how likely such future classes would be.

Jim Newton 2020-10-27T18:42:58.161900Z

you make a good point. a practical issue I'd like to avoid is that someone loads my code, then loads his application using it. that user application invisibly builds an automaton which assumes interface-A and interface-B are disjoint (because no class currently inherits from both).

Jim Newton 2020-10-27T18:43:46.162100Z

then the user loads a third library which declares a class which inherits from both. That means the automaton is now invalid. and there is nothing to trigger it to be re-computed.

2020-10-27T18:44:25.162300Z

Right, hence my phrase about informing them about the assumptions on which the answers are based, and what could change them in the future.

2020-10-27T18:45:11.162500Z

Many software developers can understand and handle such qualifications, especially with examples.

2020-10-27T18:46:08.162700Z

I am just wondering if your type system, and how common it is for Clojure functions not to rely on particular Java classes, but on Java interfaces, will commonly lead your RTEs to never be deterministic, or only rarely.

Jim Newton 2020-10-27T18:46:12.162900Z

its a good point. i'm open about it. but cautious because i'm not sure the gain is worth the risk.

2020-10-27T18:47:12.163100Z

There are a lot of commonly used Clojure functions that take something that implements clojure.lang.ISeq, and returns something implementing that, for example.

2020-10-27T18:47:23.163300Z

but make no promises about which classs

Jim Newton 2020-10-27T18:47:36.163500Z

in my system they are always deterministic, but may contain unsatisfyable transitions.

Jim Newton 2020-10-27T18:48:32.163700Z

any state in the automaton has some number of transitions leading to that number of next states.

2020-10-27T18:48:35.163900Z

if your system says "clojure.lang.ISeq is an :interface, therefore is not provably disjoint from something implementing another thing that is an :interface, therefore I can't say much useful about it", then I suspect most Clojure functions will lead to that situation.

Jim Newton 2020-10-27T18:48:54.164100Z

the transitions are labeled with disjoint types (some of which might be empty types).

Jim Newton 2020-10-27T18:49:59.164300Z

I compile the transitions into a data structure sort of a decision network which is guaranteed to never make the same type test twice.

Jim Newton 2020-10-27T18:50:46.164500Z

so given an object at runtime I ask whether it is Comparable, and then possibly whether it is serializable, then possibly whether it is a Number etc.

2020-10-27T18:51:05.164700Z

I've got to go do some other work for a bit, earn my pay and all, but been interesting chatting. I would be interested to see what kinds of results you end up with.

Jim Newton 2020-10-27T18:51:16.164900Z

and depending on the answers to those questions transfer control to the appropriate next state.

Jim Newton 2020-10-27T18:51:49.165100Z

Thanks for the chat. Can you suggest the best way for me to present my code to the public. There are no conferences anymore

2020-10-27T18:52:16.165300Z

especially if it results in code I could run that compares two specs (or a useful subset of them) to see if one is a subset of the other.

2020-10-27T18:54:00.165500Z

If you had something that could be used to answer that kind of question about Clojure specs, I suspect the Clojure conj conference might be interested, although it is more developer-focused and less theoretical. I do not know which of the academic conferences might be interested, but suspect there would be some -- perhaps. you know better than I. I thought several conferences were still having on-line presentations?

2020-10-27T18:54:10.165700Z

Only a subset, I know, but still some.

Jim Newton 2020-10-27T18:55:39.165900Z

many thanks for your valuable feedback, especially finding that weird bug with ldiff

Jim Newton 2020-10-27T18:56:02.166100Z

i'm going to eat supper now

timo 2020-10-27T18:58:55.166600Z

ok thanks to you both! Have it working.

2020-10-27T19:16:49.167100Z

Is there any sanctioned way to make new reader macros in clojure?

2020-10-27T19:17:01.167400Z

(Ideally without having to fork the language anyway)

teodorlu 2020-10-27T19:19:24.167800Z

@leif do https://clojure.org/reference/reader#tagged_literals work for you?

alexmiller 2020-10-27T19:25:32.167900Z

intentionally, no

alexmiller 2020-10-27T19:26:01.168100Z

for data, you can use tagged literals

alexmiller 2020-10-27T19:26:13.168300Z

https://clojure.org/guides/faq#reader_macros

2020-10-27T19:31:51.169400Z

@teodorlu It looks like tagged literals can hold values, but not expressions (or macro references), yes?

2020-10-27T19:32:25.169800Z

Basically, I'm trying to do this in clojure: https://arxiv.org/abs/2010.12695

2020-10-27T19:34:40.171Z

It seems like tagged literals (understandably) cannot expand like macros, is this correct?

2020-10-27T19:34:49.171200Z

(Basically I'm trying to do this in clojure: https://arxiv.org/abs/2010.12695 )

alexmiller 2020-10-27T19:38:59.171700Z

that's correct

alexmiller 2020-10-27T19:39:45.172500Z

I mean they could theoretically hold code (which is data) and do something with that but I'm not sure you'd be happy with where you'd end up.

2020-10-27T19:43:10.173900Z

Okay, another option then is a regular (textual) macro, whose invocations can be turned into source positions and let the IDE hide them.

2020-10-27T19:43:41.174600Z

That 'might' be doable with attached metadata....maybe...I'll have to think about that.

2020-10-27T19:48:47.175Z

Actually, ya, that would work...is there any way to 'tag' arbitrary expressions in clojure?

2020-10-27T19:48:55.175300Z

Like you can with functions.

2020-10-27T19:49:44.176Z

Maybe something like (func ^meta-info args)?

2020-10-27T19:50:13.176600Z

Its okay if func has to be a macro and can even specify that it requires ^meta-info

2020-10-27T19:50:30.176900Z

Basically I'm just looking for something an IDE can hook into.

teodorlu 2020-10-27T19:51:09.177300Z

You can attach metadata to any var: https://clojure.org/reference/metadata

2020-10-27T19:52:29.177900Z

@teodorlu In this case though, I'm looking to attach the metadata to the variable's use, rather than just its definition.

teodorlu 2020-10-27T19:53:28.178200Z

I see. Not sure how you'd achieve that.

2020-10-27T19:54:10.178400Z

Alright, thanks.

2020-10-27T19:54:19.178700Z

I'll look into this more.

2020-10-27T19:55:20.179900Z

I do admit I am sort of stretching the language here, although I would like to avoid contorting the language too much... 😉

alexmiller 2020-10-27T19:55:58.180400Z

if func is a macro, you don't metadata there - you can just pass whatever and the macro can do whatever it wants with it

teodorlu 2020-10-27T19:57:12.180800Z

Good luck!

2020-10-27T19:57:33.181300Z

@alexmiller good point.

2020-10-27T19:57:40.181600Z

I think I'm starting to see how I might go about doing this.

2020-10-27T19:57:48.181700Z

Thanks. 🙂

Louis Kottmann 2020-10-27T21:11:23.193400Z

hello, can anyone explain to me why regexes as keys get duplicated in a map? i.e:

myapp.core> (assoc (assoc cmds #"cake" "hello") #"cake" "hello")
=> {#"cake" "hello", #"cake" "hello"}
myapp.core> (assoc (assoc cmds "cake" "hello") "cake" "hello")
=> {"cake" "hello"}

Louis Kottmann 2020-10-27T21:13:50.193800Z

should I avoid using regexes as keys if I want them to be unique within the map?

alexmiller 2020-10-27T21:14:16.194100Z

regexes don't compare for equality

alexmiller 2020-10-27T21:15:23.194900Z

one workaround is to use the regex string instead (which has pros and cons)

Louis Kottmann 2020-10-27T21:17:07.195300Z

sooo, I should avoid them for this use?

alexmiller 2020-10-27T21:19:16.195800Z

well they won't work well as keys in a map or values in a set for sure

alexmiller 2020-10-27T21:19:49.196400Z

this is one of very few things in Clojure that don't have equality as value (function instances being another)

Louis Kottmann 2020-10-27T21:21:12.196900Z

mmmh when I try:

(= #'message #'message)
=> true

2020-10-27T21:21:36.197600Z

that isn't a regex

Louis Kottmann 2020-10-27T21:21:36.197700Z

do you mean 2 same functions with same name, arguments and body don't compare?

Louis Kottmann 2020-10-28T08:09:19.225800Z

I see, I just do not understand how the basic case of 2 regexes written exactly the same way are not equal. The problem of finding out if 2 regexes written differently evaluate to the same string rules is indeed very hard and I would not expect that equality to work out of the box ^^

dpsutton 2020-10-27T21:21:43.198100Z

(= #"message" #"message")

2020-10-27T21:21:44.198200Z

that isn't a function

2020-10-27T21:21:53.198500Z

#' is var quote

2020-10-27T21:22:05.198800Z

it means give me the var with this name

Louis Kottmann 2020-10-27T21:22:38.199100Z

I see

Louis Kottmann 2020-10-27T21:22:46.199300Z

thank you

bronsa 2020-10-27T21:22:54.199600Z

correct

bronsa 2020-10-27T21:22:58.199800Z

function equality is undecidable

2020-10-27T21:23:29.200500Z

they are sort of functions, because you can invoke vars as functions, similar to how you can call keywords or collections as functions

2020-10-27T21:23:35.200700Z

but they are not function objects

2020-10-27T21:24:20.201800Z

and to say regexes don't compare for equality is not entirely accurate, you can call = on them, just like you can call = on function objects

2020-10-27T21:24:33.202300Z

but they have identity equality, not value equality

alexmiller 2020-10-27T21:24:39.202600Z

they compare with identity is what I meant

2020-10-27T21:24:46.202800Z

basically you are comparing pointers

Louis Kottmann 2020-10-27T21:25:26.203200Z

I'll try a few things on the REPL to play with regexes equality

Louis Kottmann 2020-10-27T21:25:38.203600Z

but I'll remove them as keys of the map and find a better way

Louis Kottmann 2020-10-27T21:25:56.204500Z

thanks for the info, this is a nice community

dpsutton 2020-10-27T21:25:57.204600Z

there's a very simple logic to it. if they are the exact same object they are equal, else they are not equal

2020-10-27T21:26:14.205Z

every usage of a regex literal creates a different regex object, and different regex objects will never be equal regardless of the value of the regex

2020-10-27T21:26:29.205300Z

#"..." being a regex literal

2020-10-27T21:30:20.207500Z

it is a kind of interesting connection between regexes and functions, because they can both define something that produces the same result in more than one way, and we tend to think of equivalence for them as based on their results

2020-10-27T21:30:25.207700Z

Is there any good way to get source locations out of read?

2020-10-27T21:30:38.208200Z

(And if not, is there a better way to read a file and get source locations?)

2020-10-27T21:31:22.209100Z

if you use a clojure.lang.LineNumberingPushbackReader with read it will add position metadata to forms it reads (certain forms cannot have metadata attached)

dpsutton 2020-10-27T21:31:37.209500Z

check out (source source) clojure.repl/source

2020-10-27T21:32:07.210600Z

clojure.tools.reader (a clojure reader written in clojure) is a thing that also exists

2020-10-27T21:32:38.211400Z

https://github.com/clojure/tools.reader

Eric Ihli 2020-10-27T21:33:28.212700Z

This is an image of me starting a REPL session in Emacs/Cider, loading a namespace, letting it sit around for ~30 minutes, then calling (def data (read-string (slurp "resources/some-data.edn"))) some-data.edn is only 140mb. Can anyone explain why my heap jumps by almost 2gb when slurping/reading a 140mb edn file?

Eric Ihli 2020-10-28T12:40:51.228500Z

Thanks everyone. This has been a huge help. Great to have cljol in my toolbox!

2020-10-27T21:33:39.213200Z

Okay. Do any of these also work in clojurescript, or are they exclusive to clojure?

bronsa 2020-10-27T21:34:05.213400Z

tools.reader works in both clojure and clojurescript

2020-10-27T21:34:32.214100Z

@bronsa cool, thanks

bronsa 2020-10-27T21:35:23.214500Z

this is the equivalent to LineNumberingPushbackReader http://clojure.github.io/tools.reader/#clojure.tools.reader.reader-types/indexing-push-back-reader

Louis Kottmann 2020-10-27T21:37:43.214800Z

that makes more immediate sense to me than regex equality

2020-10-27T21:37:51.215200Z

Cool. Although these only seem to give line numbers and not also column numbers, is that correct?

Louis Kottmann 2020-10-27T21:38:02.215300Z

as 2 same regexes casted to String should be comparable on their String representation

Louis Kottmann 2020-10-27T21:38:12.215700Z

but I'm probably missing something

bronsa 2020-10-27T21:38:15.216Z

it gives both

2020-10-27T21:38:24.216300Z

Ah, okay, thanks.

bronsa 2020-10-27T21:39:30.216800Z

the data is attached to the object as metadata (if possible)

bronsa 2020-10-27T21:39:44.217200Z

but you can also ask the reader object manually for line/col/file info before/after reading

bronsa 2020-10-27T21:40:21.217700Z

using those protocol functions http://clojure.github.io/tools.reader/#clojure.tools.reader.reader-types/IndexingReader

bronsa 2020-10-27T21:40:44.217800Z

well

bronsa 2020-10-27T21:41:19.218200Z

regex equality is not the same as a regex's representation equality

2020-10-27T21:41:22.218400Z

AHHH, okay, thanks.

2020-10-27T21:41:32.218800Z

(Sorry, still a bit new to the clojure docs. Thanks again.)

alexmiller 2020-10-27T21:55:58.218900Z

well obviously, it needed 2 gb of objects to read that file

alexmiller 2020-10-27T21:56:43.219100Z

whether it's 2 gb of data is a separate question - you might have a lot of garbage that can be collected (or you might not). hard to say without knowing what the data is

alexmiller 2020-10-27T21:58:14.219300Z

or invoking gc at this point and seeing what the used heap is afterwards

Louis Kottmann 2020-10-27T22:07:17.219500Z

Two regexes could have the same string representation ?

bronsa 2020-10-27T22:17:36.219700Z

no, if they have the same representation then they are the same regex

bronsa 2020-10-27T22:17:54.219900Z

but two regex with different representation could be equal

bronsa 2020-10-27T22:18:07.220100Z

so equality of representation is not equality of regex

Eric Ihli 2020-10-27T22:18:28.220300Z

Ah. I see. I assumed 140mb of edn would take up ~140mb of memory. I'm gathering now, that's not the case. nil might get encoded as edn to a certain number of bytes but take up a different number in memory. Same thing with datetimes, etc... And in the process of coercing, a bunch of intermediate objects get created as well.

phronmophobic 2020-10-27T22:19:09.220500Z

this is really neat! I thought there something familiar and I realized I had also seen your talk, "Movies as Programs". I think there's a lot of interesting approaches that have yet to be tried.

bronsa 2020-10-27T22:25:06.220800Z

in general regex equality is decidable, but since most regex implementations (like java's) have extended operator support (e.g. backreferencing), the complexity of equality is at least NP-complete, and I believe possibly PSPACE

alexmiller 2020-10-27T22:41:54.221900Z

All objects have overhead of on the order of dozens of bytes

Eric Ihli 2020-10-27T23:10:08.222400Z

An update for future readers: this appears to be a heap size issue. Confusing, because I never saw my used heap approach max heap, and it would usually churn indefinitely withut ever throwing an OOM exception, and I was only reading into memory from a few files of < 150mb and didn't expect (incorrectly, I now presume) that process to require 4+ gigs of heap. But running my repl with -Xmx6g let me deserialize the file that was previously causing issue.

2020-10-27T23:48:08.222800Z

Clojure strings are JVM strings (in Clojure/Java), and tend to take about 40 bytes per string overhead, plus the storage for the characters in the string, which is often 8-bits per ASCII character if a string is all ASCII, but 16-bit per character if any of them are not ASCII