clojars

http://clojars.org discussion and “support”, see http://status.clojars.org for status.
2021-02-24T00:23:40.006500Z

I'd like to write up an announcement about the verification plan and timeline first, just so we have a place to point folks to as word of the changes percolate through the community. I'll try to get that out this weekend.

seancorfield 2021-02-24T01:20:56.006700Z

Sounds good! I won't update my READMEs or clj-new until after that is up.

2021-02-24T01:27:11.006900Z

Thanks!

borkdude 2021-02-24T11:48:59.007500Z

@tcrawley food for thought (in screenshot, conversation from #tools-deps) https://github.com/borkdude/deps-infer

borkdude 2021-02-24T13:14:58.008700Z

@tcrawley I think for the purpose of libs like these, it would be super awesome if clojars had some kind of index of jars + the list of files in each jar, as EDN, or transit, which refreshed every so often (daily, weekly, monthly)

2021-02-25T13:43:08.012300Z

I tried to run it on the repo cached on the server last night, but realized my recollection of how we build the maven index was wrong - we pull down the poms, not the jars for indexing :( However, I think we could: • pull down the jars once and index those, then store the index in s3 • index new jars as they are deployed, then merge with the existing index This should work since existing releases are immutable. We could also store the index as many timestamped files - that would allow clients to be able to cache the index, pulling down new files and merging them. I suspect the full index file will be pretty large.

borkdude 2021-02-25T13:46:03.012500Z

yeah, those are good ideas

borkdude 2021-02-25T13:47:07.012700Z

I like the second idea

borkdude 2021-02-25T13:47:19.012900Z

then we can just pull only the latest files

2021-02-25T13:52:21.013100Z

Good deal. We should probably open an issue at https://github.com/clojars/clojars-web/issues/new/choose and continue this discussion there

borkdude 2021-02-25T13:55:00.013300Z

https://github.com/clojars/clojars-web/issues/793

2021-02-25T13:55:21.013700Z

Thanks!

borkdude 2021-02-27T09:19:09.024900Z

I think it might be better to have one file per namespace actually, since the amount of namespaces to check is usually little and downloading the entire index would be wasteful in that case. Just one http request per namespace would be ideal.

borkdude 2021-02-27T09:19:25.025100Z

If you agree, I can change the code to produce those files

2021-02-24T14:15:21.008900Z

I think that would be great! I'm focused on adding group validation currently, but we could tackle this afterward. Do you have code already that will generate the index for a single jar?

borkdude 2021-02-24T14:17:44.009100Z

@tcrawley Yeah, this code is in https://github.com/borkdude/deps-infer We could work on this together if you want. The part I do not control is the "ops" side, but I can write the "script" that produces the index from a dir of jars

2021-02-24T14:22:04.009500Z

A script to processes a sparse maven repo dir would do the trick. "sparse" meaning it is in the correct shape (`group-name/artfact-name/0.1.0/artifact-name-0.1.0.jar`), but has no pom files. The repo is in s3, but we sync down all of the jar files nightly in order to generate the maven-style indexes for tooling, and could generate this index as part of that process.

2021-02-24T14:24:41.009800Z

We could then upload these ns indexes to s3 alongside the feeds/jar lists: https://github.com/clojars/clojars-web/wiki/Data#list-of-jars-and-versions-in-leiningen-syntax

borkdude 2021-02-24T14:27:42.010Z

Sounds excellent

borkdude 2021-02-24T15:06:32.010200Z

@tcrawley Right now I have some code which walks over a dir with .jar files and produces one huge map:

{accountant.core
 [{:mvn/version "0.2.5",
   :file "accountant/core.cljs",
   :group-id "venantius",
   :artifact "accountant"}],
 adzerk.boot-cljs
 [{:mvn/version "2.1.5",
   :file "adzerk/boot_cljs.clj",
   :group-id "adzerk",
   :artifact "boot-cljs"}],
 adzerk.boot-cljs-repl
 [{:mvn/version "0.4.0",
   :file "adzerk/boot_cljs_repl.clj",
   :group-id "adzerk",
   :artifact "boot-cljs-repl"}],
 adzerk.boot-cljs.impl
 [{:mvn/version "2.1.5",
   :file "adzerk/boot_cljs/impl.clj",
   :group-id "adzerk",
   :artifact "boot-cljs"}],
 adzerk.boot-cljs.js-deps
 [{:mvn/version "2.1.5",
   :file "adzerk/boot_cljs/js_deps.clj",
   :group-id "adzerk",
   :artifact "boot-cljs"}],
 adzerk.boot-cljs.middleware
 [{:mvn/version "2.1.5",
   :file "adzerk/boot_cljs/middleware.clj",
   :group-id "adzerk",
   :artifact "boot-cljs"}],

borkdude 2021-02-24T15:07:04.010400Z

Perhaps it would be better to partition this into multiple files

borkdude 2021-02-24T15:08:16.010600Z

For my local .m2 dir the file is 130822 lines long

borkdude 2021-02-24T15:10:13.010800Z

@tcrawley I have this code here: https://github.com/borkdude/deps-infer/blob/main/src/deps_infer/clojars.clj It prints to stdout. You can run it with clojure -M -m deps-infer.clojars > /tmp/index.edn

borkdude 2021-02-24T15:11:10.011200Z

This file takes 200ms to parse to EDN on my machine which is still quite ok

borkdude 2021-02-24T15:11:19.011400Z

But for the entire clojars it might get a little bit bloated

borkdude 2021-02-24T15:57:54.011700Z

You can change the location of the dir it scans for .jar files with --repo

2021-02-24T15:59:08.011900Z

Thanks! I'll see if I can find some time today to kick this off on the server to see how long it takes and how large of a file it produces.

borkdude 2021-02-24T16:01:46.012100Z

I produced both an .edn and .transit file and zipped both, here's how it looks on my machine:

$ ls -la /tmp/index*
-rw-r--r--  1 borkdude  wheel  4363922 Feb 24 16:07 /tmp/index.edn
-rw-r--r--  1 borkdude  wheel   214482 Feb 24 17:00 /tmp/index.edn.zip
-rw-r--r--  1 borkdude  wheel  3594066 Feb 24 16:59 /tmp/index.transit.json
-rw-r--r--  1 borkdude  wheel   393184 Feb 24 17:01 /tmp/index.transit.zip
Funnily enough, the zipped edn looks better than the zipped transit.