Is there a database of list of dependencies for a given artifact. I can't access it from the API. I wish to build a website that prints the dependency tree of a given library for the given version and also to track the issue with latest JDK. Is this data available via API or as a JSON dump or do I need to crawl the data?
@xtreak29 Dependencies are part of maven data: http://repo.clojars.org/metosin/eines/0.0.9/eines-0.0.9.pom look for <dependencies>
Thanks. Is POM available as a JSON or much helpful as a dump? That would help me avoid scraping
Sorry. I clicked the link and rendered them in a browser. The XML output is helpful. Still a database will be very much helpful or I need to download the entire POM and construct a database for myself. I am sure someone would have done this already.
As far as I remember, Clojars doesn't have the dependency information in DB
Thanks. It gave me a good pointer. It's just that I don't want to scrap and load the server if the data is available since there are around 153k entries as I can get from http://clojars.org/repo/all-poms.txt.gz.
With each file around 5KB it will be a lot of data to scrape 😕
Hmm I think there is a way to load all pom files, for development environment
Yes, there is rsync option I think
Or maybe I have just used rsync. You could probably tell rsync to only load pom files, not jars.
> If you want to use the actual repo from http://clojars.org, you can grab it via rsync.
> Note that this setup task isn't perfect - SNAPSHOTS won't have version-specific metadata (which won't matter for the operation of clojars, but may matter if you try to use the resulting repo as a real repo), and versions will be listed out of order on the project pages, but it should be good enough to test with. https://github.com/clojars/clojars-web
I am ok with non snapshot data but there is no information about the URL for rsync and so on.
Hmm? Wiki shows the rsync command
Yes, got it thanks : https://github.com/clojars/clojars-web/wiki/Data#rsync-the-whole-repository
But as you said I need only the poms and not the jars
--exclude '*.jar'
might work, or --exclude '*' --include '*.pom'
Thanks a lot. I will try that.
I tried rsync -av --delete <http://clojars.org::clojars|clojars.org::clojars> my-wonderful-copy-of-clojars --include="*/" --include="*.pom" --exclude="*"
. It creates the folders. Around 42MB of empty folders but the pom files are printed on the screen and not downloaded. I cancelled in the middle. Am I missing something? I am trying to use the command against less data
maybe change the include - exclude order, so that include is done after exclude
Is there a way to only do 10% of rsync or something?
Gets me an empty directory
➜ ~ rsync -av --delete <http://clojars.org::clojars|clojars.org::clojars> my-wonderful-copy-of-clojars --exclude="*" --include="*/" --include="*.pom"
receiving incremental file list
./
sent 53 bytes received 55 bytes 30.86 bytes/sec
total size is 0 speedup is 0.00
rsync -av --delete <http://clojars.org::clojars|clojars.org::clojars> my-wonderful-copy-of-clojars --include="*/" --include="*.pom" --exclude="*"
works . It seems there are duplicate pom files that I need to clean up. Thanks a lot.
@xtreak29 there is also https://github.com/clojars/clojars-web/wiki/Data which might have what you need, if not then open an issue and we might be able to add it