Did I here do a good job : https://github.com/RoelofWobben/clojure_ground_up/blob/main/src/ground_up/chapter7.clj
do you have time to learn me what you mean
I can answer a few questions
you were talking I could make my code better
step 1 was to find a answer to this question
how to find the maximum element of a lazy sequence with O(1) memory
right
and you were talking about reduce
but then I needed to sleep
do you mean I have to change map
to reduce
?
the map
in most-prevalent
is fine
maybe just try try to write a function, find-max
that takes a sequence as an argument and returns the maximum value
oke
with respect to most-prevalent
, the goal is to replace:
(sort-by :prevalance)
(take-last 10)
eventually with something like:
(map :prevalence)
(find-largest-elements 10)
do I now need to use reduce
or can I just use the method max
?
max
would work, but the next goal is to try and find-max
to be able to return the two largest elements rather than just the largest
oke, I was thinking
(defn find-max [sequence]
(reduce max sequence))
untestedlooks great
constant memory, and if given a lazy sequence,can process larger than memory data sets
now, can you improve the function to find the two largest elements?
if I may use a helper function
`
(defn top-two [[big1 big2 :as acc] x]
(cond
(> x big1) [x big1]
(> x big2) [big1 x]
:else acc))
(defn find-two-max [sequence]
(reduce top-two [0 0] sequencel))
looks like there's a bug
then I have to test it
wrote it right of my head
oke
this seems to work
(ns clojure.examples.hello
(:gen-class))
(defn top-two [[big1 big2 :as acc] x]
(cond
(> x big1) [x big1]
(> x big2) [big1 x]
:else acc))
(defn find-two-max [sequence]
(reduce top-two [0 0] sequence))
(print (find-two-max '(1 2 3 )))
but it given (3 2)
as answer and not (2 3)
try (top-two [1 2] 2)
gives [ 2 1]
as answer
the answer should be [2 2]
oke
then I think I need to make another arm to the cond
yep
(defn top-two [[big1 big2 :as acc] x]
(cond
(= x big2) [big2 big2 ]
(> x big1) [x big1]
(> x big2) [big1 x]
:else acc))
(defn find-two-max [sequence]
(reduce top-two [0 0] sequence))
(print (top-two [1 2] 2))
gives now the right answerbut this is not good when I want to have a variable number I think
there's still an issue:
(top-two [1 3] 2) ;; [2 1]
oops more to investigate
moment
hmm, still in the dark why this is not working
(defn top-two [[big1 big2 :as acc] x]
(cond
(> x big1) [x big1]
(> x big2) [big1 x]
:else acc))
the structure is fine, but you should double check the returnswhich cond
branch is the test case taking?
this one (> x big1) [x big1]
and it schould return x and big2
yep, then it doing fine
ok, now how would you extend find-two-max
to find-n-max
?
that I find difficult
the only way I see it , it to use a collection that holds the highest numbers
but then it is possible that too much or too little numbers are in it because it is not fixed
and I do not see how I can change the arms of the cond then
so right now im thinking I do not have enough xp to solve this one
that's ok. sometimes it takes a little bit of time
to come up with an answer
especially if it's not similar to other problems you've worked on
I think I need some sort of loop
I can destruct in a parts because I do not know how many items the collection is
or some sort of recursion
I can destruct in a the first and the rest
sorry, I do not see it
get the feeling im close but miss a few pieces or do not know how the pieces fit together
it might make it easier if acc
was sorted
oke, acc
is the outcome we wanted. Right?
and I have to look what is the acc is when the parameter is a collection
I was referring to acc
in top-two
oke
now I m totally confused
I guess top-two
would need to be top-n
I was thinking you wanted this :
(defn top-n [colllection x] ....]
yea
maybe this better : (def topn [collection :as acc x] ......)
if so, then I still do not see how the arms of the collection should be
sorry
arms?
yep. the arms of the cond where I have to check so I can find the highest n numbers
so acc
is the collection we're accumulating into that will contain the top n items. if acc
was sorted, is there a way to check if a new value, x
, should be added, replace a value, or not added?
yep
I agree
lets say we have now (1 2 3)
and we have the number 5
then we schould have (2 3 5)
right ?
and if we have (2 3 5)
and we have 1
nothing have to change
so the middle is never changed
thinking aloud to see if I can figure out how to do it in code
no, I see how things schould be but not how to make it work in code
It looks like im overthinking things
hmm, I see a pattern
when I have (1 2)
and the number 3
I can compare it to the 2 and keep the 2 and add the 3
when I have (1 5 9)
and the number 8
I can compare it to the last in the collection and that is not true so nothing is changed
then I can compare it to the second last one and that is true so replace that number with the given number
so i looks I have to make a reversed loop for comparising
when I have (1 5 7)
and a 8
I can compare it again with the last one that is true so I replace it with the 8
there often built in data structures that will remain sorted as you add values. I can't think of one for clojure off the top of my head though, but there's probably something if I looked hard enough
oke
but then still I have to delete one item and add a item
I was thinking of something simple like: (->> (conj acc x) (sort) (take n))
it's not the most efficient implementation, but since acc
is small, it's probably fine
oke, but it could be very very big when someone wanted to hold the highest 50 or 100 or 1000 items
right. it depends on the use case
if I wanted to prepare for that use case, I would probably look for a data structure on https://www.clojure-toolbox.com/ (under data structures) to see if there's already an efficient data structure for that purpose
I would also investigate sorted-map
and then run benchmarks
but I think we are going to far down the rabbit hole
right
I can live with a collection of 1 - 20 items now
oke, so now arms of a cond just the code you let me see
I have to try if that could work
hmm. not what I expect when trying this in a online repl
(ns clojure.examples.hello
(:gen-class))
(defn top-n [[collection :as acc] x n]
(->> (conj acc x) (sort) (take n)))
(print (top-n [1 2 ] 2 2))
gives [1 2]
where I expect [2 2]
and when i do this :
(ns clojure.examples.hello
(:gen-class))
(defn top-n [[collection :as acc] x n]
(->> (conj acc x) (sort) (take n)))
(print (top-n [1 2 ] 2 1))
I get 1
where I expect 2
so or I do something wrong or the code is not working
or do we have to add a reverse
after the sort
yep, this is working
(ns clojure.examples.hello
(:gen-class))
(defn top-n [[collection :as acc] x n]
(->> (conj acc x)
(sort)
(reverse)
(take n)
(reverse)
))
(print (top-n [1 5 9 ] 8 2))
gives me a (8 9)
which is good and I like to see it this way
so I can use this method/function to do what you wanted to do with replacing code
I will try it tomorrow .
Thanks for the lessons
:thumbsup:
have a good night đ
maybe clojure is a nicer language then I thought
but a totally other one then ruby or c# or haskell
yep, it's definitely in a unique position compared to other languages
I hope i will once be so profient that I can do AOC or exercism challenges
and begin learning web development
I have two projects in mind
Can clojure work properly on Windows or can I better use linux for it
if I get it working , I will begin with the brave book
I think this chapter : https://aphyr.com/posts/352-clojure-from-the-ground-up-polymorphism
is too much for a beginner like me
sorry, still one question about your feedback
you wrote this :
I would separate the data processing from the data loading. it's very common to want to load data from different sources and reuse the data processing functions
but I do it , file is only the filename. the loading and parsing to json is done in the load-json method
right, but most-prevalent
is doing both loading and processing
and there's no way to do just the data processing
so if you had a different data source, you'd have to copy most of the code from most-prevanent
sorry, I miss you
you mean if I want to do the same with a file which do not have the same fields
so you were talking about this part
(map (fn [county]
{:county (fips (fips-code county)),
:prevalance (calculate_prevalance county field-name)
:report-count (field-name county)
:population (:county_population county)
}))
(sort-by :prevalance)
and I thought you were talkimng about this part
(->> file
load-json
if so, I have to think how to solve that
maybe a seperate method for that part
chips, not working
(defn top-n [[collection :as acc] x n]
(->> (conj acc x)
(sort)
(reverse)
(take n)
(reverse)
))
(defn most-prevalent
"Given a JSON filename of UCR crime data for a particular year, finds the
counties with the most DUIs."
[file field-name]
(->> file
load-json
(map (fn [county]
{:county (fips (fips-code county)),
:prevalance (calculate_prevalance county field-name)
:report-count (field-name county)
:population (:county_population county)
}))
(map :prevalance)
(top-n 10)
(reverse)))
(clojure.pprint/print-table (most-prevalent "2008.json" :auto_thefts))
Error:
; Execution error (ArityException) at ground-up.chapter7/most-prevalent (form-init17204185006522729914.clj:70).
; Wrong number of args (2) passed to: ground-up.chapter7/top-n
Can you help me figure out what I did wrong ?
you need a reduce
somewhere
you need to have a find-top-n
that uses reduce and top-n
and you'll want find-top-n
to have the collection as the last argument so that it works with ->>
in most-prevalent
oke, I will think about that next year
in 2 min i s here 2021
happy new year over there!
thanks you too
i've still got 9 more hours of 2020 đ
oke
here it is now 00:04 so 2020 is over with a lot of corona lockdowns and so on
I hope 2021 will be better
think im given up on this : Thought about this :
(def find-top-n [n]
(reduce ((top-n n))))
but I need then another argument. the collection which I do not have
`
(defn most-prevalent
"Given a JSON filename of UCR crime data for a particular year, finds the
counties with the most DUIs."
[file field-name]
(->> file
load-json
(map (fn [county]
{:county (fips (fips-code county)),
:prevalance (calculate_prevalance county field-name)
:report-count (field-name county)
:population (:county_population county)
}))
(map :prevalance)
(find-top-n 10)
(reverse)))
(def find-top-n [n coll]
(reduce (fn [acc x] (top-n acc x n)
coll)))
I haven't tested, but I think something like that should work
nope, it does not work
; Execution error (ArityException) at ground-up.chapter7/find-top-n (form-init4053567356004708660.clj:58).
; Wrong number of args (1) passed to: clojure.core/reduce
oops
that's what I get for not typing into a repl
NP
(def find-top-n [n coll]
(reduce (fn [acc x] (top-n acc x n))
coll))
do those parens match?
i'm blind without my repl
Im going to sleep,
NP
no a wierd error message :
; Execution error (UnsupportedOperationException) at ground-up.chapter7/top-n (form-init704639222239290302.clj:49).
; nth not supported on this type: Float
we can look at it tomorrow or later
really time to sleep here . it's now 00:38
ok, have a good night
maybe you have your repl back
happy new year to you
if you have time and have a repl could you help me with this annoying one
ping
yo
can we work on ther code which is still not working
or are you busy or do not have a repl
I can answer some questions
what's the code look like now?
(ns ground-up.chapter7 (:require [cheshire.core :as json] [clojure.pprint]))
(defn load-json
"Given a filename, reads a JSON file and returns it, parsed, with keywords."
[file]
(json/parse-string (slurp file) true))
(def fips
"A map of FIPS codes to their county names."
(->> "fips.json"
load-json
:table
:rows
(into {})))
(defn fips-code
"Given a county (a map with :fips_state_code and :fips_county_code keys),
returns the five-digit FIPS code for the county, as a string."
[county]
(str (:fips_state_code county) (:fips_county_code county)))
(defn calculate_prevalance
[county field-name]
( if (zero? (:county_population county))
0
(float (/ (field-name county) (:county_population county)))))
(defn most-duis
"Given a JSON filename of UCR crime data for a particular year, finds the
counties with the most DUIs."
[file]
(->> file
load-json
(map (fn [county]
{:county (fips (fips-code county)),
:prevalance (calculate_prevalance county :driving_under_influence)
:report-count (:driving_under_influence county)
:population (:county_population county )
}))
(sort-by :prevalance)
(take-last 10)
(reverse)))
(clojure.pprint/print-table (most-duis "2008.json"))
(defn top-n [[acc] x n]
(->> (conj acc x)
(sort)
(reverse)
(take n)
(reverse)
))
(defn find-top-n [n coll]
(reduce (fn [acc x] (top-n acc x n))
coll))
(defn most-prevalent
"Given a JSON filename of UCR crime data for a particular year, finds the
counties with the most DUIs."
[file field-name]
(->> file
load-json
(map (fn [county]
{:county (fips (fips-code county)),
:prevalance (calculate_prevalance county field-name)
:report-count (field-name county)
:population (:county_population county)
}))
(map :prevalance)
(find-top-n 10)
(reverse)))
(clojure.pprint/print-table (most-prevalent "2008.json" :auto_thefts))
and it produces this error :
; Execution error (UnsupportedOperationException) at ground-up.chapter7/top-n (form-init704639222239290302.clj:49).
; nth not supported on this type: Float
but we are not using nth anywhere
if you type *e
it should print the full stack trace
lots of sequence functions use nth
under the hood
oh, I think I see it:
(defn top-n [[acc] x n]
(->> (conj acc x)
(sort)
(reverse)
(take n)
(reverse)
))
[acc]
should just be acc
tricky
nope, then I get another error :
; Execution error (ClassCastException) at ground-up.chapter7/top-n (form-init3018563272993570646.clj:50).
; class java.lang.Float cannot be cast to class clojure.lang.IPersistentCollection (java.lang.Float is in module java.base of loader 'bootstrap'; clojure.lang.IPersistentCollection is in unnamed module of loader 'app')
it seems like the issue is in top-n
. do you have a guess as to what might be causing the error?
not complete
it looks like we are using something as a float where the compiler wants a collection
thinking now to print acc x and n
to see what they exactly contains
he
2.861667E-41.8142852E-4 nil10
so x seems to be nill
wonder why
this bug is also kinda tricky
it's because reduce
isn't called with a an initial state
(defn find-top-n [n coll]
(reduce initial-val
(fn [acc x] (top-n acc x n))
coll))
oke, I saw that coll in the find-top-n was also not right
with initial-val
being the starting value for the reduce state
do you know what the initial val should be?
doubt between 1 and 0
what type should acc
be?
a vector
so I tried
(defn find-top-n [n coll]
(reduce [0 0]
(fn [acc x] (top-n acc x n))
coll))
it should probably just be []
`; Execution error (ArityException) at ground-up.chapter7/find-top-n (form-init3018563272993570646.clj:58).
; Wrong number of args (2) passed to: clojure.lang.PersistentVector
oh whoops
args in the wrong order
(defn find-top-n [n coll]
(reduce (fn [acc x] (top-n acc x n))
[]
coll))
; Execution error (IllegalArgumentException) at ground-up.chapter7/eval22508 (form-init3018563272993570646.clj:80).
; Don't know how to create ISeq from: java.lang.Float
(ns ground-up.chapter7 (:require [cheshire.core :as json] [clojure.pprint]))
(defn load-json
"Given a filename, reads a JSON file and returns it, parsed, with keywords."
[file]
(json/parse-string (slurp file) true))
(def fips
"A map of FIPS codes to their county names."
(->> "fips.json"
load-json
:table
:rows
(into {})))
(defn fips-code
"Given a county (a map with :fips_state_code and :fips_county_code keys),
returns the five-digit FIPS code for the county, as a string."
[county]
(str (:fips_state_code county) (:fips_county_code county)))
(defn calculate_prevalance
[county field-name]
( if (zero? (:county_population county))
0
(float (/ (field-name county) (:county_population county)))))
(defn most-duis
"Given a JSON filename of UCR crime data for a particular year, finds the
counties with the most DUIs."
[file]
(->> file
load-json
(map (fn [county]
{:county (fips (fips-code county)),
:prevalance (calculate_prevalance county :driving_under_influence)
:report-count (:driving_under_influence county)
:population (:county_population county )
}))
(sort-by :prevalance)
(take-last 10)
(reverse)))
(clojure.pprint/print-table (most-duis "2008.json"))
(defn top-n [acc x n]
(->> (conj acc x)
(sort)
(reverse)
(take n)
(reverse)
))
(defn find-top-n [n coll]
(reduce (fn [acc x] (top-n acc x n))
[]
coll))
(defn most-prevalent
"Given a JSON filename of UCR crime data for a particular year, finds the
counties with the most DUIs."
[file field-name]
(->> file
load-json
(map (fn [county]
{:county (fips (fips-code county)),
:prevalance (calculate_prevalance county field-name)
:report-count (field-name county)
:population (:county_population county)
}))
(map :prevalance)
(find-top-n 10)
(reverse)))
(clojure.pprint/print-table (most-prevalent "2008.json" :auto_thefts))
it's not showing a line number for the error
is there a way to do something like eval-buffer?
usually that will fix the exception not showing a proper file and line number
no idea if that is possible
is line 80 now the culprit ?
I use vs code with calva
I use emacs
line 80 might be it
what's line 80?
I see it
I think
the code gives now
(0.0051150895 0.004174161 0.0036036037 0.0030321407 0.00243309 0.002319513 0.0018750526 0.0016433854 0.0015932024 0.0015475558)
nil
success!?
print-table wants this :
Prints a collection of maps in a textual table. Prints table headings
so we loose somewhere the rest of the data
ah, I get it
the old code printed the name, the prevelance, population and the number of dui
right
hmm, I have to think now how we can hold the data
it looks our find-top-n looses it
the extra info is discarded before find-top-n is called
do you see where?
o, I deleted the map but that makes another error message
right, because find-top-n expects a list of comparables (like a list list of numbers)
If I delete the (map :
prevelance)``
jthen I see this :
; Execution error (ClassCastException) at java.util.TimSort/countRunAndMakeAscending (TimSort.java:355).
; class clojure.lang.PersistentArrayMap cannot be cast to class java.lang.Comparable (clojure.lang.PersistentArrayMap is in unnamed module of loader 'app'; java.lang.Comparable is in module java.base of loader 'bootstrap')
you'll need to update find-top-n
and top-n
to accept a key function to sort with
check out the docs for sort-by
and see if you can think of a way to update find-top-n
and top-n
to also accept a keyfn
oke, so I have to change (sort)
by sort-by
?
that's a good start
then I see this :
({:county TX, Kenedy, :prevalance 0.0051150895, :report-count 2, :population 391} {:county NM, Grant, :prevalance 0.004174161, :report-count 123, :population 29467} {:county OR, Gilliam, :prevalance 0.0036036037, :report-count 6, :population 1665} {:county OR, Sherman, :prevalance 0.0030321407, :report-count 5, :population 1649} {:county TX, Hudspeth, :prevalance 0.00243309, :report-count 8, :population 3288} {:county TX, Hall, :prevalance 0.002319513, :report-count 8, :population 3449} {:county MO, Jackson, :prevalance 0.0018750526, :report-count 1514, :population 807444} {:county TX, Refugio, :prevalance 0.0016433854, :report-count 12, :population 7302} {:county AK, Prince of Wales-Outer Ketchikan, :prevalance 0.0015932024, :report-count 3, :population 1883} {:county MD, Baltimore city, :prevalance 0.0015475558, :report-count 982, :population 634549})
nil
he, the names has changed, not good
it sorts on name not on prevelance
| :county | :prevalance | :report-count | :population |
|--------------+--------------+---------------+-------------|
| AL, Autauga | 2.861667E-4 | 15 | 52417 |
| AL, Baldwin | 1.8142852E-4 | 32 | 176378 |
| AL, Barbour | 1.4354926E-4 | 4 | 27865 |
| AL, Bibb | 9.219989E-5 | 2 | 21692 |
| AL, Blount | 1.9146418E-4 | 11 | 57452 |
| AL, Bullock | 0.0 | 0 | 10705 |
| AL, Butler | 3.988036E-4 | 8 | 20060 |
| AL, Calhoun | 5.771285E-4 | 67 | 116092 |
| AL, Chambers | 4.6220067E-4 | 16 | 34617 |
| AL, Cherokee | 8.107998E-5 | 2 | 24667 |
when I do this :
(defn top-n [acc x n]
(->> (conj acc x)
(sort-by :prevelance)
(reverse)
(take n)
(reverse)
))
this is not funny anymore
this list is different:
({:county TX, Kenedy, :prevalance 0.0051150895, :report-count 2, :population 391} {:county NM, Grant, :prevalance 0.004174161, :report-count 123, :population 29467} {:county OR, Gilliam, :prevalance 0.0036036037, :report-count 6, :population 1665} {:county OR, Sherman, :prevalance 0.0030321407, :report-count 5, :population 1649} {:county TX, Hudspeth, :prevalance 0.00243309, :report-count 8, :population 3288} {:county TX, Hall, :prevalance 0.002319513, :report-count 8, :population 3449} {:county MO, Jackson, :prevalance 0.0018750526, :report-count 1514, :population 807444} {:county TX, Refugio, :prevalance 0.0016433854, :report-count 12, :population 7302} {:county AK, Prince of Wales-Outer Ketchikan, :prevalance 0.0015932024, :report-count 3, :population 1883} {:county MD, Baltimore city, :prevalance 0.0015475558, :report-count 982, :population 634549})
and is actually sorted on prevalance
where's the other list coming from?
which other list
?
the list sorted by names?
I found out make a typo
used : prevelance
in place of prevalance
so I think the code is working again
this looks better
| :county | :prevalance | :report-count | :population |
|-------------------------------------+--------------+---------------+-------------|
| TX, Kenedy | 0.0051150895 | 2 | 391 |
| NM, Grant | 0.004174161 | 123 | 29467 |
| OR, Gilliam | 0.0036036037 | 6 | 1665 |
| OR, Sherman | 0.0030321407 | 5 | 1649 |
| TX, Hudspeth | 0.00243309 | 8 | 3288 |
| TX, Hall | 0.002319513 | 8 | 3449 |
| MO, Jackson | 0.0018750526 | 1514 | 807444 |
| TX, Refugio | 0.0016433854 | 12 | 7302 |
| AK, Prince of Wales-Outer Ketchikan | 0.0015932024 | 3 | 1883 |
| MD, Baltimore city | 0.0015475558 | 982 | 634549 |
nil
:thumbsup:
I only do not know if the data is allright
but it looks well
thanks
I hope I did not costs you too much with my stupid questions
and not much people here have the time for feedback
see still something wierd
on prevelance I sayit must be a float
(float (/ (field-name county) (:county_population county)))))
but this is not a float 8.70322E-4
ping any idea why this happens ?
this one does it right :
(defn most-duis
"Given a JSON filename of UCR crime data for a particular year, finds the
counties with the most DUIs."
[file]
(->> file
load-json
(map (fn [county]
{:county (fips (fips-code county)),
:prevalance (calculate_prevalance county :driving_under_influence)
:report-count (:driving_under_influence county)
:population (:county_population county )
}))
(sort-by :prevalance)
(take-last 10)
(reverse)))
and this one not :
(defn most-prevalent
"Given a JSON filename of UCR crime data for a particular year, finds the
counties with the most DUIs."
[file field-name]
(->> file
load-json
(map (fn [county]
{:county (fips (fips-code county)),
:prevalance (calculate_prevalance county field-name)
:report-count (field-name county)
:population (:county_population county)
}))
(find-top-n 10)
(reverse)
))
what makes you think 8.70322E-4
is not a float?
looks to me more scientific
I think there are just fewer responses because it's around the holidays
I think it's just printing it out differently
one does display 0.008
and the other 8e-4
yep, and I wonder why
I want both to display the same
if possible
how it's formatted depends on how you're printing it and what formatter is being used
if you care how it's formatted, you should explicitly format it
oke
so clojure.pprint/print-table
can display two things different
ÂŻ\(ă)/ÂŻ
> (type (float 8.70322E-4))
java.lang.Float
> (float 8.70322E-4)
8.70322E-4
thanks, I let it be
and tomorrow try to make the challenges of chapter2 of the brave book
Thanks a lot with all the patience with me
and I hope Im a good "student"
đ
why the smile ?
I think im older then you đ
GN
these are challenges from this page
looks pretty good. a few thoughts:
⢠it seems like most-duis
could be written in terms of most-prevalent
⢠most-prevalent
receives a file and immediately calls load-json
. I would separate the data processing from the data loading. it's very common to want to load data from different sources and reuse the data processing functions
⢠calculate_prevalence
should probably be calculate-prevalence
⢠sort-by
requires loading the full data set into memory. since you only need the 10 most prevalent values, most-prevalent
could be re-written to keep at most 10 values in memory so that a larger than memory data set could be processed
oke, and how could I do the last point
the challemge said only to display the 10 most
the given dataset is some 18 thousand entries
I think rewriting it so that the memory consumption is O(1) rather than O(n) is a good exercise.
if no solution comes to mind, I would try first figuring out 1. how to find the maximum element of a lazy sequence with O(1) memory 2. how to find the 2 largest elements ... 3. finally, how to find the n largest elements ...
may I have then hints how to do so
jus learning clojure for a week
the same techniques would apply in just about any language
oke, give me then time to find the answer to the first question
right now no idea how I can find that out
are you familiar with reduce
?
yep. I have learned that
like (reduce + [1 2 3 4 5])
to add up all items of a collection
but it late here so if you do not mind im heading to bed
no problem. have a good night
thanks
if you still wants , we can tomorrow talk on this when it's not so late here