Was surprised to find that my ETL pipeline spends a considerable amount in SSL processing (JDK 8, for now). I don’t control the code that handles the connections (MongoDB and JDBC Postgres) - is there a way to speed things up other than throwing hardware at the problem?
This is a long shot, but does the flame-graph show wait time (e.g. waiting for I/O)? Have you checked the entropy levels on the machine that's running this (specially if it's a remote instance)
I think that I/O is built in the various function calls. Not sure what entropy levels is about…
Like this? https://forums.aws.amazon.com/thread.jspa?messageID=249079
It might be that I’m also saturating some I/O channel. Need to dig deeper into AWS to see the metrics.
@orestis That sounds like your code is creating and tearing down a lot of connections -- perhaps look at ways to have longer-lived connections and more connection pooling?
Sorry, I meant that the JVM is spending quite a lot of time doing SSL operations. I’ll paste the relative part of the flame graph here
Sorry, I don't know how to read that.
This is the mongo-specific part — sun.security.ssl.InputRecord.decrypt
is taking half the time.
Width is time spent in a specific function, bottom layers include the top.
(Super cool tool: http://clojure-goes-fast.com/blog/profiling-tool-async-profiler/)
Effectively this looks like when I’m loading documents from mongo, 55% of the CPU is spent on doing the BSON decoding, and 45% on the SSL decoding.
(this feels like yet more support for our decision to stop using MongoDB 🙂 )
Heh, you’d think — but here’s the relevant part of for JDBC 😄
again, sun.security.ssl.AppOutputStream.write
is taking like 40% of the time
Security is expensive 🙂
(I’m loading documents from Mongo, doing light massage and dumping them into Postgres)
Good thing I decided to profile because I naively thought that the massaging time would be my bottleneck, but turns out it isn’t.
I guess I'd ask "Is the process fast enough?" rather than "Where is it spending its time?"
Nope, it’s not 😄
Normally, the big speed ups come from algorithmic changes -- and it doesn't seem like you've got much hope of those.
So... hardware? Maybe that is your only option?
There might be “free” performance gains by bumping to JDK 11, which I’ll try next. Googling about JVM SSL performance shows that it is a common issue (Netty says use OpenSSL)
We moved from 8 to 11 a while back for nearly all our processes but I can't say we noticed much speedup.
(I suspect we're bottlenecked on other stuff than SSL tho')
Yeah this is a special case in that it syncs Mongo to Postgres and has to pretty much pipe the entire database. It wouldn’t be noticeable with normal day-to-day operations.
Hm, this seems relevant: https://github.com/google/conscrypt/
I’ll try throwing hardware at it and see what changes
Hardware does make a difference, and this is a nicely parallel problem so I think I can live with this for now.
This is interesting, given that the SSL stuff takes at most 50% (or less), it seems, I wouldn't hope to get a huge performance gain from optimizing it (Ambdahl's law). Did you try to run it without SSL? Does it make a huge difference?
No SSL is not an option so I didn’t even try :) trying to run this on production configurations so even disabling SSL is not something possible.
The thing is that people claim that OpenSSL is 1 or 2 orders of magnitude faster so essentially I would get twice the speed if SSL becomes insignificant.
Assuming 45% spend on SSL and 10x performance speedup (which would be really interesting to verify if that's possible) you get at most 1.7x speedup: https://www.google.com/search?q=1+%2F+(+(1+-+0.45)+%2B+0.45%2F10)+)&rlz=1C5CHFA_enCZ836CZ837&oq=1+%2F+(+(1+-+0.45)+%2B+0.45%2F10)+)&aqs=chrome..69i57j6j69i64l2.11187j0j7&sourceid=chrome&ie=UTF-8 (https://en.wikipedia.org/wiki/Amdahl%27s_law) So it depends whether that is enough - I assume if you noticed it is slow, this speedup might not be enough 🙂