@aengelberg @dm3 as a general rule, if there's an error somewhere in a stream transform or operation, the error will be logged and the stream will be closed
which should lead to things being cleaned up
if that's not true in some situations, I'd consider that a bug in my implementation
I get that it usually cleans up resources properly, but it does not propagate the fact that something went wrong. A stream closing could either mean success or failure.
ah, I see
core.async chans have a similar problem.
error propagation is a bit tricky in general, imagine if src
is being split across two different sinks, a
and b
if a
fails, should that cause src
to yield an error too?
or should it just continue to feed into b
?
more to the point, if you're feeding into a network socket and it closes, is that correct behavior or an error?
I don't try to give anything richer than TCP offers, basically
in order to talk about something succeeding at the application level, you need to define your success criteria using logic layered atop the transport layer, not in the transport layer
in fairness, I absolutely could have put!
fail with an error rather than return false if, say, map
fails
but I suspect that people aren't checking for that, and in practice it would just cause the same error to be logged all the way up the topology
Great points
this isn't to say that you couldn't create a richer error model, I'm sure other stream libraries do
but being the lowest common denominator across all stream abstractions makes that difficult
My original complaint stemmed from the fact that I wanted to convert InputStream
into (seq-of bytes)
, and byte-streams
was (unnecessarily?) using a manifold stream in the conversion path, and that conversion silenced errors such as amazon.s3.SocketTimeoutException
that I wanted the consumer of the lazy seq to see.
hmm
it wasn't even logging the error?
It was logging the error
which is great
but it wasn't propagating it upwards
yes
that is an unnecessary conversion to a stream, I think
which I need to happen for my job to die and mark itself as failure
I just need to define a new edge in the conversion graph
but I think the same issue would happen if the connection just closed, right?
it depends on the implementation
is the S3 client validating the length of the value?
in this case I was reading from S3, not writing to it
right, so what happens if the connection closes halfway through downloading the value
is an exception thrown?
but I have a different, separate use case that also suffered from this problem in which I was uploading each chunk as a batch to S3 which could potentially fail
because if so, that strikes me as a little weird, you wouldn't expect the InputStream to be doing that for you
good question, I'd have to think back to the last time I saw this error happen in prod
and this comes back to my original point of "you get some data, it's up to you to decide if it's the correct, full data"
I know you don't control the S3 client api, so I'm not really pointing the finger at you here
but with anything over the network, lack of an explicit error is not a guarantee that everything's correct, which is kind of why I haven't worried about this part of Manifold very much
I could easily reproduce this by just manually opening an inputstream, waiting out the timeout, and seeing what happens
(which isn't a great defense, tbh, but that's my rationale)
so basically my takeaway here is that I need to figure out what exact error I want to be able to see before I complain about not seeing it
possibly, or at least consider that failing to process everything isn't always signaled by an exception
so having exception propagation doesn't actually mean all issues are surfaced, and may lead to thinking that things are correct when they aren't
I'm fairly confident that my S3 read sockettimeout issue was somehow surfacing through one of the InputStream methods. But I'll need to confirm since you make an interesting point that it might just close it instead