UCInterpreter.merge
is responsible for merging UCValue
s
together they are a join semi-lattice https://en.wikipedia.org/wiki/Semilattice
currently merge is bugged and can create cycles
asm Analyzer takes a method and returns an array of Frames, one frame per instruction.
SemiLattice for a class name would have been neat š
a Frame represents the (abstract) state of the VM at a given instruction: it contains abstract values for all locals and the stack
frames[0]
is initialized with an empty stack, and abstract arguments (created only from their type). Unitiliazed locals are abstracted to whatever Interpreter.newValue(null)
returns. This value is also used for padding of double-word values (longs and doubles).
starting from frame[0] the analyzer interprets the bytecode and produces next frames
when a jump occurs to a previously visited instruction then the new frame and the old frames are merged and subsequent frames will be recomputed.
for this process to terminate it must converge
the semilattice properties assures that progress is made, convergence involves the interpreter not creating ever āgreaterā abstract values.
Here are the kind of abstract values we try to represent: ā¢ constants (strings, numbers, classes for a start) as their own value ā¢ uninitialized ā¢ instances with their types
hmm so what happens in merge with different sized Value
s?
looking at
@Override
public Value merge(final Value v, final Value w) {
// is this enough?
if (v.equals(w)) return v;
if (((UCValue) v).type == null && w.getSize() == 1) return w;
if (((UCValue) w).type == null && v.getSize() == 1) return v;
return UCValue.UNINITIALIZED_VALUE;
}
UCValue.UNINITIALIZED_VALUE
is returned I guess
switched to Idea+Cursive for Java land browsing, seems that I have to renew Cursive license
these 4 lines of code are baaaaad (well #2 and #3)
thought so š
or innocent at least, they need friends š
good friends
Iām not even sure that in legal bytecode you could have that case (merging different sizes)
reuse of space happens but between different points in the program
merge occur at the same place in a program
it makes no sense to say āat instruction #42, local #1 and #2 are one double or two object references.
āone double or one longā (no diff in size) seems already wrong
the only case where it may happen is when an āunitializedā slot is merged with a double-word one
and this kind of merge should not happen (from my current understanding of the analyzer) with ānot initializedā values (cf access arrays)
so hmm, did the problematic behavior of merge surface only after these two commits? https://github.com/cgrand/portkey/commit/217acd4e6eae54ee995363861f362135b1e197b3 5b651877291c165e807a39e29133f1612473f30a
these two commits allowed to go further and reach problematic code
yep
hmm, Kotlin uses also Interpreter https://github.com/jetbrains/kotlin/commit/c24e6b56985f86e857523f9bbce27395a3f33945