parinfer

cfleming 2017-07-20T01:34:20.443302Z

@shaunlebron Ok, thanks. My problem is that instead of returning a whole string, I just return a series of edits, since that’s much more efficient in IntelliJ (the edits are nearly always small).

cfleming 2017-07-20T01:34:41.447220Z

I’m finding that the edits are not consistently created in one space or the other (input or output)

cfleming 2017-07-20T01:35:56.462255Z

I think it’s probably a subtlety in the paren trail handling - I also process the whole document, I don’t break it into lines. So if result.x isn’t correctly updated when closing the paren trail out, that wouldn’t affect your processing but probably affects mine.

cfleming 2017-07-20T01:36:49.473086Z

Returning edits also meant that I got the cursor movement for free, since that wasn’t there previously.

cfleming 2017-07-20T01:47:59.610129Z

I think the issue is this: when replaceWithinLine is called, that then shifts the x values that you need to use for changes after the current one, unless the edits are consistently made in input space.

cfleming 2017-07-20T01:48:52.620775Z

That happens in commitChar because ch is set to "". However it doesn’t happen in correctParenTrail, but I think it doesn’t matter for you because you’re modifying line by line.

cfleming 2017-07-20T02:29:59.106971Z

I’m thinking I’m going to try to consistently produce the edits in input space, which is what I had previously and I already have the code for working out the correct deltas to apply the changes. The only complication is that I then need the paren trail x values in input space, hopefully that’s not too tricky.

cfleming 2017-07-20T02:41:53.242740Z

@shaunlebron Actually, do you only use the output coords for knowing where to make your edits? If I want my edits to be in input coords, could I simplify things by only working with input coords?

shaunlebron 2017-07-20T02:45:44.286595Z

i track my edits in input coords

shaunlebron 2017-07-20T02:48:05.311752Z

but I’m not able to follow much of what you said actually

shaunlebron 2017-07-20T02:49:49.330359Z

is the divergence in behavior due to an inconvenience in how parinfer.js is behaving, related to intellij?

shaunlebron 2017-07-20T02:52:16.357301Z

i know there may be particular reasons for creating separate implementation, perhaps mostly to explore ideas that may inform core—having said that, I think it will be difficult to help you keep yours in sync with parinfer.js as changes will be inevitable due to the pace of smartMode changes

cfleming 2017-07-20T03:05:23.497627Z

The edits I’m seeing are definitely not all in input coords. For example, edits are often created from the paren trail x values, and they’re created from result.x

cfleming 2017-07-20T03:05:40.500512Z

Should they be created from result.inputX?

cfleming 2017-07-20T03:07:48.522270Z

So my initial implementation diverged from how parinfer.js worked for a couple of reasons. Mostly the difference was that I process the whole input document rather than splitting it into lines, to reduce GC pressure.

cfleming 2017-07-20T03:08:48.532195Z

Similarly, I return a series of edits to apply rather than returning a whole new document.

cfleming 2017-07-20T03:09:35.540324Z

When I was perf testing, node was actually faster at a lot of this string manipulation than the JVM, presumably they have to optimise it based on usage patterns since JS doesn’t have a mutable StringBuffer-like thing.

cfleming 2017-07-20T03:10:00.544705Z

That made the performance much better, and wasn’t a significant change to the algorithm.

cfleming 2017-07-20T03:10:36.550969Z

However those changes seem to have been complicated by the new input/output coord stuff.

cfleming 2017-07-20T03:11:06.556088Z

It may also just be a simple bug in my port, but it’s hard to tell - there’s quite a lot of code now, and it’s subtle stuff in places.

shaunlebron 2017-07-20T03:16:19.609398Z

ah, I’m sorry I was thinking of changes when you said edits for some reason

cfleming 2017-07-20T03:16:46.614190Z

No, I mean the x values passed to replaceWithinLine

cfleming 2017-07-20T03:17:17.619829Z

But it actually seems like the paren trail x values should be in input coords anyway, since they’re compared to the cursorX values.

shaunlebron 2017-07-20T03:17:59.627211Z

yeah, edits are incremental, so they have to be in output coords given what i’m currently doing

shaunlebron 2017-07-20T03:18:09.628806Z

cursorX is updated when edits are made behind it

cfleming 2017-07-20T03:18:49.635935Z

So if I wanted all my edits to be in input coords, do you think I could get rid of the x/inputX distinction altogether?

cfleming 2017-07-20T03:19:07.639263Z

Is that likely to complicate my life?

cfleming 2017-07-20T03:19:41.645276Z

I accumulate a list of edits, and when I’m applying them maintain a running delta, so I take care of the incremental nature at that time.

shaunlebron 2017-07-20T03:20:43.656224Z

I think that running delta will probably be fine

cfleming 2017-07-20T03:21:21.662960Z

Ok. I’ll try those changes in my port and see if that helps.

shaunlebron 2017-07-20T03:22:12.671558Z

I guess you keep track the line number and column as a derivative value of your “global” index?

cfleming 2017-07-20T03:22:30.674879Z

I’ll see afterwards if I could apply the same changes to the JS version, and send a pull request if so. It’s a fairly minor change to not require splitting into lines, and that will help the GC on JS anyway even if it’s better at optimising it.

shaunlebron 2017-07-20T03:23:15.682714Z

cool! I think line splitting and array joining is super fast on JS, but probably not elsewhere?

cfleming 2017-07-20T03:23:34.685922Z

Right, I maintain an offset which is a char offset into the original doc. Then I count lines as I see newlines, and reset the x at that point.

shaunlebron 2017-07-20T03:24:06.691785Z

I just have no concept of how fast insertion and removal of characters inside a giant string can be

cfleming 2017-07-20T03:24:16.693509Z

Right, but on a big file you’re still (at least) doubling the memory requirements - in a long-running editor that can add up.

cfleming 2017-07-20T03:24:54.699633Z

It would be interesting to see if creating an array of edits was faster in JS too.

shaunlebron 2017-07-20T03:25:06.701826Z

this is a bit of a blind spot for me—isn’t a single character insertion inside a giant string reallocating the whole thing anyway?

cfleming 2017-07-20T03:25:25.705188Z

I suspect that in any editor that maintains non-trivial indices, applying small edits is likely much faster.

shaunlebron 2017-07-20T03:25:45.708423Z

oh, I see, you’re not reallocating, you’re modifying the editor buffer in-place with a list of edits

cfleming 2017-07-20T03:25:46.708630Z

Well, editors don’t generally use a giant string for their documents.

cfleming 2017-07-20T03:25:50.709204Z

Right.

shaunlebron 2017-07-20T03:26:15.713506Z

I like this idea

cfleming 2017-07-20T03:26:49.719223Z

It would be interesting to know if e.g. Atom works natively with a single offset into a document rather than cols/lines - I’m willing to bet that it requires some calculation to work out line/col coords.

cfleming 2017-07-20T03:27:14.723473Z

Although they probably do some trickery to make it efficient.

cfleming 2017-07-20T03:28:30.736208Z

What might be possible would be an API which returns a list of changes, and then one like the current one that calls the other, and then applies the edits to the original string and returns that. Building that new string is probably still much more efficient than line splitting, since you’re only splitting/joining strings where things have actually changed.

cfleming 2017-07-20T03:29:18.744857Z

Ok, I have to go now - I might actually look at getting the JS version working like this first, which would make my port easier, and you can take a look at it.

shaunlebron 2017-07-20T03:29:37.747936Z

that would be great, thanks colin!

cfleming 2017-07-20T03:29:48.749900Z

Then I can start from something working, and ensure the tests keep working.

shaunlebron 2017-07-20T03:30:02.752238Z

good idea

cfleming 2017-07-20T03:30:06.753336Z

Cool, seeya

shaunlebron 2017-07-20T03:30:09.753849Z

bye