Re: [dev] [vis][RFC][PATCH 2/2] Implement the execution of external commands and cmd_substitute from Silvan Jegen on 2015-05-14 (dev mail list archive)

From: Silvan Jegen <s.jegen_AT_gmail.com>
Date: Thu, 14 May 2015 10:05:42 +0200

On Thu, May 14, 2015 at 12:30:54AM +0200, Marc André Tanner wrote:
> On Thu, May 07, 2015 at 10:15:34AM +0200, Silvan Jegen wrote:
> > On Wed, May 6, 2015 at 9:47 PM, Marc André Tanner <mat_AT_brain-dump.org> wrote:
> > * If the text is marked read only anyways, is it actually useful to
> > have a "responsive" editor at all?
> > * Measuring progress is hard when you do not know the goal. In the sed
> > case you can only guess how much progress has been done by comparing
> > the count of input and output bytes for example. Even then, since you
> > can do something like 's/a/AAAAAAAAAAAA/g' there is no direct
> > correspondance. That said, there is the 'pv'[0] program that does
> > something similar. I assume it just checks the amount of data sent
> > through the pipe (but depending on the buffering and processing speed
> > of the output end of the pipe, predictions on running time will
> > probably be quite inaccurate)?
>
> Maybe progress indication is the wrong word. I wasn't thinking along the
> lines of percentage completed but rather an indication that there is a
> background task running and a way to cancel it.
>
> > * Is it worth adding the libuv dependency in that case?
>
> If I ever find the time to experiment with a client/server architecture
> then a proper mainloop will be needed. Using something like libuv doesn't
> seem like the worst idea ...

What makes the library somewhat suspicious to me is that is was made
for Node.js... :P I don't know enough about the library to give you
a real opinion. The Neovim people use it to make their I/O platform
independent AFAIR.

> > >> This approach uses a unnecessary amount of memory too since it keeps a lot
> > >> of the input data for the external process
> > >
> > > This should not be necessary, it should be possible to repeatedly call
> > > text_range_write (ignoring for now that it always takes a snapshot) with
> > > a resonable range which is adjusted each time.
> > >
> > >> as well as its output data in memory at the same time
> > >
> > > Similarly it should be possible to insert the data from the external process
> > > as it becomes available. One idea is to write the new data after the
> > > specified range which is being written out.
> >
> > This is of course true. At the same time that would mean that the
> > external command and the text_range_write have to be called several
> > times which has some overhead (especially with big ranges of course)
> > while requiring more elaborate logic. It's a tradeoff.
>
> I'm not sure why you think the external command would be called multiple
> times? Instead of your current approach of first copying the data into
> a contiguous memory block you would simply repeatedly call text_range_write
> in a loop with different ranges. Similarly to how you are currently doing
> it with write(2).

Hm, you are right. We would just have to write the data and then refill
the buffer.

> > > Assuming we want to process the range [10, 100] it would work as follows:
> > >
> > > 1) text_snapshot
> > >
> > > 2) while (data to write or data to read)
> > > if (data to write)
> > > write range [10, 20] (on next iteration [20, 30] etc ...)
> > > if (data to read)
> > > read range and insert into [100, 110] ([110, 120], ...)
> > >
> > > 3) read command exit status depending on success
> > > - either delete original range [10, 100] and snapshot
> > > - or revert to last snapshot (text_undo)
> >
> > I think that would work but in case that there is some text after
> > position 100 already you would have to either
> >
> > 1. handle the existing text by moving it on each insertion
>
> This is of course done implicitly. However no data is moved around.
> Upon the first insertion a new piece is created and linked in at the
> respective position. Since the inserts are contiguous the piece is
> then simply expanded by appending the new data to the existing buffer.
> That is further inserts basically amount to a memcpy(3) (ok that is
> a bit simplified, but true in principle).
>
> The main overhead multiple text_inserts have is the mapping of a file
> position i.e. a byte offset to the corresponding place in the piece
> chain. Currently this has linear complexity in the number of non
> consecutive editing operations in the file. Since this mapping is
> used by almost everything it is a natural candidate for optimizations
> (if it ever turns out to be a bottleneck).

I don't assume that the whole operation for creating a new piece and
appending the data will be a big bottle neck.

If you think using the piped-approach is the way to go, I could try to
adjust the code in a way that lets it write and read without having to
keep almost the whole input and output in buffers. It could take me a
while to get there though.
Received on Thu May 14 2015 - 10:05:42 CEST

This archive was generated by hypermail 2.3.0 : Thu May 14 2015 - 10:12:09 CEST