[dev] structural regular expression support for vis

From: Marc André Tanner <mat_AT_brain-dump.org>
Date: Fri, 11 Mar 2016 20:08:38 +0100

On Tue, Mar 01, 2016 at 05:23:11PM +0000, Connor Lane Smith wrote:
> On 1 March 2016 at 17:12, Marc André Tanner <mat_AT_brain-dump.org> wrote:
> > I think structural regexp will integrate nicely with multiple selections.
> >
> > Basically if you omit the command of a structural regexp the editor
> > would switch to visual mode and add a selection for every match. If you
> > are already in visual mode then the existing selections would be used
> > as ranges for an implicit leading loop construct (x/ in sam).
> > That is for an existing selection x/ and y/ could be used to split it.
> > Similarly the conditionals g/ and v/ would be used to keep / discard
> > selections.
>
> I agree (strongly!). My main complaint with sam is its inability to
> reflect the multiple selections implied by its command language. It
> would be fantastic if we could get that sorted, and vis may well be a
> good place to do it.

I finally had some time to experiment with these ideas. The results can
be found in the "sam" branch of the vis git repository:

 https://github.com/martanne/vis/tree/sam

For now sam commands can be entered from the vis prompt via :sam <cmd>

A command behaves differently depending on the mode in which it is issued:

 - in visual mode it behaves as if an implicit extract x command
   matching the current selection(s) would be preceding it. That is
   the command is executed once for each selection.

 - in normal mode:

    * if an address for the command was provided it is evaluated starting
      from the current cursor position(s) i.e. dot is set to the current
      cursor position.

    * if no address was supplied to the command then:

       + if multiple cursors exist, the command is executed once for every
         cursor with dot set to the current line of the cursor

       + otherwise if there is only 1 cursor then the command is executed
         with dot set to the whole file

The command syntax was slightly tweaked to accept more terse commands.

  - When specifying text or regular expressions the trailing delimiter can
    be elided if the meaning is unambiguous.

  - If only an address is provided the print command will be executed.

  - The print command creates a selection matching its range.

  - In text entry \t inserts a literal tab character (sam only recognizes \n).

Hence the sam command ,x/pattern/ can be abbreviated to x/pattern

If a command is successful vis switches to normal mode (and hence removes
any selections), otherwise the editor is kept in visual mode. The print
command "fails" by definition.

Some limitations of the command language as currently implemented include:

 - The following commands are deliberately not implemented:
  
    * move (m)
    * copy (t)
    * print line address (=)
    * print character address (=#)
    * set current file mark (k)
    * quit (q)
    * undo (u)

   I don't think they make sense / are important for interactive usage.

 - Multi file support is very basic. While the X and Y commands are
   in principle supported. They are essentially untested at this stage
   and probably even more buggy than the rest.

   Also a file might be displayed in multiple windows (possibly with
   different selections, although currently vis does not allow switching
   of windows in visual mode i.e. when a selection is active).

   Not yet sure what the right thing to do here is. What is sam doing
   in this case?

    * the "regexp" construct to evaluate an address in a file matching
      regexp is currently not supported.

    * the following commands related to multiple file editing are not
      supported: b, B, n, D, f.

 - The I/O related commands e, r, w, <, >, |, ! and cd are not yet implemented.

   These should not be hard to add later on. The equivalent vi commands
   :r !, :w !, :! are supported hence the necessary infrastructure to
   implement them is already there.

 - Error handling is only very bare bone. For example if you have multiple
   regular expressions in your command and one of them has a syntax error,
   there is no indication as to which one causes the problem.

   However this is not that bad since you can build your commands
   incrementally. As an example, if in doubt use:

     :sam x g/pattern
     :sam x/sub-pattern

   instead of

     :sam x g/pattern/x/sub-pattern

 - The special grouping semantics where all commands of a group operate
   on the the same state is not implemented. In general grouping has not
   yet been really tested. Your help in doing so is appreciated.

 - The substitute (s) command is currently not implemented. In general
   this command with its additional g flag (for global substitution)
   feels quite alien to the rest of the command language. I wonder whether
   it is necessary?

   However the ability to refer to (sub) expression matches with \&
   and \d where d is a digit will probably be missed.

 - The file mark address ' (and corresponding k command) is not supported

 - There will likely be bugs, memory leaks, crashes, infinite loops etc.
   YOU are supposed to fix them and submit patches ;)

Some general things I noticed while doing some initial testing. These are
mostly vis limitations not directly related to sam's command language:

 - All regex matching is performed using the regex(3) interface, hence
   performance is sub optimal, especially for backward searches or within
   large files when there are many small matches. There are lots of needless
   memory copies involved.

 - The vis display code (view.c) is not really optimized, in particular
   it redraws its internal cell matrix way too often. For example
   every time a cursor/selection is created/destroyed which can
   happen frequently with the looping commands. In general it
   remains to be seen how the current code scales with hundreds of
   cursors/selections. These non-consecutive editing operations will
   cause fragmentation in the data structure used for text management.
   That is the number of pieces maintained in the double linked list
   will significantly increase.

 - Vis assumes that at least one cursor (the primary one) is always
   in the region currently displayed. Hence one can not simply move
   around the file without affecting the selection.

   I added key bindings Ctrl-u and Ctrl-d to change the primary
   cursor to the "previous" / "next" cursor. This can be used to "scroll"
   around the file without affecting the selection.

   However because vis does currently not keep track of the ordering of
   its cursors, these commands will visit the cursors in the order they
   were created which does not necessarily correspond to their current
   location in the file.

   If a selection covers more than a screen full then vi's o command
   which moves the cursor to the other end of the selection is useful.

 - Some indications whether other selections exist and ideally also
   where they are would be useful. In a graphical editor one would
   probably display an indication on the scroll bar (as for example
   Chrome is doing when displaying search results).

   Maybe something like n/m meaning the primary cursor is currently in
   selection n of m could be added to the status bar?

 - A mechanism to restore past selections might be useful. For example
   when a x/ command resulted in an unexpected outcome, one could "undo"
   the selections, tweak the command and run it again.

 - Integration into the vi command line should be improved, having to
   prefix commands with :sam is slightly annoying.

 - It might be interesting to experiment with ways to extend the command
   language to include:

    * Text objects, maybe just as an abbreviation for common regular
      expressions?

    * Information from the lexers, it would then be possible to
      select/match based on the token type. Thus for example allowing
      commands iterating over all comments/strings etc.

To conclude the basics seem to work. Combining sam's structural regular
expression based command language with a modal editor supporting multiple
cursors/selections for immediate visual feedback results in a powerful
tool.

I hope the current state is good enough to encourage people to play with
it, report (or preferably fix) bugs, implement the missing features,
improve vi integration etc.

Marc

-- 
 Marc André Tanner >< http://www.brain-dump.org/ >< GPG key: 10C93617
Received on Fri Mar 11 2016 - 20:08:38 CET

This archive was generated by hypermail 2.3.0 : Fri Mar 11 2016 - 20:12:09 CET