Re: [hackers] [dmenu][RFC][PATCH 0/4] Using sort and simple C program to get dmenu history functionality from Silvan Jegen on 2015-12-01 (hackers mail list archive)

From: Silvan Jegen <s.jegen_AT_gmail.com>
Date: Tue, 1 Dec 2015 20:04:04 +0100

Heyho!

On Tue, Dec 01, 2015 at 05:51:59AM -0800, Xarchus wrote:
> On Mon, Nov 30, 2015 at 03:28:42PM +0100, Silvan Jegen wrote:
> > Heyho!
> >
> > On Sat, Nov 28, 2015 at 11:25 PM, Hiltjo Posthuma
> > <hiltjo_AT_codemadness.org> wrote:
> > >
> > > This can be implemented in a few lines of shell (wc, sort) and maybe awk.
> >
> > I *have* implemented the history part with sort. If you think the
> > history updating functionality that I ended up writing in C can be
> > (easily?) implemented in some shell script and/or awk then I would
> > like to see it on the list :)
> >
> > I thought it should be possible to implement in awk but because you
> > have to both read input from stdin (the command) and from a file (the
> > history file) I couldn't figure it out in the admittedly short time I
> > kept trying to do it.
> >
> > I tried abusing the -v option of gawk to set the command name as a
> > variable value but that only seems to work in a BEGIN block which is
> > executed before the input file is read. It may be possible to use the
> > shell to read the command and then initialize a variable in the awk
> > code given on the command line but I think the argument-escaping hell
> > would be very annoying to deal with.
> >
>
> As both Silvan and Hiltjo mentioned, here is an awk script that does the
> job of updhist. (more than that, it fixes a problem: the updhist.c as sent
> does not work with multisel, it will corrupt the history file if multiple
> selections are generated from a single dmenu invocation; but more about
> multiselect later)

I was aware that update.c does not deal with the case of multiple-line
commands. I did not know that there exists such a functionality for
dmenu though so I ignored it in the first version...

> In order to make the script more portable, I tried to keep the awk features
> limited to the nawk set and not using any of the gawk stuff. With gawk,
> things can get even simpler: for example gawk has sort, so the external sort
> would not be needed in the END block.
> (BTW, which is the target as the *official* awk?)

I prefer using the external sort.

I don't know if there is an "official" suckless awk. Do you know if your
script works with the BSD awk implementation?

> The script assumes the format of the history file to be 'filename tab
> count', the same as the C program. The problem with this is what happens
> if a file name contains tabs ... Crazy, but not impossible. A more reliable
> approach would be to put the count first then the remaining line is all a
> file name.

While I would be willing to take that risk, doing it your way should
simplify the sort command slightly as well and is a good idea.

> Anyhow, here is the script (tested) :
>
> ---------------------%<--------------------- updhist.awk
> #!/usr/bin/awk -f
>
> BEGIN {
> if(histfile=="") {
> print "updhist.awk: no history file specified" > "/dev/stderr"
> print "usage: awk -v histfile=<path-to-history-file> -f updhist.awk" > "/dev/stderr"
> exit(1)
> }
> FS=OFS="\t" # explicit tabs for input to allow file names with spaces
> while ( (getline < histfile) > 0 )
> history[$1]=$2 # assumption: file name does not contain tabs
> close(histfile)
> }
>
> {
> history[$0]++
> print
> }
>
> END {
> for (f in history)
> print f,history[f] | "sort -t '\t' -k2rn >" histfile
> }
> ---------------------%<---------------------
>
> To use the above, replace in dmenu_run the call to updhist with 'awk -v
> histfile=$historyfile -f updhist.awk' (assuming updhist.awk is placed
> somewhere in AWKPATH, or /usr/share/awk, otherwise use the full path to the
> awk script).
>
> Because the script will keep the history file already sorted, in
> 'dmenu_path' there is no need to sort the history when fed to dmenu (so
> leave out the 'sort -r -n -t ' ' -k 2').
>
> Silvan talked about the fact that the commands in the history will show up
> twice: once from the history, once in the normal list. This too can be
> taken care of with a tiny awk one-liner to filter duplicates; that will
> replace the 'cat' at the end of the two lines in dmenu_path with this:
>
> awk '!x[$0]++' - "$cache"
>
> Going even further, the cut operation can be factored into awk (I assume it
> does cut on the tab that separates the name from the counter), so the whole
> line now becomes (this relies on a history format with the name first, tab,
> followed by the count):
>
> awk -F$'\t' '!x[$1]++' "$HISTORY" "$cache"

For some reason this does not work for me using gawk. I get both cache
and history results (deduplicated) but the history results are not split
on tab for some reason.

Using external cut works fine for me though.

> P.S. Now back to multiselect:
>
> This updhist awk script replacement will work with multiselect (multiple
> inputs will simply increment their count or added as new). This includes
> the case when dmenu outputs duplicate strings (with multiselect, a same
> entry can be generated multiple times).
>
> But speaking of multiselect, the dmenu_run as it is now does not handle
> very gracefully multiple selections ... Multiple programs selected in dmenu
> will all be started, but sequentially, with the next waiting for the
> previous to exit (an artifact of them all being fed to one shell instance).
> I am not sure if that is the intention: my preference would be to start
> each program as soon as it's selected with Ctrl-Return.

Personally I don't use multiselect so I will let others deal with that
use case...

> For completeness, a full diff with all these changes attached.
>

The patch did not apply for me, saying that updhist.awk does not exist.
I think something went wrong with the patch creation because the diff
mentions an existing (empty?) updhist.awk file. See below.

> diff --git a/dmenu_path b/dmenu_path
> old mode 100644
> new mode 100755
> index 338bac4..f16f562
> --- a/dmenu_path
> +++ b/dmenu_path
> _AT_@ -1,4 +1,11 @@
> #!/bin/sh
> +if [ -z "$1" ]; then
> + echo "Need a history file as first argument."
> + exit
> +else
> + HISTORY=$1
> +fi
> +
> cachedir=${XDG_CACHE_HOME:-"$HOME/.cache"}
> if [ -d "$cachedir" ]; then
> cache=$cachedir/dmenu_run
> _AT_@ -7,7 +14,8 @@ else
> fi
> IFS=:
> if stest -dqr -n "$cache" $PATH; then
> - stest -flx $PATH | sort -u | tee "$cache"
> + stest -flx $PATH | sort -u > "$cache"
> + awk -F$'\t' '!x[$1]++' "$HISTORY" "$cache"
> else
> - cat "$cache"
> + awk -F$'\t' '!x[$1]++' "$HISTORY" "$cache"
> fi
> diff --git a/dmenu_run b/dmenu_run
> index 834ede5..051a2c4 100755
> --- a/dmenu_run
> +++ b/dmenu_run
> _AT_@ -1,2 +1,5 @@
> #!/bin/sh
> -dmenu_path | dmenu "$_AT_" | ${SHELL:-"/bin/sh"} &
> +
> +historyfile=~/.cache/dmenu/dmenuhistory
> +
> +dmenu_path $historyfile | dmenu "$_AT_" | awk -v histfile=$historyfile -f updhist.awk | ${SHELL:-"/bin/sh"} &
> diff --git a/updhist.awk b/updhist.awk
> index e69de29..76820bf 100755
> --- a/updhist.awk
> +++ b/updhist.awk

This seems odd because it implies there already being an updhist.awk. I
assume that's why the patch did not apply for me.

> _AT_@ -0,0 +1,23 @@
> +#!/usr/bin/awk -f
> +
> +BEGIN {
> + if(histfile=="") {
> + print "updhist.awk: no history file specified" > "/dev/stderr"
> + print "usage: awk -v histfile=<path-to-history-file> -f updhist.awk" > "/dev/stderr"
> + exit(1)
> + }
> + FS=OFS="\t" # explicit tabs on input to allow file names with spaces
> + while ( (getline < histfile) > 0 )
> + history[$1]=$2 # assumption: file name does not contain tabs
> + close(histfile)
> +}
> +
> +{
> + history[$0]++
> + print
> +}
> +
> +END {
> + for (f in history)
> + print f,history[f] | "sort -t '\t' -k2rn >" histfile
> +}

I like your solution using awk better and would vote for putting a
version of your patch on suckless.org after adjusting it to deal with
the history file in "$count\t$cmdname" format. Do you agree?

Cheers and thanks!

Silvan
Received on Tue Dec 01 2015 - 20:04:04 CET

This archive was generated by hypermail 2.3.0 : Tue Dec 01 2015 - 20:12:14 CET