Re: [dev] ssam rocks! unwrapping paragraphs
On Tue, Mar 22, 2022, at 9:49 PM, 201009-suckless_AT_planhack.com wrote:
> sed is the canonical paragraph mangler. It's worth spending a bit to
> grok how that is true.
>
> tr -d '\r' | sed '/^$/!{H;d;};p;x;s/\n/ /g;'
>
> Gutenberg lines are CRLF-terminated so `tr` is needed.
Right I forgot to mention that I had to
tr -d '\r'
first. Thanks for mentioning that.
Close, but no cigar. That sed command introduces extra blank lines. It is incorrect. ssam reigns supreme!
tr -d '\r' < 2488-0.txt | ssam -e 'x/\n+/ v/\n\n+/ c/ /' | wc -l
7667
tr -d '\r' < 2488-0.txt | sed '/^$/!{H;d;};p;x;s/\n/ /g;' | wc -l
7782
Received on Wed Mar 23 2022 - 03:04:51 CET
This archive was generated by hypermail 2.3.0
: Wed Mar 23 2022 - 03:48:08 CET