Re: [dev] reading an epub book with less: adventures in text processing

From: Κρακ Άουτ <krackout_AT_gmx.com>
Date: Mon, 11 Mar 2024 20:12:40 +0200

On 2024-03-11 17:44 Greg Reagle <list_AT_speedpost.net> wrote:

> Now my next question is, what is the tool that does the *best* job of
> turning a PDF book into a readable text document? Via html or
> docbook or markdown or whatever--doesn't matter. My previous
> experience trying things out to achieve this goal is that it's just
> not worth it. The output always winds up un-readable.

I use pdftotext from poppler-utils. It does quite good job.

This is my main pdf reader command:
```
pdftotext -layout -nopgbrk ${1_AT_Q} - | less -MS --use-color
```
Received on Mon Mar 11 2024 - 19:12:40 CET

This archive was generated by hypermail 2.3.0 : Mon Mar 11 2024 - 19:36:09 CET