Re: [dev] sple - A simple PDF links/emails extracotr.

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Ivan Tham <ivanthamjunhoe_AT_gmail.com>
Date: Thu, 7 May 2015 23:13:57 +0800

On Wed, May 06, 2015 at 11:19:04PM -0400, Jason Woofenden wrote:
> Hi Hypsurus,
>
> I hope you're having fun coding. Don't let me detract from that.
> But if you just need to extract links from pdfs, you can do so with
> existing tools, eg:
>
> pdftohtml -stdout foo.pdf | sed -ne 's/$^\|\n$\n$[^\n]*$\n[^\n]*/\1\2/gp; t; s/href="$[^"]\+$"/\n\n\1\n/g; D'
>
> Sorry if that sed thing is more complex than it needs to be. I'm
> just learning the other sed commands besides s///.
>
> The extra complexity with the "\n"s is to handle multiple links on
> the same line.
Hi, is there any way of using sed *only* as equivalent to:

    # The code change the variable if only the variable is found or else it
    #+ append to the end of file.
    # $1=xyz -> $1=$2, and $3 is the filename
    grep -q "^$1=" $3 && sed -i "/^$1=/c $1=$2" $3 || echo "$1=$2" >> $3

If you do find a way, can you explain it. Thanks.

-- 
 _____________________________________
< Do what you like, like what you do. >
 -------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Received on Thu May 07 2015 - 17:13:57 CEST

This archive was generated by hypermail 2.3.0 : Thu May 07 2015 - 17:24:09 CEST