Re: [hackers] A better mailing list web archiver for ... ?

From: NRK <>
Date: Thu, 11 Aug 2022 15:10:50 +0600

On Wed, Aug 10, 2022 at 09:29:43PM +0200, Thomas Oltmann wrote:
> I think we can all agree that the current web archive over at
> isn't all that great;
> Author names get mangled, the navigation is terrible, some messages
> are duplicated, some missing.

I've noticed the missing mails too.

> Is there currently any interest in such a project here?

If it'd be an improvement over the current system then I don't see why

> So far, I've gone ahead and implemented a sort of proof-of-concept (at

Hmm, interesting source code. A couple observations:

0. `.POSIX` needs to be first non-comment line in the Makefile
1. L277: pointer arithmetic is only valid as long as the result is
   within the array or just 1 past it.
2. L36: `mail` should be declared `static` as it's not used outside of
   the TU.

Usage of memcpy for string copying is good to see. I think more C
programmers should start thinking of strings as buffers and tracking
their length as necessary. Which can both improve efficiency and reduce
chances of buffer mishandling.

But in the case of `encode_html()`, stpcpy is probably the proper
function to use.

Anyways, I've attached patches for all the above. The stpcpy change is
opinionated, so feel free to reject that.

And one more thing:

        /* TODO we should probably handle EINTR and partial reads */

Best thing to do here is not using `read()` to begin with. Instead use
`mmap()` to map the file into a private buffer. Otherwise I think using
`fread` is also an (inferior) option, don't think you need to worry
about EINTR with fread.


Received on Thu Aug 11 2022 - 11:10:50 CEST

This archive was generated by hypermail 2.3.0 : Thu Aug 11 2022 - 11:12:38 CEST