Re: [hackers] [quark][PATCH] Add support for precomputed compression

From: Laslo Hunhold <dev_AT_frign.de>
Date: Tue, 10 Mar 2020 09:54:04 +0100

On Tue, 10 Mar 2020 00:00:20 +0200
Guy Sviry <sviryguy_AT_gmail.com> wrote:

Dear Guy,

> I agree that an online approach to compression is ideal, it's just I
> couldn't
> find a simple gzip implementation to power that. Given that we prefer
> not to malloc in runtime, things are even trickier.

I understand. Yeah, it definitely is not trivial. I must admit that I
was (wrongly) under the impression that gzip was LZ77, but it is in
fact DEFLATE (LZ77 + Huffman coding), making the matter much more
complicated.
PuTTY has a standalone one-c-file-gzip-implementation[0] (compression
and decompression), which lets me estimate that a standalone compressor
based on a static huffman tree would be around 300-400 LOC. This
definitely sounds like a cool exercise, so if anybody is up to the task

> We could write it ourselves, but even if the implementation is
> trivial, it won't be minimalist. I think resorting to zlib is ok in
> this case. Even then,
> it really might just not worth the effort (the coding effort, that
> is).

It's the question what one might consider minimalist. Currently, quark
does not have external dependencies, and I don't see the reason to pull
in the relatively hefty libgzip.

> Another, bit crazy approach: Spawning a `gzip -n -` subprocess, using
> its stdin for the http handlers, and setting the original fd as gzip's
> stdout.

Given the fork is most of the overhead for quark, adding another
fork-exec would effectively half its speed. Of course this is not the
main motivation, but you see what I'm getting at.

> Then we get universal, outsourced compression for relatively little
> amount of code.
>
> The only question left now is how to get those exes into the chroot..

That's also a very important point. It shouldn't rely on that.

Thinking about it, your approach of checking if a .gz exists and then
serving that is a nice one, but you must understand that I am not too
big of a fan. Gzip is such an easy win to have on your website, as it
can speed things up and reduce transmitted weight. But if you end up
having to maintain a set of .gz's in your servedir, this is an
overhead. Unless you automate it, it can lead to problems.

Feel free to add it to the patches section on the website. You can even
do that without discussion here, and thanks again for taking your time
to program this feature and bring it up here!

With best regards

Laslo

[0]:https://github.com/grumpydev/PortablePuTTY/blob/master/SSHZLIB.C
Received on Tue Mar 10 2020 - 09:54:04 CET

This archive was generated by hypermail 2.3.0 : Tue Mar 10 2020 - 10:00:41 CET