Re: [wiki] [sites] Revert "[st][patch][lowlatency] Introducing new patch for st called lowlatency." || Hiltjo Posthuma

From: mschoth_AT_googlemail.com <mschoth_AT_gmail.com>
Date: Mon, 1 Jun 2020 14:26:16 +0200

Concerning your objections for the patch

> Revert "[st][patch][lowlatency] Introducing new patch for st called lowlatency."
>
> This reverts commit 87fac66fea613a7e8d798da65dcd3552d4766817.
>
> Assuming the author probably had good intentions, but the page contains too
> much inaccurate and wrong information so it is reverted.
>
> st has had work on improving the latency and drawing performance (less flicker):
> https://git.suckless.org/st/commit/1d590910652519268152eae6b97cf30ace4e90c0.html

This patch does not improve IO-latency. It solves other problems.
However with default settings it achieves the within the margin of
statistical error exactly the same latency values as version 0.8.3.
I will provide a new lowlatency patch for the next stable version of
st with the new backend if the time comes.

>
> There you can lower the minlatency and maxlatency values. This is a better
> solution than forcing drawing after each keypress as done in this patch
> st.suckless.org/patches/lowlatency/st-lowlatency-0.8.3.diff
> Which causes flicker and potentially reduce throughput aswell (as mentioned in
> the notes section). The auto-sync patch (in master) does not have this issue.
> Before the auto-sync patch it was already possible to increase the xfps and
> actionfps causing lower latencies aswell (at some other costs potentially).

The point of this patch is that it reduces the latency with no cost
for throughput because it only enforces additional drawing exclusively
for KeyPress events.
Since the bandwidth for KeyPress events generally really low this
should not be an issue.
Also flickering/tearing only occurs in the absolute rare event that
the scanline is right above the text that is being typed right now.
This patch is a tradeoff between ultra rare tearing events in exchange
for latency values on par of Xterm.

Also, If you simply set minlatency and maxlatency to 0 with the new
backend you still get worse latency than this patch with the old
backend with the additional cost of much worse throughput.
The same is true if you set xfps to a very high value in the old backend.
This will immediately render on ALL events instead of just KeyPress events.

>
> st was never as slow as the benchmark referenced on the page:
> https://danluu.com/term-latency/
> There the benchmarks are run with a Java application on MacOS:
> "Measurements are with macOS unless otherwise stated."
> This is not a reliable way to test. The proprietary MacOS is not supported
> either.

This page also contains also benchmarks made on linux ("st-linux")
which are somewhat consistent with benchmark results from the first
source.
I can add a note that results on MacOS should be taken with a huge
grain of salt and should generally be ignored.

>
> For an accurate benchmark provide _atleast_:
>
> * A listing of exactly all the versions used.
> * System information.

This can easily be done.

> * An FOSS OS (Linux, BSD).

Is already being used.

> * A reliable simple reproducable benchmarking tool (open-source of course).

Typometer is open source and produces reproducible results.
If you know of any simpler solution to measure input/ouput latency,
please let me know.

> * The exact dataset used for testing.

I use random data generated by /dev/urandom. It doesn't really make
sense to provide the exact file.
But I will provide the detailed command I ran in order to produce the
throughput results.

However I found that I get vastly different results for different GPU
manufacturers.
Therefore the important metric to consider is not the absolute latency
value but the ratio latency relative to Xterm which still beats st on
every system I tested in default settings.
That ratio seems to be consistent across different environments.

Here is a draft for the modified the description text with changes
according some of your suggestions.
Please let me know if there is anything else you want to have changed:


lowlatency
==========

Summary
-------
Trivial patch that reduces input/output latency by selectively disabling the
built in frame rate limiting for `KeyPress` events.

Description
-----------
According to the popular essay
[Typing with pleasure](https://pavelfatin.com/typing-with-pleasure/)
by Pavel Fatin public available research data comes to the conclusion that
"Delay of visual feedback on a computer display have important effects on
typist behavior and satisfaction" and even though not necessarily consciously
perceived can have an significant impact on typing speed, error rate, eye and
muscle strain and the required amount of conscious attention.

Several publications \[[1](https://lwn.net/Articles/751763/)\],\[
[2](https://danluu.com/term-latency/)\] (the test in the latter source
on MacOS should be considered unreliable but the article still
contains useful information) that benchmarked latency metrics for
terminal emulators established that the latency performance of St is
consistently worse than that of XTerm.
The reason for this is that St employs frame rate limiting in order to keep St
from slowing down applications with high output bandwidth and to reduce
tearing.

This patch disables the frame rate limiting for events that are caused by
keyboard input but keeps it intact for all other events and consequently should
not harm throughput performance.

Benchmarks
----------
These benchmarks are done on a Intel Core i5-2400 using a RX580 with
`amdgpu` drivers on
Linux kernel 5.4.42_1 on a 60Hz refresh rate display employing the utility
[Typometer](https://github.com/pavelfatin/typometer) (Version 1.0.1)
to measure the latency.

[![Results Timeseries](xterm_vs_st0.8.3_vs_stlowlatency.png)](xterm_vs_st0.8.3_vs_stlowlatency.png)

[![Results Histogram](xterm_vs_stlowlatency_hist.png)](xterm_vs_stlowlatency_hist.png)

Throughput measured by time it takes to cat a 51MB file containing random data
generated with the command `dd if=/dev/urandom of=/tmp/test count=100000`:

        Terminal real user sys
        --------------------------------------
        XTerm 7.853s 0.255s 0.819s
        St 0.8.3 4.347s 0.246s 0.655s
        St lowlatency 4.371s 0.254s 0.638s

**Conclusion**: The results show that the patch improves latency to the point
that it now beats previous leader XTerm without having impact on throughput.

Notes
-----
* This patch will most probably show no effect if used in conjunction with a
compositor or any driver configuration that enforces global vsync.

* In rare events you might experience tearing but only while typing.

* St after commit 1d590910652519268152eae6b97cf30ace4e90c0 has a new backend
to determine the right time to render. A corresponding lowlatency
patch will be released
for the next stable version of St.

Download
--------
* [st-lowlatency-0.8.3.diff](st-lowlatency-0.8.3.diff)

Authors
-------
* Matthias Schoth - <mschoth_AT_gmail.com>
Received on Mon Jun 01 2020 - 14:26:16 CEST

This archive was generated by hypermail 2.3.0 : Mon Jun 01 2020 - 14:36:45 CEST