Re: [dev] [PATCH] slstatus: cpu_perc: make 100% equivalent of 1 cpu

From: Kurt Van Dijck <dev.kurt_AT_vandijck-laurijssen.be>
Date: Fri, 15 Sep 2017 10:57:00 +0200

> > This commit allows to specify (statically) the number of CPU's (ncpu).
> > This allows to show the cpu usage relative to 1 CPU.
> > So, when 1 cpu is busy, 100% is shown. 2 cpu's busy: 200%, and so on.
> > At this point, the configuration of ncpu is static.
> >
> > When no number is given (the backward compatible option), then
> > slstatus thinks it only has 1 cpu and no scaling is done, like it
> > used to be.
>
> Eeek,

First:
I kept the legacy behavious as default, since I do understand that
people may choose to display the number relative to all resources.

> Could you explain the rationale of this?.

yes, see below

>
> 100% means “all resources”, 200% means “twice all resources”, how is
> this supposed to be interpreted?

on my 4cpu machine, a busy job shows around 25%: This typically does
not trigger anything to me.

with my patch, showing usage relative to 1 cpu resource, a busy job
shows around 100%: This heuristically triggers me that a job is busy.

This is mostly usefull to detect problem jobs consuming more that they should.

To summarize: 100% means "the resources of 1 cpu", 200% means twice the
resources of 1cpu. My 4cpu machine will never go beyond 400%.
>
> We are not collecting per-cpu statistics, so it's misleading at best.

I did show per-cpu usage in the past, but that wasn't that good after all:

A busy job may show as [50% 50% 0% 0%] or [0% 10% 70% 20%] and sometimes
even [100% 0% 0% 0%]. This wasn't that good after all.
I don't want per-cpu statistics.
Having 1 number relative to 1 cpu has been good all the time.
> If it's a question of “precision” (those are approximate stats), use ‰,
> not %.
That's not the problem. While the measurement is precise, the context
introduces enough variation that a problem job does never show exactly
as 100%.
>
> Having 100% on this setup (of 200% max) only means actually 50%, not
> 100% of supposedly one core and having another core still available.

I focus to consumed resources, while you focus to free resources.
Our different focus remains equally valid.

Kind regards,
Kurt
Received on Fri Sep 15 2017 - 10:57:00 CEST

This archive was generated by hypermail 2.3.0 : Fri Sep 15 2017 - 11:00:54 CEST