Discussion:
[gentoo-user] Emerge load again
(too old to reply)
Peter Humphreey
2023-11-27 15:50:01 UTC
Permalink
Hello list,

I still can't see how portage limits the load. Today I'm emerging libreoffice,
and it's spending almost the whole time working with 4 CPU threads. But:

$ grep -e '\-j' -e distcc /etc/portage/make.conf
EMERGE_DEFAULT_OPTS="--jobs=18 --load-average=30 --backtrack=200 --
autounmask=n --keep-going --nospinner"
FEATURES="distcc userfetch buildpkg network-sandbox parallel-install sandbox
userpriv usersandbox"
MAKEOPTS="-j18"

I found a suggestion to use distcc in the installation handbook, which I
hadn't seen there before, so I went searching for it and found how to do it.
It usually works well, in this case starting 18 packages before starting LO
itself. grep -rw doesn't find '4' anywere relevant under /etc/portage/ . Other
times it just doesn't help at all.

What am I missing?
--
Regards,
Peter.
Michael
2023-11-29 10:30:01 UTC
Permalink
Post by Peter Humphreey
Hello list,
I still can't see how portage limits the load. Today I'm emerging
libreoffice, and it's spending almost the whole time working with 4 CPU
$ grep -e '\-j' -e distcc /etc/portage/make.conf
EMERGE_DEFAULT_OPTS="--jobs=18 --load-average=30 --backtrack=200 --
autounmask=n --keep-going --nospinner"
FEATURES="distcc userfetch buildpkg network-sandbox parallel-install sandbox
userpriv usersandbox"
MAKEOPTS="-j18"
I found a suggestion to use distcc in the installation handbook, which I
hadn't seen there before, so I went searching for it and found how to do it.
It usually works well, in this case starting 18 packages before starting LO
itself. grep -rw doesn't find '4' anywere relevant under /etc/portage/ .
Other times it just doesn't help at all.
What am I missing?
In absence of other contributions I'll offer a theoretical explanation, based
on random observations on my systems.

You have specified as many as 18 packages to be emerged in parallel x up to 18
make jobs each. The result of [18 x 18 = 324] is to be limited by a total
load average of 30.

If there were more than 18 packages listed to be emerged and there were no
dependencies between them to restrict how many could start emerging in
parallel, you would observe =<18 packages being emerged in parallel. This
alone will not breach the load limit of 30.

Let's assume all 18 packages had a large codebase to need at least 18 make
jobs each. Sooner or later you'd have 18 parallel emerges all trying to run
18 make jobs.

Were this to occur the load limit restriction would kick in and you would see
only up to 30 jobs listed in top, with individual package processes
alternating in the top list of make threads.

Here's my hypothesis explaining your own observation with libreoffice. As a
package or more finished emerging, libreoffice's turn comes up. Soon
libreoffice starts to execute make jobs, but any of the following may apply:

1. There are only 4 out of 30 jobs available, because other packages are
already using 26, throughout your window of observation.
2. Libreoffice sequencing of make jobs is mostly linear with succeeding make
jobs waiting on output from their predecessors.
3. Libreoffice source code is not optimised for high parallelism - I recall
when it was hardcoded at -j1 just a few years ago. Before this restriction
was added, any bug reporters were advised to try again after limiting make to
-j1.

Next time I'm building libreoffice on a beefier system I'll keep an eye out
for the number of jobs to see what it gets up to.
Peter Humphreey
2023-11-29 12:10:02 UTC
Permalink
Post by Michael
Here's my hypothesis explaining your own observation with libreoffice. As a
package or more finished emerging, libreoffice's turn comes up. Soon
1. There are only 4 out of 30 jobs available, because other packages are
already using 26, throughout your window of observation.
Nope. Nothing else in progress.
Post by Michael
2. Libreoffice sequencing of make jobs is mostly linear with succeeding make
jobs waiting on output from their predecessors.
That's possible, but it doesn't seem likely with such a huge code base. And
why four processes, specifically and consistently?
Post by Michael
3. Libreoffice source code is not optimised for high parallelism - I recall
when it was hardcoded at -j1 just a few years ago. Before this restriction
was added, any bug reporters were advised to try again after limiting make
to -j1.
Yes, that was common to many packages for a long time because of incomplete
optimisation.
Post by Michael
Next time I'm building libreoffice on a beefier system I'll keep an eye out
for the number of jobs to see what it gets up to.
That would help, yes.

The contribution of distcc isn't clear to me yet, as I said before. Sometimes
it's the bee's knees; other times it might just as well not be there. I don't
like mysteries... :)
--
Regards,
Peter.
Michael
2024-01-06 11:50:01 UTC
Permalink
Post by Peter Humphreey
Post by Michael
Here's my hypothesis explaining your own observation with libreoffice. As
a package or more finished emerging, libreoffice's turn comes up. Soon
1. There are only 4 out of 30 jobs available, because other packages are
already using 26, throughout your window of observation.
Nope. Nothing else in progress.
Post by Michael
2. Libreoffice sequencing of make jobs is mostly linear with succeeding
make jobs waiting on output from their predecessors.
That's possible, but it doesn't seem likely with such a huge code base. And
why four processes, specifically and consistently?
Post by Michael
3. Libreoffice source code is not optimised for high parallelism - I recall
when it was hardcoded at -j1 just a few years ago. Before this restriction
was added, any bug reporters were advised to try again after limiting make
to -j1.
Yes, that was common to many packages for a long time because of incomplete
optimisation.
Post by Michael
Next time I'm building libreoffice on a beefier system I'll keep an eye out
for the number of jobs to see what it gets up to.
That would help, yes.
OK, I eventually got around to it. I am observing right now LO is building
with as many as 24 jobs:

top - 11:14:59 up 2:19, 2 users, load average: 24.46, 23.15, 9.51
Tasks: 474 total, 25 running, 449 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 5.6 sy, 94.0 ni, 0.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0
st
MiB Mem : 64217.1 total, 50028.6 free, 6233.7 used, 7954.9 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 54333.4 avail Mem

I don't use distcc. The make -j25 -l24.8 I have specified is respected.
Post by Peter Humphreey
The contribution of distcc isn't clear to me yet, as I said before.
Sometimes it's the bee's knees; other times it might just as well not be
there. I don't like mysteries... :)
Peter Humphrey
2024-01-06 14:00:01 UTC
Permalink
Post by Michael
Post by Peter Humphreey
Post by Michael
Here's my hypothesis explaining your own observation with libreoffice.
As
a package or more finished emerging, libreoffice's turn comes up. Soon
1. There are only 4 out of 30 jobs available, because other packages are
already using 26, throughout your window of observation.
Nope. Nothing else in progress.
Post by Michael
2. Libreoffice sequencing of make jobs is mostly linear with succeeding
make jobs waiting on output from their predecessors.
That's possible, but it doesn't seem likely with such a huge code base. And
why four processes, specifically and consistently?
Post by Michael
3. Libreoffice source code is not optimised for high parallelism - I recall
when it was hardcoded at -j1 just a few years ago. Before this restriction
was added, any bug reporters were advised to try again after limiting make
to -j1.
Yes, that was common to many packages for a long time because of incomplete
optimisation.
Post by Michael
Next time I'm building libreoffice on a beefier system I'll keep an eye out
for the number of jobs to see what it gets up to.
That would help, yes.
OK, I eventually got around to it. I am observing right now LO is building
top - 11:14:59 up 2:19, 2 users, load average: 24.46, 23.15, 9.51
Tasks: 474 total, 25 running, 449 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 5.6 sy, 94.0 ni, 0.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0
st
MiB Mem : 64217.1 total, 50028.6 free, 6233.7 used, 7954.9 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 54333.4 avail Mem
I don't use distcc. The make -j25 -l24.8 I have specified is respected.
Interesting. Thanks.
Post by Michael
Post by Peter Humphreey
The contribution of distcc isn't clear to me yet, as I said before.
Sometimes it's the bee's knees; other times it might just as well not be
there. I don't like mysteries... :)
I've decided to ditch distcc altogether. During the very first build, what it
grants with one hand it takes away double with the other - lots of tiny jobs
all started together, but then gcc is sompiled with just two threads. That
just-two happens on at least two different machines (not just separate;
different).

The position is no better in regular maintenance: no matter how many /make/
tasks are needed, I get just two threads compiling at a time. (I'm referring
to the single-host arrangement I mentioned at the start.)

I'm baffled, and I don't like it; I much prefer understanding to mystery.
--
Regards,
Peter.
Wols Lists
2024-01-06 15:30:02 UTC
Permalink
Post by Peter Humphreey
The contribution of distcc isn't clear to me yet, as I said before. Sometimes
it's the bee's knees; other times it might just as well not be there. I don't
like mysteries... 🙂
As far as I'm aware, there's no mystery. On a single machine you get the
exact same thing ... it's all down to parallelism.

Make asks itself "how many separate tasks can I do at the same time,
which won't interfere with each other". In gcc's case, the answer
appears to be two. It doesn't matter how much resource is available,
make can only make use of two cores.

In other cases, there may be a hundred separate tasks, make fires off a
hundred tasks shared amongst all the resource it can find, and sits back
and waits.

Think of a hundred compile jobs all running at the same time, but then
the linker is invoked, and you can only have the one linker running,
after all the compile jobs have finished.

And this is a HARD problem, I haven't seen it recently, but there used
to be plenty of threads about hard-to-debug compile failures that went
away with -j1. The obvious cause was two compile jobs being set off in
parallel, when in reality one depended on the other, and things messed up.

Cheers,
Wol
Peter Humphrey
2024-01-06 18:00:02 UTC
Permalink
Post by Wols Lists
As far as I'm aware, there's no mystery. On a single machine you get the
exact same thing ... it's all down to parallelism.
Make asks itself "how many separate tasks can I do at the same time,
which won't interfere with each other". In gcc's case, the answer
appears to be two. It doesn't matter how much resource is available,
make can only make use of two cores.
Yet, if I set -distcc and -j12 -l12, I get 12 threads in parallel. That's the
mystery.
Post by Wols Lists
In other cases, there may be a hundred separate tasks, make fires off a
hundred tasks shared amongst all the resource it can find, and sits back
and waits.
And that's how the very first installation goes, with single-host distcc. Then,
when it gets to gcc, it collapses to 2 threads and everything gained so far is
lost many-fold. (I set USE=-fortran to avoid pointless recompilation, since
nothing needs it here.)
Post by Wols Lists
Think of a hundred compile jobs all running at the same time, but then
the linker is invoked, and you can only have the one linker running,
after all the compile jobs have finished.
I hadn't thought of that - another thing to consider.
Post by Wols Lists
And this is a HARD problem, I haven't seen it recently, but there used
to be plenty of threads about hard-to-debug compile failures that went
away with -j1. The obvious cause was two compile jobs being set off in
parallel, when in reality one depended on the other, and things messed up.
I haven't either - seen it recently.
--
Regards,
Peter.
Wols Lists
2024-01-06 19:40:01 UTC
Permalink
Post by Peter Humphrey
Post by Wols Lists
In other cases, there may be a hundred separate tasks, make fires off a
hundred tasks shared amongst all the resource it can find, and sits back
and waits.
And that's how the very first installation goes, with single-host distcc. Then,
when it gets to gcc, it collapses to 2 threads and everything gained so far is
lost many-fold. (I set USE=-fortran to avoid pointless recompilation, since
nothing needs it here.)
So if it's consistently gcc that collapses to two threads, then
something (maybe explicit settings, maybe dependencies, maybe yadda
yadda) is telling make that only two jobs can run at the same time else
they'll trip over each other.

Could be a dev has hard-coded the "two jobs" rule to make those random
crashes go away :-) Or maybe they found the problem, and that's why only
two jobs can run in parallel.

Cheers,
Wol
Peter Humphrey
2024-01-07 00:50:01 UTC
Permalink
Post by Wols Lists
Post by Peter Humphrey
Post by Wols Lists
In other cases, there may be a hundred separate tasks, make fires off a
hundred tasks shared amongst all the resource it can find, and sits back
and waits.
And that's how the very first installation goes, with single-host distcc.
Then, when it gets to gcc, it collapses to 2 threads and everything
gained so far is lost many-fold. (I set USE=-fortran to avoid pointless
recompilation, since nothing needs it here.)
So if it's consistently gcc that collapses to two threads, then
something (maybe explicit settings, maybe dependencies, maybe yadda
yadda) is telling make that only two jobs can run at the same time else
they'll trip over each other.
Could be a dev has hard-coded the "two jobs" rule to make those random
crashes go away :-) Or maybe they found the problem, and that's why only
two jobs can run in parallel.
Not so. As I said last time: 'if I set -distcc and -j12 -l12, I get 12 threads
in parallel'.
--
Regards,
Peter.
Adam Carter
2024-01-07 01:00:01 UTC
Permalink
Post by Peter Humphrey
Post by Wols Lists
So if it's consistently gcc that collapses to two threads, then
something (maybe explicit settings, maybe dependencies, maybe yadda
yadda) is telling make that only two jobs can run at the same time else
they'll trip over each other.
Could be a dev has hard-coded the "two jobs" rule to make those random
crashes go away :-) Or maybe they found the problem, and that's why only
two jobs can run in parallel.
Not so. As I said last time: 'if I set -distcc and -j12 -l12, I get 12 threads
in parallel'.
Have you checked you're not limiting jobs in /etc/distcc/hosts? ie no '/2'
after the IP address?
Peter Humphrey
2024-01-07 01:50:01 UTC
Permalink
Post by Adam Carter
Post by Peter Humphrey
Post by Wols Lists
So if it's consistently gcc that collapses to two threads, then
something (maybe explicit settings, maybe dependencies, maybe yadda
yadda) is telling make that only two jobs can run at the same time else
they'll trip over each other.
Could be a dev has hard-coded the "two jobs" rule to make those random
crashes go away :-) Or maybe they found the problem, and that's why only
two jobs can run in parallel.
Not so. As I said last time: 'if I set -distcc and -j12 -l12, I get 12 threads
in parallel'.
Have you checked you're not limiting jobs in /etc/distcc/hosts? ie no '/2'
after the IP address?
$ cat /etc/distcc/hosts
localhost/12
--
Regards,
Peter.
Nuno Silva
2023-11-30 10:20:01 UTC
Permalink
Post by Michael
Post by Peter Humphreey
Hello list,
I still can't see how portage limits the load. Today I'm emerging
libreoffice, and it's spending almost the whole time working with 4 CPU
$ grep -e '\-j' -e distcc /etc/portage/make.conf
EMERGE_DEFAULT_OPTS="--jobs=18 --load-average=30 --backtrack=200 --
autounmask=n --keep-going --nospinner"
FEATURES="distcc userfetch buildpkg network-sandbox parallel-install sandbox
userpriv usersandbox"
MAKEOPTS="-j18"
I found a suggestion to use distcc in the installation handbook, which I
hadn't seen there before, so I went searching for it and found how to do it.
It usually works well, in this case starting 18 packages before starting LO
itself. grep -rw doesn't find '4' anywere relevant under /etc/portage/ .
Other times it just doesn't help at all.
What am I missing?
In absence of other contributions I'll offer a theoretical explanation, based
on random observations on my systems.
I can't explain the 4, but one thing about this configuration (although
it's possible this has been already discussed before, apologies if
Post by Michael
You have specified as many as 18 packages to be emerged in parallel x up to 18
make jobs each. The result of [18 x 18 = 324] is to be limited by a total
load average of 30.
[...]
Post by Michael
Were this to occur the load limit restriction would kick in and you would see
only up to 30 jobs listed in top, with individual package processes
alternating in the top list of make threads.
The load limit is being set only for emerge, not make, so it would only
affect the decision to start building more packages in parallel. The
already started ongoing builds could still take the load beyond 30, with
more than 30 processes - there is nothing set to prevent that, or is
there?
--
Nuno Silva
Michael
2023-11-30 11:20:02 UTC
Permalink
Post by Nuno Silva
Post by Michael
Post by Peter Humphreey
Hello list,
I still can't see how portage limits the load. Today I'm emerging
libreoffice, and it's spending almost the whole time working with 4 CPU
$ grep -e '\-j' -e distcc /etc/portage/make.conf
EMERGE_DEFAULT_OPTS="--jobs=18 --load-average=30 --backtrack=200 --
autounmask=n --keep-going --nospinner"
FEATURES="distcc userfetch buildpkg network-sandbox parallel-install
sandbox userpriv usersandbox"
MAKEOPTS="-j18"
I found a suggestion to use distcc in the installation handbook, which I
hadn't seen there before, so I went searching for it and found how to do
it. It usually works well, in this case starting 18 packages before
starting LO itself. grep -rw doesn't find '4' anywere relevant under
/etc/portage/ . Other times it just doesn't help at all.
What am I missing?
In absence of other contributions I'll offer a theoretical explanation,
based on random observations on my systems.
I can't explain the 4, but one thing about this configuration (although
it's possible this has been already discussed before, apologies if
Post by Michael
You have specified as many as 18 packages to be emerged in parallel x up
to 18 make jobs each. The result of [18 x 18 = 324] is to be limited by
a total load average of 30.
[...]
Post by Michael
Were this to occur the load limit restriction would kick in and you would
see only up to 30 jobs listed in top, with individual package processes
alternating in the top list of make threads.
The load limit is being set only for emerge, not make, so it would only
affect the decision to start building more packages in parallel. The
already started ongoing builds could still take the load beyond 30, with
more than 30 processes - there is nothing set to prevent that, or is
there?
As I understand it any tasks the emerge command is spawning, including make
jobs, will be respectful of the '--load-average 30.0'. When only MAKEOPTS is
specified, then a '-l 30.0' would be needed there to apply the same load limit
average.
Peter Humphreey
2023-11-30 16:50:01 UTC
Permalink
Post by Nuno Silva
The load limit is being set only for emerge, not make, so it would only
affect the decision to start building more packages in parallel. The
already started ongoing builds could still take the load beyond 30, with
more than 30 processes - there is nothing set to prevent that, or is
there?
Yes, according to that web site I found, distcc will limit the number, and it
does seem to do so. What puzzles me is that I can't get LO to start any other
number of make jobs than 4.
--
Regards,
Peter.
John Blinka
2023-11-29 14:20:02 UTC
Permalink
On Mon, Nov 27, 2023 at 10:39 AM Peter Humphreey <***@prh.myzen.co.uk>
wrote:l
Post by Peter Humphreey
What am I missing?
I have much less powerful hardware than you but libreoffice (as a
stand-alone build) generates many more threads than 4 on my “cluster”. I’m
also using distcc.

On the main box, I set
MAKEOPTS=“-j17 -l6”
On the other two less powerful ones -l is 5 and 3, but -j is the same.

On the main box, /etc/distcc/hosts contains
localhost/11 sophie/5,lzo tobey/3,lzo —localslots=11 —localslots_cpp=11

On sophie and tobey (my less powerful boxes) the hosts file contains
something similar but specific to those boxes. The localslots and
localslots_cpp numbers are 3 on tobey and 5 on sophie, and the order in
which the machines are mentioned changes (local machine first, then remote
machines in order of power).

This configuration is the result of a lot of experimentation rather than
just a theoretical calculation. The various guides that discuss how to tune
these numbers for best performance were modestly helpful in explaining what
the tuning parameters mean, but experimenting and watching the resulting
performance was the best teacher.

Hope this helps.

John Blinka
Michael
2023-11-29 15:00:01 UTC
Permalink
Post by John Blinka
wrote:l
Post by Peter Humphreey
What am I missing?
I have much less powerful hardware than you but libreoffice (as a
stand-alone build) generates many more threads than 4 on my “cluster”. I’m
also using distcc.
On the main box, I set
MAKEOPTS=“-j17 -l6”
On the other two less powerful ones -l is 5 and 3, but -j is the same.
On the main box, /etc/distcc/hosts contains
localhost/11 sophie/5,lzo tobey/3,lzo —localslots=11 —localslots_cpp=11
On sophie and tobey (my less powerful boxes) the hosts file contains
something similar but specific to those boxes. The localslots and
localslots_cpp numbers are 3 on tobey and 5 on sophie, and the order in
which the machines are mentioned changes (local machine first, then remote
machines in order of power).
This configuration is the result of a lot of experimentation rather than
just a theoretical calculation. The various guides that discuss how to tune
these numbers for best performance were modestly helpful in explaining what
the tuning parameters mean, but experimenting and watching the resulting
performance was the best teacher.
Hope this helps.
John Blinka
I don't use distcc, so I can't add anything useful to its application on
Peter's requirements, but a quick test by Peter would be to start a single
emerge of libreoffice on its own and observe if it is still limited to 4
threads with and without distcc.
Loading...