[gentoo-user] Controlling emerges

Post by Peter Humphrey
Hello list,
We've had a few discussions here on how to balance the parameters to emerge
One the one hand, big jobs should be able to use the maximum CPU
performance and RAM capacity, but on the other we don't want to flood the
system.
Therefore, I think it would be useful to be able to specify in env and
package.env that a job should be run on its own - if any other emerge jobs are
scheduled, wait until they're finished. Combine that with a specific MAKEOPTS,
and we'd have a more flexible deployment of resouces.
Is this feasible? What have I not thought of?

I've had exactly the same thought for some time now.Â My guess is that
it is theoretically possible to add some USE flag or ENV var for portage
to recognize, but I don't know the portage internals well enough to
guess how much effort it would be.Â Given that portage orders ebuilds in
a single emerge session based on some dependency graph, that seems like
a good place to put the necessary hooks.

As a starting point, one option might be to create a special/magic
ebuild and make it a dependency of those jobs that need to be run alone,
and have something about it that won't run if anything else is still
running.Â But, I don't know if those pre-checks (such as checking for
enough RAM and/or disk space) can be run at build time and not just at
portage startup time.Â The other possible problem with that approach
would be to be sure that ebuild gets run separately for each other
ebuild that depends on it - not all of them depending on it being run
once.Also, those blocking ebuilds have work so that if several of them
are queued (and running their "wait for everything else to finish"
scripts - exactly one of them needs to start. I don't know if those
pre-check scripts count as running before or within the ebuild itself.

Alan McKinnon

2023-09-18 13:50:01 UTC

It may be less complex than you think, Jack. I envisage a package being
marked
as solitary, and when portage reaches that package, it waits until all
current
jobs have finished, then it starts the solitary package with the
environment
specified for it, and it doesn't start the next one until that one has
finished.
The dependency calculation shouldn't need to be changed.
It seems simple the way I see it.

How does that improve emerge performance overall?

--
Alan McKinnon
alan dot mckinnon at gmail dot com

Peter Humphrey

2023-09-18 16:10:03 UTC

Post by Peter Humphrey
It may be less complex than you think, Jack. I envisage a package being marked
as solitary, and when portage reaches that package, it waits until all current
jobs have finished, then it starts the solitary package with the environment
specified for it, and it doesn't start the next one until that one has finished.
The dependency calculation shouldn't need to be changed.
It seems simple the way I see it.

How does that improve emerge performance overall?

--
Regards,
Peter.

Alan McKinnon

2023-09-18 16:20:01 UTC

How does that improve emerge performance overall?

have only X resources available all the time.
Whether you just let emerge do it's thing or try get it to do big packages
on their own, everything is still going to use the same number of cpu
cycles overall and you will save nothing.

If webkit-gtk is the only big package, have you considered:

emerge -1v webkit-gtk && emerge -avuND @world?

What you have is not a portage problem. It is a orthodox parallelism
problem, and I think you are thinking your constraint is unique in the work
- it isn't.
With parallelism, trying to fiddle single nodes to improve things overall
never really works out.

Just my $0.02

Alan

--
Alan McKinnon
alan dot mckinnon at gmail dot com

Michael

2023-09-18 18:10:02 UTC

It may be less complex than you think, Jack. I envisage a package
being
marked
as solitary, and when portage reaches that package, it waits until all current
jobs have finished, then it starts the solitary package with the environment
specified for it, and it doesn't start the next one until that one has
finished.
The dependency calculation shouldn't need to be changed.
It seems simple the way I see it.

How does that improve emerge performance overall?

have only X resources available all the time.
Whether you just let emerge do it's thing or try get it to do big packages
on their own, everything is still going to use the same number of cpu
cycles overall and you will save nothing.
What you have is not a portage problem. It is a orthodox parallelism
problem, and I think you are thinking your constraint is unique in the work
- it isn't.
With parallelism, trying to fiddle single nodes to improve things overall
never really works out.
Just my $0.02
Alan

I think there is a level of complexity involved which will make (m)any
attempts on optimisation difficult, because EMERGE_DEFAULT_OPTS competes for
resources against MAKEOPTS, resulting in a trade-off between their optimal
settings. Parallelisation becomes difficult to maximise on the basis of some
presets when not all updates have the same combination of small Vs large
packages, dependent packages queue up before dependencies are built, various
emerge stages are processed linearly, some versions of gcc may get hungrier
for RAM and whatever else I haven't accounted for.

Someone with a PhD on multivariate stochastic analysis could probably come up
with some nifty code to include in portage? ;-)

Dale

2023-09-18 18:20:01 UTC

On Mon, Sep 18, 2023 at 3:44â¯PM Peter Humphrey

It may be less complex than you think, Jack. I envisage a

package being

marked
as solitary, and when portage reaches that package, it waits

until all

current
jobs have finished, then it starts the solitary package with the
environment
specified for it, and it doesn't start the next one until that

one has

finished.
The dependency calculation shouldn't need to be changed.
It seems simple the way I see it.

How does that improve emerge performance overall?

By allocating all the system resources to huge packages while not flooding the
system with lesser ones. For example, I can set -j20 for
webkit-gtk today
without overflowing the 64GB RAM, and still have 4 CPU threads available to
other tasks. The change I've proposed should make the whole operation more
efficient overall and take less time.
As things stand today, I have to make do with -j12 or so, wasting time and
resources. I have load-average set at 32, so if I were to set -j20 generally
I'd run out of RAM in no time. I've had many instances of packages failing to
compile in a large update, but going just fine on their own; and I've had
mysterious operational errors resulting, I suspect, from otherwise undetected
miscompilation.
Previous threads have more detail of what I've tried already.
I did read all those but no matter how you move things around you
still have only X resources available all the time.
Whether you just let emerge do it's thing or try get it to do big
packages on their own, everything is still going to use the same
number of cpu cycles overall and you will save nothing.
What you have is not a portage problem. It is a orthodox parallelism
problem, and I think you are thinking your constraint is unique in the
work - it isn't.
With parallelism, trying to fiddle single nodes to improve things
overall never really works out.
Just my $0.02
Alan
--
Alan McKinnon
alan dot mckinnon at gmail dot com

I have to admit, I wish I could tell emerge to compile certain packages
on their own as well.Â LOo, that qtweb package and a few others.Â
Sometimes they end up naturally compiling on their own but sometimes, I
end up with LOo, Seamonkey or Firefox, or that qtweb package trying to
compile at the same time in some combination.Â Sometimes, all four hit
at once.Â It's bad enough when it is just two of them but when they all
hit, it causes problems.Â It would be nice if we could set up a list
that tells emerge to emerge only one at a time just like we tell it not
to use tmpfs for certain builds.Â

While just emerging them first might work, it also limits emerge to just
doing that package instead of the whole update.Â It also could have
dependencies that also want a lot of resources.Â I don't know about most
people but I run my updates while I sleep.Â Having the option to set
that up would be nice.Â It's not like packages are getting any smaller
either.Â This is a growing problem.Â

I have no idea how to do this but I do like the idea.Â

Dale

:-)Â :-)Â

John Blinka

2023-09-18 18:20:01 UTC

Post by Alan McKinnon
What you have is not a portage problem. It is a orthodox parallelism
problem, and I think you are thinking your constraint is unique in the work
- it isn't.
With parallelism, trying to fiddle single nodes to improve things overall
never really works out.
Just my $0.02
Alan

I use this idea, but it requires (for me) a more sophisticated
implementation. As is, it pulls in webkit-gtk-x.y.z and
webkit-gtk-x.y.z-r410 simultaneously - for my portage setup. I donât have
the memory to handle both at the same time. Itâs guaranteed to crash on my
system.

Instead, I do a preliminary emerge -p<etc>, saving the specific package
builds to a file. I then inspect the file to see what portage wants to do.
Too often, the file contains webkit-gtk-x.y.z and webkit-gtk-x.y.z-r410 in
sequence, usually preceded and followed by other packages. Portage always
wants to build both versions simultaneously - guaranteed crash for me.

Instead of invoking emerge, I write a little bash script to emerge the
preceding packages in parallel, followed by a serial webkit-gtk-x.y.z,
followed by a serial webkit-gtk-x.y.z-r410, and then finally all the
remaining packages. Four emerge invocations in sequence. The script builds
specific versions, ie, =net-libs/webkit-gtk-x.y.z, to ensure it builds only
1 package at a time. Itâs trivial to write.

A problem arises when splitting up builds as you suggest. Emerge has its
own ideas about what itâs going to do - and in what sequence. When you try
to impose a build order not of its making, emerge will often do something
unintuitive and frustrating to you. Iâve learned to respect its sequencing.
This technique keeps portage happy and predictable by using its sequencing.
It gives me reliable overnight unattended upgrades.

John Blinka

Rich Freeman

2023-09-18 18:50:01 UTC

Whether you just let emerge do it's thing or try get it to do big packages on their own, everything is still going to use the same number of cpu cycles overall and you will save nothing.

That is true of CPU, but not RAM. The problem with large parallel
builds is that for 95% of packages they're fine, and for a few
packages they'll eat up all the RAM in the system until the OOM killer
kicks in, or the system just goes into a swap storm (which can cause
panics with some less-than-perfect kernel drivers).

I'm not aware of any simple solutions. I do have some packages set to
just build with a small number of jobs, but that won't prevent other
packages from being built alongside them. Usually that is enough
though. It is just frustrating to watch a package take all day to
build because I can't use more than -j2 or so without running out of
RAM, usually just at one step of the build process.

I can't see anybody bothering with this, but in theory packages could
have a variable to hint at the max RAM consumed per job, and the max
number of jobs it will run. Then the package manager could take the
lesser of -j and the max jobs the package can run, multiply it by the
RAM requirement, and compare that to available memory (or have a
setting to limit max RAM). Basically treat RAM as a resource and let
the package manager reduce -j to manage it if necessary.

Hmm, I guess a workaround would be to set ulimits on the portage user
so that emerge is killed before RAM use gets too out of hand. That
won't help complete builds, but it would at least keep it from killing
the system.

--
Rich

William Kenworthy

2023-09-18 22:50:01 UTC

per package env variables?

https://wiki.gentoo.org/wiki//etc/portage/package.env

BillK

Peter Humphrey

2023-09-19 09:10:01 UTC

Post by William Kenworthy
per package env variables?
https://wiki.gentoo.org/wiki//etc/portage/package.env

Apropos of what?

--
Regards,
Peter.

William Kenworthy

2023-09-19 09:20:01 UTC

That is where you set per package compiler parameters by overriding
make.conf settings.

BillK

Post by William Kenworthy
per package env variables?
https://wiki.gentoo.org/wiki//etc/portage/package.env

Apropos of what?

Andreas Fink

2023-09-19 09:40:01 UTC

On Tue, 19 Sep 2023 17:14:42 +0800

Post by William Kenworthy
That is where you set per package compiler parameters by overriding
make.conf settings.
BillK

I would argue, that per package compiler parameters is not what is
needed, because in the example of chromium 99% of the compile time can
be done with -j16 on my machine, but at a very short time I would need
to run with -j1, because I otherwise run out of memory otherwise.
In short: I want to run with as many jobs as I have cores, as long as
I do not run out of memory, and when I run out of memory I want to run
with as little jobs as possible until the pressure on the memory is
gone. Then I want to continue with as many jobs as possible.

And this is not something that make / ninja provide. They have a
concept of global number of jobs, which in this concept must be set to
the maximum number that your RAM can take at the very short period in
time where you have a high watermark on your RAM, but that number would
be at 99% of the compilation time way too low.

FWIW, I have a hacky solution that I use privately, but I never
published it anywhere, because it could break some builds, and at the
moment I'm not ready to support it.

Basically it tries to run with as many jobs as the number of CPU cores
at all times. It watches memory pressure in the background and
kills build jobs as soon as a high watermark is reached.
At this point, make would normally exit, because a build job failed.
However my hacky solution overrides the exec-family of system calls,
and if a job fails, it is being retried exclusively, i.e. no other
build job is allowed to run at the same time as the failed job.
It fails ultimately, when the second and exclusive run fails too.
This way, if the job failed only because of lack of memory, it will be
retried exclusively and succeeds. If it failed due to a programming
error, it will fail also the second time, and then the error is
forwarded to make.

Peter Humphrey

2023-09-19 09:50:02 UTC

(I assume this was addressed to me, though it was a reply to someone else.)

Post by William Kenworthy
That is where you set per package compiler parameters by overriding
make.conf settings.

And which make.conf setting might achieve what I want? Careful reading of the
make.conf man page hasn't revealed anything relevant.

--
Regards,
Peter.

Rich Freeman

2023-09-19 10:10:01 UTC

Post by William Kenworthy
That is where you set per package compiler parameters by overriding
make.conf settings.

And which make.conf setting might achieve what I want? Careful reading of the
make.conf man page hasn't revealed anything relevant.

There isn't one. At best there is -l which regulates jobs by system
load, but there is nothing that takes into account RAM use.

I just use package.env to limit jobs on packages that I know are RAM-hungry.

Right now my list includes:
calligra
qtwebengine
qtwebkit
ceph
nodejs
passwdqc
scipy
pandas
spidermonkey

(It has been ages since I've pruned the list, and of course what is
"too much RAM" will vary.)

The other thing I will tweak is avoiding building in a tmpfs.
Obviously anything that is RAM constrained is a good contender for not
using a tmpfs, but there are also packages that just have really large
build directories that otherwise don't need to much RAM when building.

--
Rich

William KENWORTHY

2023-09-19 10:20:01 UTC

MAKEOPTS - for example I have a laptop that locks up (heat) on long compiles so reduce the number of jobs (rust and webgtk). The discussion asks about how to control emerge - appropriate per package -j and -l for the heavy packages should go a long way to doing what you want.

Post by Peter Humphrey
(I assume this was addressed to me, though it was a reply to someone else.)

Post by William Kenworthy
That is where you set per package compiler parameters by overriding
make.conf settings.

And which make.conf setting might achieve what I want? Careful reading of the
make.conf man page hasn't revealed anything relevant.
--
Regards,
Peter.

Peter Humphrey

2023-09-19 12:30:01 UTC

Post by Peter Humphrey
I did read all those but no matter how you move things around you still
have only X resources available all the time.
Whether you just let emerge do it's thing or try get it to do big packages
on their own, everything is still going to use the same number of cpu
cycles overall and you will save nothing.

That isn't the point. The point is that it takes twice as long, and it wastes
the machine's resources while I twiddle my thumbs waiting for it.
Of course.

Post by Peter Humphrey
What you have is not a portage problem. It is a orthodox parallelism
problem, and I think you are thinking your constraint is unique in the work
- it isn't.

No, I think my problem has not been tackled by the portage developers.

Post by Peter Humphrey
With parallelism, trying to fiddle single nodes to improve things overall
never really works out.

See above.

--
Regards,
Peter.

Wol

2023-09-20 22:10:01 UTC

On Mon, Sep 18, 2023 at 3:44 PM Peter Humphrey

It may be less complex than you think, Jack. I envisage a

package being

marked
as solitary, and when portage reaches that package, it waits

until all

current
jobs have finished, then it starts the solitary package with the
environment
specified for it, and it doesn't start the next one until that

one has

finished.
The dependency calculation shouldn't need to be changed.
It seems simple the way I see it.

How does that improve emerge performance overall?

By allocating all the system resources to huge packages while not flooding the
system with lesser ones. For example, I can set -j20 for webkit-gtk today
without overflowing the 64GB RAM, and still have 4 CPU threads available to
other tasks. The change I've proposed should make the whole operation more
efficient overall and take less time.
As things stand today, I have to make do with -j12 or so, wasting time and
resources. I have load-average set at 32, so if I were to set -j20 generally
I'd run out of RAM in no time. I've had many instances of packages failing to
compile in a large update, but going just fine on their own; and I've had
mysterious operational errors resulting, I suspect, from otherwise undetected
miscompilation.
Previous threads have more detail of what I've tried already.
I did read all those but no matter how you move things around you still
have only X resources available all the time.
Whether you just let emerge do it's thing or try get it to do big
packages on their own, everything is still going to use the same number
of cpu cycles overall and you will save nothing.

Except a big chunk off your power bill ... a system under stress uses
more energy for the same amount of work.

What you have is not a portage problem. It is a orthodox parallelism
problem, and I think you are thinking your constraint is unique in the
work - it isn't.
With parallelism, trying to fiddle single nodes to improve things
overall never really works out.

A big problem you are missing is that portage does not have control of
the system. It can control its usage of the system, but if I want emerge
to use as much SPARE resource IN THE BACKGROUND as it can without
impacting on on-line responsiveness, that is HARD.

I would like to be able to tell portage "these programs are resource
hogs, don't parallelise them". If portage has loads of little jobs, it
can fire them off one after the other as resource becomes available. If
it fires a hog (or worse, two) off at the same time, the system can
rapidly collapse under load.

Even better, if portage knew roughly how much resource each job
required, it could (within constraints) start with the jobs that
required least resource and run loads of them, and by firing jobs off in
order of increasing demandingness, the number of jobs running in
parallel would naturally tail off.

At the end of the day, if the computer takes an extra 20% time, I'm not
bothered. If I'm sat at the computer 20% time extra because the system
isn't responding because emerge has bogged it down, then I do care. And
when I'm building things like webkit-gtk, llvm, LO, FF and TB, they do
hammer my system. If they're running in parallel, my system would be
near unusable.

Cheers,
Wol

Rich Freeman

2023-09-22 10:10:01 UTC

Sometimes I wish they would announce when they add features. Rich, you
frequent this list. If you hear of something new, could you post it?

Sure, if a relevant topic comes up and I'm aware of it. However, I
doubt this setting is going to do much that nice doesn't already do.

The original focus seemed to be on memory use, and niceness will not
have any impact on the memory use of a build. The only thing that
will is reducing the number of parallel jobs. There really isn't any
way to get portage to regulate memory use short of letting it be
killed (which isn't helpful), maybe letting it being stopped when
things get out of hand (which will help as the memory could at least
be swapped, but the build might not be salvageable without jumping
through a lot of hoops), or if the package maintainer provides some
kind of hinting to the package manager so that it can anticipate how
much memory it will use. Otherwise trying to figure out how much
memory a build system will use without just trying it is like solving
the halting problem.

--
Rich

Peter Humphrey

2023-09-18 13:50:01 UTC

Post by Peter Humphrey
Hello list,
We've had a few discussions here on how to balance the parameters to emerge
One the one hand, big jobs should be able to use the maximum CPU
performance and RAM capacity, but on the other we don't want to flood the
system.
Therefore, I think it would be useful to be able to specify in env and
package.env that a job should be run on its own - if any other emerge jobs
are scheduled, wait until they're finished. Combine that with a specific
MAKEOPTS, and we'd have a more flexible deployment of resouces.
Is this feasible? What have I not thought of?

I've had exactly the same thought for some time now. My guess is that
it is theoretically possible to add some USE flag or ENV var for portage
to recognize, but I don't know the portage internals well enough to
guess how much effort it would be. Given that portage orders ebuilds in
a single emerge session based on some dependency graph, that seems like
a good place to put the necessary hooks.
As a starting point, one option might be to create a special/magic
ebuild and make it a dependency of those jobs that need to be run alone,
and have something about it that won't run if anything else is still
running. But, I don't know if those pre-checks (such as checking for
enough RAM and/or disk space) can be run at build time and not just at
portage startup time. The other possible problem with that approach
would be to be sure that ebuild gets run separately for each other
ebuild that depends on it - not all of them depending on it being run
once.Also, those blocking ebuilds have work so that if several of them
are queued (and running their "wait for everything else to finish"
scripts - exactly one of them needs to start. I don't know if those
pre-check scripts count as running before or within the ebuild itself.

It may be less complex than you think, Jack. I envisage a package being marked
as solitary, and when portage reaches that package, it waits until all current
jobs have finished, then it starts the solitary package with the environment
specified for it, and it doesn't start the next one until that one has finished.
The dependency calculation shouldn't need to be changed.

It seems simple the way I see it.

--
Regards,
Peter.

Michael

2023-09-22 07:20:01 UTC