Discussion:
[gentoo-user] Issues with amdgpu driver: Compositor hangs, sysfs not working
(too old to reply)
Paul Sopka
2024-02-17 19:40:01 UTC
Permalink
Hello everybody,

I installed an AMD Radeon RX 7900 XTX today, switching from Nvidia. But
once I enable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y to have a tty once
the driver is up, the following happens:

1) My Wayland compositor (Hyprland) takes very long to start.

2) reading from sysfs (e.g. running "cat
/sys/class/drm/card0/device/gpu_busy_percent") does not work and causes
a hang.

Once I disable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=n, I have no issues
with the starting speed of the compositors at all and the mentioned
command works. But this leads to a black tty.

The only two error messages from amdgpu I find in dmesg are:

[   66.757500] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your
previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
[   66.757502] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff!

and

[  870.087856] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your
previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
[  870.087858] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics
table!

Did I forget anything or is this a bug?
Michael
2024-02-18 09:00:01 UTC
Permalink
Post by Paul Sopka
Hello everybody,
I installed an AMD Radeon RX 7900 XTX today, switching from Nvidia. But
once I enable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y to have a tty once
1) My Wayland compositor (Hyprland) takes very long to start.
2) reading from sysfs (e.g. running "cat
/sys/class/drm/card0/device/gpu_busy_percent") does not work and causes
a hang.
Once I disable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=n, I have no issues
with the starting speed of the compositors at all and the mentioned
command works. But this leads to a black tty.
You'd normally need this enabled to get a fb display on the console, but I
don't know if this would be provided by proprietary drivers instead for your
card - see below.
Post by Paul Sopka
[ 66.757500] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your
previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
[ 66.757502] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff!
and
[ 870.087856] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your
previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
[ 870.087858] amdgpu 0000:03:00.0: amdgpu: Failed to export SMU metrics
table!
Did I forget anything or is this a bug?
It could be both. I don't think there's any Linux firmware released yet for
this card - but I don't follow the latest & greatest so I could be wrong.
You'd need the AMD amdgpu-pro on top of the amdgpu driver, to bring in the
proprietary OpenGL, OpenCL, Vulkan and AMF components:

https://wiki.gentoo.org/wiki/AMDGPU-PRO

This is what's in portage today:

~ $ eix -l amdgpu-pro
* dev-libs/amdgpu-pro-opencl
Available versions:
~ 20.40.1147286 ^fmsd [ABI_X86="32 64"] ["|| ( abi_x86_32
abi_x86_64 )"]
Homepage: https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-20-40
Description: Proprietary OpenCL implementation for AMD GPUs

* media-libs/amdgpu-pro-vulkan
Available versions:
~ 21.50.2.1384496-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 22.10.4.1452060-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 22.20.5.1511376-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 22.40.6.1580631-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 23.10.3.1620044-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 23.20.0.1654522-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
Homepage: https://www.amd.com/en/support
Description: AMD's closed source vulkan driver, from Radeon
Software for Linux

* media-video/amdgpu-pro-amf
Available versions:
~ 1.4.24.1452059 ^md
~ 1.4.26.1511376 ^md
~ 1.4.29.1580631 ^md
~ 1.4.30.1620044 ^md
~ 1.4.31.1654522 (0/31)^md
Homepage: https://www.amd.com/en/support
Description: AMD's closed source Advanced Media Framework (AMF)
driver

Found 3 matches
Paul Sopka
2024-02-18 09:20:01 UTC
Permalink
Thank you for your reply.
Post by Michael
Post by Paul Sopka
Hello everybody,
I installed an AMD Radeon RX 7900 XTX today, switching from Nvidia. But
once I enable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y to have a tty once
1) My Wayland compositor (Hyprland) takes very long to start.
2) reading from sysfs (e.g. running "cat
/sys/class/drm/card0/device/gpu_busy_percent") does not work and causes
a hang.
Once I disable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=n, I have no issues
with the starting speed of the compositors at all and the mentioned
command works. But this leads to a black tty.
You'd normally need this enabled to get a fb display on the console, but I
don't know if this would be provided by proprietary drivers instead for your
card - see below.
I made a mistake here, sorry. The issue causing setting is
DRM_FBDEV_EMULATION=y, which on itself works with the open source
driver, but causes issues as soon as I start Hyprland.
Post by Michael
It could be both. I don't think there's any Linux firmware released yet for
this card - but I don't follow the latest & greatest so I could be wrong.
You'd need the AMD amdgpu-pro on top of the amdgpu driver, to bring in the
https://wiki.gentoo.org/wiki/AMDGPU-PRO
~ $ eix -l amdgpu-pro
* dev-libs/amdgpu-pro-opencl
~ 20.40.1147286 ^fmsd [ABI_X86="32 64"] ["|| ( abi_x86_32
abi_x86_64 )"]
Homepage: https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-20-40
Description: Proprietary OpenCL implementation for AMD GPUs
* media-libs/amdgpu-pro-vulkan
~ 21.50.2.1384496-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 22.10.4.1452060-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 22.20.5.1511376-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 22.40.6.1580631-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 23.10.3.1620044-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
~ 23.20.0.1654522-r1 ^md [ABI_X86="32 64"
VIDEO_CARDS="amdgpu"] ["video_cards_amdgpu"]
Homepage: https://www.amd.com/en/support
Description: AMD's closed source vulkan driver, from Radeon
Software for Linux
* media-video/amdgpu-pro-amf
~ 1.4.24.1452059 ^md
~ 1.4.26.1511376 ^md
~ 1.4.29.1580631 ^md
~ 1.4.30.1620044 ^md
~ 1.4.31.1654522 (0/31)^md
Homepage: https://www.amd.com/en/support
Description: AMD's closed source Advanced Media Framework (AMF)
driver
Found 3 matches
The firmare seems good, since it is loaded just fine, "dmesg | grep
amdgpu | grep firmware" returns:
[   16.905914] Loading firmware: amdgpu/psp_13_0_0_sos.bin
[   16.905916] Loading firmware: amdgpu/psp_13_0_0_ta.bin
[   16.905917] Loading firmware: amdgpu/smu_13_0_0.bin
[   16.905917] Loading firmware: amdgpu/dcn_3_2_0_dmcub.bin
[   16.905918] Loading firmware: amdgpu/gc_11_0_0_pfp.bin
[   16.905919] Loading firmware: amdgpu/gc_11_0_0_me.bin
[   16.905919] Loading firmware: amdgpu/gc_11_0_0_rlc.bin
[   16.905920] Loading firmware: amdgpu/gc_11_0_0_mec.bin
[   16.905921] Loading firmware: amdgpu/gc_11_0_0_imu.bin
[   16.905922] Loading firmware: amdgpu/sdma_6_0_0.bin
[   16.905923] Loading firmware: amdgpu/vcn_4_0_0.bin
[   16.906095] Loading firmware: amdgpu/gc_11_0_0_mes_2.bin
[   16.906096] Loading firmware: amdgpu/gc_11_0_0_mes1.bin
[   16.906496] amdgpu 0000:03:00.0: amdgpu: Will use PSP to load VCN
firmware

Also the mesa libraries work just fine, if I disable
DRM_FBDEV_EMULATION=n, I just get a black tty, but Hyprland starts and I
can play games with the expected performance.
Michael
2024-02-18 10:50:02 UTC
Permalink
Post by Paul Sopka
Thank you for your reply.
Post by Michael
Post by Paul Sopka
Hello everybody,
I installed an AMD Radeon RX 7900 XTX today, switching from Nvidia. But
once I enable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y to have a tty once
1) My Wayland compositor (Hyprland) takes very long to start.
2) reading from sysfs (e.g. running "cat
/sys/class/drm/card0/device/gpu_busy_percent") does not work and causes
a hang.
Once I disable FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=n, I have no issues
with the starting speed of the compositors at all and the mentioned
command works. But this leads to a black tty.
You'd normally need this enabled to get a fb display on the console, but I
don't know if this would be provided by proprietary drivers instead for
your card - see below.
I made a mistake here, sorry. The issue causing setting is
DRM_FBDEV_EMULATION=y, which on itself works with the open source
driver, but causes issues as soon as I start Hyprland.
I have an older AMD card here, using amdgpu only. If I disable
DRM_FBDEV_EMULATION I lose my framebuffer and end up with a black screen.
When the sddm display manager starts I have a GUI again to login to Plasma
with. This is to be expected in my case, because I rely on the KMS driver
(KMS FB helpers) to provide a framebuffer device. Unless an AMD proprietary
driver is available via amdgpu-pro to substitute for the KMS FB emulation,
then you won't get a framebuffer device to render your tty console.
Post by Paul Sopka
Post by Michael
It could be both. I don't think there's any Linux firmware released yet
for this card - but I don't follow the latest & greatest so I could be
wrong. You'd need the AMD amdgpu-pro on top of the amdgpu driver, to
https://wiki.gentoo.org/wiki/AMDGPU-PRO
[snip ...]
Post by Paul Sopka
The firmare seems good, since it is loaded just fine, "dmesg | grep
[ 16.905914] Loading firmware: amdgpu/psp_13_0_0_sos.bin
[ 16.905916] Loading firmware: amdgpu/psp_13_0_0_ta.bin
[ 16.905917] Loading firmware: amdgpu/smu_13_0_0.bin
[ 16.905917] Loading firmware: amdgpu/dcn_3_2_0_dmcub.bin
[ 16.905918] Loading firmware: amdgpu/gc_11_0_0_pfp.bin
[ 16.905919] Loading firmware: amdgpu/gc_11_0_0_me.bin
[ 16.905919] Loading firmware: amdgpu/gc_11_0_0_rlc.bin
[ 16.905920] Loading firmware: amdgpu/gc_11_0_0_mec.bin
[ 16.905921] Loading firmware: amdgpu/gc_11_0_0_imu.bin
[ 16.905922] Loading firmware: amdgpu/sdma_6_0_0.bin
[ 16.905923] Loading firmware: amdgpu/vcn_4_0_0.bin
[ 16.906095] Loading firmware: amdgpu/gc_11_0_0_mes_2.bin
[ 16.906096] Loading firmware: amdgpu/gc_11_0_0_mes1.bin
[ 16.906496] amdgpu 0000:03:00.0: amdgpu: Will use PSP to load VCN
firmware
These are for the amdgpu driver. I expect the amdgpu-pro proprietary driver
contains additional firmware.
Post by Paul Sopka
Also the mesa libraries work just fine,
Mesa is the open source implementation of OpenGL, Vulkan, et al. graphics API
specifications. If you are using proprietary AMD drivers then I understand
all the graphics API instructions will go through these proprietary drivers,
instead of being translated by Mesa.
Post by Paul Sopka
if I disable
DRM_FBDEV_EMULATION=n, I just get a black tty, but Hyprland starts and I
can play games with the expected performance.
I am not sure how the fbdev emulation in the kernel works with the amdgpu-pro
when combined with Hyprland. Have you tried a different compositor to see how
it compares. If your problem is caused by some Hyprland bug, you'd soon know.
Paul Sopka
2024-02-18 11:40:01 UTC
Permalink
AMDGPU-PRO is not a driver, but a set of libraries containing
opencl,vulkan and advanced media framework. It operates on top of amdgpu.
Post by Michael
Mesa is the open source implementation of OpenGL, Vulkan, et al. graphics API
specifications. If you are using proprietary AMD drivers then I understand
all the graphics API instructions will go through these proprietary drivers,
instead of being translated by Mesa.
As you said, it could be seen as an alternative to mesa, but will not
change anything about the firmware or the amdgpu driver. It therefore
will not change anything about the framebuffer, since this is handled by
the driver.

That said, I do not have the described problems when starting gamescope
from a tty, so my guess now is that its a Hyprland issue. I filed a bug
there. I will also try to start Hyprland using amdgpu-pro instead of mesa.

Thank you for your time

Loading...