Trennen Sie die GPU von den Radeon-Treibern, ohne lightdm neu zu starten

Trennen Sie die GPU von den Radeon-Treibern, ohne lightdm neu zu starten

Ich verwende ein Bash-Skript, um meine GPU für KVM PCI-Passthrough neu zu binden, aber ich muss lightdm stoppen, um es von den Radeon-Treibern zu trennen bzw. zu binden. Wenn ich lightdm nicht stoppe, bleibt das gesamte System nach ein paar Sekunden hängen und ich kann mich nicht einmal per SSH anmelden, um zu sehen, was los ist. Es muss eine Möglichkeit geben, die Treiber sicher zu trennen. Ich verwende Kernel 4.1.6, da 4.2 derzeit PCI-Passthrough unterbricht.

Ich habe versucht, den Radeon-Treiber vor dem Aufheben der Bindung zu entfernen, aber das hat nicht funktioniert.

modprobe --remove-dependencies radeon

Ich vermute, dies liegt daran, dass es von diesen verwendet wird, die aus irgendeinem Grund nicht entfernt wurden:

lsmod | grep radeon
radeon               1589248  0
ttm                    94208  1 radeon
i2c_algo_bit           16384  2 i915,radeon
drm_kms_helper        126976  2 i915,radeon
drm                   352256  7 ttm,i915,drm_kms_helper,radeon

Es gab viele Stacktraces wie diesen. Einige von sysfs/group.c und der Rest von drm. Es sieht so aus, als wäre das ein Problem mit der Speicherverwaltung. Ich bin nicht sicher, wie ich die Bindung richtig aufheben würde.

WARNING: CPU: 3 PID: 10935 at /home/kernel/COD/linux/drivers/gpu/drm/radeon/radeon_object.c:83 radeon_ttm_bo_destroy+0xea/0xf0 [radeon]()
Modules linked in: pci_stub joydev binfmt_misc arc4 nls_iso8859_1 eeepc_wmi asus_wmi sparse_keymap ath9k ath9k_common intel_rapl iosf_mbi amdkfd x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek amd_iommu_v2 snd_hda_co$
CPU: 3 PID: 10935 Comm: echo Tainted: G        W       4.1.6-040106-generic #201508170230
Hardware name: ASUS All Series/Z97-E/USB 3.1, BIOS 0403 04/07/2015
ffffffffc08d62a0 ffff88010656fa38 ffffffff817d1363 0000000000000000
0000000000000000 ffff88010656fa78 ffffffff81079c3a ffff88012d9d1ec0
ffff880220a6f868 ffff880220a6f800 0000000000002480 ffff880220a6f868
Call Trace:
[<ffffffff817d1363>] dump_stack+0x45/0x57
[<ffffffff81079c3a>] warn_slowpath_common+0x8a/0xc0
[<ffffffff81079d2a>] warn_slowpath_null+0x1a/0x20
[<ffffffffc07bf5ba>] radeon_ttm_bo_destroy+0xea/0xf0 [radeon]
[<ffffffffc042e4d9>] ttm_bo_release_list+0xa9/0x180 [ttm]
[<ffffffffc04351e0>] ? ttm_bo_man_put_node+0x40/0x50 [ttm]
[<ffffffffc042e6cd>] ttm_bo_release+0x11d/0x2b0 [ttm]
[<ffffffff81507816>] ? __dev_printk+0x46/0xa0
[<ffffffffc042e889>] ttm_bo_unref+0x29/0x30 [ttm]
[<ffffffffc07bfada>] radeon_bo_unref+0x2a/0x50 [radeon]
[<ffffffffc07d4cdb>] radeon_gem_object_free+0x4b/0x50 [radeon]
[<ffffffffc00254a7>] drm_gem_object_free+0x27/0x30 [drm]
[<ffffffffc07bff78>] radeon_bo_force_delete+0x128/0x130 [radeon]
[<ffffffffc07d4ebe>] radeon_gem_fini+0xe/0x10 [radeon]
[<ffffffffc083ebad>] si_fini+0xbd/0x110 [radeon]
[<ffffffffc07a1612>] radeon_device_fini+0x42/0x140 [radeon]
[<ffffffffc07a3d40>] radeon_driver_unload_kms+0x50/0x70 [radeon]
[<ffffffffc002a8cd>] drm_dev_unregister+0x2d/0xc0 [drm]
[<ffffffffc002af87>] drm_put_dev+0x27/0x80 [drm]
[<ffffffffc079f295>] radeon_pci_remove+0x15/0x20 [radeon]
[<ffffffff8140193f>] pci_device_remove+0x3f/0xc0
[<ffffffff8150b297>] __device_release_driver+0x87/0x120
[<ffffffff8150b353>] device_release_driver+0x23/0x30
[<ffffffff8150a04d>] unbind_store+0xbd/0xe0
[<ffffffff81509484>] drv_attr_store+0x24/0x40
[<ffffffff8127478d>] sysfs_kf_write+0x3d/0x50
[<ffffffff81273c3a>] kernfs_fop_write+0x12a/0x180
[<ffffffff811f8d98>] __vfs_write+0x28/0x100
[<ffffffff811fba19>] ? __sb_start_write+0x49/0xf0
[<ffffffff81320993>] ? security_file_permission+0x23/0xa0
[<ffffffff811f9499>] vfs_write+0xa9/0x1b0
[<ffffffff817d6f66>] ? mutex_lock+0x16/0x37
[<ffffffff811fa2a6>] SyS_write+0x46/0xb0
[<ffffffff81067240>] ? do_page_fault+0x30/0x80
[<ffffffff817d8f32>] system_call_fastpath+0x16/0x75

Hier ist mein aktuelles Skript für alle, die es interessiert. (Ausführen von einer TTY-Konsole außerhalb von Xsession)

#!/bin/bash

read -n3 -rsp "Restart lightdm to unbind the GPU? [yes] " res
test "$res" != 'yes' && exit 1
echo

sudo service lightdm stop
sudo echo "1002 683d" > /sys/bus/pci/drivers/vfio-pci/new_id
sudo echo "1002 aab0" > /sys/bus/pci/drivers/vfio-pci/new_id
sudo echo "0000:01:00.0" > /sys/bus/pci/devices/0000:01:00.0/driver/unbind
sudo echo "0000:01:00.1" > /sys/bus/pci/devices/0000:01:00.1/driver/unbind
sudo echo "0000:01:00.0" > /sys/bus/pci/drivers/vfio-pci/bind
sudo echo "0000:01:00.1" > /sys/bus/pci/drivers/vfio-pci/bind
sudo echo "1002 683d" > /sys/bus/pci/drivers/vfio-pci/remove_id
sudo echo "1002 aab0" > /sys/bus/pci/drivers/vfio-pci/remove_id
sudo service lightdm start

echo "Rebind Audio"
sudo modprobe pci_stub
sudo echo "8086 8ca0" > /sys/bus/pci/drivers/pci-stub/new_id
sudo echo "0000:00:1b.0" > /sys/bus/pci/drivers/snd_hda_intel/unbind
sudo echo "0000:00:1b.0" > /sys/bus/pci/drivers/pci-stub/bind
sudo echo "8086 8ca0" > /sys/bus/pci/drivers/pci-stub/remove_id

# Check if VM drive is mounted
if ! grep -qs '/media/ljosalfur/VM' /proc/mounts; then
echo "Attempting to mount VM drive. I don't know how though."
#sudo mkdir /media/ljosalfur/VM
#sudo mount /dev/disk/by-id/0BD253F0-EF7F-6F40-BDD8-FABF85161762 /media/ljosalfur/VM
fi

sudo kvm -monitor stdio -vnc :0 \
-m 6G -mem-path /dev/hugepages \
-drive if=pflash,format=raw,file=./OVMF.fd -rtc base=localtime \
-cpu host -smp 6,sockets=1,cores=6,threads=1 \
-device vfio-pci,host=01:00.0,multifunction=on,x-vga=on \
-device vfio-pci,host=01:00.1 \
-device pci-assign,host=00:1b.0 \
-drive file=/media/ljosalfur/VM/vm7.img,format=raw,cache=writethrough \
-smb /media/ljosalfur \
-usb -usbdevice host:046d:c24a -show-cursor \
-usb -usbdevice host:1b1c:1b08

echo
echo "Re-Rebind Audio"
sudo echo "0000:00:1b.0" > /sys/bus/pci/drivers/pci-stub/unbind
sudo echo "0000:00:1b.0" > /sys/bus/pci/drivers/snd_hda_intel/bind

echo "Unbind GPU from vfio-pci"
sudo echo "0000:01:00.0" > /sys/bus/pci/drivers/vfio-pci/unbind
sudo echo "0000:01:00.1" > /sys/bus/pci/drivers/vfio-pci/unbind

read -n3 -rsp "Restart lightdm to rebind the GPU? [yes] " ress
test "$ress" != 'yes' && (exit 1)
echo
sudo echo "0000:01:00.0" > /sys/bus/pci/drivers/radeon/bind

Antwort1

Mithilfe der Informationen in diesem Thread habe ich endlich eine Möglichkeit gefunden, dies zu tun:https://www.reddit.com/r/VFIO/comments/41pm1q/my_bash_script_for_rebinding_a_secondary_nvidia/

Das funktionierende Skript lautet:

#!/bin/bash
# Which device and which related HDMI audio device. They're usually in pairs.
export VGA_DEVICE=0000:01:00.0
export AUDIO_DEVICE=0000:01:00.1
 
export VGA_DRIVER=radeon
export AUDIO_DRIVER=snd_hda_intel
 
# Passing through USB devices. Querying bus address and feeding that to QEMU
# instead of the device ID, so you can yank and replug the keyboard to regain
# control.
export KEYBOARD="1b1c:1b08"
export MOUSE="046d:c24a"
 
# Unbinds a device and loads the driver specified.
flipdriver() {
    dev="$1"
    driver="$2"
 
    if [ -z $driver ] | [ -z $dev ];
    then
        return 1
    fi
 
    vendor=$(cat /sys/bus/pci/devices/${dev}/vendor)
    device=$(cat /sys/bus/pci/devices/${dev}/device)
 
    echo -n Unbinding $vendor:$device ...
 
    if [ -e /sys/bus/pci/devices/${dev}/driver ]; then
        echo ${dev} > /sys/bus/pci/devices/${dev}/driver/unbind
        while [ -e /sys/bus/pci/devices/${dev}/driver ]; do
            sleep 0.5
            echo -n .
        done
    fi
    echo " OK!"
 
    echo -n Binding \'$driver\' to $vendor:$device ...
    echo ${vendor} ${device} > /sys/bus/pci/drivers/${driver}/new_id
 
    echo " OK!"
 
    return 0
}
 
# Common error message
fliperror()
{
    echo "Couldn\'t perform required driver switch\'n\'bait!"
    exit 1
}
 
# Xorg shouldn't run.
if [ -n "$( ps -C xinit | grep xinit )" ];
then
    echo "Don\'t run this inside Xorg!"
    exit 1
fi
 
# Unbind specified graphics card and audio device.
echo "Pulling the plug on the specified passthrough devices..."
flipdriver $VGA_DEVICE vfio-pci
flipdriver $AUDIO_DEVICE vfio-pci
 
export QEMU_PA_SAMPLES=128
export QEMU_AUDIO_DRV=alsa
 
# Get the bus addresses for keyboard and mouse.
export QEMU_KEYB=$( lsusb | sed -n 's/Bus \([0-9]*\) Device \([0-9]*\): ID '$KEYBOARD'.*/-device usb-host,bus=xhci.0,hostbus=\1,hostaddr=\2/p' )
export QEMU_MOUS=$( lsusb | sed -n 's/Bus \([0-9]*\) Device \([0-9]*\): ID '$MOUSE'.*/-device usb-host,bus=xhci.0,hostbus=\1,hostaddr=\2 -show-cursor/p' )
 
# Check if VM drive is mounted
if ! grep -qs '/media/user/VM' /proc/mounts; then
echo "Attempting to mount VM drive."
sudo mount /dev/sdc1
fi
 
#network stuff
tunctl -t vmtap10
ip link set dev tap10 address 42:42:42:42:42:10
ifconfig vmtap10 192.168.42.1 up
echo 1 > /proc/sys/net/ipv4/ip_forward
iptables -t nat -A POSTROUTING -o eno1 -j MASQUERADE
iptables -A FORWARD -i vmtap10 -j ACCEPT
iptables -A FORWARD -o vmtap10 -m state --state RELATED,ESTABLISHED -j ACCEPT
 
#pactl set-sink-volume 2 50%
 
echo Starting virtual machine...
sleep 0.2
 
# QEMU stuff
sudo kvm -monitor stdio -vnc :1 -vga none \
-drive if=pflash,format=raw,file=./OVMF.fd -rtc base=localtime \
-m 4G -mem-path /dev/hugepages \
-cpu host -smp sockets=1,cores=6,threads=1 \
-soundhw ac97 \
-device vfio-pci,host=01:00.0 \
-usb -usbdevice host:046d:c215 \
-usb -device nec-usb-xhci,id=xhci \
$QEMU_KEYB \
$QEMU_MOUS \
-net nic,macaddr=42:42:42:42:42:42 -net tap,ifname=vmtap10,script=no,downscript=no,vhost=on \
-drive file=/media/user/VM/vm10.img,format=qcow2,cache=writeback \
-smb /media/ljosalfur \
-cdrom /home/user/Downloads/virtio-win-0.1.105.iso
 
# Rebind the devices for the host.
echo Adios vfio, reloading the host drivers for the passedthrough devices...
flipdriver $AUDIO_DEVICE $AUDIO_DRIVER
flipdriver $VGA_DEVICE $VGA_DRIVER
 
iptables -F
iptables -t nat -F POSTROUTING
ip link delete vmtap10

verwandte Informationen