I don’t see error when installing docker with the get-docker.sh from https://docs.docker.com/engine/install/ubuntu/#install-using-the-convenience-script.
But when running docker containers, I saw this error:
Error response from daemon: AppArmor enabled on system but the docker-default profile could not be loaded: running '/usr/sbin/apparmor_parser -Kr /var/lib/docker/tmp/docker-default116736515' failed with output: apparmor_parser: Unable to replace "docker-default". apparmor_parser: Access denied. You need policy admin privileges to manage profiles.
What I’ve tried, but does not work:
- Adding
nesting
flag through the Proxmox UI. - Adding
lxc.apparmor.profile: unconfined
line to the/etc/pve/lxc/<id>.conf
.
AppArmor Error Solution
For most of the docker compose, we need to add:
security_opt:
- apparmor:unconfined
For example, for dockge:
services:
dockge:
image: louislam/dockge:1
restart: unless-stopped
ports:
# Host Port : Container Port
- 5001:5001
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./data:/app/data
# If you want to use private registries, you need to share the auth file with Dockge:
# - /root/.docker/:/root/.docker
# Stacks Directory
# ⚠️ READ IT CAREFULLY. If you did it wrong, your data could end up writing into a WRONG PATH.
# ⚠️ 1. FULL path only. No relative path (MUST)
# ⚠️ 2. Left Stacks Path === Right Stacks Path (MUST)
- /opt/stacks:/opt/stacks
environment:
# Tell Dockge where is your stacks directory
- DOCKGE_STACKS_DIR=/opt/stacks
security_opt:
- apparmor:unconfined
What may also work (if you don’t want to add the security_opt
for each docker compose stack):
- Add these two lines in the LXC config:
lxc.cgroup2.devices.allow: a
lxc.cap.drop:
Quickly create 1000:1000 user
When using Ubuntu LXC, if the docker container was originally hosted on a Ubuntu VM server, we may be using 1000:1000
as the UID:GID
when running the container. Use this script to quickly set up the 1000:1000 user and group in LXC (Ubuntu LXC does not have the 1000:1000 user
and group created by default) so we can continue using exactly the same docker compose yaml.
#!/bin/bash
set -e
# Prompt for the new user name and password
echo "Please provide the information for the new user."
read -p "Enter the name of the new user: " USERNAME
echo "Enter the password of the new user: "
stty -echo
read PASSWORD
stty echo
echo
echo "Adding a user '$USERNAME' with UID=1000, GID=1000..."
# Add a user with UID=1000, GID=1000
groupadd -g 1000 $USERNAME
useradd -u 1000 -g 1000 -m -s /bin/bash $USERNAME
echo $USERNAME:$PASSWORD | chpasswd
Last Resort
From my previous experience with Proxmox, following settings are the ultimate solution:
In the lxc config file on Proxmox
lxc.cgroup2.devices.allow: c 226:0 rwm
lxc.cgroup2.devices.allow: c 226:128 rwm
lxc.cgroup2.devices.allow: c 29:0 rwm
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file
这里面的 226, 128, 29 这些 magic numbe,应该是要跟 ls -al /dev/dri/*
- Related note: Set up LXC
WARNING
All those hacks on the ‘device’ bindings require privileged container. And can be a source of risk. So, for those services, it is probably better not exposing them to Internet.
如果像我现在,有多个 GPU (Nvidia + intel iGPU)就需要用下面这个 script 来确认一下具体的设备:
NOTE
这是指我的 Nvidia 显卡还没有 pass through 给某个 VM 的情况。一旦设定了直通(启用IOMMU、屏蔽驱动、绑定VFIO),这个设备应该在 PVE 里面就没有了。
- 在Proxmox主机启动的极早期阶段,vfio-pci 驱动就优先“捕获”了NVIDIA GPU。
- 宿主机的 /dev/dri/ 目录下将不会出现这张NVIDIA显卡对应的cardX和renderDXXX节点。
# Loop through all entries in the /sys/class/drm directory
for dev_path in /sys/class/drm/*; do
# Extract the base name (e.g., card1, renderD128, card1-DP-1)
dev_name=$(basename "$dev_path")
#
# >> CORE IMPROVEMENT <<
# Only proceed if the device name matches the pattern for a primary
# device (cardX or renderDXXX). This filters out ports.
#
if [[ "$dev_name" =~ ^(card|renderD)[0-9]+$ ]]; then
# Double-check that the 'device' symlink exists to be safe
if [ -L "$dev_path/device" ]; then
# 1. Get the full path of the device node
full_dev_node="/dev/dri/$dev_name"
# 2. Get the PCI address by reading the symlink
pci_address=$(readlink "$dev_path/device" | awk -F'/' '{print $NF}')
# 3. Get the full description from lspci, including the PCI address
description=$(lspci -s "$pci_address")
# Print the structured output using printf for alignment
printf "Device Path: %s\n" "$full_dev_node"
printf "PCI Address: %s\n" "$pci_address"
printf "Description: %s\n" "$description"
echo "-----------------------------------------------------------------"
fi
fi
done
Output on my new server:
Device Path: /dev/dri/card1
PCI Address: 0000:01:00.0
Description: 01:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070] (rev a1)
-----------------------------------------------------------------
Device Path: /dev/dri/card2
PCI Address: 0000:00:02.0
Description: 00:02.0 VGA compatible controller: Intel Corporation CometLake-S GT2 [UHD Graphics 630] (rev 03)
-----------------------------------------------------------------
Device Path: /dev/dri/renderD128
PCI Address: 0000:01:00.0
Description: 01:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070] (rev a1)
-----------------------------------------------------------------
Device Path: /dev/dri/renderD129
PCI Address: 0000:00:02.0
Description: 00:02.0 VGA compatible controller: Intel Corporation CometLake-S GT2 [UHD Graphics 630] (rev 03)
-----------------------------------------------------------------
所以,如果我不打算直通 NVIDIA 的显卡,那么就需要把 card2, render 129 给进去,保险起见,都给进去得了。
root@pve:~# cat /etc/pve/lxc/205.conf
arch: amd64
cores: 2
dev0: /dev/dri/card1,gid=44
dev1: /dev/dri/renderD128,gid=104
dev2: /dev/dri/card2,gid=44
dev3: /dev/dri/renderD129,gid=104
features: mount=cifs,nesting=1
hostname: jellyfin
memory: 2048
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.1.1,hwaddr=BC:24:11:6B:40:73,ip=192.168.1.225/24,type=veth
ostype: ubuntu
rootfs: local-zfs-nvme:subvol-205-disk-0,size=16G
swap: 512
tags: lan;media
In docker compose:
devices:
# pass in all devices (including card1, card2, render128, render129)
- /dev/dri/:/dev/dri/
NOTE
Jellyfin 还是比较智能的。如果你有多个显卡,都一股脑给它用,它会根据你设定的解码方案(比如选择 Quicksync)来自动选择用哪个硬件。 其实这个算是 native installation 使用的时候必须有的功能吧。也只有 docker 或者 LXC 的用户,会考虑精准的给入有限的硬件。如果是桌面部署的话,当然经常会有多个显卡的情况。
不过最终想稳定的话,还是把 Nvidia pass through 给 VM,比较稳妥。在 PVE 里面纠缠 Nvidia Driver 我觉得不是一个好的选择。