之前其实有了一篇笔记 [proxmox-community-script-and-nvidia-driver-support-baked-in],但是毕竟是给跑 Nvidia GPU 的 LXC 用的。

现在经过再三权衡,还是决定用 Intel iGPU 来做这个事了。Nvidia GPU 还是准备直通给一个 VM 做点能用上它的显存的事情。

再加上最近在重构 Home Server [2025-all-in-one-home-server-project—prologue],所以用这篇笔记记录一下趟过的坑。

NOTE

这里记录的方法应该说是有很大的安全漏洞的。所以仅内网使用吧。

另外,我记录一下,2025年8月,尝试了 community script,发现ffmpeg跑不起来,视频流搞不定。

When installing ‘frigate’ lxc from the community script, I found the install script requires debian 11 lxc template. But Proxmox 9 now only provides debian 12 in pveam available. so I can’t use the pveam cli tool to download the template. And the lxc installer script only uses ‘pveam’ cli when the template is not found locally.

2025-08-18-Monday BUT installer script actually does not check local templates, which could lead to a deadlock for Frigate. Luckily there is a PR to fix it: https://github.com/community-scripts/ProxmoxVE/pull/6844, although not yet merged yet, I think it is soon to be fixed.

To download debian 11 manually:

# the folder where templates are stored
cd /var/lib/vz/template/cache
# download this.
wget http://download.proxmox.com/images/system/debian-11-standard_11.7-1_amd64.tar.zst

LXC Preparation

首先用这个 command 看看你的 PVE 能看到几张显卡:

for dev_path in /sys/class/drm/*; do
  dev_name=$(basename "$dev_path")
  if [[ "$dev_name" =~ ^(card|renderD)[0-9]+$ ]]; then
    # Double-check that the 'device' symlink exists to be safe
    if [ -L "$dev_path/device" ]; then
      full_dev_node="/dev/dri/$dev_name"
      pci_address=$(readlink "$dev_path/device" | awk -F'/' '{print $NF}')
      description=$(lspci -s "$pci_address")
      printf "Device Path: %s\n" "$full_dev_node"
      printf "PCI Address: %s\n" "$pci_address"
      printf "Description: %s\n" "$description"
      echo ""
    fi
  fi
done

如果 NVIDIA GPU 已经直通了,那么应该是看不到被直通了的显卡的。

我这里还没有直通,所以能看到两张:3070 和 iGPU:

Device Path: /dev/dri/card1
PCI Address: 0000:01:00.0
Description: 01:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070] (rev a1)
 
Device Path: /dev/dri/card2
PCI Address: 0000:00:02.0
Description: 00:02.0 VGA compatible controller: Intel Corporation CometLake-S GT2 [UHD Graphics 630] (rev 03)
 
Device Path: /dev/dri/renderD128
PCI Address: 0000:01:00.0
Description: 01:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070] (rev a1)
 
Device Path: /dev/dri/renderD129
PCI Address: 0000:00:02.0
Description: 00:02.0 VGA compatible controller: Intel Corporation CometLake-S GT2 [UHD Graphics 630] (rev 03)

这里需要系好 /dev/dri/card{?} 和 /dev/dri/renderD{?}。

vim /etc/pve/lxc/<CTID>.conf
# Content:
arch: amd64
cores: 2
dev0: /dev/dri/card2,gid=44 # <- card{?}, here my iGPU is card2, so use card2
dev1: /dev/dri/renderD129,gid=104 # <- renderD{?}, here my iGPU is renderD129, so use renderD129
features: mount=cifs,nesting=1
hostname: frigate
memory: 4096
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.1.1,hwaddr=<MAC>,ip=<CIDR>,type=veth
ostype: ubuntu
rootfs: local-zfs-nvme:subvol-<CTID>-disk-0,size=16G
swap: 512
tags: media

通常都是 card0 和 renderD128。我这里因为NVIDIA GPU还没有直通,所以就两个了。但是吧,我也不清楚为什么我这里是从1开始的,而不是0.

然后就是一些比较常规的设置了:

  • 如果要挂NFS 或者 SAMBA,就在 Feature 里面打开。
  • 建议还是 privileged container,也许 unprivileged 也能用,但是没有正经测试。
    • 如果已经是 unprivileged,而且测试了不行,可以通过 backup remove restore 的方式,在 restore 的时候给恢复成 privileged。
  • Nesting 肯定是要开的。

LXC Template: Ubuntu 24.04

Docker compose

services:
  frigate:
    container_name: frigate
    privileged: true # this may not be necessary for all setups
    restart: unless-stopped # <- we will revisit it later
    stop_grace_period: 30s # allow enough time to shut down the various services
    image: ghcr.io/blakeblackshear/frigate:0.16.0
    shm_size: 512mb # update for your cameras based on calculation above
    devices:
      #  - /dev/bus/usb:/dev/bus/usb # Passes the USB Coral, needs to be modified for other versions
      #  - /dev/apex_0:/dev/apex_0 # Passes a PCIe Coral, follow driver instructions here https://coral.ai/docs/m2/get-started/#2a-on-linux
      #  - /dev/video11:/dev/video11 # For Raspberry Pi 4B
      # - /dev/dri/renderD128:/dev/dri/renderD128 # Normally, this is the one to use.
      - /dev/dri/renderD129:/dev/dri/renderD129 # But for my case, I need to use this one.
      # -/dev/dri:/dev/dri # this is the last resort, map all devs
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /root/frigate/config:/config
      - /root/frigate/storage:/media/frigate
      - type: tmpfs
        target: /tmp/cache
        tmpfs:
          size: 1000000000
    ports:
      - 8971:8971
      - 5000:5000 # Internal unauthenticated access. Expose carefully.
      - 8554:8554 # RTSP feeds
      - 8555:8555/tcp # WebRTC over tcp
      - 8555:8555/udp # WebRTC over udp
networks: {}

这里其实就是保证 docker 要给进去的 device 和 LXC 层通过来的 device 是对应的。

  • 其实只有 renderD129 被用到了。所以可能之前的 card2 是不需要在 LXC config 上面传进来。但是我就没有测试不传行不行了。

Frigate Config

mqtt: # <- do this if you use Frigate<>HAOS integration.
  host: <your_home_assistant_ip_or_fqdn>
  user: frigate
  password: <pass>
  port: 1883
  topic_prefix: frigate
  client_id: frigate
 
#############################
# Global Detection Settings #
#############################
detect:
  enabled: true
objects:
  track:
    - person
    - bear
    - bird
    - cat
    - dog
    - car
  filters:
    car:
      min_score: 0.55
      threshold: 0.76
audio:
  enabled: true
  min_volume: 200
  listen:
    - babbling
    - crying
    - scream
 
##########################
# Global Record Settings #
##########################
record:
  enabled: true
snapshots:
  enabled: true
review:
  alerts:
    labels:
      - crying
      - scream
      - yell
      - cough
      - mouse
      - bear
 
###########
# Streams #
###########
go2rtc:
  streams:
    driveway_sd:
      - rtsp://haha:heihei@<ip_cam>:554/stream2
      - ffmpeg:driveway_sd#audio=aac#hardware
 
###########
# Cameras #
###########
cameras:
  driveway:
    enabled: true
    ffmpeg:
      inputs:
        - path: rtsp://127.0.0.1:8554/driveway_sd?video&audio
          input_args: preset-rtsp-restream
          roles:
            - detect
            - audio
        - path: rtsp://127.0.0.1:8554/driveway_hd?video&audio
          input_args: preset-rtsp-restream
          roles:
            - record
      output_args:
        record: preset-record-generic-audio-aac
    review:
      alerts:
        labels:
          - bear
          - dog
          - cat
          - mouse
    motion:
      mask: # <- usually those ones are created from the UI, but is persisted in yaml and can be reused across Frigate versions.
        - 0.357,0,0,0,0,0.049,0.357,0.057
        - 0.525,0.822,0.564,0.819,0.568,0.896,0.526,0.901
 
# Example of a dummy camera feed which is actually a video clip.
#  test:
#    enabled: false
#    ffmpeg:
#      #hwaccel_args: preset-vaapi
#      inputs:
#        - path: /media/frigate/exports/driveway_npcy9r.mp4
#          input_args: -re -stream_loop -1 -fflags +genpts
#          roles:
#            - detect
#            - rtmp
#    detect:
#      height: 1080
#      width: 1920
#      fps: 5
#    audio:
#      enabled: false
#    objects:
#      track:
#        - bear
 
##################################
# Detector and Models and FFmpeg #
##################################
detectors:
  ov:
    type: openvino
    device: GPU # <- this and the 'openvino' type will tell Frigate to use iGPU for detection.
    model_path: /openvino-model/ssdlite_mobilenet_v2.xml # <- note that this is changed from 0.14.0, if model_path not exist, Frigate can crash and your container will likely keep restarting.
model:
  width: 300
  height: 300
  input_tensor: nhwc
  input_pixel_format: bgr
  labelmap_path: /openvino-model/coco_91cl_bkgr.txt
ffmpeg:
  hwaccel_args: preset-intel-qsv-h264 # <- tell Frigate to use iGPU (qsv) for h264 decoding (actually if your cam res not too high, this is not really needed.)
version: 0.16-0 # <- Easy to miss this one, if you copy config from Frigate of older versions, do check this one.
 

Mount external storage for video clips

LXC 通常还是比较轻量的好,而且 records 本身实际上很多时候并不像 config 那样一点都不能错。所以大概率还是要存在一个 remote storage 上面的。

假定我们 mount 一个 NFS。这里其实有两个方案:

  • 直接在 Docker compose YAML 里面定义一个 volume with driver type: nfs.
    • 这样的好处是,整个挂载和 docker compose stack 的其他定义都放在一起,移植性比较高,docker compose 也会自动去在正式开始 container 前去做这个 mount。
    • 但是缺点是:排错时的透明度低。docker 的log需要通过 journalctl -u docker.service 来查。而且如果要 tweak options 的话,一遍遍启动 container(s),还是太浪费时间了。
    • 另外,其实这个方法也需要宿主机有安装 nfs-common 。所以移植性也不一定有想象中那么强。
      • 这个方法对于 SAMBA 也适用,不过也需要安装 cifs-utils
    • 密码明文写在 yaml 里面也是个缺点。虽然可以用 docker secrets 搞定,但是就更麻烦了。得不偿失。
  • fstab 挂载,然后手动改 systemd config 来保证 mount 在 container 启动前完成。
    • 个人认为这个方式,虽然看起来更复杂一些,但是实际上可操作性更强。
      • homelab 嘛,经常需要查错的,这个时候每一步单独查错时候的可复制性和透明度就挺重要的。

NFS mount

In your LXC shell (I’m using Ubuntu 24.04):

apt install nfs-common
 
mkdir /mnt/frigate
 
vim /etc/fstab
# Example Content (assume your NFS server exports storage at /mnt/tank/nvr/frigate):
# UNCONFIGURED FSTAB FOR BASE SYSTEM
<YOUR_NFS_SERVER_IP>:/mnt/tank/nvr/frigate /mnt/frigate  nfs  defaults,_netdev,auto 0 0
 
mount -a
 
# /mnt/frigate should be ready to use.

Use Systemd to manage the docker compose dependency on NFS mount

首先我们必须把 docker compose yaml 里面的 restart: always 或者 restart: unless-stopped 去掉。把整个 restart 的逻辑都写到 systemd 里面。

改完 yaml 之后可以再重新启动 frigate stack.

然后我们创建一个 systemd config:

# vim /etc/systemd/system/frigate-docker.service
 
[Unit]
Description=Frigate Docker Compose Service
# v: this 'mnt-frigate' is auto generated from the mount at '/mnt/frigate'
Requires=mnt-frigate.mount 
After=mnt-frigate.mount network-online.target
 
[Service]
# Location of the docker compose file. I'm using dockge and by default the yaml is stored here:
WorkingDirectory=/opt/stacks/frigate
 
# v: do not use `-d` here.
ExecStart=/usr/bin/docker compose up
ExecStop=/usr/bin/docker compose down
 
Restart=always
RestartSec=5s
 
[Install]
WantedBy=multi-user.target

NOTE

这里有一个坑。如果 Restart 设定是 on-failure。就会导致 Frigate 在 UI 里面更改 config 后的 save and restart frigate 无法正常重启 Frigate。所以这里一定要用 Always。

Then:

systemctl daemon-reload
systemctl enable --now frigate-docker.service

另外还有个optional的设定:x-systemd.device-timeout=15s.

  • 如果挂载卡了15秒,基本上可以宣告失败了
  • 失败后, docker compose 是不会跑起来的 其实这个只是个 fail faster 的小优化,没有这个也没什么问题,不会把系统都卡死,只是等待的时间长一点而已。
<YOUR_NFS_SERVER_IP>:/mnt/tank/nvr/frigate /mnt/frigate  nfs  defaults,_netdev,x-systemd.device-timeout=15s,auto 0 0

现在 reboot,测试一下,应该就可以了。后续别忘了修改 docker compose 里面的 storage 路径。

Appendix

For CIFS/Samba mounts

Jellyfin 的 Media 挂载也是用这个方式搞定的,只不过是用的 samba 挂载:

/etc/fstab:

//<SMB_SERVER_IP>/jellyfin /mnt/media cifs credentials=/root/.smb_credentials,uid=1000,gid=1000,vers=3.0,iocharset=utf8,_netdev,x-systemd.device-timeout=15s 0 0

/etc/systemd/system/jellyfin-docker.service:

# cat /etc/systemd/system/jellyfin-docker.service
[Unit]
Description=Jellyfin Docker Compose Service
# 'mnt-media' is auto generated from the mount at '/mnt/media'
Requires=mnt-media.mount 
After=mnt-media.mount network-online.target
 
[Service]
# Location of the docker compose file. I'm using dockge and by default the yaml is stored here:
WorkingDirectory=/opt/stacks/jellyfin
 
# v: do not use `-d` here.
ExecStart=/usr/bin/docker compose up
ExecStop=/usr/bin/docker compose down
 
Restart=on-failure
RestartSec=5s
 
[Install]
WantedBy=multi-user.target

另外,在 Jellyfin 这里,uid,gid 都是 1000,所以我们还需要确定 1000:1000 已经存在了。 (对于 Ubuntu 来说,1000:1000 就是最普通的选择。)

groupadd -g 1000 jellyfin
useradd -u 1000 -g 1000 -m -s /bin/bash jellyfin