PVE 內 CT 無法啟動查問題與處理紀錄

有一台 PVE Node 因為無法直接安裝 PVE 6 所以先安裝 PVE 5 在透過PVE 5 升級到 6 紀錄進行升級程序完成升級, 但這台一切都正常, 但就是無法啟動裡面的 CT , 不過由 Cluster 內其他節點正在運行中的 CT 採用 Restart Mode 遷移過來卻是可以正常啟動運作, 以下就是找尋在這樣很特別的狀況下 CT 無法啟動的原因

  • 在這 Node 內有個 CT 編號 112 要來進行啟動
  • 透過以下指令來啟動, 並將紀錄訊息寫到 /tmp/t.log 內
    lxc-start -n 112 --logfile /tmp/t.log
    root@TP-PVE-249:/var/log# lxc-start -n 112 --logfile /tmp/t.log
    lxc-start: 112: lxccontainer.c: wait_on_daemonized_start: 852 Received container state "ABORTING" instead of "RUNNING"
    lxc-start: 112: tools/lxc_start.c: main: 308 The container failed to start
    lxc-start: 112: tools/lxc_start.c: main: 311 To get more details, run the container in foreground mode
    lxc-start: 112: tools/lxc_start.c: main: 314 Additional information can be obtained by setting the --logfile and --logpriority options
    cat /tmp/t.log
    root@TP-PVE-249:/var/log# cat /tmp/t.log
    lxc-start 112 20200721031900.352 ERROR    conf - conf.c:lxc_create_tmp_proc_mount:2990 - Permission denied - Failed to mount proc in the container
    lxc-start 112 20200721031900.353 ERROR    conf - conf.c:lxc_setup:3391 - Failed to "/proc" LSMs
    lxc-start 112 20200721031900.353 ERROR    start - start.c:do_start:1231 - Failed to setup container "112"
    lxc-start 112 20200721031900.353 ERROR    sync - sync.c:__sync_wait:41 - An error occurred in another process (expected sequence number 5)
    lxc-start 112 20200721031900.353 ERROR    start - start.c:__lxc_start:1957 - Failed to spawn container "112"
    lxc-start 112 20200721031900.353 ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:852 - Received container state "ABORTING" instead of "RUNNING"
    lxc-start 112 20200721031900.360 ERROR    lxc_start - tools/lxc_start.c:main:308 - The container failed to start
    lxc-start 112 20200721031900.360 ERROR    lxc_start - tools/lxc_start.c:main:311 - To get more details, run the container in foreground mode
    lxc-start 112 20200721031900.360 ERROR    lxc_start - tools/lxc_start.c:main:314 - Additional information can be obtained by setting the --logfile and --logpriority options
  • 目前還是無法找到原因, 但透過 restore 還原備份檔案後就可成功啟動
  • 所以目前作法是這 Node 要重新啟動前必須對每個 CT 進行手動 backup, 開機後再透過 restore 還原每個 CT 的程序讓 CT 可以正常啟動
  • tech/pve-ct-err.txt
  • 上一次變更: 2021/01/16 14:41
  • jonathan