I’ve noticed that sometimes when a particular VM/ service is having issues, they all seem to hang. For example, I have a VM hosting my DNS (pihole) and another hosting my media server (jellyfin). If Jellyfin crashes for some reason, my internet in the entire house also goes down because it seems DNS is unable to be reached for a minute or so while the Jellyfin VM recovers.

Is this expected, and is there a way to prevent it?

  • apigban@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    edit-2
    1 year ago

    I’d check high I/O wait, specially if your all of the vms are on HDDs.

    one of the solution I had for this issue was to have multiple DNS servers. solved it by buying a raspberry pi zero w and running a 2nd small instance of pihole there. I made sure that the piZeroW is plugged on a separate circuit in my home.

    • root@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      Good point. I just checked and streaming something to my TV causes IO delay to spike to like 70%. I’m also wondering if maybe me routing my Jellyfin (and some other things) through NGINX (also hosted on Proxmox) has something to do with it… Maybe I need to allocate more resources to NGINX(?)

      The system running Proxmox has a couple Samsung Evo 980s in it, so I don’t think they would be the issue.

      • apigban@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        1 year ago

        I had this issue when I used kubernetes, sata SSDs cant keep up, not sure what Evo 980 is and what it is rated for but I would suggest shutting down all container IO and do a benchmark using fio.

        my current setup is using proxmox, rusts configured in raid5 on a NAS, jellyfin container.

        all jf container transcoding and cache is dumped on a wd750 nvme, while all media are store on the NAS (max. BW is 150MBps)

        you can monitor the IO using IOstat once you’ve done a benchmark.