Difference between revisions of "Xen"

From Noah.org
Jump to navigationJump to search
Line 110: Line 110:
 
The host and dom0 seem to boot fine, but guests cannot start because '''xend''' is not running. I found that this happened when my '''dom0''' ran out of disk space. For me the solution was, "don't run out of disk space".
 
The host and dom0 seem to boot fine, but guests cannot start because '''xend''' is not running. I found that this happened when my '''dom0''' ran out of disk space. For me the solution was, "don't run out of disk space".
  
== Problem: guests start to have erratic networking ==
+
== Problem: guests start to have erratic networking, part 1 ==
  
 
If too many guests share the same bridged interface then their networking may become slow and erratic. This can happen even if the total traffic over the dom0 physical interface is low; although, the problem shows itself more frequently when traffic load is high. I am not sure of the cause of this problem. It may be due '''congestive collapse'''; or the dom0 virtual networking drivers may have undocumented limitations; or the Linux virtual bridge driver may have undocumented limitations. I have not found unusual syslog or dmesg messages; although, I have done limited testing, so I may have missed something.  
 
If too many guests share the same bridged interface then their networking may become slow and erratic. This can happen even if the total traffic over the dom0 physical interface is low; although, the problem shows itself more frequently when traffic load is high. I am not sure of the cause of this problem. It may be due '''congestive collapse'''; or the dom0 virtual networking drivers may have undocumented limitations; or the Linux virtual bridge driver may have undocumented limitations. I have not found unusual syslog or dmesg messages; although, I have done limited testing, so I may have missed something.  
  
This erratic networking problem only seems to happen when over 100 guests are on the same bridge. Splitting the guests across different bridges, each with its own physical interface eliminates this problem. (Note that it is possible that creating many separate virtual bridges assigned to the same physical interface will also eliminate this problem, but I have not tested this.) Assigning guest to different bridges creates an extra task in provisioning and managing a host, but the task is not difficult. Each '''/etc/xen/guest.cfg''' file must be configured to associate it with one of the available bridges. I have not found a way to automatically spread the guests evenly over the available bridges. So far the only solution I have is a simple script that rebalances each '''.cfg''' file to sequentially reassign each guest to a bridge round-robin style. This requires the guests to be restarted.  
+
This erratic networking problem only seems to happen when over 100 guests are on the same bridge. Splitting the guests across different bridges, each with its own physical interface eliminates this problem. (Note that it is possible that creating many separate virtual bridges assigned to the same physical interface will also eliminate this problem, but I have not tested this.) Assigning guest to different bridges creates an extra task in provisioning and managing a host, but the task is not difficult. Each '''/etc/xen/guest.cfg''' file must be configured to associate it with one of the available bridges. I have not found a way to automatically spread the guests evenly over the available bridges. So far the only solution I have is a simple script that rebalances each '''.cfg''' file to sequentially reassign each guest to a bridge round-robin style. This requires the guests to be restarted.
  
 
== Problem: guests start to have erratic networking, part 2 ==
 
== Problem: guests start to have erratic networking, part 2 ==

Revision as of 11:14, 8 June 2013


I run a few large Xen hosts. At the moment I am working with five hosts, each with 192 to 384 GB of RAM, 10TB of storage in a 20 disk RAID 10 array, and four gigabit ethernet interfaces. I have gotten over 200 guests to run on a single machine, but I found that it is difficult to get more than 120 guests to run reliably on a single machine.

Debugging Xen can be tricky, especially on huge hosts with lots of CPU cores and lots of RAM. Guests can run for days with no trouble and then suddenly have problems without warnings in dmesg or /var/log/. Stress testing can help, but with vast amounts of RAM and lots of CPU power it can take a long time before a problem will reveal itself. Also artifical loads never seem to quite repplicate the obscure, subtle bugs you get after running real loads for a few days. Better stress testing tools might help, but I have not found anything specific to testing virtual machines. My own test tools usually amount to spinning up hundreds of guests at once and then running cpuburn on the guests. It's harder to test networking, yet this is often where I see the most problems. The total network bandwidth through the host may actually be moderate, but the number of interrupts hitting a device driver in dom0 may be so great that it falls down. Again, usually it takes a few days for this to happen, which is not amenable to regression testing. A similar problem happens with the virtual disk block driver. The dom0 disk throughput may be moderate but lots of IO operations multiplied by a hundred guests can cause the virtual block driver in dom0 to exceed timeouts. I found that tuning the the host to handle guest disk IO is the most difficult problem to overcome.

Problem: xen guests slow to a crawl for no reason

You may have restricted your guest to using only a few cores. You may have also accidentally pinned it to using specific cores. This eventually leads to bound situations where the guest does not have enough "wiggle room" to get unstuck.

You may have a guest config file under /etc/xen/guest.cfg with lines something like this:

cpus = '0-3'
vcpus = '4'

The counter intuitive solution is to let the guest have access to all physical cores. This examples assumes you running a 24 core host:

cpus = '0-23'
vcpus = '24'

The important thing is the cpus setting. You may set the vcpus setting lower, but my guests seem to run better with access to all cpus. I found that it rarely makes sense to second guess the kernel. It typically does a good job.

a note about GRUB_CMDLINE_XEN names in /etc/default/grub

I have seen documentation spell XEN boot options with and without underscores. I am not sure if the system will accept both, or if one style is a newer convention, or what. Beware.

xen version headaches

Xen can be very finicky to get running. Generally later versions are prefered. This may seem obvious, but later versions are much easier get working than earlier versions. The downside is that many tools for Xen are quite brittle and are strongly dependent on a specific version of Xen. Anything that depends on scripts in /etc/xen/scripts/ is bound to break between different versions of Xen. Unfortunately, lots of tools seem to have this weakness. One of the more popular tools for working with Xen, xen-tools is particularly guilty of tight version coupling. It is also itself fairly buggy.

Beware.

Problem: dom0 can't handle too much memory

It may seem odd that your host can have too much RAM, but it seems that huge amounts of RAM will confuse the Xen dom0. In my case I was working with a server with 384 GB of RAM. The problem is that your physical machine has more memory than dom0 can handle. The solution is to restrict the amount of memory the Xen dom0 can use. This is set in the GRUB boot menu.

Here is what you are likely to see. You try to boot your Xen host and it locks up during boot with a message like this:

FATAL: Error inserting dm_mod (/lib/modules/2.6.32-5-xen-amd64/kernel/drivers/md/dm-mod.ko): Cannot allocate memory
done.
Begin: Waiting for root file system ... done
Gave up waiting for root device.

You can restrict the amount of RAM for dom0 by editing the grub.cfg or by editing /etc/default/grub on Debian/Ubuntu systems. I also like to pin a few cores for dom0. That is, I like to reserve CPU only for dom0 use.

The grub.cfg should have a line similar to this:

    multiboot   /xen-4.0-amd64.gz placeholder

It should be modified to something like this:

    multiboot   /xen-4.0-amd64.gz placeholder dom0_mem=8192M,max:8192M dom0maxvcpus=4 dom0vcpuspin

See also: http://wiki.xen.org/wiki/Xen_Best_Practices#Xen_dom0_dedicated_memory_and_preventing_dom0_memory_ballooning and http://wiki.debian.org/Xen#Other_configuration_tweaks

The exact operations you need to update grub.cfg will vary from platform to platform. On modern Ubuntu systems you will edit /etc/default/grub then run update-grub. On an ancient Debian 6 system I did this:

dpkg-divert --divert /etc/grub.d/08_linux_xen --rename /etc/grub.d/20_linux_xen
sed -i -e '$aGRUB_CMDLINE_XEN="dom0maxvcpus=4 dom0vcpuspin dom0_mem=8192M,max:8192M"' /etc/default/grub
update-grub
sed -i -e 's/(enable-dom0-ballooning .*)/(enable-dom0-ballooning no)/' -e 's/(dom0-min-mem .*)/(dom0-min-mem 8192)/' /etc/xen/xend-config.sxp
reboot

Problem: dom0 can't free RAM to run guests

You might see an error like this while starting a guest:

Error: Not enough free memory and enable-dom0-ballooning is False, so I cannot release any more.  I need 8421376 KiB but only have 130924.

Ballooning causes trouble in machines with lots of RAM, yet turning it off causes dom0 to take all the RAM for itself. This leaves nothing for the guests. The fix is simple. This is another instance where Xen behaves badly on large systems.

The solution is simple. You must also set the Xen boot parameters in GRUB to limit the amount of RAM dom0 is allowed to use. See the section titled #Problem: dom0 can't handle too much memory.

The most annoying part about this is that part of the fix must be done in /etc/xen/xend-config.sxp and part of it must be done in the GRUB config on boot. It seems like these fundemental memory parameters should all be in one place.

Problem: scrubbing free RAM takes forever

Add no-bootscrub to GRUB_CMDLINE_XEN. You may also wish to disable scrubbing free RAM since that will cause the boot to take forever. The RAM scrubbing is a security strengthening step. If your host is for your sole use then this security step can probably be skipped. This will significantly increase the boot speed.

GRUB_CMDLINE_XEN="dom0_max_vcpus=4 dom0_mem=4G,max:4G no-bootscrub"

Error: Dom0 dmesg log shows 'page allocation failure' or 'Out of memory: kill process:' or 'invoked oom-killer:' messages

Yes, these are vague symptoms, but I found that if I set vm.min_free_kbytes to a higher value this seemed to help. This may be partly precipitated by turning off dom0 ballooning and setting a fixed amount of dedicated memory. Note that this can happen even if dom0 has free RAM and swap. If you have lots of guests I think their I/O demands (disk and/or network) cause the dom0 kernel run out of wiggle room. Edit /etc/sysctl.conf and set the following option to reserve 128 MB for the kernel.

vm.min_free_kbytes = 131072

You can update this live with the following command.

sysctl vm.min_free_kbytes=131072

Problem: dom0 takes forever to shutdown (XENDOMAINS_SAVE)

The problem is that by default Xen attempts to save the running state of each guest before the host is allowed to shutdown. If you have hundreds of guests this will take a very, very long time. If you have more RAM than free disk on your dom0 then not only will host shutdown take a long time it will also fill the disk. I almost never need this feature. It uses a lot of disk space. When I shutdown the host I usually don't care about the guests.

Edit /etc/default/xendomains and set XENDOMAINS_SAVE to be empty. This controls the feature that allows Xen to save the guest's running state when dom0 is shutdown.

#XENDOMAINS_SAVE=/var/lib/xen/save
XENDOMAINS_SAVE=""

Problem: xend won't start

The host and dom0 seem to boot fine, but guests cannot start because xend is not running. I found that this happened when my dom0 ran out of disk space. For me the solution was, "don't run out of disk space".

Problem: guests start to have erratic networking, part 1

If too many guests share the same bridged interface then their networking may become slow and erratic. This can happen even if the total traffic over the dom0 physical interface is low; although, the problem shows itself more frequently when traffic load is high. I am not sure of the cause of this problem. It may be due congestive collapse; or the dom0 virtual networking drivers may have undocumented limitations; or the Linux virtual bridge driver may have undocumented limitations. I have not found unusual syslog or dmesg messages; although, I have done limited testing, so I may have missed something.

This erratic networking problem only seems to happen when over 100 guests are on the same bridge. Splitting the guests across different bridges, each with its own physical interface eliminates this problem. (Note that it is possible that creating many separate virtual bridges assigned to the same physical interface will also eliminate this problem, but I have not tested this.) Assigning guest to different bridges creates an extra task in provisioning and managing a host, but the task is not difficult. Each /etc/xen/guest.cfg file must be configured to associate it with one of the available bridges. I have not found a way to automatically spread the guests evenly over the available bridges. So far the only solution I have is a simple script that rebalances each .cfg file to sequentially reassign each guest to a bridge round-robin style. This requires the guests to be restarted.

Problem: guests start to have erratic networking, part 2

I also found that erratic networking may occur when dom0 runs low on disk space. I have not tested this and I have only seen this problem a few times since it is unusual for dom0 to run low on disk. I noticed this after restarting dom0 with XENDOMAINS_SAVE turned on. The memory images of all the guests exceeded the dom0 disk capacity. After rebooting the dom0 I paid no attention to its boot messages, but soon I noticed that its guests seemed to have network troubles.

See also #Problem: dom0 takes forever to shutdown (XENDOMAINS_SAVE).

Error: physdev match: using --physdev-out in the OUTPUT, FORWARD and POSTROUTING chains for non-bridged traffic is not supported anymore.

If you see the following message in dmesg or /var/log/kern.log

Error: physdev match: using --physdev-out in the OUTPUT, FORWARD and POSTROUTING chains for non-bridged traffic is not supported anymore.

then you probably need to patch /etc/xen/scripts/vif-common.sh and edit the function frob_iptables() so that it looks like the function below. You need to add the --physdev-is-bridged option to iptables in two places.

frob_iptable()
{
  if [ "$command" == "online" ]
  then
    local c="-I"
  else
    local c="-D"
  fi

  iptables "$c" FORWARD -m physdev --physdev-is-bridged --physdev-in "$vif" "$@" -j ACCEPT \
    2>/dev/null &&
  iptables "$c" FORWARD -m state --state RELATED,ESTABLISHED -m physdev \
    --physdev-is-bridged --physdev-out "$vif" -j ACCEPT 2>/dev/null

  if [ "$command" == "online" -a $? -ne 0 ]
  then
    log err "iptables setup failed. This may affect guest networking."
  fi
}

install Xen on Ubuntu

aptitude -q -y install xen-hypervisor-4.1-amd64 xen-tools xen-utils-4.1 xenstore-utils  

generic /etc/default/grub settings

This is a good starting place for values for grub in /etc/default/grub. I show only the values that I typically change. After modifying this file you need to run update-grub.

GRUB_DEFAULT=3
GRUB_HIDDEN_TIMEOUT_QUIET=false
GRUB_TIMEOUT=10
GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_CMDLINE_LINUX="apparmor=0"
GRUB_DISABLE_OS_PROBER=true
GRUB_CMDLINE_XEN="dom0_max_vcpus=4 dom0_mem=4G,max:4G no-bootscrub"
GRUB_CMDLINE_XEN_DEFAULT=""