A large SunFire server with hardware domains allows many isolated systems to be created. Zones achieve this in software and is far more flexible - it is easy to move individual CPUs between zones as needed, or to configure a more sophisticated way to share CPUs and memory.
There are two general zone types to pick from during zone creation. They are,
Small zone - (also known as a “Sparse Root zone”): The default. This consumes the least disk space, has the best performance and the best security.
Big zone - (also known as a “Whole Root zone”): The zone has its own /usr files, which can be modified independently.
If you aren’t sure which to choose, pick the small zone. Below are examples of installing each zone type as a starting point for Zone Resource Controls.
This demonstrates creating a simple zone that uses the default settings which share most of the operating system with the global zone. The final layout will be like the following,
To create such a zone involves letting the system pick default settings, which includes the loopback filesystem (lofs) read only mounts that share most of the OS. The following commands were used,
zonecfg -z small-zone
small-zone: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:small-zone> create
zonecfg:small-zone> setautoboot=truezonecfg:small-zone> setzonepath=/export/small-zone
zonecfg:small-zone> add net
zonecfg:small-zone:net> setaddress=192.168.2.101
zonecfg:small-zone:net> setphysical=hme0
zonecfg:small-zone:net> end
zonecfg:small-zone> info
zonepath: /export/small-zone
autoboot: truepool:
inherit-pkg-dir:
dir: /lib
inherit-pkg-dir:
dir: /platform
inherit-pkg-dir:
dir: /sbin
inherit-pkg-dir:
dir: /usr
net:
address: 192.168.2.101
physical: hme0
zonecfg:small-zone> verify
zonecfg:small-zone> commit
zonecfg:small-zone> exitzoneadm list -cv
ID NAME STATUS PATH
0 global running /
- small-zone configured /export/small-zone
The new zone is in a configured state. Those inherited-pkg-dir’s are filesystems that will be shared lofs (loopback filesystem) readonly from the global; this saves copying the entire operating system over during install, but can make adding packages to the small-zone difficult as /usr is readonly. (See the big-zone example that uses a different approach).
We can see the zonecfg command has saved the info to an XML file in /etc/zones:
1
2
3
4
5
6
7
8
9
10
11
12
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE zone PUBLIC "-//Sun Microsystems Inc//DTD Zones//EN" "file:///usr/share/lib/xml/dtd/zonecfg.dtd.1"><!--
DO NOT EDIT THIS FILE. Use zonecfg(1M) instead.
--><zonename="small-zone"zonepath="/export/small-zone"autoboot="true"><inherited-pkg-dirdirectory="/lib"/><inherited-pkg-dirdirectory="/platform"/><inherited-pkg-dirdirectory="/sbin"/><inherited-pkg-dirdirectory="/usr"/><networkaddress="192.168.2.101"physical="hme0"/></zone>
Next we begin the zone install, it takes around 10 minutes to initialise the packages it needs for the new zone. A verify is run first to check our zone config is ok, then we run the install, then boot the zone:
mkdir /export/small-zone
chmod 700 /export/small-zone
zoneadm -z small-zone verify
zoneadm -z small-zone install
Preparing to install zone <small-zone>.
Creating list of files to copy from the global zone.
Copying <2574> files to the zone.
Initializing zone product registry.
Determining zone package initialization order.
Preparing to initialize <987> packages on the zone.
Initialized <987> packages on zone.
Zone <small-zone> is initialized.
Installation of these packages generated warnings: <SUNWcsr SUNWdtdte>
The file </export/small-zone/root/var/sadm/system/logs/install_log> contains a log of the zone installation.
zoneadm list -cv
ID NAME STATUS PATH
0 global running /
- small-zone installed /export/small-zone
zoneadm -z small-zone boot
zoneadm list -cv
ID NAME STATUS PATH
0 global running /
1 small-zone running /export/small-zone
We can see small-zone is up and running. Now we login for the first time to the console, so we can answer system identification questions such as timezone,
zlogin -C small-zone
[Connected to zone 'small-zone' console]100/100
What type of terminal are you using?
1) ANSI Standard CRT
2) DEC VT52
3) DEC VT100
4) Heathkit 19 5) Lear Siegler ADM31
6) PC Console
7) Sun Command Tool
8) Sun Workstation
9) Televideo 910 10) Televideo 925 11) Wyse Model 50 12) X Terminal Emulator (xterms) 13) CDE Terminal Emulator (dtterm) 14) Other
Type the number of your choice and press Return: 13...standard questions...
The system then reboots. To get an idea of what this zone actually is, lets poke around it’s zonepath from the global zone,
/> cd /export/small-zone
/export/small-zone> ls
dev root
/export/small-zone> cd root
/export/small-zone/root> ls
bin etc home mnt opt proc system usr
dev export lib net platform sbin tmp var
/export/small-zone/root> grep lofs /etc/mnttab
/export/small-zone/dev /export/small-zone/root/dev lofs zonedevfs,dev=4e40002 1110446770/lib /export/small-zone/root/lib lofs ro,nodevices,nosub,dev=22000081110446770/platform /export/small-zone/root/platform lofs ro,nodevices,nosub,dev=22000081110446770/sbin /export/small-zone/root/sbin lofs ro,nodevices,nosub,dev=22000081110446770/usr /export/small-zone/root/usr lofs ro,nodevices,nosub,dev=22000081110446770/export/small-zone/root> du -hs etc var
38M etc
30M var
/export/small-zone/root>
From the directories that are not lofs shared from the global zone, the main ones are /etc and /var. They add up to around 70Mb, which is roughly how much extra disk space was required to create this small-zone.
This demonstrates creating a zone that resides on it’s own slice, which has it’s own copy of the operating system. The final layout will be like the following:
$ zonecfg -z big-zone
big-zone: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:big-zone> create -b
zonecfg:big-zone> setautoboot=truezonecfg:big-zone> setzonepath=/export/big-zone
zonecfg:big-zone> add net
zonecfg:big-zone:net> setaddress=192.168.2.201
zonecfg:big-zone:net> setphysical=hme0
zonecfg:big-zone:net> end
zonecfg:big-zone> info
zonepath: /export/big-zone
autoboot: truepool:
net:
address: 192.168.2.201
physical: hme0
zonecfg:big-zone> verify
zonecfg:big-zone> commit
zonecfg:big-zone> exit
1
2
3
4
5
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE zone PUBLIC "-//Sun Microsystems Inc//DTD Zones//EN" "file:///usr/share/lib/xml/dtd/zonecfg.dtd.1"><zonename="big-zone"zonepath="/export/big-zone"autoboot="true"><networkaddress="192.168.2.201"physical="hme0"/></zone>
1
2
3
4
5
6
7
8
9
10
11
chmod 700 /export/big-zone
df -h /export/big-zone
Filesystem size used avail capacity Mounted on
/dev/dsk/c0t1d0s0 7.8G 7.9M 7.7G 1% /export/big-zone
zoneadm -z big-zone verify
zoneadm -z big-zone install
Preparing to install zone .
Creating list of files to copy from the global zone.
Copying <118457> files to the zone.
...
After the zone has been installed and booted, we now check the size of the dedicated zone slice,
1
2
3
df -h /export/big-zone
Filesystem size used avail capacity Mounted on
/dev/dsk/c0t1d0s0 7.8G 2.9G 4.8G 39% /export/big-zone
Wow! 2.9Gb, pretty much most of Solaris 10. This zone resides on it’s own slice, and can add many packages as though it was a separate system. Using inherit-pkg-dir as happened with small-zone can be great, but it’s good to know we can do this as well.
When you want to remove a non-global zone from your Solaris 10 installation, you’ll need to follow the following steps.
If you want to completely remove a zone called ’testzone’ from your system, login to the global zone and become root. The first command is the opposite of the ‘install’ option of zoneadm and deletes all of the files under the zonepath:
1
zoneadm -z testzone uninstall
At this point, the zone is in the configured state. To remove it completely from the system use:
1
zonecfg -z testzone delete
There is no undo, so make sure this is what you want to do before you do it.
Use cpu-shares to control zone computing resources link
Although the Solaris 10 08/07 OS allows you to specify how many CPUs can be used in a zone, sometimes this does not work out well. For example, I use dedicated-cpu for three zones in an 8-core Sun Fire T2000 server. Each zone has 4-20 specified for ncpus with a different importance value. However, when the system is fully utilized, the importance value does not always play its role. Sometimes, a zone with a lower importance value consumes a higher percentage of the computing resources than a zone with higher importance.
In the following, I demonstrate that cpu-shares works well.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
root@bigfoot# dispadmin -d
FSS (Fair Share)root@bigfoot# zonecfg -z global info rctl
rctl:
name: zone.cpu-shares
value: (priv=privileged,limit=4,action=none)root@bigfoot# zonecfg -z bighead info rctl name=zone.cpu-shares
rctl:
name: zone.cpu-shares
value: (priv=privileged,limit=3,action=none)root@bigfoot# zonecfg -z bighand info rctl name=zone.cpu-shares
rctl:
name: zone.cpu-shares
value: (priv=privileged,limit=3,action=none)
Generate 20 processes in zone bighead:
1
2
<username>@bighead> perl -e 'while (--$ARGV[0] and fork) {}; while () {}'20 &
<username>@bighand> perl -e 'while (--$ARGV[0] and fork) {}; while () {}'12 &
root@bigfoot# vmstat 33 kthr memory page disk faults
cpu
r b w swap free re mf pi po fr de sr s1 s2 s3 s4 in sy
cs us sy id
4003795488815215072662062591106013 -0 -0 2481841861780640360003874721615152224050000000007682723391000000038746960151519680000000000078824734710000root@bigfoot# prstat -Z
ZONEID NPROC SWAP RSS MEMORY TIME CPU ZONE
155 202M 264M 1.6% 0:36:11 62% bighead
247 199M 263M 1.6% 0:20:29 37% bighand
049 219M 291M 1.8% 0:01:36 0.1% global
As we see, when the system is not fully utilized, each zone uses as many computing resources as it needs.
Now, we will generate 15 processes in zone bigfoot to see how the bighead and bighand zones consume the computing resources:
root@bigfoot# perl -e 'while (--$ARGV[0] and fork) {}; while () {}'15 &
root@bigfoot# vmstat 33 kthr memory page disk faults
cpu
r b w swap free re mf pi po fr de sr s1 s2 s3 s4 in sy
cs us sy id
50037520616152498881023144061109413 -0 -0 248696466259250149150038745928151513920500000100080632536610000150038745672151511360000000000073923432010000root@bigfoot# prstat -Z
ZONEID NPROC SWAP RSS MEMORY TIME CPU ZONE
065 228M 298M 1.8% 1:32:55 40% global
155 202M 264M 1.6% 1:59:48 30% bighead
247 199M 263M 1.6% 1:37:32 29% bighand
root@bigfoot# prstat -Z
ZONEID NPROC SWAP RSS MEMORY TIME CPU ZONE
065 228M 298M 1.8% 1:19:15 38% global
247 199M 263M 1.6% 1:27:31 31% bighand
155 202M 264M 1.6% 1:48:24 31% bighead
As we see, each zone is consuming a portion of the computing resources according to its cpu-shares value when the system’s computing resources are fully utilized.
The swap property of capped-memory is virtual swap space, not physical swap space link
For zone bighead running Oracle Database 10g Enterprise Edition with total memory of 2 Gbytes (1.5 Gbytes System Global Area [SGA] and 0.5 Gbytes Process Global area [PGA]), we might just give a maximum of 3 Gbytes memory and 1.5 Gbytes swap space, as follows:
1
2
3
4
zonecfg:bighead> info capped-memory
capped-memory:
physical: 3G
[swap: 1.5G]
Start up the Oracle database in zone bighead:
1
2
3
4
5
oracle@bighead> sqlplus /nolog
SQL> conn / as sysdba
SQL& startup
ORA-27102: out of memory
SVR4 Error: 12: Not enough space
So the swap here is not physical swap space. Based on Sun documents, swap here means the total amount of swap that can be consumed by user process address space mappings and tmpfs mounts for this zone. When we set up swap, the capped-memory swap should be set proportionately. For example:
1
2
3
4
5
6
7
8
9
10
<username>@bigfoot> vmstat -p 5memory page executable anonymous
filesystem
swap free re mf fr de sr epi epo epf api
apo apf fpi fpo fpf
38671464151563364077105144200242004411388753521531201603000000000000
In our case, it should be 3 * ( 38 / 15 ), which equals 7 Gbytes.
Sometimes, a zone consumes more physical memory than the maximum limit link
1
2
3
4
5
zonecfg:bighead> info capped-memory
capped-memory:
physical: 1G
[swap: 7G][locked: 1G]
The Oracle database took a while to start up. The Resident Set Size (RSS) memory consumed by the zone fluctuated around, as follows:
I needed to add a second file system to one of my Solaris 10 zones this morning, and needed to do so without rebooting the zone. Since the global zone uses loopback mounts to present file systems to zones, adding a new file system was as easy as loopback mounting the file system into the zone’s file system:
1
mount -F lofs /filesystems/zone1oracle03 /zones/zone1/root/ora03
Once the file system was mounted, I added it to the zone configuration and then verified it was mounted:
1
$ mount
Now to update my ASM disk group to use the storage.