Guest Domain Freezes with 100% utilization

I setup LDOM on a Sun Fire T2000 with one control domain and one guest domain. The guest domain is sharing the disk (on slice 6) with the control domain.

The slice is set as boot disk for the guest domain. Upon starting the guest domain, the utilization goes 100%. A telnet to the virtual console gets refused connection.

When I try to stop and unbind the guest domain (with a reboot), the slice is no longer unusable. format and newfs commands both cannot operate on the slice. What's wrong?

sc>showhost

Sun-Fire-T2000 System Firmware 6.4.4 2007/04/20 10:13

Host flash versions:

Hypervisor 1.4.1 2007/04/02 16:37

OBP 4.26.1 2007/04/02 16:26

POST 4.26.0 2007/03/26 16:45

# Control Domain

$ cd LDoms_Manager-1_0-RR

$ Install/install-ldm

$ ldm add-vdiskserver primary-vds0 primary

$ ldm add-vconscon port-range=5000-5100 primary-vcc0 primary

$ ldm add-vswitch net-dev=e1000g0 primary-vsw0 primary

$ ldm set-mau 1 primary

$ ldm set-vcpu 4 primary

$ ldm set-memory 4g primary

$ ldm add-config initial

$ shutdown -i6 -g0 -y

# Guest Domain

$ ldm add-domain myldom1

$ ldm add-vcpu 4 myldom1

$ ldm add-memory 2g myldom1

$ ldm add-vnet vnet1 primary-vsw0 myldom1

$ ldm add-vdiskserverdevice /dev/dsk/c0t1d0s6 vol1@primary-vds0

$ ldm add-vdisk vdisk1 vol1@primary-vds0 myldom1

$ ldm set-variable auto-boot\?=false myldom1

$ ldm set-variable boot-device=vdisk1 myldom1

$ ldm bind-domain myldom1

$ ldm start-domain myldom1

$ telnet localhost 5000

Truss output of format command:

AVAILABLE DISK SELECTIONS:

write(1,"\n\n A V A I L A B L E ".., 29)= 29

ioctl(0, TCGETA, 0xFFBFFB34)= 0

ioctl(1, TCGETA, 0xFFBFFB34)= 0

ioctl(0, TCGETA, 0xFFBFFACC)= 0

ioctl(1, TCGETA, 0xFFBFFACC)= 0

ioctl(1, TIOCGWINSZ, 0xFFBFFB40)= 0

open("/dev/tty", O_RDWR|O_NDELAY)= 3

ioctl(3, TCGETS, 0x000525BC)= 0

ioctl(3, TCSETS, 0x000525BC)= 0

ioctl(3, TCGETS, 0x000525BC)= 0

ioctl(3, TCSETS, 0x000525BC)= 0

0. c0t1d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>

write(1,"0 .c 0".., 56)= 56

/pci@780/pci@0/pci@9/scsi@0/sd@1,0

write(1," / p".., 45)= 45

ioctl(3, TCSETS, 0x000525BC)= 0

ioctl(3, TCSETS, 0x000525BC)= 0

close(3)= 0

ioctl(0, TCGETA, 0xFFBFF1EC)= 0

fstat64(0, 0xFFBFF108) = 0

Specify disk (enter its number): write(1," S p e c i f yd i s k".., 33)= 33

read(0, 0xFF2700F8, 1024)(sleeping...)

0

read(0," 0\n", 1024)= 2

open("/dev/rdsk/c0t1d0s2", O_RDWR|O_NDELAY)= 3

brk(0x00058810) = 0

brk(0x00068810) = 0

brk(0x00068810) = 0

brk(0x00078810) = 0

selecting c0t1d0

write(1," s e l e c t i n gc 0".., 17)= 17

ioctl(3, 0x04C9, 0xFFBFFA54)= 0

[disk formatted]

write(1," [ d i s kf o r m a t".., 17)= 17

open("/etc/mnttab", O_RDONLY)= 4

ioctl(4, (('m'<<8)|7), 0xFFBFFA64) = 0

open("/dev/rdsk/c0t1d0s0", O_RDWR|O_NDELAY)= 5

fstat(5, 0xFFBFF5E0)= 0

ioctl(5, 0x0403, 0xFFBFF59C)= 0

close(5)= 0

llseek(4, 0, SEEK_CUR) = 0

close(4)= 0

fstat64(2, 0xFFBFEBA0) = 0

Warning: Current Disk has mounted partitions.

write(2," W a r n i n g :C u r".., 46)= 46

resolvepath("/","/", 1024) = 1

sysconfig(_CONFIG_PAGESIZE) = 8192

open("/dev/.devlink_db", O_RDONLY) = 4

fstat(4, 0xFFBFF1F8)= 0

mmap(0x00000000, 40, PROT_READ, MAP_SHARED, 4, 0) = 0xFEF50000

mmap(0x00000000, 24576, PROT_READ, MAP_SHARED, 4, 32768) = 0xFEF38000

open("/devices/pseudo/devinfo@0:devinfo", O_RDONLY) = 5

ioctl(5, 0xDF82, 0x00000000)= 57311

After failing to configure LDOM, I boot from a Solaris DVD to do reinstall the OS. It looks like the e1000g0 has some kind of fault.

{0} ok boot cdrom

Boot device: /pci@7c0/pci@0/pci@1/pci@0/ide@8/cdrom@0,0:f File and args:

SunOS Release 5.10 Version Generic_118833-33 64-bit Copyright 1983-2006 Sun Microsystems, Inc. All rights reserved.

Use is subject to license terms.

WARNING: mac_open e1000g0 failed

WARNING: mac_open e1000g0 failed

WARNING: mac_open e1000g0 failed

WARNING: Unable to setup switching mode

Configuring devices.

WARNING: bypass cookie failure 71ece

NOTICE: tavor0: error during attach: hw_init_eqinitall_fail

NOTICE: tavor0: driver attached (for maintenance mode only)

NOTICE: pciex8086,105e - e1000g[0] : Adapter 1000Mbps full duplex copper link is up.

What is the meaning of this?

Message was edited by:

JoeChris@Sun

[5296 byte] By [JoeChris@Suna] at [2007-11-27 3:39:51]
# 1
You can't install an OS on a slice exported to a guest domain. You have to use the whole disk (s2) or if you don't have a spare disk you can use a file.e.g.# mkfile 10g /export/home/disk.image# ldm add-vdsdev /export/home/disk.image vol1@primary-vds0
merwicka at 2007-7-12 8:43:14 > top of Java-index,Administration Tools,Logical Domains for CoolThreads Servers...
# 2

It looks like there's a failure when trying to use slice 6, do you have any vds messages in /var/adm/messages on the service domain?

After this failure, slice 6 can not be used because of bug 6530040 (vds does not close underlying physical device or file properly); fuser on that slice should show that it is still in used by the vds driver.

achartrea at 2007-7-12 8:43:14 > top of Java-index,Administration Tools,Logical Domains for CoolThreads Servers...
# 3

Hi,

In the case of bug 6530040, the recovery method in the Release Notes is to reboot the system. In my case, even after the reboot the vds still does not close the device. I suspect I might use a wrong way to reboot the system. Can you give me an example for a system with control domain (primary) and a single guest domain (myldom1).

My steps would be as follow:

$ ldm stop-domain myldom1

$ ldm unbind-domain myldom1

$ reboot

Do I miss any step?

LDOM seems to have problem releasing the network interface e1000g0 also. How do I release it?

{0} ok boot cdrom

Boot device: /pci@7c0/pci@0/pci@1/pci@0/ide@8/cdrom@0,0:f File and args:

SunOS Release 5.10 Version Generic_118833-33 64-bit Copyright 1983-2006 Sun Microsystems, Inc. All rights reserved.

Use is subject to license terms.

WARNING: mac_open e1000g0 failed

WARNING: mac_open e1000g0 failed

WARNING: mac_open e1000g0 failed

WARNING: Unable to setup switching mode

Configuring devices.

WARNING: bypass cookie failure 71ece

NOTICE: tavor0: error during attach: hw_init_eqinitall_fail

NOTICE: tavor0: driver attached (for maintenance mode only)

NOTICE: pciex8086,105e - e1000g[0] : Adapter 1000Mbps full duplex copper link is up.

Thank you

Message was edited by:

JoeChris@Sun

JoeChris@Suna at 2007-7-12 8:43:14 > top of Java-index,Administration Tools,Logical Domains for CoolThreads Servers...
# 4
All devices are released after a reboot. Don't pay attention to the message about e1000g0 when booting cdrom, this is certaintly because your cdrom image uses a kernel from 118833-33 and LDoms requires at least 118833-36.
achartrea at 2007-7-12 8:43:14 > top of Java-index,Administration Tools,Logical Domains for CoolThreads Servers...