Jumpstart client unable to get IP for mac while installation

Hi Techies,

When I initiate boot vnet0 -install from the ok prompt of the client i get the following message contiously...

{0} ok boot vnet1 -v install

Sun Fire T200, No Keyboard

Copyright 2007 Sun Microsystems, Inc. All rights reserved.

OpenBoot 4.26.1, 4096 MB memory available, Serial #66563857.

Ethernet address 0:14:4f:f7:af:11, Host ID: 83f7af11.

Boot device: /virtual-devices@100/channel-devices@200/network@0 File and args: -v install

Requesting Internet Address for 0:14:4f:f7:af:11

Requesting Internet Address for 0:14:4f:f7:af:11

Requesting Internet Address for 0:14:4f:f7:af:11

Requesting Internet Address for 0:14:4f:f7:af:11

Requesting Internet Address for 0:14:4f:f7:af:11

Requesting Internet Address for 0:14:4f:f7:af:11

Requesting Internet Address for 0:14:4f:f7:af:11

On the server when i did a snoop for the MAC - i could see the following

# snoop -d e1000g0 | grep 0:14:4f:f7:af:11

Using device /dev/e1000g0 (promiscuous mode)

OLD-BROADCAST -> (broadcast) RARP C Who is 0:14:4f:f7:af:11 ?

OLD-BROADCAST -> (broadcast) RARP C Who is 0:14:4f:f7:af:11 ?

OLD-BROADCAST -> (broadcast) RARP C Who is 0:14:4f:f7:af:11 ?

OLD-BROADCAST -> (broadcast) RARP C Who is 0:14:4f:f7:af:11 ?

OLD-BROADCAST -> (broadcast) RARP C Who is 0:14:4f:f7:af:11 ?

Server Configuration:

You have entered Hostname = Jumpserv175

IP Address = 20.5.68.175

Terminal Type = vt100

Time Server = localhost

Encrypted Password = 2Scyi05RDsMsY

Name Service = NONE

System Locale = en_US

Network Interface = primary

Netmask = 255.255.255.0

Default Router = 20.5.68.1

IPv6 Used = no

Time Zone = GMT

Security Policy = NONE

Is this correct? (Y/N) [Y]

Modifying add_network_client.sh

Modifying S99jumpstart

Modifying std_svr.profile

Done

Configure the Jumpstart Server

Client Configuration

You have entered Hostname = testdom1

IP Address = 20.5.68.18

MAC Address = 0:14:4f:f7:af:11

Hostid = 83f7af11

Disk Setup = VM50

Veritas Root Mirroring = No

Solaris Root Mirroring = No

ASL Setup = HDS95xx HDS9980k

Kernel Parameters = Standard

Domain Name = xyz.com

SOE Version = 33

Monitor = None

Full Duplex = N

Software Installation = N

Local Customization = None

Begin Script = std_svr.begin

Start Script = std_svr.profile

Finish Script = std_svr.finish

Is this correct? (Y/N) [Y]

Setting up Data Files

Validating rules...

Validating profile Profiles/Solaris_10/std_svr.profile...

The custom JumpStart configuration is ok.

/jumpstart/Scripts/Solaris_10/add_network_client.sh

/jumpstart/Scripts/Solaris_10/add_network_client.sh

Client Addition

# more /etc/bootparams

* install_config=Jumpserv175:/flar

# more /etc/ethers

0:14:4f:f7:af:11 testdom1

Hope these information Suffice... Did I miss any stuff? Can any one help me...

[3160 byte] By [Vinod_Ka] at [2007-11-27 11:10:16]
# 1

Is that the entire contents of your /etc/ethers ? You would get this error if you have multiple lines in the ethers file that contains the ethernet address, for example:

0:1:2:3:4:5 foo

0:1:2:3:4:5 bar

would make rarpd unable to resolve the address.

Secondly, do you have rarpd running on the jumpstart server? (aka svcs rarp or ps -ef |grep rarp)

.7/M.

mAbrantea at 2007-7-29 13:40:38 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 2

Also check you have the following in OBP on client:

local-mac-address?=false

Stuart_Flishera at 2007-7-29 13:40:38 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 3

You can run in.rarpd in debug mode (with -d).

Also make sure that /etc/ethers is being read. If this is an NIS client, it could have nsswitch.conf set up so that only the NIS map is read, not the local file.

--

Darren

Darren_Dunhama at 2007-7-29 13:40:38 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 4

Hi mAbrante,

I have only one ether entry in /etc/ethers file and i.e

0:14:4f:f7:af:11 testdom1

I have also verified rarp service, it was disabled earlier, but when i enable it - rarp started in maintenance mode and still facing the same error...

# svcs rarp

STATE STIMEFMRI

disabledFeb_09svc:/network/rarp:default

# svcadm enable /network/rarp:default

# svcs rarp

STATE STIMEFMRI

maintenance15:37:38 svc:/network/rarp:default

# svcs -x rarp

svc:/network/rarp:default (Reverse Address Resolution Protocol (RARP) server)

State: maintenance since Mon 19 Feb 2001 03:48:25 PM GMT

Reason: Restarting too quickly.

See: http://sun.com/msg/SMF-8000-L5

See: rarp(7P)

See: in.rarpd(1M)

See: /var/svc/log/network-rarp:default.log

Impact: This service is not running.

*************************************************************************

Hi Stuart_Flisher

This is another good catch Flisher, I have checked the local-mac-address parameter which was 'true' I made it to 'false'.

{0} ok printenv local-mac-address?

local-mac-address? =true

{0} ok setenv local-mac-address? false

local-mac-address? =false

{0} ok printenv local-mac-address?

local-mac-address? =false

We are closer to the fix... that RARP daemon is not running... How to fix?

Vinod

Vinod_Ka at 2007-7-29 13:40:38 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 5

Disable the service and use Darrens suggestion, aka start it as:

/usr/sbin/in.rarpd -ad

.7/M.

mAbrantea at 2007-7-29 13:40:38 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 6

Hi,

I have tried disabling rarp and enabling it through -ad option as well with svcadm command...

# svcadm disable rarp

# /usr/sbin/in.rarpd -ad

/usr/sbin/in.rarpd: putmsg: Not a stream device

# svcs rarp

STATE STIMEFMRI

disabled17:21:20 svc:/network/rarp:default

# svcadm enable rarp

# svcs rarp

STATE STIMEFMRI

maintenance17:21:51 svc:/network/rarp:default

# /usr/sbin/in.rarpd -ad

/usr/sbin/in.rarpd: putmsg: Not a stream device

# /usr/sbin/in.rarpd -ad vsw0

/usr/sbin/in.rarpd: Usage: /usr/sbin/in.rarpd [ -ad ] device unit

# /usr/sbin/in.rarpd -ad e1000g0

/usr/sbin/in.rarpd: Usage: /usr/sbin/in.rarpd [ -ad ] device unit

An extract of error from "/var/svc/log/network-rarp:default.log"

[ Feb 19 17:21:51 Executing stop method (:kill) ]

[ Feb 19 17:21:51 Executing start method ("/usr/sbin/in.rarpd -a") ]

[ Feb 19 17:21:51 Method "start" exited with status 0 ]

[ Feb 19 17:21:51 Stopping because all processes in service exited. ]

[ Feb 19 17:21:51 Executing stop method (:kill) ]

[ Feb 19 17:21:51 Executing start method ("/usr/sbin/in.rarpd -a") ]

[ Feb 19 17:21:51 Method "start" exited with status 0 ]

[ Feb 19 17:21:51 Stopping because all processes in service exited. ]

[ Feb 19 17:21:51 Executing stop method (:kill) ]

[ Feb 19 17:21:51 Executing start method ("/usr/sbin/in.rarpd -a") ]

[ Feb 19 17:21:51 Method "start" exited with status 0 ]

[ Feb 19 17:21:51 Stopping because all processes in service exited. ]

[ Feb 19 17:21:51 Executing stop method (:kill) ]

[ Feb 19 17:21:51 Restarting too quickly, changing state to maintenance ]

Still the same error... I feel I am missing something...

Vinod

Vinod_Ka at 2007-7-29 13:40:38 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 7

the -a flag to in.rarpd tells it to bind to all available interfaces, since that failed you could tell it to bind only to e1000g0, you would do that by running:

/usr/sbin/in.rarpd -d e1000g 0

.7/M.

mAbrantea at 2007-7-29 13:40:39 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 8

Hi

# /usr/sbin/in.rarpd -d e1000g 0

/usr/sbin/in.rarpd:[1] device e1000g0 lladdress 0:14:4f:e:bf:ca

/usr/sbin/in.rarpd:[1] device e1000g0 address 0.0.0.0

/usr/sbin/in.rarpd:[1] device e1000g0 subnet mask 0.0.0.0

/usr/sbin/in.rarpd:[3] starting rarp service on device e1000g0 address 0:14:4f:e:bf:ca

# svcs rarp

STATE STIMEFMRI

disabled17:48:01 svc:/network/rarp:default

# ps -ef | grep rarp

root 26336 263180 17:48:12 pts/20:00 grep rarp

root 26314 255740 17:46:09 pts/10:00 /usr/sbin/in.rarpd -d e1000g 0

# svcadm enable rarp

# svcs rarp

STATE STIMEFMRI

maintenance17:48:24 svc:/network/rarp:default

But still rarp daemon is in maintenance mode and jumpstart client is not getting IP for its MAC.

I apologize, I am troubling you a lot and should thank you for providing more and more light towards the solution.

regards,

Vinod

Vinod_Ka at 2007-7-29 13:40:39 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 9

Well, svcs/svcadm will only affect processes started from the svc.startd process. Since you started in.rarpd manually, it will not be detected by svcs...

According to the output you posted, the e1000g0 interface doesn't seems to have any IP address, is it your primary interface? Perhaps you could post the output of ifconfig -a ?

.7/M.

Message was edited by:

mAbrante

mAbrantea at 2007-7-29 13:40:39 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 10

# ifconfig -a

lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1

inet 127.0.0.1 netmask ff000000

vsw0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3

inet 20.5.68.175 netmask ffffff00 broadcast 20.5.68.255

e1000g0: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 5

inet 0.0.0.0 netmask 0

ether 0:14:4f:e:bf:ca

For the puropse of LDom creation i have created virtual switch VSW0

Vinod

Vinod_Ka at 2007-7-29 13:40:39 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 11

why don't you configure interface e1000g0 with a valuable IP address ?

Sakolan.

sakolan2002a at 2007-7-29 13:40:39 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 12

Thats a good question - rather

I am creating a guest LDom and this e1000g0 interface is configured as a virtual switch on a primary domain and virtual network port for the guest domains.

I could find in some document that...

vswitch is a layer 2 switch and by default virtual network can't communicate with the external network via the physical interface. As per the LDom install guide, I have unplumbed e1000g0 and plumbed only vsw0. Still the guest unable to get the IP

and i am unable to bind in.rarpd to vsw0...

Vinod

Vinod_Ka at 2007-7-29 13:40:39 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 13

I can't see the information requested/supplied earlier, but just to cover the simple stuff...

you've got /etc/ethers containing:

<code>

0:14:4f:f7:af:11 testdom1

</code>

What is in /etc/inet/hosts for testdom1 and what does the "hosts" line say in /etc/nsswitch.conf?

sally_ha at 2007-7-29 13:40:39 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 14

Hi

# grep hosts /etc/nsswitch.conf

hosts:files

# grep testdom1 /etc/hosts

20.5.68.18testdom1

What I could read is the LDom constraints on control and guest domain communication. Need to find a solution for this...

http://blogs.sun.com/bradb/entry/ldoms_for_the_rusty

this above blog has an excerpt of the same problem - but the solution is not clear - if any of you get some clue then it will b of great help for me

"ok boot vnet1 - install

Oh No! It never got it's IP address to start the booting process. What's up with this - I spent alot of time messing around with trying to figure out why rarp wasn't working. Well, if you remember reading the docs the vswitch is a layer 2 switch and by default the vnet can't communicate with the external network via the physical interface. Okay that's cool. I'll just plumb up vsw0 in the control domain. No, it didn't work. The control domain's physical interface (e1000g0) still couldn't see the broadcast from vsw0. Long story short I had to unplumb e1000g0 and plumb up vsw0 per the install guide!!!!"

Vinod

Vinod_Ka at 2007-7-29 13:40:39 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 15

Hi all,

got the stuffs resolved and thanks for your help...

But its my duty to say how i did it...

Point no 1: LDom has a communication constraint between control and guest domain and vice versa - hence jumpstarting from the control domain to guest domain is not allowed

Point no 2: in.rarpd was not running on the server due to the fact that the rarp couldnot bind to vsw device as there is no device file in /dev directory. hence i have created soft links for vsw and vsw0, thus that problem got solved.

# ln -s /devices/pseudo/clone@0:vsw /dev/vsw

# ln -s /devices/virtual-devices@100/channel-devices@200/virtual-network-switch@0:vsw0 /dev/vsw0

I will proceed with jumpstarting from an external jumpstart server and let you know the updates.,.

Let me assign you all points and bid you off now... Thank you for all your vital information

Vinod

Vinod_Ka at 2007-7-29 13:40:43 > top of Java-index,Solaris Operating System,Solaris 10 Features...
# 16

I accidentally found this thread, it might be useful:

http://forum.java.sun.com/thread.jspa?threadID=5171040

.7/M.

mAbrantea at 2007-7-29 13:40:43 > top of Java-index,Solaris Operating System,Solaris 10 Features...