Linux and x4500?

Hi,

I'm wondering, if anyone tried running Linux on Thumper? I know it works sweet with Solaris and ZFS is outstanding. But we have all infrastructure running on Linux and even some developed in-house applications, which I would like to run directly on storage, but which are very Linux-specific (I'm talking here about using Linux-only syscalls and tuning to Linux characteristic to get best performance).

Linux' Marvell driver (sata_mv) supports following PCI IDs:

PCI_VENDOR_ID_MARVELL: 0x5040, 0x5041, 0x5080, 0x5081, 0x6040, 0x6041, 0x6042, 0x6081;

PCI_VENDOR_ID_ADAPTEC2: 0x0241

Which one is used in x4500?

[652 byte] By [zdzichuBG] at [2007-11-26 9:51:17]
# 1
Answering to self: Thumper uses Marvell 88SX6081. Probably supported by Linux
zdzichuBG at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 2

Depending on the number of CPUs, you'll have to run an SMP kernel, and an advanced server kernel with more than 2 CPUs. There shouldn't be any problem with addressing the number of disks that Thumper supports.

However, be aware that Linux will fall over dead on massive I/O. Solaris does not. We had to retire our Linux filesystem servers and re-implement Solaris filesystem servers for our main computing cluster.

I'm not trying to discourage you, but you can't beat Solaris for pure I/O performance. I believe you can also run a Linux executable directly on Solaris. At least that is SUN's claim.

While I run 500+ linux servers, I've found them very good for pure computational performance, but I rely on Solaris systems for filesystem access.

truly64 at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 3
Thanks for the tip. It's interesting what you say about I/O, I believe linux kernel developers will be interested in my reports if this will become problem :)
zdzichuBG at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 4

The newer 2.6 distributions introduced a new device scheme (called "udev") where /dev entries are dynamically created when a disk is discovered. This makes the management of many disks much easier.

So I would encourage you to use a 2.6 distribution (we use RHEL V4). There are also many i/o performance and stability improvements.

truly64 at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 5

Hi,

I installed a Thumper using Scientific Linux 4.4 which is a rebuild of RHEL 44.

Apart from some bootstrap problems, I have now the box up and running, but with the following critical issue:

When building the arrays (8 RAID5), the box eventually crashes instantly with either of the following messages:

PCI-DMA: Out of IOMMU space for 360448 bytes at device 0000:0b:01.0

PCI-DMA: Out of IOMMU space for 360448 bytes at device 0000:02:01.0

which corresponds to both the marvell controllers 'lspci' is seeing.

I tried to change the IOMMU setting in the BIOS, with no success.

Changing the md rebuild rate in /proc only deferrs the crash.

Also worth mentioning is the following message which appears during boot:

Lata7: SATA max UDMA/133 cmd 0x0 ctl 0xFFFFFF00000B6120 bmdma 0x0 irq 177

^Moading sata_mv.kata8: SATA max UDMA/133 cmd 0x0 ctl 0xFFFFFF00000B8120 bmdma 0x0 irq 177

^Mo module^M

Badness in __msleep at drivers/scsi/sata_mv.c:1907

^M

^MCall Trace:<IRQ> <ffffffffa00421b7>{:sata_mv:__mv_phy_reset+237} <ffffffffa0041cc3>{:sata_mv:mv_channel_reset+138}

^M<ffffffffa0042805>{:sata_mv:mv_interrupt+594} <ffffffff80112f4a>{handle_IRQ_event+41}

^M<ffffffff801131c4>{do_IRQ+197} <ffffffff80110833>{ret_from_intr+0}

^M<EOI> <ffffffff8010e749>{default_idle+0} <ffffffff8010e769>{default_idle+32}

^M<ffffffff8010e7dc>{cpu_idle+26}

^Mata1: dev 0 ATA-7, max UDMA/133, 976773168 sectors: LBA48

^Mata1: dev 0 configured for UDMA/133

Any successful attempts to use other kernels?

faxm0dem at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 6
OK replying to myself here.I just dropped the sata_mv for the mv_sata driver which I got from http://www.keffective.com/mvsata/FC3/Building the RAID arrays at over 1.6GBps (read) instead of 0.7 with the stock sata_mv.No crash yet
faxm0dem at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 7
Don't know if it will help, but the modules I load for x2100 and x4100 servers are:scsi_mod libata sata_nv usbcore usb-ohci input hid keybdev
truly64 at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 8

I'm running with CentOS 4.4 (RHEL 4 clone), which has the sata_mv driver included.

My system does a rolling panic any time I stress the I/O system at all (e.g., building a

RAID-5 array with mdadm).

I've been looking at the materials on building/installing mv_sata (instead of sata_mv),

but I can't get it to work

It builds correctly (verified by the strings mv_sata.ko | grep magic trick), but a modprobe

reports "FATAL: Module mv_sata not found". I *can* do an insmod successfully.

If I change modprobe.conf to try to force it to use mv_sata, it seems to have no effect.

If the machine is running (using sata_mv), can someone give me a step-by-step

on changing over to mv_sata-- clearly I'm missing something.

TIA--

Cris

crhea at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 9
Cris,modprobe can only load modules which are in /lib/modules/<version>/... and subdirectories. In other hand, insmod can load modules from current dir (that's why it's working for you) but can't resolve dependencies.
zdzichuBG at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 10

The instructions I saw on the net show using a full path for modprobe (e.g., modprobe /home/mydir/mv_sata.ko). I also previously tried to place the module in the modules directory (/lib/modules/2.6.9-42.0.3.ELsmp/kernel/drivers/scsi/mv_sata.ko) as you suggest-- same result. I think I have an issue because of scsi_mod->libata->sata_mv dependencies.

Since the list was quiet this weekend, I tried another approach-- getting the latest kernel from kernel.org (2.6.19.1) had a newer version of sata_mv which is MUCH more stable. I was able to set up RAID6 groups, add file systems on top of them then let them resync. Machine has now been up for almost 2 days (it would previously croak after at most an hour of resyncing).

I'd still like to understand what I'm doing wrong with the modprobe and/or /etc/modprobe.conf-- I'd like to see what the performance difference is between mv_sata and sata_mv as discussed earlier in this thread.

crhea at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 11
You need to do a 'depmod' before modprobe finds the module.furthermore, add 'alias scsi_hostadapter mv_sata' to modprobe.conf and rerun mkinitrd if needed
faxm0dem at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 12

I got the mv_sata module to finally take... I think I was shortcutting something in the

difference in the build process (difference between building something from

kernel.org and RedHat with the RPM sources).

I now have a different problem....

On CentOS4 update 4 (RHEL4u4), there's a bug that during the installation process,

the installer wants to write the grub bootloaded to /dev/sda (even though on an x4500,

the boot disk will end up being /dev/sdy). I can click on "Advanced grub options" and

select putting it on /dev/sdy1 (which is fine-- that's my /boot partition).

The result is that grub starts to boot, then generates a "Error 12: Invalid device requested". I can boot to a rescue disk, but running grub manually gives the same

error when doing the "root (hd24,0)" step.

Grub's device.map file confirms that hd24 is /dev/sdy (as it should be).

Googling this error produces loads of hit, but none seem useful.

Do I need to put grub in /dev/sda's MBR -- even though sda has nothing to do with the

boot process?

crhea at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 13

I experimented many workarounds to this.

Eventually I opted for the following:

1) Use devices sda and sdb in anaconda partitions

this will install the system on disks labeled 10 and 22

2) On first reboot you then need to physically swap these

disks with 0 and 1

If I remember correctly nothing else needs to be done after that.

faxm0dem at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 14

<Bang head against wall>

It would have been so much easier if the GRUB docs would

have pointed this type of thing out...

So, Grub uses BIOS disk order, while Linux finds the drives in

a different order (based on driver load order, PCI bus order, etc.)

Since I had already loaded the drive as /dev/sdy (I didn't use the 2nd drive), all I had to

do was change /boot/grub/grub.conf to use (hd0,0) rather than (hd24,0).

Once I had done this-- POOF! It works!

Since Sun just released the Linux support materials for the X4500, I'll be very

curious how they explain these sorts of "quirks".

Thanks faxm0dem-- you saved what was left of my sanity!

Cris

crhea at 2007-7-7 1:03:55 > top of Java-index,Sun Hardware,Servers - General Discussion...