IT Blog

UCS

UCS: Warning: there are pending SEEPROM errors on one or more devices, failover may not complete

by on Dec.03, 2012, under Cisco, UCS

In UCS CLI after issuing command ‘show cluster state‘ a warning is received on one of the chassis.

UCS-B # show cluster state
Cluster Id: 0xf122a7f83dba11e0-0x9a4c123573c4f1c4

B: UP, PRIMARY
A: UP, SUBORDINATE

HA READY
Detailed state of the device selected for HA storage:
Chassis 1, serial: FOX1234567A, state: active
Chassis 2, serial: FOX1234567B, state: active
Chassis 5, serial: FOX1234567C, state: active with errors

Fabric B, chassis-seeprom local IO failure:
FOX1234567C READ_FAILED, error: TIMEOUT, error code: 10, error count: 7
Warning: there are pending SEEPROM errors on one or more devices, failover may not complete

In sam_techsupportinfo  log you’ll see the following message
Creation Time: 2012-10-12T01:12:21.217
ID: 2712562
Description: device FOX1234567C, error accessing shared-storage
Affected Object: sys/mgmt-entity-B
Trigger: Oper
User: internal
Cause: Device Shared Storage Io Error
Code: E4196537

This is known Cisco Bug CSCtu17144 and here is what needs to be done

If the fault condition stays on or keeps being cleared and re-raised, try the following workarounds:
1. Reboot the IO module.
2. Remove and re-seat the IO module. Make sure the module is in contact with the backplane firmly.

I’ve had this problem couple times and resetting IO module was enough in both cases

2 Comments :, , , , , more...

UCS: configuration-failed; Code: F0170; connection-placement; There are not enough resources overall

by on Dec.02, 2012, under Cisco, UCS

Here is an interesting issue that I ran into with Cisco UCS blade.
I needed to move service profile from one blade to another. This is a process that should not give any problems but it did. Dissociation worked fine, but when I tried to associate the same profile with diferent blade I ran into problems.

The first thing I noticed is Config Failure error in Status:

The Configuration error was:
connection-placement
There are not enough resources overall

Not enough vHBAs available
Not enough cNICs available (continue reading…)

1 Comment :, , , , , , , , more...

UCS: After installing or replacing DIMMs shown as disabled in UCS Manager(invalid FRU)

by on Nov.29, 2012, under Cisco, UCS

Here is a problem that you can see when replacing or installing new DIMMs in UCS Blades.
Although the blade will boot but the newly installed DIMMs might show as disabled with invalid FRU error:
Error codes F0844 and F0502 are logged:

When you check inventory of the blade and go into Memoery you’ll see that Capacity and Clock are Unspecified.

SSH into UCSM IP.
Type:
scope server x/y  (where x is your chassis id and y is server id of the server that is having problems.)
show memory  (this list memory information of the blade)
Server 1/1:
Array 1:
DIMM Location Presence Overall Status Type Capacity (MB) Clock

—- ———- —————- ———————— ———— ————- —– (continue reading…)
3 Comments :, , , , , more...

UCS: Blade is stuck on discovery after UCS firmware upgrade (unidentified FRU)

by on Nov.12, 2012, under Cisco, UCS

Here is pretty common problem in UCS 2.0 release.
At any stage of UCS upgrade  one or more blades go into discovery mode and never finishes it. Depending on the version they can get stuck at any percentage but usually between 4% and 40%.
Most of the time a corruption occurs in SEEPROM of  M81kr CNA card because of this corruption checksum fails and UCS cannot recognize the mezzanine card any longer and this prevent Discovery from finishing.
You can see the following errors when this happens:
Configuration Error: adaptor-inoperable. Discovery State: Insufficiently Equipped.
Adapter 1 in server 1/1 has unidentified FRU 

There are multiple Cisco bugs for this issue CSCub16754, CSCty34034, CSCub48862, CSCub99354 and I’ve seen it happening on 2.0(1q), 2.0(2r), 2.0(3a) releases.
Unfortunately the issue is not fixed and there is no workaround. The good thing is that if this occurs the fix is pretty simple and quick and no hardware replacement is needed but only Cisco TAC can fix this or whoever has access to their internal resources.

To verify if corruption occurred you can do the following:

  1. SSH to UCSM IP
  2. Enter connect cimc x/y (Chassis/Blade)
  3. Enter mezz1fru on the versions starting from 2.0(3a) you need to enter fru
    If corruption has occurred the last line of the output will show something like
    ‘Checksum Failed For: Board Area!’

The other method to check is to look at the logs. (continue reading…)

Leave a Comment :, , , , , , , , , more...

UCS: How to update Capability Catalog in UCS Manager

by on Nov.09, 2012, under Cisco, UCS

Here is a guide how to update the Capability Catalog in UCS Manager. Capability Catalog is updated every time you upgrade UCS firmware but you might need to update it separately when a new hardware is added to UCS infrastructure and upgrading the whole UCS is not possible.

1. Login into UCS manager
2. Select Admin tab and change the Filter to Capability Catalog

3. Verify the version of Capability Catalog that is currently installed

(continue reading…)

Leave a Comment :, , , more...

UCS: IOM POST failure, Code: F0481

by on Feb.04, 2012, under Cisco, UCS

So I’ve got the following error on Cisco UCS IOmodule it came out of nowhere.

left IOM 4/1 (A) POST failure
Code: F0481

So the fix could be very simple try reseating or resetting the IOmodule that is giving the problem.
Before you attempt to do that make sure that you the other IOmodule in the Chassis is working fine and all the blades in the chassis have other path available when the IOmodule is reseated. If you have only one IOmodule then you would need to power down the blades first.

Here I’ll show how to reset IOmodule from UCSM.

First select the IOmodule that is giving problems and click Reset IO Module

You’ll be prompted to confirm

You should see now that IOmodule is not reachable and one path for Chassis is down.
Also Critical errors and Warnings will be logged.

Wait for the IOmodule to comeback and the errors to clear.

If all is clear then monitor this module for couple weeks or so. If the same Error appears on IOmodule then call Cisco support to get the module replaced. I’ve seen where the error would reappear after 2-7 days. In that case call Cisco support to get IOmodule replaced.

 

Leave a Comment :, , , more...

UCS: Cisco UCS Emulator 2.0 was released

by on Nov.23, 2011, under Cisco, UCS

Cisco has release UCS emulator 2.0 last week.
You can grab it here.

Leave a Comment :, , more...


UCS: Using Cisco UCS Emulator

by on Aug.28, 2011, under Cisco, UCS

Cisco Unified Computing System(UCS) is quite new compared to well know server providers like HP, Dell, IBM but during a short period of time it achieved quite a lot.
For anyone who wants to see what UCS is or just simply try out, Cisco has released UCS Emulator.
UCS emulator is pretty powerful tool it allows users to emulate UCS environment. So you can add your chassis, servers, CPUs, RAM, PSUs etc. The other interesting feature is that it allows to import hardware configuration from a live environment. It cannot import everything but is pretty close.
This post will show you how setup and use UCS emulator.
Before you start you’ll need to download UCS emulator also you need to install VMware Player or Workstation on your computer.
http://developer.cisco.com/web/unifiedcomputing/ucsemulatordownload download 7z file (you’ll need Cisco login to download it, just register). The emulator requires 1GB of RAM dedicated to the VM so make sure you have enough memory.
http://www.vmware.com/go/downloadplayer (again you need to login to download)

Extract the files from 7z archive to a new folder.
Install VMware player/Workstation this will require a reboot.
(continue reading…)

5 Comments more...