Posts Tagged ‘discovery’

UCS: After firmware upgrade B230 M1 blade is failing discovery at 4%


15 Dec

After upgrading B230 M1 blade it gets stuck in discovery at 4% and is showing following errors:

no connection to MC endpoint
1

Error retrieving Server Params-MC Error(-6): Connection is closing

2

No DIMMs present

3

The errors can be rotating but it is stuck at 4%.

1. Check Firmware versions and see if all are as expected.
3_5

2. Reset CIMC and after this is done Reacknowledge the server.
3. Check the firmware versions of the blade again. The Running Board Controller version is set to 00000000
4

4. Activate the Board Controller with correct version and wait until activation is finished.
5

5. Reacknowledge the server

UCS: Blade is stuck on discovery after UCS firmware upgrade (unidentified FRU)


12 Nov

Here is pretty common problem in UCS 2.0 release.
At any stage of UCS upgrade  one or more blades go into discovery mode and never finishes it. Depending on the version they can get stuck at any percentage but usually between 4% and 40%.
Most of the time a corruption occurs in SEEPROM of  M81kr CNA card because of this corruption checksum fails and UCS cannot recognize the mezzanine card any longer and this prevent Discovery from finishing.
You can see the following errors when this happens:
Configuration Error: adaptor-inoperable. Discovery State: Insufficiently Equipped.
Adapter 1 in server 1/1 has unidentified FRU 

There are multiple Cisco bugs for this issue CSCub16754, CSCty34034, CSCub48862, CSCub99354 and I’ve seen it happening on 2.0(1q), 2.0(2r), 2.0(3a) releases.
Unfortunately the issue is not fixed and there is no workaround. The good thing is that if this occurs the fix is pretty simple and quick and no hardware replacement is needed but only Cisco TAC can fix this or whoever has access to their internal resources.

To verify if corruption occurred you can do the following:

  1. SSH to UCSM IP
  2. Enter connect cimc x/y (Chassis/Blade)
  3. Enter mezz1fru on the versions starting from 2.0(3a) you need to enter fru
    If corruption has occurred the last line of the output will show something like
    ‘Checksum Failed For: Board Area!’

The other method to check is to look at the logs. (more…)

IT Blog

Just another blog on Kozeniauskas.com Network