Troubleshooting “%FABRIC-INGRESSQ-6-LINK_DOWN” on CRS

Introduction

Customer found 0/4/cpu0 have many ingressq asic error, after checked, that should match a know DDTS: CSCuu86430. The issue maybe was triggered when CRS-3 MSCs(140G) interactive with a CRS-X(400G) fabric. After trigger the issue, will found CRS-X’s fabric link of s1rx flapping. Have reload SMU under 514.

For this article, will show how to troubleshooting the fabric link flapping.

Troubleshooting

1. Customer found follow alarm:

LC/0/4/CPU0:Nov  8 00:30:26.752 : ingressq[235]: %FABRIC-INGRESSQ-6-LINK_DOWN : Ingressq: Link 26 of Asic Instance 0 has been administratively shut down. 
LC/0/1/CPU0:Nov  8 00:37:59.734 : fabricq_mgr[178]: %FABRIC-FABRICQ-3-PCL_PKT : Minor error in PCL of fabricq asic 0. PCL UC Lost Packet: CAOPCI: 0x18 (0/4, UC, LO):Lost Packet count= 1 
LC/0/4/CPU0:Nov  8 00:37:59.734 : ingressq[235]: %FABRIC-INGRESSQ-6-LINK_DOWN : Ingressq: Link 26 of Asic Instance 0 has been administratively shut down. 
LC/0/4/CPU0:Nov  8 10:27:27.265 : ingressq[235]: %FABRIC-INGRESSQ-6-LINK_DOWN : Ingressq: Link 26 of Asic Instance 0 has been administratively shut down. 
LC/0/4/CPU0:Nov  8 11:06:08.181 : ingressq[235]: %FABRIC-INGRESSQ-6-LINK_DOWN : Ingressq: Link 26 of Asic Instance 0 has been administratively shut down. 
LC/0/1/CPU0:Nov  8 11:08:46.132 : fabricq_mgr[178]: %FABRIC-FABRICQ-3-PCL_PKT : Minor error in PCL of fabricq asic 0. PCL UC Partial Packet: CAOPCI: 0x18 (0/4, UC, LO) 
LC/0/4/CPU0:Nov  8 11:18:34.733 : ingressq[235]: %FABRIC-INGRESSQ-6-LINK_DOWN : Ingressq: Link 26 of Asic Instance 0 has been administratively shut down. 
LC/0/4/CPU0:Nov  8 11:28:44.350 : ingressq[235]: %FABRIC-INGRESSQ-6-LINK_DOWN : Ingressq: Link 26 of Asic Instance 0 has been administratively shut down. 


2. FPD and Platform info:

--------------------------------------------------------------------------------
0/4/CPU0     140G-MSC                   0.7   lc   rommonA 0       2.07     Yes
                                              lc   rommon  0       2.07     Yes
                                              lc   fpga1   0       0.08     No 
                                              lc   fpga2   0       0.36     No 
--------------------------------------------------------------------------------
0/4/CPU0     14-10GBE                   0.81  lc   fpga3   1      42.00     No 
--------------------------------------------------------------------------------

0/SM1/SP      FC-400G/S(SP)     N/A                IOS XR RUN      PWR,NSHUT,MON
0/SM2/SP      FC-400G/S(SP)     N/A                IOS XR RUN      PWR,NSHUT,MON
0/SM3/SP      FC-400G/S(SP)     N/A                IOS XR RUN      PWR,NSHUT,MON
0/SM4/SP      FC-400G/S(SP)     N/A                IOS XR RUN      PWR,NSHUT,MON
0/SM5/SP      FC-400G/S(SP)     N/A                IOS XR RUN      PWR,NSHUT,MON
0/SM6/SP      FC-400G/S(SP)     N/A                IOS XR RUN      PWR,NSHUT,MON
0/SM7/SP      FC-400G/S(SP)     N/A                IOS XR RUN      PWR,NSHUT,MON

3. Checking ingressq link status:

#admin show controllers ingressq fabric links location 0/4/CPU0
Sat Nov 11 15:50:18.867 Beijing
Ingressq ASIC instance 0
----------------------------------------------------.
Ingressq link state
plane-id    link-id     ADMIN-STATE OPER-STATE  AVAIL-STATE UP-COUNT    
----------------------------------------------------.
0           0           UP          UP          UP          1           
0           8           UP          UP          UP          1           
0           16          UP          UP          UP          1           
0           24          UP          UP          UP          1           
0           32          UP          UP          UP          1           
0           40          UP          UP          UP          1           
1           1           UP          UP          UP          1           
1           9           UP          UP          UP          1           
1           17          UP          UP          UP          1           
1           25          UP          UP          UP          1           
1           33          UP          UP          UP          1           
1           41          UP          UP          UP          1           
2           2           UP          UP          UP          2           
2           10          UP          UP          UP          2           
2           18          UP          UP          UP          2           
2           26          UP          UP          UP          435  <<<       
2           34          UP          UP          UP          2           
2           42          UP          UP          UP          2           
3           3           UP          UP          UP          1           
3           11          UP          UP          UP          1           
3           19          UP          UP          UP          1           
3           27          UP          UP          UP          1           
3           35          UP          UP          UP          1           
3           43          UP          UP          UP          1           
4           4           UP          UP          UP          1           
4           12          UP          UP          UP          1           
4           20          UP          UP          UP          1           
4           28          UP          UP          UP          1           
4           36          UP          UP          UP          1           
4           44          UP          UP          UP          1           
5           5           UP          UP          UP          1           
5           13          UP          UP          UP          1           
5           21          UP          UP          UP          1           
5           29          UP          UP          UP          1           
5           37          UP          UP          UP          1           
5           45          UP          UP          UP          1           
6           6           UP          UP          UP          1           
6           14          UP          UP          UP          1           
6           22          UP          UP          UP          1           
6           30          UP          UP          UP          1           
6           38          UP          UP          UP          1           
6           46          UP          UP          UP          1           
7           7           UP          UP          UP          1           
7           15          UP          UP          UP          1           
7           23          UP          UP          UP          1           
7           31          UP          UP          UP          1           
7           39          UP          UP          UP          1           
7           47          UP          UP          UP          1           
----------------------------------------------------.

4. Checking s1rx link

#admin show controllers fabric link port s1rx brief            
--------------------------------------------------------------------------------
0/SM2/SP/1/56/V3     UP/UP          0/4/CPU0/0/26/V1

5. Checking s1rx link stats

#admin show controllers fabric link port s1rx statistics brief
Sat Nov 11 15:50:55.460 Beijing
Total racks: 1

Rack 0:
  Flags: E-D - Exceeded display width.
               Check detail option.
 
      SFE  Port                In                In         CE       UCE    PE 
      R/S/M/A/P            Data Cells        Idle Cells    Cells    Cells Cells
--------------------------------------------------------------------------------
  0/SM2/SP/1/56        316659395355       3994054892193        0        0    0

6. Checking s1rx link detail status

admin show controllers fabric link port s1rx 0/SM2/SP/1/56 detail
Sat Nov 11 16:18:53.829 Beijing

 Sfe Port       Admin  Oper   Avail           Down     Sfe BP Port BP  Other
 R/S/M/A/P      State  State  State           Flags    Role   Role     End
 -------------------------------------------------------------------------
 0/SM2/SP/1/56/V3 UP    UP     UP                                   0/4/CPU0/0/26
 ---------------------------------------------------
 Link Type  Pin1 Name           Pin2 Name
 ---------------------------------------------------
 CHASSIS    C14                 G24                 

+-----------------------------------------------------------------------+
| Timestamp               Flags  Event                    Direction     |
+-----------------------------------------------------------------------+
2017 Nov 11 14:19:51.742  l      ADMIN_UP                  INTERNAL    
2017 Nov 11 14:19:51.749  l      ADMIN_UP                  FSDB->DRIVER
2017 Nov 11 14:19:51.752  l      DOWN                      DRIVER->FSDB
2017 Nov 11 14:19:51.809  l      UP                        DRIVER->FSDB
2017 Nov 11 14:19:51.809         ADMIN_UP                  INTERNAL    
2017 Nov 11 14:19:51.814         ADMIN_UP                  FSDB->DRIVER
2017 Nov 11 14:43:15.363         DOWN                      DRIVER->FSDB
2017 Nov 11 14:43:15.363  l      ADMIN_UP                  INTERNAL    
2017 Nov 11 14:43:15.367  l      ADMIN_UP                  FSDB->DRIVER
2017 Nov 11 14:43:15.417  l      DOWN                      DRIVER->FSDB
2017 Nov 11 14:43:15.494  l      UP                        DRIVER->FSDB
2017 Nov 11 14:43:15.494         ADMIN_UP                  INTERNAL    
2017 Nov 11 14:43:15.499         ADMIN_UP                  FSDB->DRIVER
2017 Nov 11 15:52:23.291         DOWN                      DRIVER->FSDB
2017 Nov 11 15:52:23.291  l      ADMIN_UP                  INTERNAL    
2017 Nov 11 15:52:23.296  l      ADMIN_UP                  FSDB->DRIVER
2017 Nov 11 15:52:23.345  l      DOWN                      DRIVER->FSDB
2017 Nov 11 15:52:23.420  l      UP                        DRIVER->FSDB
2017 Nov 11 15:52:23.420         ADMIN_UP                  INTERNAL    
2017 Nov 11 15:52:23.421         ADMIN_UP                  FSDB->DRIVER
 -------------------------------------------------------------------------
      Neighbors
 -------------------------------------------------------------------------
s1rx/0/SM2/SP/1/58               ingressqtx/0/4/CPU0/0/10
s1rx/0/SM2/SP/1/50               ingressqtx/0/6/CPU0/0/10
 -------------------------------------------------------------------------

7. Checking s1rx flapping status

#admin show controllers sfe link-info rx 0 127 flap instance 1 location 0/sm2/sp
Sat Nov 11 16:19:01.596 Beijing
-------------------------------------------------------------------------
Node ID:0/SM2/SP
Link ID              Oper       Link       Admin     
                     Status     Errors     Shuts      Bringdowns
-------------------------------------------------------------------------
0/SM2/SP/1/56        UP         327        0         0

8. Checking asic error:

#admin show asic-errors all summary location 0/sm2/sp
Sat Nov 11 16:19:19.718 Beijing
************************************************************
*               Superstar ASIC Error Summary               *
************************************************************
Instance              : 0
Number of nodes       : 0
SBE error count       : 0
MBE error count       : 0
Parity error count    : 0
Generic error count   : 0
Reset error count     : 0
Barrier error count   : 0
Unexpected error count: 0
Link error count      : 0
OOR Threshold count   : 0
BP error count        : 0
IO error count        : 0
Ucode error count     : 0
Config error count    : 0
Indirect error count  : 0
--------------------
Instance              : 1
Number of nodes       : 2
SBE error count       : 0
MBE error count       : 0
Parity error count    : 0
Generic error count   : 0
Reset error count     : 0
Barrier error count   : 0
Unexpected error count: 0
Link error count      : 327 <<<
OOR Threshold count   : 0
BP error count        : 0
IO error count        : 0
Ucode error count     : 0
Config error count    : 0
Indirect error count  : 0
--------------------

9. Checking detail asic-error

************************************************************
*                   Instance       : 1                     *
************************************************************
************************************************************
*                    Single Bit Errors                     *
************************************************************
************************************************************
*                   Multiple Bit Errors                    *
************************************************************
************************************************************
*                      Parity Errors                       *
************************************************************
************************************************************
*                      Barrier Errors                      *
************************************************************
************************************************************
*                    Unexpected Errors                     *
************************************************************
************************************************************
*                       Link Errors                        *
************************************************************
FULLQ_B, FC-400G/S, 0/SM2/SP, sfe[1]
Name            : DORM3.orl_b_csrs.orl_err_hier_int.orl_HW_LINK_SHUTDOWN_leaf_int.int_LOL_EVENT_LINK0
Leaf ID         : 0x160600ca
Thresh/period(s): 20/day
Error count     : 327
Last clearing   : Sun Oct 29 05:50:15 2017
Last N errors   : 50
--------------------------------------------------------------
......
Last N errors.
@Time, Error-Data
------------------------------------------
Nov 11 00:14:22.213982: 
                       Error description: DORM: 3 ORL: 1 Stage: 1 Data link: s1rx/0/SM2/SP/1/56
Nov 11 00:47:37.787079: 
                       Error description: DORM: 3 ORL: 1 Stage: 1 Data link: s1rx/0/SM2/SP/1/56
Nov 11 01:01:51.484478: 
                       Error description: DORM: 3 ORL: 1 Stage: 1 Data link: s1rx/0/SM2/SP/1/56
Nov 11 01:21:48.864781: 
                       Error description: DORM: 3 ORL: 1 Stage: 1 Data link: s1rx/0/SM2/SP/1/56
Nov 11 02:09:06.313837: 
                       Error description: DORM: 3 ORL: 1 Stage: 1 Data link: s1rx/0/SM2/SP/1/56
Nov 11 02:42:42.938339: 
                       Error description: DORM: 3 ORL: 1 Stage: 1 Data link: s1rx/0/SM2/SP/1/56
......
--------------------------------------------------------------
anyShare分享到:
你可以留言,或者trackback 从你的网站

留言哦