Archive

‘Classical CASE’ 分类存档

NCS6K: XRVM loses console

Problem

Customer get HostOS when connect to XRVM console after installing ISSU SMU, and confirmed XRVM normal work.

Background

  • SAVM and XRVM on all RPs and LCs, FC only have SAVM, check by “show vm” in admin vm, SAVM and XRVM mapping to console 0 & 1, as follow:
  • Except SAVM and XRVM, have key components that is host system in RPs or LCs, you can check host by follow steps, login by “ssh ” after “chvrf 0 bash”.

完整阅读

0

Troubleshooting “%FABRIC-INGRESSQ-6-LINK_DOWN” on CRS

Introduction

Customer found 0/4/cpu0 have many ingressq asic error, after checked, that should match a know DDTS: CSCuu86430. The issue maybe was triggered when CRS-3 MSCs(140G) interactive with a CRS-X(400G) fabric. After trigger the issue, will found CRS-X’s fabric link of s1rx flapping. Have reload SMU under 514.

For this article, will show how to troubleshooting the fabric link flapping.

Troubleshooting

1. Customer found follow alarm:

LC/0/4/CPU0:Nov  8 00:30:26.752 : ingressq[235]: %FABRIC-INGRESSQ-6-LINK_DOWN : Ingressq: Link 26 of Asic Instance 0 has been administratively shut down. 
LC/0/1/CPU0:Nov  8 00:37:59.734 : fabricq_mgr[178]: %FABRIC-FABRICQ-3-PCL_PKT : Minor error in PCL of fabricq asic 0. PCL UC Lost Packet: CAOPCI: 0x18 (0/4, UC, LO):Lost Packet count= 1 
LC/0/4/CPU0:Nov  8 00:37:59.734 : ingressq[235]: %FABRIC-INGRESSQ-6-LINK_DOWN : Ingressq: Link 26 of Asic Instance 0 has been administratively shut down. 
LC/0/4/CPU0:Nov  8 10:27:27.265 : ingressq[235]: %FABRIC-INGRESSQ-6-LINK_DOWN : Ingressq: Link 26 of Asic Instance 0 has been administratively shut down. 
LC/0/4/CPU0:Nov  8 11:06:08.181 : ingressq[235]: %FABRIC-INGRESSQ-6-LINK_DOWN : Ingressq: Link 26 of Asic Instance 0 has been administratively shut down. 
LC/0/1/CPU0:Nov  8 11:08:46.132 : fabricq_mgr[178]: %FABRIC-FABRICQ-3-PCL_PKT : Minor error in PCL of fabricq asic 0. PCL UC Partial Packet: CAOPCI: 0x18 (0/4, UC, LO) 
LC/0/4/CPU0:Nov  8 11:18:34.733 : ingressq[235]: %FABRIC-INGRESSQ-6-LINK_DOWN : Ingressq: Link 26 of Asic Instance 0 has been administratively shut down. 
LC/0/4/CPU0:Nov  8 11:28:44.350 : ingressq[235]: %FABRIC-INGRESSQ-6-LINK_DOWN : Ingressq: Link 26 of Asic Instance 0 has been administratively shut down. 

完整阅读

0

Troubleshooting IPoE Session that pending at “ACK-DPM-WAIT”

Introduction

The article will talk about what is “ACK-DPM-WAIT”, and how to troubleshooting the similar scenario. Due to limitation info that couldn’t narrow down, in my CASE, so will update the article if the issue happen again and find RCA.

Problem Description

Version: 5.1.3 + induvial SMU
Platform: 9010 + Mod80 + A9K-MPA-4X10GE
BNG: IPOE, DHCP Proxy, 28k session

My customer found part of BNG session was failure. Trigger is due to customer power supply have issue that cause the asr9k re-power. After 9k reload, found dhcpd and arp have so many alarms, dhcpd was recovery after tried restart process multi times, but arp continue have SPIO alarm even if tried restart process, customer had enabled arp local disable on the BNG port.

The issue sessions got address correct from DHCP, but session would be deleted after 15min. After checked on asr9k, we found issue session pending on ACK_DPM_WAIT status. And the issue was auto recovery at approx.19:00-19:30. And at that timeslot, arp alarm disappear too.
完整阅读

0

How to decode TCP, UDP and RAW for IOS-XR

做为工程师,常常遇到一些协议交互的问题,需要确认数据包的具体信息,这时常常会用到几种方法:
1. SPAN抓包
对于这种方法,结果分析起来最方便,但操作过程最麻烦
2. debug
这种方法最直观,但是debug数据如果非常多,会影响设备的正常运行

下面就是采用其他方法来达成这种需求,虽然用的是udp来说明,但是同样适用于TCP和RAW:

RP/0/RP1/CPU0:CRS2(config)#udp directory /tmp/udp
RP/0/RP1/CPU0:CRS2(config)#commit
RP/0/RP1/CPU0:CRS2(config)#ipv4 access-list hsrp-packet
RP/0/RP1/CPU0:CRS2(config-ipv4-acl)#20 permit udp any eq 1985 any eq 1985
RP/0/RP1/CPU0:CRS2(config-ipv4-acl)#30 deny ipv4 any any
RP/0/RP1/CPU0:CRS2(config-ipv4-acl)#exit
RP/0/RP1/CPU0:CRS2(config)#ipv6 access-list v6-filter
RP/0/RP1/CPU0:CRS2(config-ipv6-acl)#10 deny ipv6 any any
RP/0/RP1/CPU0:CRS2(config-ipv6-acl)#exit
RP/0/RP1/CPU0:CRS2(config)#commit
RP/0/RP1/CPU0:CRS2(config)#exit
RP/0/RP1/CPU0:CRS2#debug udp packet v4-access-list hsrp-packet v6-access-list v6-filter hex control-block location x/x/cpu0

You can check the capture by follow patch:
RP/0/RP1/CPU0:CRS2#run
# cd /tmp/udp
#ls
#more xxxx

0

Multi Hierarchical CEF / Load Share

环境

 --------+--------------------+---------
         |   22.22.22.22/32   |
         |                    |
    +----+----+          +----+----+
    | 2.2.2.2 |          | 3.3.3.3 |
    | RouterA |          | RouterB |
    +-\----\--+          +-/---/---+
       \    \             /   /
        \\   \           /   /
          \   \         /  //
           \   \F2/0   /  /
            \\  \     /  /
         F1/0 \  \ F3/0 / F4/0
               *--\-/--*
               |1.1.1.1|
               | CoreA |
               +-------+

不限设备,所有运行IOS的设备,包括GSR,7609等。
在早期版本,不支持Multi hierarchical CEF,仅仅支持一层递归后的转发。这样产生了很多限制,例如今天提到的双PE结构。在特定版本后(包括IOS和IOX),CEF的行为有了改变,并且支持多层CEF。不过CEF的行为也要看平台,因为GSR上任何版本都不支持这种多层CEF。
完整阅读

0