Archive

标签为 ‘ASR9k’的文章

Auto check share memory utilization for IOX by Python

Introduction

In some scenario, we need to monitor some data in router/switch by automation. This article will show example that how to check share memory utilization. And you can easy to change the script base on your requirement/scenario.

Prepare

Due to “telnetlib” couldn’t exactly check expect messages by read_until() function (that couldn’t control exactly time when the info return to buffer), so I change to “expect”. And follow Bo’s example Python Expect Demo, and there is a good documents for expect demo from IBM too: 探索 Pexpect,第 2 部分:Pexpect 的实例分析
完整阅读

How To Check BGP Memory OF ASR9k

Introduction

Refer to some CASE, customer concern why BGP take so many memory resource, and how to optimize BGP, and why memory not release after optimize, and so on. That hard to answer, in order to get answer, we need check more information. From the article, you can simple to know how to check BGP memory, and how to analyzing the BGP memory, those informaiton will help you and customer to do special optimize.

Solution

1. Default scenario, no any BGP route

完整阅读

Troubleshooting fabric FIA tail drop on ASR9k

Introduction

In TZ database, have more good documents that is troubleshooting fabric guide on ASR9k, but no analysis process that show how to troubleshooting fabric issue on real scenario/CASE. “Very lucky” ?  I matched a hot CASE that due to fabric issue cause online fail. So i summaried totally analysis process that will help CSE to narrow down similar issue.

Problem Description

  • Platform: 9922 + 4 36x10G + 2 8x100G
  • Version: 5.3.2 + SMU 

My customer online a new 9922 to replace old devices. After online, found their business have traffics drop. Base on online information, I found NP no more drop, and business traffics very less. (max 5g under 100g port, all bundle port when online ts)

完整阅读

Do action by EEM+TCL after the log happen X Times in Y LC/RSP at ASR9k

Problem:

We can do more automated action by EEM + TCL on Cisco router, and have more trigger way for syslog pattern trigger, OID trigger, CPU Threshold trigger and so on. That will match IOS platform, no any issue. But in XR platform, each LC/RSP have separate alarm, we maybe have special requirement, e.g:

Some alarms frequency happen, I want to restart the process (base on pid) if the alarm happen 3 times in 5min on each LC, how to do that?

0/3/cpu0: alarm report "C", Pid = zzz
0/1/cpu0: alarm report "A", Pid = xxx
0/2/cpu0: alarm report "B", pid = yyy
0/3/cpu0: alarm report "C", pid = zzz
0/1/cpu0: alarm report "A", pid = xxx
0/1/cpu0: alarm report "A", pid = xxx

Solution:

We can do interactive script by TCL I/O, create a file in Harddisk/disk which has the history/count of syslog for Lcs. We can read this file using the script whenever the syslog is observed. Based on the number of syslogs the script can take the required action.

The steps will be like this, please check attachment and script flow chart for detail script, in my example, I only dump arp process for testing, please change script base on your requirement, in order to test script, you can add flag to test that, e.g “action_syslog priority info msg “a””: 完整阅读

How to troubleshooting HW FIB on ASR9k | PBTS/TE

Problem Description

Follow Topology

 

 10.200.0.24               10.200.0.26    10.200.0.32
   +------+      +------+     +------+     +------+
   |XiAn  |      | SZ   |     |HKBR  |     |Canada|
   |Huawei+------+      +-----+ASR9k +-----+ASR9k |
   +------+      +------+     +------+     +------+

For the issue, HuaWei is Head, asr9k should mid and end node. And Huawei found they put the traffics to Canada by LDP label, not TE label. So traffics should arrive to HK ASR9k by LDP label, then forwarding by LDP too from HK ASR9k to Canada. That not match normal scenario.

For normal scenario, should use TE label from head to end. And after Huawei put traffics to TE LSP, the issue will recovery to normal. Refer to why Huawei send traffics by LDP, that should their issue, will fix in future by them, but even if to do that, 9k shouldn’t drop packets. We need find whether packets drop at 9k and due to label issue first. Now set up test environment in customer site. waiting update.

完整阅读

blonde teen swallows load.xxx videos