Archive

标签为 ‘ASR9k’的文章

Troubleshooting IPoE Session that pending at “ACK-DPM-WAIT”

Introduction

The article will talk about what is “ACK-DPM-WAIT”, and how to troubleshooting the similar scenario. Due to limitation info that couldn’t narrow down, in my CASE, so will update the article if the issue happen again and find RCA.

Problem Description

Version: 5.1.3 + induvial SMU
Platform: 9010 + Mod80 + A9K-MPA-4X10GE
BNG: IPOE, DHCP Proxy, 28k session

My customer found part of BNG session was failure. Trigger is due to customer power supply have issue that cause the asr9k re-power. After 9k reload, found dhcpd and arp have so many alarms, dhcpd was recovery after tried restart process multi times, but arp continue have SPIO alarm even if tried restart process, customer had enabled arp local disable on the BNG port.

The issue sessions got address correct from DHCP, but session would be deleted after 15min. After checked on asr9k, we found issue session pending on ACK_DPM_WAIT status. And the issue was auto recovery at approx.19:00-19:30. And at that timeslot, arp alarm disappear too.
完整阅读

Auto check share memory utilization for IOX by Python

Introduction

In some scenario, we need to monitor some data in router/switch by automation. This article will show example that how to check share memory utilization. And you can easy to change the script base on your requirement/scenario.

Prepare

Due to “telnetlib” couldn’t exactly check expect messages by read_until() function (that couldn’t control exactly time when the info return to buffer), so I change to “expect”. And follow Bo’s example Python Expect Demo, and there is a good documents for expect demo from IBM too: 探索 Pexpect,第 2 部分:Pexpect 的实例分析
完整阅读

Do action by EEM+TCL after the log happen X Times in Y LC/RSP at ASR9k

Problem:

We can do more automated action by EEM + TCL on Cisco router, and have more trigger way for syslog pattern trigger, OID trigger, CPU Threshold trigger and so on. That will match IOS platform, no any issue. But in XR platform, each LC/RSP have separate alarm, we maybe have special requirement, e.g:

Some alarms frequency happen, I want to restart the process (base on pid) if the alarm happen 3 times in 5min on each LC, how to do that?

0/3/cpu0: alarm report "C", Pid = zzz
0/1/cpu0: alarm report "A", Pid = xxx
0/2/cpu0: alarm report "B", pid = yyy
0/3/cpu0: alarm report "C", pid = zzz
0/1/cpu0: alarm report "A", pid = xxx
0/1/cpu0: alarm report "A", pid = xxx

Solution:

We can do interactive script by TCL I/O, create a file in Harddisk/disk which has the history/count of syslog for Lcs. We can read this file using the script whenever the syslog is observed. Based on the number of syslogs the script can take the required action.

The steps will be like this, please check attachment and script flow chart for detail script, in my example, I only dump arp process for testing, please change script base on your requirement, in order to test script, you can add flag to test that, e.g “action_syslog priority info msg “a””: 完整阅读

ASR9k EEM + TCL Interactive Scripting

Requirement:
1. Capture interface tunnel port each 5 minutes, if traffics > X, will capture other information.
2. Store those information to disk0/harddisk.

In fact, the requirement is very easy by Python + CRT, but customer couldn’t find a PC to continue to run python script, so only use EEM + TCL on ASR9k. And in TCL script, I use two function: foreach and scan.

Follow CLI need config before do script, if you change any variable or script, you need re-config “event manager policy tac_te.tcl username cisco”:

aaa authorization eventmanager default local
event manager environment _cron_entry1 */5 * * * *
event manager directory user policy disk0:
event manager policy tac_te.tcl username cisco persist-time 3600 type user

完整阅读

ASR9k EEM+TCL General custom SNMP Trap

If customer want to focus a alarm on their NMS by SNMP Trap, they can config “snmp-server traps syslog”. But if customer no filter feature on NMS, they couldn’t find special alarm in all syslog, now we can use EEM + TCL to match customer requirement.

Follow TCL Script:

::cisco::eem::event_register_syslog pattern $_error_log occurs $_number period $_times maxrun 300
namespace import ::cisco::eem::*
namespace import ::cisco::lib::*

set alarm "***OOB_ERROR Happened!***"

sys_reqinfo_snmp_trapvar var temp oid 1.1.1.1.1.1.1.1 string $alarm
sys_reqinfo_snmp_trap enterprise_oid 1.3.6.1 generic_trapnum 6 specific_trapnum 2 trap_oid 1.1.1.1.1.1.1.1.1.1.1.1.1 trap_var temp

完整阅读