Archive

标签为 ‘EEM’的文章

Do action by EEM+TCL after the log happen X Times in Y LC/RSP at ASR9k

Problem:

We can do more automated action by EEM + TCL on Cisco router, and have more trigger way for syslog pattern trigger, OID trigger, CPU Threshold trigger and so on. That will match IOS platform, no any issue. But in XR platform, each LC/RSP have separate alarm, we maybe have special requirement, e.g:

Some alarms frequency happen, I want to restart the process (base on pid) if the alarm happen 3 times in 5min on each LC, how to do that?

0/3/cpu0: alarm report "C", Pid = zzz
0/1/cpu0: alarm report "A", Pid = xxx
0/2/cpu0: alarm report "B", pid = yyy
0/3/cpu0: alarm report "C", pid = zzz
0/1/cpu0: alarm report "A", pid = xxx
0/1/cpu0: alarm report "A", pid = xxx

Solution:

We can do interactive script by TCL I/O, create a file in Harddisk/disk which has the history/count of syslog for Lcs. We can read this file using the script whenever the syslog is observed. Based on the number of syslogs the script can take the required action.

The steps will be like this, please check attachment and script flow chart for detail script, in my example, I only dump arp process for testing, please change script base on your requirement, in order to test script, you can add flag to test that, e.g “action_syslog priority info msg “a””: 完整阅读

ASR9k EEM + TCL Interactive Scripting

Requirement:
1. Capture interface tunnel port each 5 minutes, if traffics > X, will capture other information.
2. Store those information to disk0/harddisk.

In fact, the requirement is very easy by Python + CRT, but customer couldn’t find a PC to continue to run python script, so only use EEM + TCL on ASR9k. And in TCL script, I use two function: foreach and scan.

Follow CLI need config before do script, if you change any variable or script, you need re-config “event manager policy tac_te.tcl username cisco”:

aaa authorization eventmanager default local
event manager environment _cron_entry1 */5 * * * *
event manager directory user policy disk0:
event manager policy tac_te.tcl username cisco persist-time 3600 type user

完整阅读

ASR9k EEM+TCL General custom SNMP Trap

If customer want to focus a alarm on their NMS by SNMP Trap, they can config “snmp-server traps syslog”. But if customer no filter feature on NMS, they couldn’t find special alarm in all syslog, now we can use EEM + TCL to match customer requirement.

Follow TCL Script:

::cisco::eem::event_register_syslog pattern $_error_log occurs $_number period $_times maxrun 300
namespace import ::cisco::eem::*
namespace import ::cisco::lib::*

set alarm "***OOB_ERROR Happened!***"

sys_reqinfo_snmp_trapvar var temp oid 1.1.1.1.1.1.1.1 string $alarm
sys_reqinfo_snmp_trap enterprise_oid 1.3.6.1 generic_trapnum 6 specific_trapnum 2 trap_oid 1.1.1.1.1.1.1.1.1.1.1.1.1 trap_var temp

完整阅读

关于IP SLA及与EEM联动的探讨<2>

根据上篇文章分析的第一种方法到底行不行呢?
经过测试,确实可以规避原丢一个包就启动EEM的问题。
但有个问题,因为要新增一个sla,如777,且其状态为pending,即只有在17丢三个包的情况下才启动777。

有如下两种情况:

1、线路已经开通时,这时配置如上命令时,因17无法连续丢3个包,导致777始终不能启动,导致track17的状态始终为down,最终导致不管丢多少包都不能启动EEM。(想想为什么?)

规避措施:配置完如上命令时需要shut上端或下端端口30s(因每10s探测一次),这时777才能启动,然后再做no shut操作,track17状态才能变为up,才能在专线中断的情况下正常启动EEM。所以在已经开通的线路配置如上命令时都要中断主用线路最少30s。

2、线路尚未开通时,这时需要在配置完如上命令最少30s才能开通此MSTP线路,否则同样会有如上问题。
完整阅读

关于IP SLA及与EEM联动的探讨<1>

SLA简介

SLA (Service-Level Agreement)简单的理解,就是测量一些网络性能参数,在超过一些门限值时,结合track或者EEM它可以触发一些操作。例如:
1. 监控下一跳的可达性,如果不可达了, 则让某一静态路由失效
2. 监控领居的接口地址,如果连续三次不可达, 则将端口shutdown

SLA 应用实例

如果客户的线路质量不好,又无法改善时,我们需要一种方法来:当线路质量达到一定阀值时,直接reset端口,用重置链路来改善。
那么我们如何达到这种需求呢,这时SLA就登场了,那么如何部署SLA呢?

分析第一种方法

完整阅读