Now Cisco had integrated BNG feature to XRv9000 platform in new version (from 631), that can let customer integrade the feature in their NFVI Infrastructure, that will flexible deploy the BNG in same Server Box. And the article will set up a simple vBNG environment that build by VIRL + XRv9000, and simple test IPoE/PPPoE. That environment will help you to easy TS vBNG PI issue, and packets paths.
My customer match a issue that business traffics take IP6,7 flag, then the traffics auto mapping to EXP6,7 that cause control police congestion, and ISIS flapping due to BFD flap. So they want to check which traffics have incorrect flag by netflow, so need to check ording for netflow and QOS at input direction. I check some documents, nobody notice that, so the article will show test info, you can check if you need. Finaly test result: At ingress direction, packets will be cached first by netflow, then do other action in QOS.
Btw, due to auto mapping from TOS to EXP by range, e.g: TOS 192-223 will map to EXP6; TOS 223-255 will map to EXP7. So if we want to check the issue by netflow, suggest filter EXP data, as in my follow test, check by follow command:
There are some internal tools that can decode SPP packets at former, but they are not work now. In some scenario, customer coudln’t do span on our asr9k, so we only need SPP, then will face to how to decode SPP result.
The article disscuss how to covert SPP original data to text2pcap readable format, then decode by text2pcap. You only do the script that can auto work. Btw, before do that, you need have python2.7 and text2pcap (integrate in wireshark). If you have python3.0 or newer, that maybe have some issue, because some function have a bit different, you need adjust them by yourself.
The article will talk about what is “ACK-DPM-WAIT”, and how to troubleshooting the similar scenario. Due to limitation info that couldn’t narrow down, in my CASE, so will update the article if the issue happen again and find RCA.
My customer found part of BNG session was failure. Trigger is due to customer power supply have issue that cause the asr9k re-power. After 9k reload, found dhcpd and arp have so many alarms, dhcpd was recovery after tried restart process multi times, but arp continue have SPIO alarm even if tried restart process, customer had enabled arp local disable on the BNG port.
The issue sessions got address correct from DHCP, but session would be deleted after 15min. After checked on asr9k, we found issue session pending on ACK_DPM_WAIT status. And the issue was auto recovery at approx.19:00-19:30. And at that timeslot, arp alarm disappear too. 完整阅读