Wireless Catalyst 9800 WLC KPIs
This blog will focus on Key Performance Indicators (AP and RF) for Access Points. I will discuss methods and commands that can be used to assess the health of APs as well as RF.
KPIs different buckets or areas:
- WLC checks,
- Connection with other devices
- AP checks
- RF checks
- Client checks
- Packet Drops.
AP Checks
Let’s now focus on APs health. We can first verify that the number of APs linked to our WLC is the correct number. Use the command “ i Number of APs “. If the AP count is incorrect, we will need to identify missing APs and the reason they were disconnected. It is helpful to have a complete list if APs are needed for a working scenario using ethernet mac and IP addresses. Show a summary “).
Gladius1#show ap sum Load for five secs: 0%/0%; one minute: 0%; five minutes: 0% Time source is NTP, 19:18:03.363 CEST Wed May 25 2022 Number of APs: 8 AP Name Slots AP Model Ethernet MAC Radio MAC Location Country IP Address State ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- AP3800-r2sw1-te1-0-8 2 AIR-AP3802I-E-K9 0042.68a0.fc4a 0062.ecf3.8310 default location DE 192.168.127.108 Registered 9130i-r2sw1-te2016 3 C9130AXI-E 04eb.409e.14c0 04eb.409f.0c60 default location DE 192.168.25.133 Registered 9130i-r2sw1-te2015 3 C9130AXI-E 04eb.409e.1724 04eb.409f.1f80 default location DE 192.168.25.122 Registered 9130i-r3-sw2-g1-0-10 3 C9130AXI-B 04eb.409e.1d28 04eb.409f.4fa0 default location US 192.168.127.113 Registered AP1562-r3-sw-3-gi1-0-3 2 AIR-AP1562E-E-K9 0062.ec80.8c8c 2c33.1192.3e40 default location DE 192.168.127.106 Registered SS-I-1 2 C9115AXI-B 7069.5a74.7a50 7069.5a78.7780 default location US 192.168.127.97 Registered ap3800i-r2-sw1-te1-0-5 2 AIR-AP3802I-E-K9 0042.68c5.bdf0 cc16.7e5f.f000 default location CH 192.168.127.109 Registered 9120i-r4-sw2-te1-0-39 2 C9120AXI-E d4e8.8019.60e8 d4e8.801a.3340 default location DE 192.168.127.114 Registered
Check AP count, and have a list of ethernet mac and IP addresses of all the APs.
To quickly locate and identify missing devices, we can compare the outputs of non-working and working scenarios.
Even if the WLC has an expected number of APs, it is still important to verify that those APs remain stable. WLC provides a command that allows us to verify Capwap tunnel reliability and uptime (reloads). ex ____([0-9])+ day” “exclude” keyword will help us to focus on APs reloaded or disconnected within 1 day.
Gladius2#sh ap uptime Number of APs: 8 AP Name Ethernet MAC Radio MAC AP Up Time Association Up Time --------------------------------------------------------------------------------------------------------------------------------------------------- AP3800-r2sw1-te1-0-8 0042.68a0.fc4a 0062.ecf3.8310 26 days 0 hour 57 minutes 41 seconds 15 days 1 hour 50 minutes 4 seconds 9130i-r2sw1-te2015 04eb.409e.1724 04eb.409f.1f80 9 days 3 hours 26 minutes 48 seconds 9 days 3 hours 24 minutes 24 seconds 9130i-r2sw1-te2016 04eb.409e.14c0 04eb.409f.0c60 9 days 1 hour 39 minutes 29 seconds 9 days 1 hour 26 minutes 47 seconds 9120i-r4-sw2-te1-0-39 d4e8.8019.60e8 d4e8.801a.3340 8 days 1 hour 36 minutes 57 seconds 8 days 1 hour 33 minutes 49 seconds SS-I-1 7069.5a74.7a50 7069.5a78.7780 26 days 0 hour 54 minutes 57 seconds 22 minutes 15 seconds ap3800i-r2-sw1-te1-0-5 0042.68c5.bdf0 cc16.7e5f.f000 26 days 0 hour 46 minutes 12 seconds 22 minutes 13 seconds 9130i-r3-sw2-g1-0-10 04eb.409e.1d28 04eb.409f.4fa0 22 minutes 21 seconds 19 minutes 39 seconds
Check uptime and Association uptime. In this case we see SS-I-1 and ap3800i-r2-sw1-te1-0-5 facing disconnection, while 9130i-r3-sw2-g1-0-10 facing reload.
The above command will tell us if there have been any unexpected AP reloads. It is also possible to determine if multiple APs were reloaded at once. It could indicate a problem with the network or power supply in that area/switch if the reloaded APs were at the same place or connected to the same switch. Similar to AP disconnections, we can also compare “Association Uptime” between them to identify patterns, determine if any tunnel teardowns occurred, and when. Keep in mind that APs can flip the CAPWAP tunnel if they make certain configuration changes (e.g. when a new tag has been applied).
If “AP Uptime”, which is not due to general reloading, is lower than anticipated, we can examine the WLC for any AP crashes and bootflash content in any report file. i crash”
Gladius1#show ap crash-file File Location: BOOTFLASH AP Name Crash File Radio Slot 0 Radio Slot 1 ------------------------------------------------------------------------------------------------------------------------------- ap3800i-r2-sw1-te0-1 ap3800i-r2-sw1-te0-1_0062ecaade80.crash Gladius1#dir bootflash: | i crash 54 -rw- 50476 May 9 2022 13:07:34 +02:00 ap3800i-r2-sw1-te0-1_0062ecaade80.crash 66 -rw- 120276 Jan 26 2022 11:46:55 +01:00 AP9120-2-r3-sw2-Gi1-0-39_d4e88019f140.crash 28 -rw- 93952 Nov 2 2021 13:02:21 +01:00 SS-E-2_00eeab18c160.crash 12 -rw- 42975 Oct 27 2021 15:01:44 +02:00 9115i-r4-sw2-te1-0-38_f80f6f154ce0.crash 42 -rw- 42235 May 15 2021 14:24:59 +02:00 9115i-r3-sw2-te1-0-38_f80f6f154960.crash 41 -rw- 26063 Mar 30 2021 13:06:45 +02:00 9115i-r3-sw2-te1-0-38_f80f6f154c80.crash
Check for AP crashes occurring, multiple crashes seen in the same AP, and periodic crashes.
To find new crashes, it is recommended to periodically review bootflash content. Download any new crashes and share them with TAC to conduct root cause analysis. To keep your file system clean, delete any old files.
If we have AP disconnections, it will allow us to determine what the most frequent termination event is and what the AP state was at that time. This will give us a global picture. Use the command “show wireless stats to terminate ap session”
Gladius1#show wireless stats ap session termination Event Previous State Occurance Count ------------------------------------------------------------------------------------ DTLS session closed JOINED 6 Heartbeat timer expiry JOINED 2 Reset by API IMAGE_DOWNLOAD 1 Image download status IMAGE_DOWNLOAD 6 Reset by API RUN 3 DTLS session closed RUN 17 Heartbeat timer expiry RUN 6
Check events with the highest count. If AP was in RUN state disconnections could be due to consistent packet drops.
The AP history command allows us to drill down further on specific APs. Filtering AP history using disconnections will reveal if multiple APs were disconnecting at the same moment and the reason. Analyzing the command output will allow us to determine if multiple disconnections are occurring for the same AP as well as the frequency of these disconnections. i Disjoined”
Gladius1#show wireless stats ap history | i Disjoined ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/24/22 12:27:39 NA DTLS close alert from peer ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/24/22 12:24:26 NA DTLS close alert from peer ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/24/22 12:17:47 NA DTLS close alert from peer ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/24/22 11:41:17 NA DTLS close alert from peer ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/24/22 11:38:04 NA DTLS close alert from peer ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/24/22 10:18:04 NA DTLS close alert from peer ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/09/22 13:02:28 NA Heart beat timer expiry ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/09/22 10:49:34 NA Heart beat timer expiry ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/05/22 19:53:31 NA Failure decoding wtp descriptor ap3800i-r3-sw2-Gi1-0-37 0042.68a1.03d2 Disjoined 05/12/22 12:02:38 NA DTLS close alert from peer ap3800i-r3-sw2-Gi1-0-37 0042.68a1.03d2 Disjoined 05/12/22 11:57:43 NA Wtp reset config cmd sent ap3800i-r3-sw2-Gi1-0-37 0042.68a1.03d2 Disjoined 05/10/22 10:54:49 NA DTLS close alert from peer
Check timestamps and disjoin reason. Find multiple disconnections per AP, disconnections occurring at the same time or periodically.
A second important thing to do is to check the AP tag assignment. The SSIDs, AP modes, RF profiles and policies for each AP will be determined by tags. It is possible to verify that APs have the correct tags and use the correct method for assigning them. Comparing tags attached at APs located in the same area, or non-working, could help spot an incorrect tag allocation. Use the command “sh-ap tag summary”
We also need to determine if any AP has misconfigured tags. A non-existent/removed parameter, such as profile policy, RFprofile, …),, or an incorrect configuration combination could cause misconfigured tags. APs that are marked as “misconfigured” will not broadcast any BSSID. i Yes”
Gladius1#sh ap tag summary Number of APs: 4 AP Name AP Mac Site Tag Name Policy Tag Name RF Tag Name Misconfigured Tag Source ---------------------------------------------------------------------------------------------------------- HG-2 0cd0.f894.0f40 default-site-tag default-policy-tag default-rf-tag No Default AP1832I 80e8.6fd8.6330 site2 flex-vlan4 rf-hig No Location ap1700i f44e.0578.a560 site2 default-policy-tag default-rf-tag Yes Static AP9120 d4e8.8019.6100 default-site-tag LOCAL_VLAN169 default-rf-tag No Filter
Check for misconfigured tags, correct tag source, and same tag assignment for APs in the same branch
We can still check that the APs have been set up correctly and are in good working order to find out if there is any misbehavior. It is possible for a perfectly functioning AP to show no clients at the moment. We can identify APs that may be having issues based on the information we have about the network, and the number clients we see in other APs within the same area. We can verify that radios are working and that the AP broadcasts the correct BSSIDs. Then we monitor the APs for a time. If AP still shows no clients after the monitoring period we can test to reset the AP radio, or to reconnect with WLC. Use command: “show ap sum sort descending client-count | i __0__”
Gladius1#show ap sum sort descending client-count | i __0__ ---------------------------------------------------------------------------------------------------------- AP-name AP-mac Client count Data Usage Through-Put Admin-State ---------------------------------------------------------------------------------------------------------- 9120i d4e8.801a.3340 0 1407172 515 Enabled AP1562 2c33.1192.3e40 0 4189901 69 Disabled AP3800 0062.ecf3.8310 0 48548613 473 Disabled
Check for APs with zero clients and in enabled state.
One example of those AP KPIs that helped to identify an issue was a customer facing AP random AP disconnections. We were able to identify the impacted APs by looking at the show AP availability. We were able to determine that all APs were located in the same place and were connected to one switch using the show ap cdp neighbor output. These APs were disconnected because of a connection being closed by AP. We could see multiple re-transmissions CAPWAP packets when we checked the AP logs. We then tried to ping from AP into WLC, and saw packet loss. When pinging from AP his gateway, the same packet loss was observed. Ping tests showed that there was a connectivity problem between the gateway and the APs.
RF Checks
Monitoring per band AP channel assignments, channel widths, transmission power and radio state can be done. We can also check if channels are evenly distributed in order to avoid interference from co-channels. Additionally, we can determine if multiple APs are using maximum TXpower. This could indicate coverage problems. It is also possible to identify APs marked as down that are not radio-operative. This verification is required for all APs, including the 9136 new ones, at 24ghz and 5ghz. To verify the assigned BSS Color to each AP, use command: “show an AP dot11 24ghz/5ghz/6ghz Summary”.
Gladius1#sh ap dot11 5ghz summary AP Name Mac Address Slot Admin State Oper State Width Txpwr Channel Mode --------------------------------------------------------------------------------------------------------------------------------------------------------- 9130E 0c75.bdb5.71e0 1 Enabled Up 20 *2/8 (21 dBm) (100)* Local 9130E 0c75.bdb5.71e0 2 Disabled Down 20 *1/8 (15 dBm) (36)* Local AP9120A d4e8.8019.f140 1 Enabled Up 20 *2/8 (19 dBm) (40)* Local AP9120B d4e8.801a.3400 1 Enabled Up 20 7/8 (4 dBm) (40) Local
Check for Txpwr 1, uneven channel distribution, radios down, and unexpected static assignment.
The next statistics will allow us to determine the frequency of radio channel changes. We can examine if AP changes channels when radar is detected on the same channel (5ghz). Client connectivity could be affected if we see many channel changes, and the numbers are rising. Channel changes will cause the AP radio to reset and all clients will be disconnected. Channel change in 5ghz will cause clients to be disconnected. AP radio must monitor the channel for 60 seconds before beaconing. Channel changes that are excessive could indicate RRM or RF issues. This should be investigated. i Channel changes due to radar
Gladius1#sh ap auto-rf dot11 5ghz | i Channel changes due to radar|AP Name|Channel Change Count AP Name : 9130E-r3-sw2-g1014 Channel changes due to radar : 0 Channel Change Count : 2 AP Name : 9130E-r3-sw2-g1014 Channel changes due to radar : 0 AP Name : AP9120-2-r3-sw2-Gi1-0-39 Channel changes due to radar : 3 Channel Change Count : 10 AP Name : AP9120-r3-sw3-Gi1-0-47 Channel changes due to radar : 0 Channel Change Count : 62
Check for a high amount of channel changes and changes due to DFS events.
Another thing we can check is the radio’s load or channel utilization. Catalyst 9800WLC will display the channel utilization and client count to help us identify high-loading APs. We can identify APs that have few clients and high loads. This will allow us to focus our attention on them and determine if this is due to traffic sent or received by the AP, or co-channel interference. We can also use information about the load to help identify the most loaded areas and those that may require more density. Use the command “show ap 11 24ghz/5ghz/6ghz loads-info”
Gladius1#sh ap dot11 5ghz load-info AP Name Radio MAC Slot Channel Utilization (%) Clients ---------------------------------------------------------------------------------------- 9130E 0c75.bdb5.71e0 1 2 0 9130E 0c75.bdb5.71e0 2 0 0 AP9120A d4e8.8019.f140 1 11 5 AP9120B d4e8.801a.3400 1 11 0
Check for high channel utilization or channel utilization with no client (co-channel interference). We can see co-channel interference because AP9120A and 9120B are both in the same channel 40.
One example of an issue that could be identified by looking at RF KPIs is a client with poor customer performance. The radio load at 5ghz was high, even though there were very few clients. The load was not caused by transmitting or receiving data, but rather due to interference from co-channels. We found that the rf-profile configuration issue caused only 4 channels to be assigned to the APs with high loads. Utilization decreased after adding additional channels to the RF profile channel. No other performance issues were reported.
For more detailed RF analysis you can use Wireless Config Analyzer Express (WCAE) tool: https://developer.cisco.com/docs/wireless-troubleshooting-tools/#wireless-config-analyzer-express
WCAE will show you the distribution of channels, TXpower, RF metrics per AP, and more details.
With provided methodology and commands you can proactively identify if there are any issues in our WLC APs and RF. In the next blog, we will share 9800 WLC KPIs to check client connectivity and WLC drops/punted packets.
List of commands to use for KPIs and automation scripts
The document below also contains a link that will allow you to access a script that will automatically gather all commands. It will automatically collect commands based upon platform and release. The files are saved in a file and exported. This script uses the “Guest Shell” feature, which is currently only available in physical WLCs 9800-40/80 or 9800-L.
This document includes an example of an EEM script that collects logs regularly. EEM, along with the “Guestshell” script, will allow you to collect 9800 WLC KPIs. This will provide a baseline for your Catalyst9800 WLC.
For the list of commands used to monitor those KPIs
Visit the Monitor Wireless Catalyst 9800 KPIs
About JNS
As a Managed Service Provider delivering IT Services in Miami and throughout South Florida we provide Cisco Wireless deployments for any venue. Call us today to lean more.
Wireless Catalyst 9800 WLC KPIs
This blog will focus on Key Performance Indicators (AP and RF) for Access Points. I will discuss methods and commands that can be used to assess the health of APs as well as RF.
KPIs different buckets or areas:
- WLC checks,
- Connection with other devices
- AP checks
- RF checks
- Client checks
- Packet Drops.
AP Checks
Let’s now focus on APs health. We can first verify that the number of APs linked to our WLC is the correct number. Use the command “ i Number of APs “. If the AP count is incorrect, we will need to identify missing APs and the reason they were disconnected. It is helpful to have a complete list if APs are needed for a working scenario using ethernet mac and IP addresses. Show a summary “).
Gladius1#show ap sum Load for five secs: 0%/0%; one minute: 0%; five minutes: 0% Time source is NTP, 19:18:03.363 CEST Wed May 25 2022 Number of APs: 8 AP Name Slots AP Model Ethernet MAC Radio MAC Location Country IP Address State ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- AP3800-r2sw1-te1-0-8 2 AIR-AP3802I-E-K9 0042.68a0.fc4a 0062.ecf3.8310 default location DE 192.168.127.108 Registered 9130i-r2sw1-te2016 3 C9130AXI-E 04eb.409e.14c0 04eb.409f.0c60 default location DE 192.168.25.133 Registered 9130i-r2sw1-te2015 3 C9130AXI-E 04eb.409e.1724 04eb.409f.1f80 default location DE 192.168.25.122 Registered 9130i-r3-sw2-g1-0-10 3 C9130AXI-B 04eb.409e.1d28 04eb.409f.4fa0 default location US 192.168.127.113 Registered AP1562-r3-sw-3-gi1-0-3 2 AIR-AP1562E-E-K9 0062.ec80.8c8c 2c33.1192.3e40 default location DE 192.168.127.106 Registered SS-I-1 2 C9115AXI-B 7069.5a74.7a50 7069.5a78.7780 default location US 192.168.127.97 Registered ap3800i-r2-sw1-te1-0-5 2 AIR-AP3802I-E-K9 0042.68c5.bdf0 cc16.7e5f.f000 default location CH 192.168.127.109 Registered 9120i-r4-sw2-te1-0-39 2 C9120AXI-E d4e8.8019.60e8 d4e8.801a.3340 default location DE 192.168.127.114 Registered
Check AP count, and have a list of ethernet mac and IP addresses of all the APs.
To quickly locate and identify missing devices, we can compare the outputs of non-working and working scenarios.
Even if the WLC has an expected number of APs, it is still important to verify that those APs remain stable. WLC provides a command that allows us to verify Capwap tunnel reliability and uptime (reloads). ex ____([0-9])+ day” “exclude” keyword will help us to focus on APs reloaded or disconnected within 1 day.
Gladius2#sh ap uptime Number of APs: 8 AP Name Ethernet MAC Radio MAC AP Up Time Association Up Time --------------------------------------------------------------------------------------------------------------------------------------------------- AP3800-r2sw1-te1-0-8 0042.68a0.fc4a 0062.ecf3.8310 26 days 0 hour 57 minutes 41 seconds 15 days 1 hour 50 minutes 4 seconds 9130i-r2sw1-te2015 04eb.409e.1724 04eb.409f.1f80 9 days 3 hours 26 minutes 48 seconds 9 days 3 hours 24 minutes 24 seconds 9130i-r2sw1-te2016 04eb.409e.14c0 04eb.409f.0c60 9 days 1 hour 39 minutes 29 seconds 9 days 1 hour 26 minutes 47 seconds 9120i-r4-sw2-te1-0-39 d4e8.8019.60e8 d4e8.801a.3340 8 days 1 hour 36 minutes 57 seconds 8 days 1 hour 33 minutes 49 seconds SS-I-1 7069.5a74.7a50 7069.5a78.7780 26 days 0 hour 54 minutes 57 seconds 22 minutes 15 seconds ap3800i-r2-sw1-te1-0-5 0042.68c5.bdf0 cc16.7e5f.f000 26 days 0 hour 46 minutes 12 seconds 22 minutes 13 seconds 9130i-r3-sw2-g1-0-10 04eb.409e.1d28 04eb.409f.4fa0 22 minutes 21 seconds 19 minutes 39 seconds
Check uptime and Association uptime. In this case we see SS-I-1 and ap3800i-r2-sw1-te1-0-5 facing disconnection, while 9130i-r3-sw2-g1-0-10 facing reload.
The above command will tell us if there have been any unexpected AP reloads. It is also possible to determine if multiple APs were reloaded at once. It could indicate a problem with the network or power supply in that area/switch if the reloaded APs were at the same place or connected to the same switch. Similar to AP disconnections, we can also compare “Association Uptime” between them to identify patterns, determine if any tunnel teardowns occurred, and when. Keep in mind that APs can flip the CAPWAP tunnel if they make certain configuration changes (e.g. when a new tag has been applied).
If “AP Uptime”, which is not due to general reloading, is lower than anticipated, we can examine the WLC for any AP crashes and bootflash content in any report file. i crash”
Gladius1#show ap crash-file File Location: BOOTFLASH AP Name Crash File Radio Slot 0 Radio Slot 1 ------------------------------------------------------------------------------------------------------------------------------- ap3800i-r2-sw1-te0-1 ap3800i-r2-sw1-te0-1_0062ecaade80.crash Gladius1#dir bootflash: | i crash 54 -rw- 50476 May 9 2022 13:07:34 +02:00 ap3800i-r2-sw1-te0-1_0062ecaade80.crash 66 -rw- 120276 Jan 26 2022 11:46:55 +01:00 AP9120-2-r3-sw2-Gi1-0-39_d4e88019f140.crash 28 -rw- 93952 Nov 2 2021 13:02:21 +01:00 SS-E-2_00eeab18c160.crash 12 -rw- 42975 Oct 27 2021 15:01:44 +02:00 9115i-r4-sw2-te1-0-38_f80f6f154ce0.crash 42 -rw- 42235 May 15 2021 14:24:59 +02:00 9115i-r3-sw2-te1-0-38_f80f6f154960.crash 41 -rw- 26063 Mar 30 2021 13:06:45 +02:00 9115i-r3-sw2-te1-0-38_f80f6f154c80.crash
Check for AP crashes occurring, multiple crashes seen in the same AP, and periodic crashes.
To find new crashes, it is recommended to periodically review bootflash content. Download any new crashes and share them with TAC to conduct root cause analysis. To keep your file system clean, delete any old files.
If we have AP disconnections, it will allow us to determine what the most frequent termination event is and what the AP state was at that time. This will give us a global picture. Use the command “show wireless stats to terminate ap session”
Gladius1#show wireless stats ap session termination Event Previous State Occurance Count ------------------------------------------------------------------------------------ DTLS session closed JOINED 6 Heartbeat timer expiry JOINED 2 Reset by API IMAGE_DOWNLOAD 1 Image download status IMAGE_DOWNLOAD 6 Reset by API RUN 3 DTLS session closed RUN 17 Heartbeat timer expiry RUN 6
Check events with the highest count. If AP was in RUN state disconnections could be due to consistent packet drops.
The AP history command allows us to drill down further on specific APs. Filtering AP history using disconnections will reveal if multiple APs were disconnecting at the same moment and the reason. Analyzing the command output will allow us to determine if multiple disconnections are occurring for the same AP as well as the frequency of these disconnections. i Disjoined”
Gladius1#show wireless stats ap history | i Disjoined ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/24/22 12:27:39 NA DTLS close alert from peer ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/24/22 12:24:26 NA DTLS close alert from peer ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/24/22 12:17:47 NA DTLS close alert from peer ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/24/22 11:41:17 NA DTLS close alert from peer ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/24/22 11:38:04 NA DTLS close alert from peer ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/24/22 10:18:04 NA DTLS close alert from peer ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/09/22 13:02:28 NA Heart beat timer expiry ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/09/22 10:49:34 NA Heart beat timer expiry ap3800i-r2-sw1-te0-1 0042.68a0.ee78 Disjoined 05/05/22 19:53:31 NA Failure decoding wtp descriptor ap3800i-r3-sw2-Gi1-0-37 0042.68a1.03d2 Disjoined 05/12/22 12:02:38 NA DTLS close alert from peer ap3800i-r3-sw2-Gi1-0-37 0042.68a1.03d2 Disjoined 05/12/22 11:57:43 NA Wtp reset config cmd sent ap3800i-r3-sw2-Gi1-0-37 0042.68a1.03d2 Disjoined 05/10/22 10:54:49 NA DTLS close alert from peer
Check timestamps and disjoin reason. Find multiple disconnections per AP, disconnections occurring at the same time or periodically.
A second important thing to do is to check the AP tag assignment. The SSIDs, AP modes, RF profiles and policies for each AP will be determined by tags. It is possible to verify that APs have the correct tags and use the correct method for assigning them. Comparing tags attached at APs located in the same area, or non-working, could help spot an incorrect tag allocation. Use the command “sh-ap tag summary”
We also need to determine if any AP has misconfigured tags. A non-existent/removed parameter, such as profile policy, RFprofile, …),, or an incorrect configuration combination could cause misconfigured tags. APs that are marked as “misconfigured” will not broadcast any BSSID. i Yes”
Gladius1#sh ap tag summary Number of APs: 4 AP Name AP Mac Site Tag Name Policy Tag Name RF Tag Name Misconfigured Tag Source ---------------------------------------------------------------------------------------------------------- HG-2 0cd0.f894.0f40 default-site-tag default-policy-tag default-rf-tag No Default AP1832I 80e8.6fd8.6330 site2 flex-vlan4 rf-hig No Location ap1700i f44e.0578.a560 site2 default-policy-tag default-rf-tag Yes Static AP9120 d4e8.8019.6100 default-site-tag LOCAL_VLAN169 default-rf-tag No Filter
Check for misconfigured tags, correct tag source, and same tag assignment for APs in the same branch
We can still check that the APs have been set up correctly and are in good working order to find out if there is any misbehavior. It is possible for a perfectly functioning AP to show no clients at the moment. We can identify APs that may be having issues based on the information we have about the network, and the number clients we see in other APs within the same area. We can verify that radios are working and that the AP broadcasts the correct BSSIDs. Then we monitor the APs for a time. If AP still shows no clients after the monitoring period we can test to reset the AP radio, or to reconnect with WLC. Use command: “show ap sum sort descending client-count | i __0__”
Gladius1#show ap sum sort descending client-count | i __0__ ---------------------------------------------------------------------------------------------------------- AP-name AP-mac Client count Data Usage Through-Put Admin-State ---------------------------------------------------------------------------------------------------------- 9120i d4e8.801a.3340 0 1407172 515 Enabled AP1562 2c33.1192.3e40 0 4189901 69 Disabled AP3800 0062.ecf3.8310 0 48548613 473 Disabled
Check for APs with zero clients and in enabled state.
One example of those AP KPIs that helped to identify an issue was a customer facing AP random AP disconnections. We were able to identify the impacted APs by looking at the show AP availability. We were able to determine that all APs were located in the same place and were connected to one switch using the show ap cdp neighbor output. These APs were disconnected because of a connection being closed by AP. We could see multiple re-transmissions CAPWAP packets when we checked the AP logs. We then tried to ping from AP into WLC, and saw packet loss. When pinging from AP his gateway, the same packet loss was observed. Ping tests showed that there was a connectivity problem between the gateway and the APs.
RF Checks
Monitoring per band AP channel assignments, channel widths, transmission power and radio state can be done. We can also check if channels are evenly distributed in order to avoid interference from co-channels. Additionally, we can determine if multiple APs are using maximum TXpower. This could indicate coverage problems. It is also possible to identify APs marked as down that are not radio-operative. This verification is required for all APs, including the 9136 new ones, at 24ghz and 5ghz. To verify the assigned BSS Color to each AP, use command: “show an AP dot11 24ghz/5ghz/6ghz Summary”.
Gladius1#sh ap dot11 5ghz summary AP Name Mac Address Slot Admin State Oper State Width Txpwr Channel Mode --------------------------------------------------------------------------------------------------------------------------------------------------------- 9130E 0c75.bdb5.71e0 1 Enabled Up 20 *2/8 (21 dBm) (100)* Local 9130E 0c75.bdb5.71e0 2 Disabled Down 20 *1/8 (15 dBm) (36)* Local AP9120A d4e8.8019.f140 1 Enabled Up 20 *2/8 (19 dBm) (40)* Local AP9120B d4e8.801a.3400 1 Enabled Up 20 7/8 (4 dBm) (40) Local
Check for Txpwr 1, uneven channel distribution, radios down, and unexpected static assignment.
The next statistics will allow us to determine the frequency of radio channel changes. We can examine if AP changes channels when radar is detected on the same channel (5ghz). Client connectivity could be affected if we see many channel changes, and the numbers are rising. Channel changes will cause the AP radio to reset and all clients will be disconnected. Channel change in 5ghz will cause clients to be disconnected. AP radio must monitor the channel for 60 seconds before beaconing. Channel changes that are excessive could indicate RRM or RF issues. This should be investigated. i Channel changes due to radar
Gladius1#sh ap auto-rf dot11 5ghz | i Channel changes due to radar|AP Name|Channel Change Count AP Name : 9130E-r3-sw2-g1014 Channel changes due to radar : 0 Channel Change Count : 2 AP Name : 9130E-r3-sw2-g1014 Channel changes due to radar : 0 AP Name : AP9120-2-r3-sw2-Gi1-0-39 Channel changes due to radar : 3 Channel Change Count : 10 AP Name : AP9120-r3-sw3-Gi1-0-47 Channel changes due to radar : 0 Channel Change Count : 62
Check for a high amount of channel changes and changes due to DFS events.
Another thing we can check is the radio’s load or channel utilization. Catalyst 9800WLC will display the channel utilization and client count to help us identify high-loading APs. We can identify APs that have few clients and high loads. This will allow us to focus our attention on them and determine if this is due to traffic sent or received by the AP, or co-channel interference. We can also use information about the load to help identify the most loaded areas and those that may require more density. Use the command “show ap 11 24ghz/5ghz/6ghz loads-info”
Gladius1#sh ap dot11 5ghz load-info AP Name Radio MAC Slot Channel Utilization (%) Clients ---------------------------------------------------------------------------------------- 9130E 0c75.bdb5.71e0 1 2 0 9130E 0c75.bdb5.71e0 2 0 0 AP9120A d4e8.8019.f140 1 11 5 AP9120B d4e8.801a.3400 1 11 0
Check for high channel utilization or channel utilization with no client (co-channel interference). We can see co-channel interference because AP9120A and 9120B are both in the same channel 40.
One example of an issue that could be identified by looking at RF KPIs is a client with poor customer performance. The radio load at 5ghz was high, even though there were very few clients. The load was not caused by transmitting or receiving data, but rather due to interference from co-channels. We found that the rf-profile configuration issue caused only 4 channels to be assigned to the APs with high loads. Utilization decreased after adding additional channels to the RF profile channel. No other performance issues were reported.
For more detailed RF analysis you can use Wireless Config Analyzer Express (WCAE) tool: https://developer.cisco.com/docs/wireless-troubleshooting-tools/#wireless-config-analyzer-express
WCAE will show you the distribution of channels, TXpower, RF metrics per AP, and more details.
With provided methodology and commands you can proactively identify if there are any issues in our WLC APs and RF. In the next blog, we will share 9800 WLC KPIs to check client connectivity and WLC drops/punted packets.
List of commands to use for KPIs and automation scripts
The document below also contains a link that will allow you to access a script that will automatically gather all commands. It will automatically collect commands based upon platform and release. The files are saved in a file and exported. This script uses the “Guest Shell” feature, which is currently only available in physical WLCs 9800-40/80 or 9800-L.
This document includes an example of an EEM script that collects logs regularly. EEM, along with the “Guestshell” script, will allow you to collect 9800 WLC KPIs. This will provide a baseline for your Catalyst9800 WLC.
For the list of commands used to monitor those KPIs
Visit the Monitor Wireless Catalyst 9800 KPIs
About JNS
As a Managed Service Provider delivering IT Services in Miami and throughout South Florida we provide Cisco Wireless deployments for any venue. Call us today to lean more.