A follow up to this post about new APs leading to a much worse experience for users…. perhaps no surprise things were more complicated. There are more bugs.
Whilst there’s more work to do on the power output question…. With settings changed some users were now happy, others less so. Some APs now had a healthy mix of 2.4GHz and 5GHz clients but at least one AP had never had a 5GHz client in its short deployment life of a week.
Something didn’t smell right.
We’d previously spotted something on the Aruba mobility controller that suggested there was lots of activity on the 5GHz channels, even though we knew there weren’t many clients. This was put aside with the discovery the power output wasn’t quite what we expected. Now, with one AP that had seen no 5GHz clients at all there was something else to go at.
Here’s a view of the 5GHz radio on our Aruba mobility controller. That channel utilization is fairly high. Normally high channel utilization is a pretty good thing, it means people are using the wi-fi. Yay. However we knew they weren’t, and in fact they complained they couldn’t. The neighbour report from the AP also showed there was nothing else using that channel.
ArubaOS provides shed loads of information about APs, their radios and their clients. Often there’s far more information than you know what do with, but that’s another story. In depth radio stats are available, and most of them are nicely understandable. They give a few clues as to what might be going on.
There’s a lot of information here, but one line that caught my eye was the Tx Failed Beacons count.
An AP sends out beacons frames regularly at predefined intervals, usually every 100(ish)msec. To transmit the beacon an AP has to contend for access to the wireless medium, just like any other station wanting to transmit, so if the channel is busy the AP won’t be able to transmit it’s beacon. No biggie, this probably won’t happen very often. This AP had recently been rebooted, yet the count of failed beacons was so high that a quick calculation confirmed it was pretty much all of them.
So although everything looks fine superficially the AP just isn’t working on 5GHz, it isn’t transmitting any (many, it turned out) beacons.
I setup another AP of the same model on my desk and was quickly able to confirm the behaviour. All the basic parameters of the AP reported that it was just fine, but the channel utilization was high and the failed beacons count was climbing.
Using Ekahau Site Survey with a Sidekick spectrum analyser I could see there wasn’t another AP on the same channel, and in fact there was nothing else using the channel. I could see the AP’s BSSID disappearing periodically as it failed to transmit beacons and the Ekahau software lost sight of it and removed it from the list.
So we know the channel is clear yet the AP reports it’s busy. Also, the AP is reporting channel utilization rather than noise. As we can see both with the Ekahau Sidekick it’s possible to be confident this AP just isn’t working properly.
Software upgrades of a network in use 24/7 are a tricky issue. We try not to fall too far behind, but we can’t just upgrade every time there’s a new release. We don’t like to be on the ragged edge latest release with our production controllers and so will schedule software upgrades during quiet times when the majority of the campus is quiet.
Unsurprisingly we’re a few minor releases behind at this point. Perhaps even more unsurprising is that we’ve found a defect description that sounds rather familiar.
Sure enough a test of the circumstances on both our production and development mobility controllers proves this is a bug in the production code, and so an update is the order of the day.
One final frustration is that my colleague had already searched through most of the release notes looking for the AP-314 or 310 series. The release notes specify the AP-315….. Gah!
Perhaps one moral of this story is it does pay to keep an eye on the release notes as they come out. It’s worth scanning through the fixed and known issues so that if you suffer from a problem, something deep in your brain might just spark that you’ve seen something about this somewhere before. That said, the ArubaOS 18.104.22.168 release has pages of fixed bugs, far too many to remember what they are.