Aruba Control Plane Security and the AP-203H

Here’s a useful little tidbit for anyone with an Aruba OS 6.5 environment wanting to enable control plane security with some AP-203H on the network.

(EDIT: This also applies to AOS8 where CPsec is now enabled by default. Whilst the AP-203H is quite a dated now, there may be some still around and you will experience these issues when migrating from AOS6/6.5 to AOS8)

With Aruba campus APs all traffic between users and the controller is encrypted (probably) by virtue of the Wi-Fi encryption in use being between the client and the mobility controller, rather than being decrypted by the AP. So unless you’re using an open network all your client traffic is encrypted until it pops out of the controller. Lovely.

Control plane traffic, the AP talking to the controller, is not. This isn’t usually a problem as we mostly trust the wires (whether we should or not is another matter).

In many Aruba controller environments all the user traffic is tunneled back to the controller in our data centre, which works very well especially when users are highly mobile around a campus.

However it’s also possible to bridge an SSID so the AP drops the traffic on a local vlan. This is desirable in some circumstances, one real world example being a robotics research lab needing to connect devices to a local subnet. In order to enable this you first have to switch on control plane security.

Switching CPsec on causes all APs to reboot at least twice so on an existing deployment it leads to 8-15 minutes of downtime. I switched CPsec on this morning for our network with approximately 2500 APs, it was a tense time but went well…. mostly.

I’d read stories of APs with a bad trusted platform module being unable to share their certificate with the controller. We have a lot of APs from many years old to brand new and even a low percentage failure rate would present a problem.

In the end two APs failed, with lots of TPM initialization errors. However all our AP-203H units failed to come back up and the controller process logs started showing things like this:

Sep 4 07:57:37 nanny[1399]: <303022> <WARN> |AP <apname>@<ip_addr> nanny| Reboot Reason: AP rebooted Wed Dec 31 16:01:41 PST 1969; Unable to set up IPSec tunnel to saved lms, Error:RC_ERROR_CPSEC_CERT_REJECTED

It took a little while to become clear just one model of AP was affected and it was all of them.

A bit of time with Aruba TAC later and it transpires it’s necessary to execute the command: crypto-local pki allow-low-assurance-devices 

This needs to be run on each controller separately, and saved of course.

The command is detailed as allowing non-tpm devices to connect to the controller. I’m not entirely clear what’s special about the AP-203H, presumably it doesn’t have a TPM, yet it does have a factory installed certificate. We also have some AP-103H units, which don’t have a TPM but they work just fine with a switch-cert. I suspect this is a bug in the firmware and Aruba OS is treating the 203H as if it has a TPM but as it doesn’t everything falls apart.

Clearly if you want the maximum possible security, allowing low assurance devices is presumably going to raise eyebrows. In our case we’re happy with this limitation.

May this post find someone else who one day searches for RC_ERROR_CPSEC_CERT_REJECTED ap-203h 

Taming Aruba ARM

About 18 months ago a new building was completed on our campus. During the post install Wi-Fi survey I noticed that pretty much all the APs were using the same 5GHz channel, despite being configured for our standard radio management profile. In fact the only thing Aruba ARM appeared to have done was reduce the power of most APs to the minimum specified. So what was going wrong?

Depending on who you listen to using the automatic radio management tool in your enterprise Wi-Fi setup is either eminently sensible or complete madness. Feelings run high when it comes to radio config it seems. Whether you ultimately decide to use it or not, it’s worth understanding what it does, how it does it, and how it really behaves outside of the ideal deployment.

Your Wi-Fi vendor sales guy will promise the world; they’ll tell you their solution is the best (of course) and that it will solve any RF issues, even removing the need for costly install surveys and design because the magic radio management will just sort it all out.

This is, of course, not true.

The first thing to you should be very clear about is that no RF management will fix a fundamentally bad design. So what does it do for you?

I have the most experience with Aruba’s Adaptive Radio Management, or ARM for short. ARM has grown gradually more sophisticated over the years as ArubaOS has developed. Essentially what ARM does is attempt to stop your access points from interfering with each other, reducing co-channel interference (APs on the same channel within range of each other). ARM will change the channels used by your APs and adjust power levels. It can also disable 2.4GHz radios where it deems necessary. The latest version announced can even change channel widths, allowing you to maximise performance by switching to 40MHz channels automatically where there is sufficient channel space.

In an environment with a controller and ‘lightweight’ APs you can be forgiven for thinking the controller gathers all the information about which APs can see each other and then creates a perfect channel plan. In reality Aruba campus APs are far less lightweight than you might imagine. Most of the management traffic and RF decisions, such as radar detection, channel selection and how ARM behaves all takes place on the AP with the controller notified what channel the AP has done.

This means automatic channel selection is an iterative process. In a large building with a lot of APs it can take ARM a long time to settle down as APs move channel to avoid a neighbour, but then interfere with a different neighbour. You have all your APs moving around the available channels, trying to keep out of each other’s way. In a large deployment I reckon on this process taking about 24 hours.

So what was going on in that new building?

The settings we were running for ARM were mostly default and these are fairly conservative. APs moving channel can cause a poor user experience, so you don’t want your APs to be jumping around all the time. By default ARM won’t move channel if there’s a client associated with the radio. You can control this behaviour by disabling Client Aware and control how aggressive ARM is by adjusting the Coverage Index and Backoff Time.

By creating an “aggressive” ARM profile and applying this to the new building AP group I found ARM then worked as expected, changing channels in order to minimise channel overlap. Good job ARM. Once the iterative process had been left running for 24 hours, we reverted to our regular ARM profile and all has looked pretty good since.

Because the 2.4GHz band has few non-overlapping channels, if you install dual band APs you’ll probably want to disable the 2.4GHz radio on quite a few APs. ARM will do this for you with a function called Mode-Aware. I can’t recommend using this. The problem is that ARM doesn’t have a high level view of… anything. Remember it’s just each AP knowing what it can ‘see’ around it. So ARM doesn’t understand the layout of the building and what areas each AP are providing coverage for.

On every occasion mode-aware has been turned on, we’ve had problems. ARM has disabled 2.4GHz radios on the wrong APs, usually those closest to the majority of users. Once after a system upgrade we had complaints that previously functioning Wi-Fi was now very poor. ARM had moved which APs had the 2.4GHz radio switched off. This is a function that might improve in the future but right now when it comes to disabling radios I reckon I can do a better job myself.

There can be similar issues with changing power levels. I’ve limited the range ARM can use to adjust power. It can reduce power by about 6dB, which is quite enough.

I mentioned that ARM won’t fix a bad design. Take an example (now fixed) of an accommodation block with APs in corridors, mounted at the same point on each floor, so vertically above each other. Pretty much every AP could see every other AP. 2.4GHz was a disaster and all ARM could do was turn down the power, which it did, to the absolute minimum. That didn’t help the channel overlap, but it did mean there was no longer coverage into many of the bedrooms turning a bad Wi-Fi experience into no Wi-Fi experience at all. It turned out 5GHz wasn’t much better because there were always clients around, so several APs were stuck on the same 5GHz channel.

None of this is ARM’s fault, but as we worked to fix the problems experienced by users, the first thing we did was disable most of the ARM features because their attempts to fix the poor RF design just made things worse.

So why bother with ARM at all? Firstly there has been no appetite to manually build static channel plans for more than 2000 APs. Aruba’s config leans towards an assumption you’ll be using ARM. The administrative overhead of the necessary profiles for all the various channel options would be an unpleasant thing to deal with. Perhaps most importantly, since making some fairly small changes to ARM, and better understanding what it can do for us, it’s now a useful tool that makes running the network easier and more predictable.