SD-Branch meshing slowness

Aruba SD-Branch supports branch meshing which, as the name suggests, allows branches to build an IPsec tunnels between branches and share routes directly. This is useful if you have server resources within a branch that need to be accessed from other sites. The concept is that it’s more efficient for traffic to flow directly between sites rather than via the VPNC in the company data centre or cloud service.

Whilst this all makes complete sense, it’s worth considering that not all ISPs are equal – of course we know this – and not all ISP peering is quite what we might expect.

I have recently worked on a project where branch mesh is occasionally used and the customer experienced significant performance problems with site B accessing servers on site A when the mesh was enabled.

The issue was down to ISP peering. Site A is in country1, Site B is in country2 and the VPNC is in country3. Traffic from ISPs on both sites to the VPNC was as fast as it could be. Both ISPs generally performed extremely well but as soon as traffic was routed between them the routing was weird with very high latency.

Because the ISPs on both sites were performing well in all other respects, reachability and performance tests all looked good. The gateways therefore happily used the branch mesh for the traffic between the two sites and the user experience was horrible.

Short term fix was to disable mesh between these branches. Long term fix was to change ISP at one of the sites. The customer did try raising a case with both ISPs. One engaged and at least tried to do something, the other didn’t… guess which was replaced.

Copper POTS is dead, long live fibre!

The Plain Old Telephone Service that’s been around in the UK since the late 19th century is about to be discontinued. This isn’t happening everywhere all at once, it’s a phased thing on an exchange by exchange basis, but ultimately if you currently have a basic phone line this is going to stop working.

There are concerns around this, mostly centered on elderly people who still use a landline and what happens when the power goes out? The argument goes that in an emergency, e.g. during a storm, in a power cut, we would not want to leave people without the ability to call for help.

There are other issues around the loss of analogue lines to do with monitoring systems and alarm, but these can pretty much all be mitigated with VOIP adapters. The real issue is about reliable service in an emergency.

It’s hard to know how much of a problem this actually is. Whilst I don’t doubt there are plenty of elderly people for whom the landline is important, my first question is how many have a cordless phone? These have a base station that needs mains power. I have never seen a domestic unit with battery backup. In a power cut these do not work, but a secondary wired phone can be present.

Then there’s the question about line damage. The circumstances that lead to power outages can often also result in damage to telephone lines. It doesn’t matter that the exchange is still working if you’re no longer connected to it.

If your analogue landline is discontinued it will be replaced with either FTTP or a DSL circuit over copper pair. To maintain a phone service means using VOIP – either dedicated phone hardware or some sort of VOIP adaptor, possibly built into the router.

The most obvious solution is a low cost UPS. To maintain service in an emergency1-3 devices would need to be powered. These are not going to present very high current draw and a small UPS would probably keep things working for several hours – albeit with annoying beeping which itself is likely to be an issue.

Debate I’ve seen is around who is responsible for this. I understand why this is raised because currently an basic analogue telephone is powered by the exchange. The thing is it’s possible to consider an analogue telephone as part of the exchange as when in use it completes the circuit through the line. As this service is withdrawn it becomes necessary to power devices locally – as it already is with a cordless phone connected to an analogue line.

UK domestic comms has moved from circuit switched analogue voice telephony to broadband packet switched IP. The way analogue telephones work, at least between the exchange the home, is essentially the same now as it was in the late 1800s and it makes complete sense to end this service now that it is no longer used as the primary means of communication.

Something that doesn’t get a lot of mention in the press coverage around this is parts of the phone network is getting old. Nothing lasts forever and periodically the cabling infrastructure in the ground and the equipment in exchanges needs to be replaced. We have gone from the human operator through several generations of mechanical switching (this link provides a wonderful explainer on how an old mechanical telephone exchange works) to electronic analogue switching through everything being digital links and now it’s all IP between exchanges anyway. Having to maintain the old, just because it’s been there a long time does not make financial sense.

Given what the communications network is now used for it makes no sense to put new wiring in. If streets and driveways are to be dug up to replace old cabling it’s much better to replace them with fibre – and that’s what’s happening.

The problem here is one of change and how that change is managed. BT have gone from offering to provide battery backed up equipment to not doing that to postponing the copper pair migration in areas because it turns out they hadn’t worked this out.

I have seen a lot of people claiming that the good old analogue phone is simple, reliable and should be maintained for that reason. I can see the logic of that though I would argue it’s only the user end that’s simple.

Perhaps there’s a market for a simple phone VOIP terminal that has built in DSL, ethernet, doesn’t need a separate router and can supply power (12v outlet) for the PON terminal – nice big LI-PO battery inside. Get that right an BT can probably then have a standard hardware item to issue that will just work, then the rollout can proceed.

Wi-Fi7 – rainbows, unicorns and high performance

Like every technological advancement that leads directly to sales of new hardware, Wi-Fi7 promises to solve all your problems. This, of course, will not happen nevertheless it’s now available ***in draft form*** so should you buy it?

No.

Ok let me qualify that. Not at the moment unless you are buying new Wi-Fi hardware anyway, in which case maybe.

The IEEE specification behind Wi-Fi7 is 802.11be and it isn’t finalized yet. That means any Wi-Fi7 kit you can buy right now is an implementation of the draft specification. Chances are that specification isn’t going to change much between now and when it’s finalised (expected end of 2024) but it could. There’s nothing new here, vendors have released hardware based on draft specs for the last few major revisions of the 802.11 Wi-Fi standards.

Perhaps more important is in a rush to get new hardware on the shelves what you can buy now is a Wi-Fi7 wave1 which doesn’t include some capabilities within the specification. As we saw with 802.11ac (Wi-Fi5) the wave2 hardware can be expected to be quite a lot better – it will support more of the protocol’s options, chances are the hardware will be more power efficient too – personally I’d wait.

Something that’s important to remember about every iteration of Wi-Fi is that it almost certainly won’t magically solve whatever problem you have that you believe is caused by the Wi-Fi. Client support is also very sparse right now, so swapping out your Wi-Fi6 access point/Wi-Fi router for Wi-Fi7 hardware probably won’t make any difference at all.

As with all previous versions many of the benefits Wi-Fi7 brings are iterative improvements that aim to improve airtime usage. These are definitely worth having, but they’re not going to make the huge difference marketing might have you believe.

The possibility of 4096 QAM (subject to really high SNR) allows for higher data rates – all other things being equal. 512 MPDU compressed block-ack is a complex sounding thing that ultimatley means sending a bigger chunk of data at a time and being able to move more data before acknowledging – which is more efficient. Channel bonding is enhanced with 320MHz channels now possible and improvements on how to handle a channel within the bonded range being used by something else. All very welcome (apart from maybe 320MHz channels) and all iterations on Wi-Fi6.

The biggest headline boost to performance in Wi-Fi7 is Multi-Link Operation – MLO. For anyone familiar with link-aggregation, what Cisco calls Port-Channel, the idea of taping together a number of links to aggregate bandwidth across those links as a single logical connection – MLO is basically this for Wi-Fi radios.

That 2.4GHz band that’s been referred to as dead for that last however many years, now can be duct taped to 5GHz channels and you get extra bandwidth for your connection. You might expect you could also simultaneously use 5GHz and 6Ghz and you can… in theory, but none of the vendors offering Wi-Fi7 hardware support that right now. Chances are this is something that will come in the wave2 hardware, maybe a software update…. who knows.

There are benefits to MLO other than raw throughput – a device with two radios (2×2:2) could listen on both 5GHz and 6Ghz (for example) and then use whichever channel is free to send its transmission. This can improve airtime usage on busy networks and reduce latency for the client. Devices switching bands within the same AP can also do so without needing to roam (currently moving from 2.4GHz to 5GHz is a roaming event that requires authentication) and this improves reliability.

Key to MLO is having Multi-Link Devices. Your client needs to support this for any of the above to work.

Wi-Fi7 has a lot to offer, builds on Wi-Fi6 while introducing technology that paves the way for further signficant improvements when Wi-Fi8 arrives. There’s a lot of potential for Wi-Fi networks to get a lot faster with a Wi-Fi7 deployment.

Returning to my initial quesiton… Personally I wouldn’t buy Wi-Fi7 hardware today unless I already needed to replace my equipment. Even then, domestically I’d probably get something to fill the gap until the wave2 hardware arrives. If everything is working just fine but you’d like to get that bit more wireless speed, chances are Wi-Fi7 isn’t going to deliver quite as you might hope. Those super speeds need the client to be very close to the AP.

ClearPass extenstion restart behaviour

To address occasional issues with a misbehaving extension entering a restart loop some changes were made in ClearPass 6.11. This can result in an extension stopping when it isn’t expected and, crucially, not restarting again.

A restartPolicy option has been added which aims to ensure extensions will always restart when the server or extensions service restarts. A good practice is to add “restartPolicy” : “unless-stopped” to your extension configuration – note that I have only used this with the inTune extension. Below are the options available.

  • “restartPolicy”: “no” – The Extension will not be automatically restarted after the server is restarted.
  • “restartPolicy”: “always” – The Extension will always be restarted after the server is restarted.
  • “restartPolicy”: “unless-stopped” – The Extension will be restarted unless it was stopped prior to the server restart, in which case it will maintain that state.
  • “restartPolicy”: “on-failure:N” – If the Extension fails to restart, the value for “N” specifies the number of times the Extension should try to restart. If you do not provide a value for “N”, the default value will be “0”.

Whilst the default behaviour ought to effectively match the “unless-stopped” policy, in my experience there can be issues with extensions stopping unexpectedly. Prior to release 6.11.5 a bug an unrelated service would restart the extensions service, and this results in stopped extensions. Whilst this should be resolved, I have still run into the problem. Adding “restartPolicy” : “unless-stopped” resolved this issue.

ClearPass Guest pages in a specific language

If you’ve made use of language packs in ClearPass Guest, you’ll know that it’s possible to support multiple languages across Guest in both customer facing pages and the back end. Everything will use whichever language you have set as default and then you can provide the option of choosing an alternative to the user. There’s also the option of enabling language detection where ClearPass will hopefully match the language used to the user’s system settings – this can be found in the Language Assistant within ClearPass Guest.

This works very well and is going to meet most requirements but there are some edge cases where it may be desirable to have some guest pages that open in a language different to the back end default.

Take the example of regional languages that are especially important to a subset of users but might not have wide operating system support. ClearPass Guest offers language customisation allowing use of a language that isn’t built in to the product. In such an example it might be a requirement to use the regional language as default for a captive portal but administrators of the system may not speak the language – it’s also worth noting that if translations don’t exist for a selected language ClearPass can exhibit some buggy behaviour with elements of the back end UI no longer working (as of release 6.11.5).

Whilst there may be alternative methods, one option is using the language selector to redirect users to the appropriate page and language.

Under the hood what the language selector drop down actually does is get translation_lang.php with variables of the destination page (usually the page you’re already on) and language. You can use this as your captive portal redirect to directly access the language of choice.

As an example, if you want the self-registration login page in Klingon it’s something like this:
“https://CPPM/guest/translation_lang.php?target=guest_register_login.php&lang=tlh”

Adjust to match your server address, the desired page and language pack.

Trouble wi’ broadband

A decent domestic broadband circuit is pretty important for most of us and especially if, like me, you work from home some of the time. I’ve been fortunate that over the last few years things have been pretty good however that may be changing and the frustrating relationship between ISPs and transit providers operating in the UK makes for annoyingly difficult decisions.

Continue reading

ClearPass auth failure diagnostic

Here’s a “learn from my recent experience” type post

The problem: Clients are unable to authenticate from a new Wi-Fi network that has been added

Observations:

  • ClearPass appears to be working fine
  • Clients are successfully authenticating from the existing network using EAP-TLS
  • A Policy Manager service has been configured for the new network and incoming requests are correctly categorised
  • Authentication attempts from the new network are rejected, seen in Access Tracker
  • Failing auths are showing as an outer type of EAP, not EAP-TLS
  • No certificate content is shown in the computed attributes of the failed auths
  • Apple Mac clients are able to authenticate to the new network successfully, managed windows clients are not. The same clients work fine on the existing network.

The obvious conclusion is this the new network is incorrectly configured, and this turned out to be the case, but it what’s wrong… The last point in the observations was particularly interesting and threw a spanner at the “network config error” idea, because if the network config is wrong why can a Mac authenticate… is it the client? What’s the difference?

Connections from the new network were proxied via another RADIUS server. This is because the solution uses RADSEC and the new network Wi-Fi controllers don’t support RADSEC.

The information provided by Access Tracker appeared to show insufficient information for the auth to be successful. Crucially there was no client certificate information and the outer method showing as just EAP was… odd.

Looking at the logs for an auth showed an error early in the process:

rlm_eap: Identity does not match User-Name, setting from EAP Identity.

Ultimately here’s what the problem was… The proxy forwarding these authentications had a default to strip information from the username. Windows clients which presented the username as host\<hostname>.<domain> had this stripped back to just the hostname. So the TLS tunnel outer username presented to ClearPass became hostname$. The Mac clients didn’t present the FQDN as the username so nothing was stripped.

ClearPass performs a check on the Outer Identity of the TLS tunnel and the Inner Identity. If the outer identity is valid and the inner identity differs the auth will fail. In the case of EAP-TLS the error above will be displayed in the logs and the auth will fail.

The Outer and Inner identity either must match or the Outer Identity can be set to Anonymous.

Note this applies to EAP-PEAP and EAP-TLS. EAP-TTLS may also have issues with mismatches… basically make sure it all matches, don’t have anything strip data from the outer identity unless it’s being set to Anonymous.

The solution to this was to disable the domain stripping on the proxy.

For anyone who’s found this post after running into this issue, take heart that information presented by Access Tracker is not at all helpful in understanding why the auth has failed. It appears the certificate hasn’t been presented at all when in fact that data just isn’t presented to you.

The error message tells you exactly what’s wrong, once you understand how it works.

ClearPass, Intune and MAC randomisation

As more organisations have moved to Microsoft Azure AD and Intune to manage their devices a common request is how to integrate this with Aruba ClearPass, which handles the RADIUS requests for Network Access Control. The most common deployment pushes certificates to the clients which use these for EAP-TLS authentication with ClearPass. Note this post doesn’t cover the basic Intune integration setup which is documented in an Aruba guide.

But you probably want to know whether a bit more about the client than whether it has a valid certificate, so ClearPass has an Intune extension which will download information from an Azure tenant’s Intune to the Endpoints Repository database.

This allows you to make policy decisions based on Intune attributes, such as compliance state, allowing you to place clients in different roles/vlans depending on what the client is, whether it’s compliance with policy, department, etc etc.

However… the Endpoints Repository uses MAC addresses so problems start if clients are using MAC randomisation. The MAC address presented by the client won’t match what’s recorded by Intune so it won’t be possible to match against Intune attributes in the Endpoint.

The first thing to do is stop using the MAC address as the UID. Much better to use the Intune ID and have this written into the client certificate, either as the CN or a SAN though you can use any variable that ends up computed from the authentication.

You can then either query Intune directly using the Intune Extension HTTP method or use the Endpoints Repository in a slightly different way.

I prefer the latter option most of the time because it provides a level of resilience as well as improving performance. Querying the Intune API via the extension works very well but if Intune is down (which has been known to happen) that won’t work. It will also take longer than querying a local database.

ClearPass assumes you’re using the presented MAC address as the UID of the Endpoint and it isn’t possible to change this. Instead you can query Endpoints as an SQL database with a filter that pulls out the attributes you require based on the presented certificate CN.

ClearPass databases can be accessed externally using the username ‘appexternal’ and the password which is set in cluster-wide parameters under the Database tab.

Next create a new Generic SQL DB Authentication source pointing to the local tipsdb and set a filter that pulls out the attributes you want for the auth session variable presented.

Server name: <server IP>
Databse Name: tipsdb
Username: appexternal
Password: <password you set>
ODBC Driver: PostgreSQL
Password Type: Cleartext

A few things to note here. You can’t use localhost as the server name. If you have a VRRP address you can use this, otherwise you must use the actual IP of the server. This can cause complications in an environment with multiple ClearPass servers and no VRRP. There are ways around this but that’s for another blog.

The filter is the SQL query that pulls out the attributes you want based on an attribute presented. In this case we’re selecting ‘Intune User Principle Name’, ‘Intune Compliance State’ and ‘Intune Device Registration State’ for a record where the ‘Intune ID’ matches Subject-CN of the client certificate

select attributes->>'Intune User Principal Name' as "Intune User Principal Name",attributes->>'Intune Compliance State' as "Intune Compliance State",attributes->>'Intune Device Registration State' as "Intune Device Registration State" FROM tips_endpoints WHERE attributes->>'Intune ID' = LOWER('%{Certificate:Subject-CN}');

You then specify for each of these how you want the system to use the data it gets back, essentially either as an attribute or directly set as a role.

Add your custom authentication source into the service, and you’re good to go. You will now be able to make policy decisions based on the Intune ID lookup rather than the MAC address.

Three more things to note about this.

If you’re limited as to what can go into the certificate CN or SAN you can use the same method to pull out other details, for example with a query like this:

select attributes->>'Intune User Principal Name' as "Intune User Principal Name",attributes->>'Intune Compliance State' as "Intune Compliance State",attributes->>'Intune Device Registration State' as "Intune Device Registration State" FROM tips_endpoints WHERE attributes->>'Intune Device Name' = '%{Authentication:Full-Username}';

This method requires the device to be in downloaded to Endpoints by the Intune Extension. If that doesn’t happen it won’t be there to match on. In some circumstances Intune no longer records MAC addresses for devices – notably self-registered personal Android devices – and because the Endpoints Repository is based around MAC address these devices will be missing.

This is one of the reasons it’s worth using the Intune ID as your device’s UID in the certificate – if you need to query the Intune extension via HTTP you’ll need to present it with the Intune ID, nothing else will work. It’s worth noting devices will also have an Azure AD ID which looks similar, and will likely be the same for devices not managed by Azure AD, but the Intune API only understands the Intune ID when querying for device attributes.

Yet another Ser2Net tutorial

I often spend time away from home and want to be able to reach my home lab, both hardware and virtual. I use a Wireguard VPN, running within Home Assistant on a RPi 3 for the remote access, which means the network side is sorted. However I often find I’m juggling a few projects and might need to rebuild a hardware controller or an AP… that essentially needs console access. Whilst I could use a dedicated DC style console server they’re expensive, awkward, and overkill so I use Ser2Net on another RPi. There are plenty of tutorials on how to setup Ser2Net which are probably better than this one, but everything I found is based on an older version. Since then the config has changed to YAML and I found the defaults didn’t behave as I expected… so here we are.

Ser2Net “provides a way for a user to connect from a network connection to a serial port” – so says the project author Corey Minyard. You define a TTY interface and how you would like to connect to it. By default Ser2Net forwards raw data over TCP via a specified port to and from the TTY interface. Again, by default, you can only access a TTY over the network from localhost.

I threw my console server together using the latest version of Ubuntu for Raspberry Pi, Ubuntu 23.04 at the time of writing, running on a Pi 2b. Any old RPi is a good choice for the low power consumption and very light requirements.

Ubuntu 23.04 repositories contain ser2net version 4.3.11 which differs from previous version in that it uses YAML for the config. This is found in /etc/ser2net.yaml.

I’ve used three different types of USB Console – a couple of cheap FTDI cables from Amazon, USB console interface on an Aruba 7005 controller, and the Aruba TTL to serial cable for an AP. All were recognised by the OS.

How to set it up:

Build your Pi (or whatever machine you’re using) with Ubuntu 23.04 (or later)
Run sudo apt update && upgrade (just because we always should)
Install ser2net with apt install ser2net
Connect the USB serial interfaces and issue the command: sudo dmesg | grep ttyUSB
This will show you the USB to serial interfaces that have been recognised by the OS. It will look something like this:

You can now add these connections to the YAML file.

A quick note on security. Ser2Net doesn’t have any authentication. You can restrict the listener to localhost, as distinct from the host IP, and that means everything is protected by the strength of your ser2net host logon. I just want a telnet port forwarded to the TTY so it’s easy. This is not a good idea for any production environment without having other layers of security. In this case it’s a lab, and it’s only accessible either in person or via my VPN… so it’s good enough for me, but might not be for you.

The YAML for my console interfaces looks like this:

connection: &con0096
    accepter: telnet,192.168.26.3,2000
    enable: on
    options:
      banner: *banner
      kickolduser: true
      telnet-brk-on-sync: true
    connector: serialdev,
              /dev/ttyUSB0,
              9600n81,local
 

You need to increment the connection number, the port (2000 in this example) and the connector (/dev/ttyUSB0) in this example. If you have duplicates it won’t work properly though I believe you can use the same device in multiple connections to allow different port settings. The IP address of the accepter is where it listens for a connection. The default config has this set to tcp,localhost,xxxx which passes raw data over TCP (use something like nc) and is only available from the local machine. It probably goes without saying, but I’ll say it anyway, under the connector be sure to check the serial port settings are correct.

After changing the config file restart the service with sudo systemctl restart ser2net

I can now telnet to my ser2net host of 192.168.26.3 and depending on the port I get connected to a different machine. As shown in the screenshot above I have four interfaces connected and could use a USB hub or multi-port serial interface to access more machines.