The second in a short series on the wider network services we need to get right in order to offer a good user experience to our Wi-Fi clients – first I mused about DNS, this time it’s Dynamic Host Configuration Protocol or DHCP for IPv4.
Put simply DHCP is what assigns your device an IP address when it joins a network. I’m not going into detail on how to configure it, the focus here is what I’ve seen go wrong in the real world.
Ensure your IP address space and DHCP scope is large enough for the intended number of clients. For example a coffee shop with a peak of 20 clients would be just fine using a /24 subnet that allows for a total of 253 clients (after accounting for the router address) whereas a 17,000 seater stadium would need a substantially larger subnet. Don’t short change yourself here, make sure there’s plenty of room for growth.
Pool exhaustion due to long lease duration. When the DHCP server runs out of IP addresses to hand out, that’s known as pool exhaustion. Consider the coffee shop with an ISP provided router which offers an address in a /24 subnet. That’s fine for the first 20 customers of the day, and the next 20, and so on, but a busy shop could soon have a lot of people come through and if enough of them hop onto the network the DHCP pool could run out – this is especially the case if the pool isn’t the full 253 addresses but maybe only 100. The simple fix for this is to set a lower lease time for DHCP, 1 hour would likely be sufficient but beware of short lease times having an impact on server load in some circumstances.
The client needs to be able to reach the DHCP server. A common deployment of captive portals moves the user into a different role after authentication. I have encountered networks where the authenticated role blocked all RFC-1918 addresses as a catch all to prevent access to internal services however this would prevent the client from renewing an IP address. Much unpredictability ensued. The solution was simply to allow DHCP traffic to reach the DHCP servers.
DHCP server hardware capacity. DHCP can get really complicated and tied into expensive IPAM products such as infoblox. For most deployments this isn’t necessary and the hardware requirements are usually not significant enough to be a concern. However this can be context dependent. A busy network with lots of people constantly coming and going likely has a very low peak DHCP request rate. A stadium network where lots of people arrive very quickly may see peak demand that requires a little more horsepower – as with DNS, keep an eye on the server load to understand if hardware limits are being reached. In practice very modest hardware can meet the DHCP demand of many thousands of users.
Multiple server syncronization is where more than one server shares the same pool, best practice in larger deployments for redundancy but it’s something I have seen go wrong with the result the same IP address is offered to more than one client. Fixing this is getting too far into the weeds and will be implementation specific, it’s enough to know that it absolutely shouldn’t happen and if the logs suggest it is, that’s a serious problem that needs someone to fix it.
The DHCP server simply stops working. Yep, this can and does happen. It’s especially a problem in some of the more affordable hardware solutions such as ISP provided routers. I encountered a Mikrotik router being used for DHCP on a large public network and from time to time it would just stop issuing IP addresses to random clients before eventually issuing no leases at all. A reboot always resolved this and I’m sure newer firmware has fixed this. There was often a battle with the owner of this to get them to restart it because “it was routing traffic just fine” and, yes, it was. It just wasn’t issuing IP address leases any more.
