I had recently seen an issue in a hospital where the client I was testing had a difficult time roaming to the next access point down the hallway and I was lucky enough to have captured a trace of it happening in real time.
Figure 1. I/O Graph of a Device Roaming Behavior
This graph is using percentage for the Y axis and time for the X axis. The black line across the top of the graph represents the upstream frames from the device to the access point. The blue dots represent the downstream frames from the access point to the client. The red lines indicate upstream retries, and the green lines are probes.
The device starts out on a very strong signal (ch 36) since the blue dots are at the top of the percentage chart, the signal is very good. The device is roaming down the hall and the signal starts to drop as the device moves away from that access point, it begins probing for a new access point and associates to channel 44, but the signal is almost as bad as it was when it roamed. This is why you don't install your access points down the middle of the hall; what looks like low coverage is really what happens when there is nothing to attenuate the signal from one access point to another and the client has passed the target access point before associating. By the time it does associate, it is looking again to find a more suitable access point.
Figure 2. Client Device Loses Four Seconds of Audio Trying to Roam
At 34 seconds the client suffering at -78dBm, tries to associate to the 2.4GHz radio in its associated access point, losing almost three seconds. The client decides to stop doing that and spends a few more milliseconds before it disassociates from channel 44 altogether and goes through a new probe sequence where it loses almost another two seconds of audio before associating to channel 149.
Aside from the fact that there are many issues with the configuration of this wireless environment, like inter-band roaming, poor access point placement, no 802.11k configured, etc... This device tried to associate to anything other that the channel 149 access point. It turns out the the channel 44 access point is an Aruba AP-635 and the channel 149 access point is an older model AP-505. This in itself should not have been an issue since the two access points can exist on the same controller with the same firmware etc. however, something obviously occurred to lose four seconds of audio. This information was presented to the customer who will now work with their wireless vendor to determine why this happened. With more information and more access to the wireless controller, I'm sure I could find a reason, but my job currently is to collect and analyze data and present results. This data goes to the customer and our support team so they have a baseline to work off of in the event a support ticket is generated.
The point is that having the ability to capture and analyze layer two traffic over the air is a vital component when performing wireless assessments on site at customer locations. I heard the audio loss and saw the device showing a 'Searching for AP' message, but having been able to capture it provides useful data for the customer to then begin troubleshooting their environment and provides proof that it happened.
This trace was captured using OmniPeek with eight 5GHz Wi-Fi adapters set to different channels to capture a single device roaming from access point to access point. I then take that file and view it in Wireshark, because it's more familiar to me. Everyday is a school day and I hope to eventually be proficient enough to not only be better at my own analysis but teach junior engineers how to capture and interpret data.

Comments
Post a Comment