I’m Going To Disneyland!

If you have ever met me in person, or read any of my blog posts, you might have come to realize that I’m a little quirky and sometimes not ready to be let lose in the wild without a chaperon.  You might have also realized that I am a big dork who likes all things wireless.

This past February I went to WLPC in Phoenix and near the end my co-worker told me he was done and his brain had turned to mush and I just looked at him like he was crazy.  I was ready to go for a second full week and then maybe, just maybe, I would be done and ready to head back to the real world.

Unlike my co-worker, I love this stuff!  I have a commercial radio license from the FCC (aka as a GROL, General RadioTelephone Operators License) and an amateur radio license.  When I’m not doing wireless for work, I’m doing it as a hobby.  Back to the quirkiness, and the title of this post; I’m not really going to Disneyland.  Or Disneyworld, or any Disney typed themed resort.  I’m going somewhere better!

I’m going to Mobility Field Day as a delegate! MFD3 here I come!

For me, that’s better than Disney or Universal or any of that nonsense.  I’m going to talk wireless for three, count them THREE, straight days!  I get to focus on wireless and mobility for three days and geek out with some of the brightest minds in the business.  When I got the invitation to go, it was just like that guy on TV going up to the quarterback who just won the Super Bowl.

If you didn’t think I was quirky and weird before, I think this might have just cemented it for you.

Honestly, it is a privilege and an honor for me to be invited to be a MFD3 delegate.  It’s hard to believe what has happened in my  life since I got plugged into the Wi-Fi community and started the CWNP certification process, just a short 18 months ago!  I’m not sure that I could have scripted what has happened, there is no way I would believe it.

Now, just like with all of the Disney properties, of which I have never been to any of them, I have never been to Mobility Field Day, or Tech Field Day, as a delegate.  I’ve followed those who have, and have read the posts and discussions and it does leave me nervous.  What if I geek out too much?

In that spirit, I have decided to set some ground rules for myself.  If anyone thinks I am missing something that will keep me in line, feel free to let me know.

  1. I will not ask for autographs, unless someone else does first.
  2. I will not be the first one to jump up and down on my chair.
  3. Or the table.
  4. I will take good notes so I can blog about the whole experience.
  5. I promise to keep all my blog posts less than 5,000 words.
  6. I will take pictures, and occasionally, allow pictures to be taken of me.
  7. I will not come home and talk incessantly to my wife about everything that happened.  (Actually, she doesn’t read my blog and as I will probably break this rule, oh well!)
  8. I promise to have a great time!

Now, number 8 is pretty simple and guaranteed to happen, but I needed to throw that in there to finish up the list.

My plan is to blog about the run up, and then the days as best I can, and then the presentations will come separately as I get the technical details straight.  That’s my plan at least.

One thing you might have noticed that is missing in my rules, and I’m not adding in, is I will more than likely talk the ear off of the other delegates to the point that one will begin to question my parentage.

That, my friends, is a promise!

Advertisements

Wi-Fi Security At A Public Venue

Earlier this week, July 18th 2018 to be exact, a Tel Aviv based cloud security company named Coronet released a report detailing the cyber security risks of connecting to airport Wi-Fi (I’m fine with that concept) and then went ahead and ranked the top 45 busiest airports in the United States based on their own proprietary security score (that is what I have a big problem with.)

Disclaimer: I work at, and run the guest Wi-Fi for an airport that is in the middle of this list.  I will still attempt to be objective, but I’m not making any promises!

The actual report, which you can find here, is simply a report that they published.  The problem is it was picked up by multiple publications, as is usually the case, and people without any qualifications are starting to put their spin and understanding on said report.  The report, and the subsequent news articles, miss the one big point that is inherent in guest Wi-Fi; that it is unsecure and it is that way by design. Airport Hacked

By sheer happenstance, I just finished taking a CWNP CWSP class last week and during the class we spent some time talking about wireless security at airports.  As I read through this report, and applied my knowledge from all of my CWNP studies (which, if you aren’t going down that road, I highly suggest you start immediately) there were a lot of red flags that started to go up, and in the end made me pretty upset.  I am going to break down their published study in an attempt to point out where I take issue with their report and methodology in an attempt to educate people about guest Wi-Fi.

The Report

Their report was created by crowdsourcing data from devices that had the Coronet application on them as they traversed through America’s 45 busiest airports.  This data was collected over 5 months and comprised of “more than 250,000 consumer and corporate endpoints.”  While that may seem like a lot, lets take a look at the number of passengers that would have traveled through Houston’s William P. Hobby Airport (not the airport I work at but the #3 worst on their list) during the first 5 months of 2018.  While the report doesn’t call out which 5 months were part of their study, it does point out it was over the course of 5 months.  During the first 5 months of 2018, Hobby airport had a total of 5,743,661 passengers during that time frame.  That number is good enough for 34th in the country and places them in the “Medium Hub” category for airports.  In comparison, Hartsfield-Jackson Atlanta International Airport, which is the busiest airport in the world, during the same 5 months had a total of 41,960,415 passengers.  I mention this as a reference that while 250,000 might sound like a lot, in the world of airports 250,000 travelers can represent a single busy day at a “Major Hub” airport like Atlanta.

This ties into the next part of the report I want to examine, and is a fact well known by Wi-Fi professionals but not thought about by others, like security professionals, and it has to do with range.  On Page 5 of the report, in the Network Risk Score, Coronet stated that their SecureCloud scanned connected and neighboring networks and used proprietary algorithms to assess network risks.  The one point they missed is they are only gathering the data that the device can see, and if you have ever tried to collect Wi-Fi frames on purpose you know that it is super easy to miss frames and beacons when you are trying to capture them, let alone some background process.  Coronet is assuming that if they didn’t see a malicious network then it wasn’t there.  That’s like saying I didn’t see the person down the street get mugged so this must be a safe street.  Take into account the limited amount of time a device stays in an airport, you might as well say I didn’t see that person get mugged in the past 30 minutes down the street so this must be a safe street.  Even more, since Wi-Fi in a high noise environment doesn’t have the same range to decode packets, it’s like saying I didn’t see anyone get mugged in the past 30 minutes within 75 feet of me on this street so it must be a safe street.  I wish I was exaggerating, but that’s just the truth as it pertains to environments like mine.

Within that same paragraph I would like to point out that according to their scale, the highest risk they account for is a “5” with “1” being the lowest.  When we jump to the end of the report we find that only 2 airports out of 45 make it into the 4’s (Raleigh Durham at 4.9 and Chicago Midway at 4.5.)  While that’s just what it is, the number 3 “safest” airport is Nashville with a score of 5.1.  While good enough for 3rd on the safe list, it’s actually breaks their range of being bad.  Keep in mind, this is their own scale based on their own reasons.

The rest of the report, just to wrap this up, is the ranking of airports using their own random numbers.  After discussing the fact that they didn’t publish how many scans were done per airport over these 5 months, how often and how much of the public space was scanned, it really loses it luster really quick.  Refer back to my analogy about someone being mugged.  The thing to take away is they only reported on what they were able to see, from what has to be a VERY limited perspective.  They don’t call that out in their report which is why all of these “news” organizations are loading up my Twitter search with this report and stating it as a very important read.  Don’t believe me?  Go on Twitter and search “airport wifi” and see what comes up.  I keep a constant feed running in my TweetDeck for airport wifi (it’s my job, honest) and it’s driving me nuts.

What Can You Do

This is also known as what they didn’t discuss in their report.  Since I believe the report is meant to promote their product and company, I will tell you this would be counter-productive to the purpose of their work, so I get why they avoided this part.  The one thing they did say, that makes sense, is on page 4 where they talk about captive portals.  I’m not a fan of captive portals, and even more so after reading this.  It makes sense that if you were to connect to a malicious AP (aka Bad Guy Hacker) he could serve up a captive portal modeled after the real portal and when you click a link, have it install malware on your device as well as granting you access to whatever they want to give you access to.  Just one more reason to turn them off.  That’s someone elses rant from a different day.

This leads into my recommendations.

  1. Educate yourself.  Every airport has a website, and that website will indicate what their SSID is, if they even offer guest Wi-Fi.  If you can’t find that SSID while you are there, don’t connect to something just because “it’s close enough.”  Look at it from an operators perspective – we can’t shut down “rogue” AP’s because we don’t know what someone named their own Mi-Fi hotspot or some other random device they travel with.  Even an extra space or an underscore is enough to make it “different” and that is hard to track down with over 100,000 people a day through your facility.
  2. Configure correctly.  Unless it is your home network inside your house, and we can argue about 802.1X versus WPA2-PSK at your house another day, always indicate you are on a public network.  That will, in hope, keep others from being able to access your device fully, no matter what you connect to.  Also, keep your firewalls and security up to date and set to high.  While it is a pain and can stop functions from working, it’s better than having your identity stolen or company secrets leaked.
  3. Take responsibility.  Keep in mind, all public Wi-Fi services, be it at an airport, sporting venue or coffee shop, is meant to be a nice service to you, but we need your help.  The report talks about the KRACK “vulnerability” from last year as something, and I will tell you they are just jumping on the latest security “scare” in the Wi-Fi world.  KRACK is only for WPA2 encryption, guess what isn’t on a guest network?  We discussed internally about a WPA2 service for about 30 minutes and we came up with why that was a bad idea and then killed it.  Advertising a WPA2 password on a banner will only add about 3 seconds of additional workload to a malicious player in the space and in the end, only drive up complaints about how hard it is to connect.  The fact they talked about WPA2 and guest Wi-Fi in the same report is infuriating.
  4. Protect yourself.  This is really a subset of #3, but we will make it its own number because it is that important.  Understanding that as the operator of the system, I can only do so much.  Client to client isolation is about it, which we do at our location, and any other parameters would only complicate the system and make it difficult to use.  Take responsibility and get yourself a trusted VPN service.  If that isn’t an option, then understand as an operator I have to refer back to #3.  Don’t send emails you don’t want others to see and definitely don’t do anything related to your financial stuff on any network without proper security in place.  At an airport, that means a VPN is the only option.  While https or SSL is some protection, it doesn’t offer complete protection.  Always try to use it, but think of it as an additional security measure, not your only security measure.  Another tip I picked up in class is around username and passwords.  If you don’t use unique username and passwords for every account, then at least come up with different ones for your critical accounts that are different than your non-critical accounts.  Then only access the non-critical accounts on unsecured networks.  This isn’t recommended, but refer back to #3.
  5. Be proactive.  This ties back into #1 and educating yourself.  There are free (or cheap) apps for phones and laptops that allow you to see some information about your connection.  I don’t want to get into an exhaustive list or debate, but they exist.  If you encounter a captive portal, if you can get it to load, don’t just click on any random hyperlink without examining what it connects to.  Hover over the link with your pointer and leave it there, your OS will pop up and tell you what it really points at.  Spoiler alert – it doesn’t have to be what it says it is.  Check out the DNS information your device received.  Most major system will hand out 2 DNS servers and they are usually pretty reputable.  If your only offered one, question it.  Check out the subnet mask you are assigned.  Atlanta breaks this rule so it’s not a hard and fast one, but larger venues won’t have the traditional subnet mask you have at your house.  Don’t know what a subnet is, read about it here.  This is an education step you can take to protect yourself.  Remember, you and I are in this together and we need to help each other.
  6. Be skeptical.  Unless the website you researched before getting to the public venue says different, expect most locations to have a captive portal and limit your speed, time, and what you can view.  Shameless plug here, our location doesn’t do any of that, but it also says so on our website.  Actually getting blocked from a site, while frustrating, is actually indicative of a legitimate service.  Hackers won’t block you because they want you to stay connected.  If you do have any doubt about your connection ask someone or move to a different location.  Move far enough away to be out of range of where you were.  Malicious players will try to be closer to you than you are to the AP, and they won’t move with you.  If they do move with you, you have other problems.
  7. Understand where you are.  Public venues are not your home or office.  We face different challenges than those areas, but we also have some other advantages that makes it pretty cool.  Unfortunately, the cool factor doesn’t extend to the end user so all you get out of this is the challenge part.  For the most part operators of these Large Public Venues (LPV) really do want to provide the guest a great experience that they don’t complain about.  Our interaction with customers is limited so as long as you aren’t complaining, we call it a win.  Part of that is the understanding on the guest’s part.  There are some bad configurations out in the LPV Wi-Fi world, but security is normally something we take seriously.  What we need the guest to understand is that the security we focus on is to limit the guest’s ability to access OUR network.  We know it’s a security grudge match going on inside guest Wi-Fi, we just want to contain it and keep it away from us.  Now that you understand that, you can go back and re-read #3 and understand it better.

In Conclusion

Security is a very, VERY real thing and all of us, network operators and users alike, need to constantly be vigilant in this effort.  As any security professional will tell you, we are only as strong as our weakest link.  Guest Wi-Fi is a very weak link.  It doesn’t mean you have to avoid it, what it means is you need to understand what you are getting yourself into.  I will take one final exception to the report from Coronet, and that comes on the top of page 9 – Taking Action.  They have the audacity to actually think they can “downgrade” your security stance based on their color code and random numbering scheme.  That is the worst possible thing you can EVER do.  Go back and read #6.  If you aren’t assuming that EVERY public location is a very serious threat against all of your connected devices, don’t bother doing anything, because you will get hacked, pwned, compromised, or whatever you want to call it that day.  This isn’t some random threat, it’s just a fact.  In Wi-Fi, if your connection isn’t rated as a Robust Security Network (RSN) or in other words, using an 802.1X form of EAP (Extensible Authentication Protocol, you will probably have to ask someone about that one) then just assume that it is not as secure as you should be.  WPA Pre-Shared Keys (PSK) are only as good as the people you share that key with and that key works for everyone, hackers included.

Long story short – trust no one and always be on the lookout.  If not, it’s only a question of when your device will be compromised, not if it will ever happen to you.

All The “Nerd Knob” Day!

In my previous post (found here) I admitted that I am the customer.  I don’t work for a VAR, integrator, or 3rd party reseller.  This is a follow up to that post, and to understand the mental lint I’m about to sweep out of my brain, you might want to go read that first post and then come back here.

At WLPC 2018, I took the CWAP boot-camp with Peter Mackenzie.  It was an awesome class, and well worth your time.  Peter is the type of guy you should take the CWAP class from, not because you need the certification, but because you WILL learn something from him.  Peter does not factor into my story today.  The last 57 words were just a shameless plug for someone I respect.  It also pads my word count for my essay, much like a paper in school that you didn’t write when you were supposed to and are now furiously trying to fill space while hung over and get your paper in on time.

See that, some more words to my word count!  Anyways, back on track.

During a break during the aforementioned class, we got on the subject of Cisco RRM and if you can actually make it work.  I mentioned that I listened to a podcast from Wireless LAN Professionals where Keith Parsons was speaking with Sam Clements and Blake Krone about Cisco RRM and how you make it work.  No, turning RRM off was not the “valid” response, although it is the oft used response.  What they talked about on the podcast was you have to really know the nerd knobs that goes into making RRM work.  Based on that, and my experience last year (you can read about that here) I mentioned that RRM could be tuned to work, and I had done it.  After some back and forth, they made a comment roughly saying I said it was worth the time and effort to make RRM work.  I corrected them and said it wasn’t worth my time, just that I had done it.  RRM has little factor in my story today, but I am up to 343 words by tacking it in!

Finally to the actual follow up of my previous blog post!  Since I am the customer, I have a vested interest, and plenty of time, in making RRM work, in my environment.  The fact that it took me almost 9 months to get to the point I could start to tune RRM is in-material; my organization was going to have me here where I was making this work or doing something else.  Sure, during the down times when I was waiting on TAC to respond I did other things so I wasn’t a total waste, but as someone who is onsite at least 40 hours a week, I have that luxury.  I can go back to the same place multiple times a week, multiple weeks a month, and multiple months a year, just to check the same thing.  Would a VAR or 3rd party have the ability to do that and stay in business?  Not at all.  Me? Not a problem.  I can tweak all the nerd knobs I want and go back until it works exactly as I want to.

The downside of all of this?  While I sit here and tweak my nerd knobs to make RRM work, everyone else has moved on to a different project, different problems to solve.  Granted I know a whole lot about one thing in particular, but I know less about every other vendor and solution in the market that I wished.

An example of this is an email that I received recently.  It was all about how physical, on-site WLC’s were a great solution a decade ago, but now days you really need to split the 3 planes (Management, Control, Data), dump the physical controllers that’s in your data center and move management to the cloud.  I know full well and good that it’s a marketing ploy since the company that sent me the email doesn’t have a physical controller to sell me, but it makes me wonder.

In certain scenarios I see great value in splitting the planes up and taking advantage of a cloud controller or controller-less scenario (reference Aerohive) but in my current situation, I can’t see an appreciable gain for the amount of work it would take to make a change like that.  I know I could go to a virtual controller, but given the state of our current VM environment, and the importance of Wi-Fi to an organization, I will be sticking to an appliance for the foreseeable future.

Stability?  Yes.  Opportunity to try something new?  No.

And this is where my problem lies.

I want to try all the different options so I can learn for myself.  I don’t believe any marketing information I read because I have been burned too many times by it.  I want to try before I recommend but I need a use case in mind before testing.  I am a much better problem solver than problem creator.  I’ve had so many problem creators in my life I’ve lost the ability to envision use cases without someone else coming up with them!

As I continue my path through the CWNP world, and the wireless world in general, I feel like right now I am sitting at an exit and watching everything else go by.  Sure I can borrow hardware from the WLP Lending Library and look at the GUI And CLI, but how does it perform in the real world?  How does it perform when it’s installed in a location surrounded by other hardware (like a mall) day in and day out?  These are questions I want to explore, but feel very limited because I don’t have the ability to take my testing to the bitter end – many users in a crowded environment.

Where does this leave me?  It’s hard to say right now.  What I do know is that every time, and I mean EVERY TIME I spend time around people from the Wi-Fi community, it renews my desire to keep going.  While I might not ever deploy a true “controller-less” system, at least I can reach out to people that have.

For now, I will go back to my nerd knobs and see what else I can turn in that elusive goal to configure the “perfect” system.  Who knows, I might actually get it there!

I Have a Dirty Secret

I have a dirty little secret I have been hiding for some time, and I think it’s time to come clean.

I work for the government.

So not “THE” government, but “A” government that likes to think it’s as large and as powerful as “THE” government simply because we can be called by at least three different three letter acronyms.  I know, it’s confusing for us to.  That’s actually not the secret I am hiding, I just wanted to soften the blow before the big reveal.

I am the customer.

Collective gasp!

There it is, I have come clean.  I am the guy who contacts you really early in the morning or late at night complaining that you haven’t done something.  I’m the guy who gets on conference calls and admits that during your wireless project we decided to swap out entire wired back end and change the logical configuration of networks.  I’m also the guy that admits halfway through the project that we are actually trying to piggy back a new service onto your project and now we are delayed because the pseudo network guy that used to be a radio guy might have messed up the logical configuration and in the middle of trying to fix it had his CCIE (times 3) decided he wanted to change a bunch of other stuff and now the pseudo radio guy is screwed and just told you that we are going to delay your project another couple of weeks while radio guy figures out what he’s going to do next with the logical, and physical, new network.

That was just this morning by the way.  In government work, we get to do that ALL THE TIME!  Yes, I am that guy.

I don’t mean to be.  Really, I don’t.  I try really hard to understand the technical details of Wi-Fi from the people I consider to be the experts; the ones in the trenches every day who, after dealing with me for a couple of hours, has to switch gears and then deal with an actual real company that operates under rules that make sense to the rest of the world.  When dealing with me, and my rules, please try and remember that we don’t mean to be this way, it’s just who we are.

Oh, and about those rules.

They make no sense to us either.  Our job is to try and learn the rules the best we can and then figure out how we can circumvent them to accomplish what we are really trying to do.  When we ask for weird things, there’s a reason.  Sometimes we can explain, other times the explanation is so long and complex, and wouldn’t make sense to you anyways, we don’t even try.  It’s not that we think you aren’t intelligent enough to understand what we are going to say, it’s because we know you ARE intelligent and what we are about to say will break your mind and cause you to steer your car into a ditch.

No one wants that!

Our funding cycles are weird, and even weirder when you are government but don’t have to adhere to normal government funding cycles, but still have to adhere to purchase requirements.  We are the ones that will ask for a quote and need it 20 minutes ago, only to put it on the shelf to gather dust for three months and then yank it out and need it updated in 30 seconds so maybe, just maybe we might cut you a PO for half of what we originally told you.  We don’t like it but as a government customer, it’s the nature of the beast.

One last thing I need to come clean about along the vein of being the customer – I am also the integrator for a group of people that understand the technology less than I do.  If you think you have it bad dealing with me, have some compassion because I just got off the phone with someone who told me that after you connect to the network it just works because the rest of it is “just magic.”  That’s a true story that happened in the middle of radio guy trying to fix his logical configuration mistake when he tried to have TWO quad zero routes for the global routing table on a 6509.  Again, that’s a true story from last month.  Fun, huh?

Why do I stay?  One, I feel like I still have a lot to learn about Wi-Fi.  I’m only one year onto what I will call my “formal education” on 802.11 (after 4 years of doing almost everything wrong that can be done wrong) and I’m pretty sure there is still some settings on my Cisco WLC I haven’t adjusted from factory default, figuring out what that broke, and then set them back to factory default.  Still so many nerd knobs to tune!

The other reason I stay?  I will borrow a line from a very obscure movie.

But if you can summon it all up… at one time… in one place… you can accomplish something… glorious.”

**Coming up next, the downside of staying to turn all of those nerd knobs.**

 

Why Not Cisco Live

Speaking from my grand experience of going to 4 actual “conferences”, I wanted to throw my opinion in the ring about the upcoming Cisco Live 2018 in Orlando Florida, sometime in June.

I’m not going.

I could, maybe, but after going back and forth, and some conversations with my boss, I pulled my request and decided to forgo going to Cisco Live.

For the second straight year. Why you ask? Good question.

I went to Cisco Live 2015 in San Diego and it blew my mind. It was an amazing experience, unlike anything I had ever been to or seen in my entire life. There were 30,000+ networking/wireless/security/developers/executives all in one place. The exhibition hall, otherwise known as “The World of Solutions” had more vendors than I could take in over the course of the three days or so that it was open. The number of sessions that were available to sign up for was crazy. The customer appreciation event was awesome! Who wouldn’t like Aerosmith in Petco Park? I had such a great time that when Cisco Live 2016 registration opened up, I jumped at the chance to go again.

I went to Cisco Live 2016 in Las Vegas and it was an amazing experience. There were 30,000+ networking/wireless/security/developers/executives all in one place. The exhibition hall, otherwise known as “The World of Solutions” had more vendors than I could take in over the course of the three days or so that it was open. The number of sessions that were available to sign up for was crazy. The customer appreciation event was awesome! Different act in a different venue, but still really cool. At the end of Live 2016, I realized one thing.

With the exception of the customer appreciation event, everything was the same! The classes I attended? The same presenters with almost the exact same slide deck. The wireless specific classes were almost exact. Great guys, don’t get me wrong, but almost the same thing. Technical classes that I went into with great expectations? Sales pitches.

Advanced RTLS with CMX 10.2? Sales pitch.

How to use Prime Infrastructure 2.2 (then) to manage your network? Sales pitch.

Wi-Fi radio basics with Fred Niehaus (@OhioFred but he never tweets)? Love me some Fred but it was almost the same with the new AP model thrown in for good measure.

The 30,000+ attendees? Not a great place to meet new people at lunch. Hell, we had problems finding the people that we wanted to meet up with, let alone anyone new. Sat down for lunch with a bunch of security focused people who looked at a couple of wireless guys like the enemy. Granted, we looked at them the same, so it was a pretty even staring match, but no free flow of information.

Now this may come as a surprise to some people, but my current working condition doesn’t allow for much travel. Working for the ‘port, not the ‘line, offers zero flight benefits. 20% of our current team, if not more, work here simply to reduce their travel time. Seems counter-intuitive but it’s true. Now, if you have read some of my previous blogs, you have an idea of how we (My buddy Mike and I) found WLPC and CWNP, and it was after this discovery that we decided with our limited travel budgets and opportunities, there was something better for a couple of wireless guys to go to who have already had enough fun in Las Vegas at 2 AM.

During the last WLPC in Phoenix, I was able to answer my 2 pressing questions, and get much more insight, during meal times and after hour events than I ever got at Cisco Live. I’m still waiting on a call back from the TAC person I met with in 2016 about my PoE issue on a Catalyst 3850 switch. (I’m actually not waiting any more, but you get the idea.) At both Wi-Fi Trek (CWNP conference) and WLPC I was able to take an actual class for three days on my wireless topic of choice for that trip. 4 hour deep dive on the ASR series routers? I’ll pass, thank you.

For those of you who have unlimited travel / training opportunities, then I say go for it. On my list of events to attend, Cisco Live ranks third behind WLPC and CWNP. It’s not bad, until you realize that for wireless folks the ROI just isn’t there.

Want to take a Cisco WIDEPLOY class at a conference? Don’t bother going to Cisco Live, go to the conferences where your ROI is measured by something other than your BAC.

FortiNet FAP-S313C Product Review

All attendees of WLPC 2018 in Phoenix last month were privileged enough to get a form you could fill out and send in to FortiNet and in return they would send you a FAP (FortiAP) S313C to demo. I had forgotten about the form until Lee Badman blogged about the fact that the AP came with a lifetime license on the FortiCloud service. Like Lee, I wasn’t going to send in the form because I didn’t want to waste my time with an AP that wasn’t going to replace my home equipment and I was going to lose the ability to play with an a year or two anyway. That’s why my Meraki is still sitting in the box from 2 yeas ago from Cisco Live. Sorry Meraki! Anyway, I just spent some time poking around and getting it set up, and here is my review.

I’ll break down my thoughts and opinions as I go but I will reveal my final grade now, for those foolish enough to blatantly trust my opinion without anything to back it up. I would NEVER suggest you do that, but here it is:

Overall Grade: B-

Packaging: A

I pay attention to the packaging. Believe it or not, it’s hard to package something that will survive shipping but not waste material, and money, for the vendor. Everything was in tact, and packaged in a smart way. This model is a single radio model (bummer) with external antennas that you can select between 2.4 and 5 GHz through the GUI (more to follow.) The accessories are in a tray that rides above the AP, with the AP nicely embedded in some foam packing below to keep it safe. The antennas ride in the foam containers on either side.

20180310_101831

Accessories: B-

As a disclaimer, I cut my teeth on the Meru product line (AP310, AP320, AP332) before moving to a bigger name vendor so I do have some experience with the accessories that were provided with the ancestors of this model. I get that it’s not a “FortiRu”, the closer relatives of the old Meru days, but I can see where the lineage comes through. The accessories provided are plentiful, but not always needed, and sometimes confusing.

20180310_102010

20180310_102358

 

 

 

 

 

 

It comes with the standard stuff; antennas, Ethernet jumper, quick start guides and some mounting solutions. The mounting solutions are pretty flexible, but not always self evident. It comes with a bracket, dry wall anchors and a T-Rail anchor system that was so “self evident” I had to look in the guide just to figure out what that package was. This is where I get to the point about confusing. Trying to explain the mounting solutions, and how they are assembled, to an average installer is tedious as best, and there is some work to be done before the AP can be installed. As far as the T-Rail mount, after seeing the assembled AP with external antennas, I’m pretty sure I would have a hard time selling it to anyone to mount it from the ceiling. If the AP had internal antennas then it would make a great deal of sense. My lower grade comes from the complexity of trying to assemble the T-Rail bracket and then get it on the AP.

Physical AP: B+

The AP itself is about 8″ in diameter and about an inch tall. The three antenna ports are equally spaced around the edge, and they are RP-SMA (Reverse Polarity-SMA) connectors. On the bottom there are 2 RJ-45 jacks, LAN and console, and a 12V, 3A DC power input port. On the side is a standard size USB port that I assume will be for “future use” to possibly do a code upgrade from a flash drive or for some other purpose. On the base of the AP around the edge are little “feet” to allow for the AP to sit on a shelf or table and still allow for some airflow underneath the AP. I like airflow.

This is where I start marking down the physical aspects of the device, and it really hurts me to do it. I love antennas but those antennas just don’t work for me! They are long (from the table they stick up a full 7″) and the connector part that screws into the AP has a good feel, but that means the vertical part of the antenna sticks out away from the radio part itself, making the device look larger than it really is. I think a little more thought into what environment this particular AP would be deployed, and adjusting the hardware a little more to match, would have resulted in a better product. It’s light, has 5 LED indicators (that’s left over from my AP310 days) and would mount quite easily to a hard cap ceiling (that’s sheetrock on the ceiling instead of a drop ceiling) but the antennas and how it would look hanging from a hard cap or T-Rail really distract from what is a decent product.

Initial Start-Up: B

I started by following the directions and going to the website, created an account, and got logged in first. Then I “registered” my AP to my log in using a code on a sticker on the top of the AP. The instructions weren’t accurate (disappointing) and not always intuitive. After registering the AP it told me it wasn’t online, so I plugged it in. (I did this on purpose because I wanted to see the boot time of the AP if it was already registered on the FortiCloud.).

For the connection I used a mid-span, 802.3at power injector to power up my AP and connect it to my network. From the time I powered it up for the first time until I confirmed it was online and working on the FortiCloud was a respectful 1 minute, 45 seconds. Considering it was the first time and had to negotiate on my network before heading outbound, I consider that a good time. My reason for a lower grade centers around the instructions and the GUI not being the same. It was asking me to click on things and go places that weren’t clearly indicated on the webpage.

Screen Shot 2018-03-10 at 10.38.26 AM

I guess after working with the application for a while you start to learn where things are, but they aren’t intuitive, and I had to click around a lot to get things running. For first timers it would have been nice to have a wizard or a quick start guide, but with some timed effort I made it.

Configuration: B-

Based on my initial struggle navigating around the GUI as I mentioned above, I can’t grade this part of the set up higher than a B minus. Harkening back to my Meru days, I really liked the Meru WLC GUI. It was pretty self explanatory and easy to navigate. I didn’t have that same experience when I set up my FAP-S313C.

Main Dashboard

Screen Shot 2018-03-10 at 3.40.48 PM

The main dashboard is decently laid out, but one of my hangups is I couldn’t click on anything in this page and navigate to that device specifically. It doesn’t seem like a big deal with only one or two AP’s, but I like to consider how I would run a system like this at any type of scale. The bandwidth graph is ok, but it’s missing a legend showing which color is which. Hovering over the graph will give you the legend and the exact data with date and time, but there is space above the graph on the right hand side to add a small legend. My other hangup is the trend only shows in 30 minute intervals. Personally, I prefer no longer than 15 minute increments and really like to see 5 minute intervals.

AP Detail Page

Screen Shot 2018-03-10 at 10.56.33 AM

From the Main Menu you are able to navigate to see details and stats about individual AP’s, but you can’t get from here to the configuration screen directly. My biggest pet peeve is when the GUI developers have no clue how the administrators and engineers actually use the product they are developing, and leave out simple things like hyper links from informational screens to configuration screens.

As a guy who is really interested in the radio side of things, network side of things are so-so, and security is nothing more than a necessary evil, it is at this point you know what is the driving force behind this UI. There are bunches of security related windows to configure, and a simple one for the radio (or radios if you have them.)

Radio Configuration Tab

Screen Shot 2018-03-10 at 10.59.59 AM

My biggest challenge is at no point do you click on anything that mentions radios to get to the screen where you manage the radios! After setting up a test SSID (not bad actually) I eventually found the “Platform Profile” screen and was finally able to configure some radios. Initially, it powers up with a 20 MHz wide channel (YAY!) and since I selected 5 GHz, it defaulted to channel 165 (OK.) I connected my new tablet that I got during my deep dive at WLPC (Thanks Keith, Jerry, and Scott!) and started some basic tests. The tablet is an 802.11n, single stream device. I know this from testing it at work, the FortiCloud GUI only told me that it was 802.11n connected at 5 GHz. With a 20 MHz channel, it performed about as expected in a fairly RF noisy neighborhood that I live in.

At this point I changed to a 40 MHz wide channel, and my device disconnected and the channel changed from 165 to 36. Weird, but ok. Device performance improved as expected, nothing strange here. Now I changed to 80 MHz wide channel and again, my device disconnected and when I looked again, I was back on channel 165 with a “negative bonded” 80 MHz wide channel. Without further in depth testing, it appears that while you can select which channels are available, I can’t actually set which channel it will beacon on, the “system” takes care of that for you. While running an 80 MHz channel, negatively bonded on UNII-3, I did some other testing I have been wondering about. Sorry for the tangent, but I find this interesting.

My production home Wi-Fi runs on UNII-3, 80 MHz wide channel, set at channel 149. With both radios running at 80 MHz, UNII-3, but beaconing on different channels, I ran some tests. Using the Ookla speedtest.net app and website, I ran tests using different devices connected to the different SSID’s. I won’t get in to the actual speed test results (because I actually think they are pointless and I accidentally left my wlanpi at work), what I did notice is that with only running one test at a time, that the speed fluctuated much more during the test running in this configuration. This happened on both hardware (new and my existing) and seem to disrupt data slightly. It’s a theory we have been discussing at work so I will have to come back at a later date and do some more testing. Either way, since I couldn’t “allow” UNII-3 channels but force the radios to UNII-1, I set the only available channels to UNII-1 and with an 80 MHz bandwidth, it selected channel 36. It’s interesting that it appears to select the lowest channel and then the highest channel, even at 20 and 40 MHz wide channels. That might be some more testing for a later date.

To wrap up the radio configuration setting, lets talk power. Much like some other vendors, it allows you to select a transmit power, it just doesn’t actually use dBm numbers to tell you what that power is. I get that in different UNII bands the max transmit power changes, but instead of an arbitrary number or percentage, just once give me an actual number I can reference to something.

Client Experience: A-

For this section, keep in mind I am testing at home and the rest of my family is scared of technology after listening to me for years, so all they do is stream video every once in a while. My point is I am testing without a load on the AP. To be fair, the documentation does point out that it is for low density, remote locations so maybe my testing is valid. On a single stream, 802.11n tablet I didn’t notice any difference in performance from my normal home network. The same can be said on my 802.11ac two stream phone as well as my new MacBook Pro that is 802.11ac three spatial stream laptop.

Other than the time where the AP automatically set up an OBSS scenario with my home network (2 AP’s running the same “wide channel” but beaconing on different 20 MHz channels) I never noticed a difference on my devices. I’m pretty sure I am always dis-satisfied with any Wi-Fi system I am on, including my own at home and at work, so an A Minus here is a good grade from me.

Overall User Experience: C+

FortiCloud has a nice feel and look to it while browsing around, but I could never tell who the intended user was. Is it the end user, an ISP, 3rd party provider, or a VAR? Is it wireless focused or security focused? There was one entry about VLAN’s, but not a lot about networking. Making changes took a while to find where to make the changes, and at times they can be dangerous. It has a very security-heavy feel, and a simple incorrect click can really mess things up. (I noticed that for a while that on my T-Mobile phone Wi-Fi Calling went offline due to an erroneous click. Took me 10 minutes to find that one to unselect it.). With one AP, it was fairly easy to manage. If that count were to exceed 20 or 30, I could see were it could become unwieldy.

If I could offer suggestions to improve my experience, it would be these:

  1. Add a wizard when creating new SSID’s or on boarding new AP’s.
  2. Add hyperlinks to the monitor pages to allow for quick navigation to the configuring page.
  3. Use a lot more hyperlinks. Seeing a summary of something and then not being able to dig in to it is frustrating.
  4. Use Wi-Fi specific terms. I understand that FortiNet is a security company first, but if you are going to play in the wireless field, use terms that wireless guys understand.
  5. Add more Wi-Fi related items to the monitor page and remove some of the security focused items. We all get it, security is important, but if the radio stuff isn’t working on the system, that in and of itself handles the security portion by breaking the Layer 1 connection. No Layer 1, no security risk. Add a second dashboard page for just security, that’s cool too.
  6. Allow more management of the radio side of things, or don’t allow any access to it. Don’t give me some access and then make it almost useless.

Final Thoughts

I know that throughout my review, I was tough on the FAP-S313C. That being said, it is a decent AP that does have a place in the world. I am always leery of a single radio device, but if doing a greenfield deployment in a small remote office where you will be able to control 100% of the devices connecting, and there won’t be any IoT, 2.4 GHz devices that need to talk, I can see where using this model would work. I would never ceiling mount the external antenna model, but there is an internal antenna model available. After some further thought I am thinking that FortiNet had an excess of external antenna models that they found a decent way to get rid of (ship to us) and write it off as marketing. It’s all good, I set is on a shelf 3 feet high and tested like that.

When I did a little research, I found that the 2 radio, FAP-321C has a list price of about $200 more per AP. I can see in certain, very specific conditions where a savings of $200 per AP could be a deal breaker, but I feel that as an IT professional, it is part of my job to convey the risks in purchasing a single radio AP that doesn’t give you the ability to run in 2 different bands at the same time. In the grand scheme of things, anything less than $1,000 one time expense for a business remodel, startup, or IT refresh should be a consideration. If the end cost is just too much for the FAP-321C, then I can see where this model fits in.

Being in the professional position that I am in I don’t mess around much with cloud managed AP’s. I have been a physical WLC guy my whole time, so it was fun to play with a little cloud managed for a change. I don’t have much experience with Meraki (due to the limited time licensing we discussed earlier) but this experience might prompt me to break that thing out and see if the code still works. If so, I might do a compare from someone who doesn’t play in that space much. We’ll see.

I am a little disappointed I didn’t get to see the latest and greatest from the “FortiRu” lineup and compare it to the Meru AP310 and AP320 that I started with, but that just means I might have to plan a field trip to Virginia and go visit Mitch!

Over-Subscription for “The Less Informed”

Every once in a while I hear someone throw out the term “over-subscription” or “over-subscribed.”  It dawned on at some point that there could be a big segment of the community that has heard it, has an idea of what it means, but maybe doesn’t have a firm grasp on what it means, or how and when they need to be worried about it.

Over-subscription is something that I have dealt with almost my entire professional career, and as an old radio guy, it makes a lot of sense.  Trunked radio systems, like the ones used in public safety radio, are designed and built on the idea of over-subscription.  Getting licensing from the FCC for trunked radio frequencies are expensive, and difficult to even find in some areas.  Just like in Wi-Fi, spectrum (or channels) are limited so making the best use of what you have is increasingly important.  Understanding why we got to where we are is just as important so as things progress, you don’t get lost.

SIDEBAR: I was going to call this “Over-subscription for Dummies,” like the books, but thought that was a little disingenuous so I changed it.

In The Beginning

Way back in the day, the phone companies started to install telephones in businesses.  It was easy to predict the type of load because only one person could use the phone at any given time, and since only a couple of businesses in a given area had phones, the formula for calculating demand, and the resources to accommodate said demand, was pretty simple.

Count the phones.

Over time, the phone companies realized that, for a price, they could extend the amazing phone service into more businesses and homes and then charge for the service.  That was pretty cool until people realized they could also call someone from a different town, not just someone in their town.  Phone companies then needed to install trunks that would connect their main phone switch to the next towns phone switch to handle the load.  All of a sudden, the price to provide a service that allowed every house in Town A to call every house in Town B became an issue.  Eventually someone at the phone company called a meeting and invited an MBA.

SIDEBAR: I’m not sure if this is really how this happened, but after spending time around marketing folks, I’m pretty sure this is how it went down.  For clarification, what you are about to read I made up sitting on my couch.  It’s a hunch.  A pretty good hunch, but a hunch nonetheless.

Newly minted MBA walks into a sales meeting with the president of the phone company and said “Hey, this phone thing is pretty cool.  What would be even cooler is if you could sell it to EVERY house but not invest any additional money into research or infrastructure.  The ROI on that type of model would be awesome and I should get a huge bonus for thinking of it!”  It’s at this point I believe the president of the phone company went “Cool!  I love the model of selling a ton without actually having to do anything.  Maybe we bring in some math smarty pants to figure out how much we can oversell and under deliver before someone notices.  Either way, MBA guy, here’s your wheelbarrow of cash, thanks for coming!”

What the math smarty pants guy (his name was Agner Krarup Erlang) (serious, check the link in a few words) came up with is what is known as the Erlang-B and Erlang-C formulas.  They are a couple of nasty formulas, as pictured below, that I one hundred percent don’t understand.  Basically, what it predicts is the optimal number of resources actually needed under normal conditions to provide service in a random access medium.  (I think.)

Screen Shot 2018-02-26 at 3.15.48 PM

Erlang B Formula aka Erlang Loss Formula

Screen Shot 2018-02-26 at 3.26.22 PM

Erlang C Formula aka Erlang Probability Formula

Look, any mathematical calculations that uses “birth rate” as part of the calculations I am not even going to attempt.  You think I’m joking?  Click on the link for the formulas and look it up.  It’s there.  Some crazy icon in the pictures above is a reference to birth rate.  I’m not even going to try.

OK, back on track.

At the heart of this is the idea that not everyone will require access to the service at the exact same time, but if they do, someone will have to queue, or wait, until the resource becomes available. What they also figured out was that this concept and formula were not only useful in telephony systems, but in any system that needs to understand load when the demand from users is random.

Sounds pretty familiar, doesn’t it?

In More Recent Times

So where this formula breaks down is in the event of what is termed “re-entrant traffic“.  When this happens it’s called a “high-loss system” where congestion leads to more congestion.  For example, if a TV show provides a number for users to call for a short period of time at a specific time.  If the phone provider doesn’t anticipate this demand, and viewers start to call the same number over and over again, it will crash the system until the demand goes away and service can be restored.  In a Wi-Fi environment, I equate this to an instructor in a classroom standing up and saying “go watch this video on the count of three.  Ready? 1-2-3 Go!”

In one heated debate I had with a really smart RF guy, he pointed out that it specifically states that Erlang’s models doesn’t apply to “circuits carrying data traffic.”  He is correct, and I agree with him.  CSMA/CD is a mechanism that anticipates that the demand for the medium (a switched, wired network) will always be there and can deal with peaks in demand.  The basis of my argument is that Erlang’s model does apply to Wi-Fi because even though the payload is a data packet, Wi-Fi uses a radio transmitter and receiver.  Two way radio uses Erlang’s formula to calculate needed resources in a radio system, the only difference is the payload is voice, and in newer systems, that voice that has been digitized.

In my mind, a radio is a radio.

Just like everything with 802.11, it depends is a response to pretty much any question.  In my example above of an instructor in a classroom telling everyone to watch a video at the same time, that demand can be anticipated ahead of time, and the folks that work in K-12 or EDU have a good idea of what the demand for a given space is going to be by counting chairs, assuming that as some point the instructor is going to tell everyone to go watch a video, and make their calculations from there.  For environments like that, over-subscription isn’t something you even consider.  It’s still good to understand the concept because there are spaces on campus where there isn’t an instructor giving the “go” signal, so access to the medium in that space is now randomized.

Wi-Fi, by design, is a re-entrant traffic system.  Re-entrant traffic is when the person, or device, that is trying to access the medium continues to attempt to access the resource until they are successful or kicked off for good.  When a system goes “high-loss” it means a spike in demand due to re-entrant traffic that is above the designed capacity.  In the phone world, a spike in re-entrant traffic used to occur during the holidays.  I remember being a kid and wanting to call relatives on Christmas and all the lines would be busy.  We would hang up, wait some random time (usually governed by food readiness or the current TV program) and then try again.  Keep doing that and the phone system goes into a high-loss state.

Sounding familiar, isn’t it?

In the trunked radio world, this “high-loss system” occurs right before lunch (when all the users are coordinating where they are going to meet for lunch,) and then rapidly drops off during lunch, only to resume normal system loads after lunch, tapering off until the end of the normal working day.  In the trunked radio world, they look at these numbers and determine how much more capacity is needed to support the current users (or how many users need to go away) to keep this queueing and wait times down to a minimum.  For the trunked radio systems, they use a formula similar to the one below.

SYSTEM LOAD = ACTIVE USERS x CALL RATE x CALL DURATION

Divide that by 3600 and you get the estimated system load for an hour.

In each example; phone, radio, and Wi-Fi the end result is the same.  An usual load on the system for a short period of time causes high re-entrant traffic with none of the users wanting, or able, to back off like I did as a kid during the holidays.  With phone systems, after a couple times of getting a busy signal or a recording saying all circuits are busy, most people will back off and try a different approach.  In radio, they get a deny tone long enough they just simply give up.  In Wi-Fi, there isn’t much human response to this scenario from the end user, other than simply whipping out their personal hot spot and going “rogue.”

Sidebar: The term rogue is a debate for another day, I just used it as an example here.  Don’t worry, much more heated material to debate on this topic coming up!

The Simple Formula

SYSTEM LOAD = ACTIVE USERS x CALL RATE x CALL DURATION

In order to prevent a high loss system due to re-entrant traffic, system designers need to understand Erlang’s formula, or least what it boils down to.  For the rest of this discussion, I will be referring to the simple formula used by trunk radio designers of system load is equal to the number of active users times call rate times call duration.  At first glance what becomes obvious in this equation is that for the most part, all of these are variables that are outside our control, or are they?  Let’s break it down and see what these three variables mean and are they really unpredictable and outside of our control.

Active Users

Active users are just that; the number of users that are active at any given point.  This figure is indeed random, and hard to predict.  For radio and phone engineers, they know about how many new devices will be added in any given area, and know when to raise the red flag.  As Wi-Fi designers / engineers, we can look at a given space and estimate how many people can physically cram into the space before a fight breaks out, but thats about it.  We can guess based on the current number of 2.5 devices per person and then assume that all of them are going to try to work at the same time, but I think that number is a little skewed.  I will admit that the 2.5 number is a good number, but on my last flight where I had 2 laptops, 2 tablets, and 1 phone (5 Wi-Fi enabled devices) I was only trying to use one at any given time.  I might have used 2 IP addresses from the DHCP lease pool, and used some ports in the NAT pool, but I was really only trying to use one device at a time.

The other part of this variable I urge you to consider is what is the user actually doing?  In LPV (Large Public Venue, I had to ask as well), the users are mostly uploading content to social media, or watching instant replay’s of what they just saw in real life.  When comparing the download data amounts to the upload data amounts for events in these areas, you can see that the active user population has a different agenda than they would at say an EDU or a coffee shop.  EDU and K-12 users will have a different “profile” of user types even between areas on campus.  Classrooms will get one type of demand where cafeterias or libraries will get a different type of user profile.

Knowing why your active users are accessing the system (mostly at random times I would point out) is just as important as knowing how many active users there are.  Again, “it depends” creeps up here as well.  In classroom or lecture halls, this isn’t a wide variable any longer, so paying attention is crucial.  Know your user population!

Call Rate

Call rate is the rate at which the active user population tries to access the system.  Is it once every second, once every 10 seconds or once every minute?  Of the three variables, this is the one that we have the least control over, and the most difficult to predict or even guess at.  For my money, I estimate on the high side, and fall back on the active user variable to help with my estimations.  Once I figured out what the user population will be doing in the room, estimating the call rate falls in line with that, but it’s still hard to predict.

Of the variables, this one is the most prone to outside influences.  Any unpredictable event that can influence the active users has to be taken into account, and it’s at this point a decision has to be made about how much, and for how long, the management will accept a high-loss system with lots of re-entrant traffic that can’t be handled.  I’ve never seen a Wi-Fi system “crash” under a heavy load, but I have seen it come to a screeching halt.  Risk aversion and risk acceptance also comes into play with this variable; a decision that has more political issues than technical ones.

Call Duration

Call duration refers to how long an active user, after placing a “call”, tries to retain that resource for exclusive use.  Again, this is a variable that as a designer or engineer we don’t have much control over.  This variable has the biggest impact on the system load of a system.  Harken back to the days of when you only had one phone in the house and an older sibling wouldn’t get off the phone long enough for you to call you friends and talk forever.  In the trunked radio business, this variable is the key to their success.

During my heated debate with the really smart RF guy, he informed me that as of Q4 of 2017 the average call duration for a group call (think multicast) on a public safety trunked radio system across the nation was 2.5 seconds.  2.5 seconds to gain access to the channel, transmit your “message”, and then get off the air, thereby freeing up the resource for the next individual to use.  When you think about that, it’s actually really quick.  Maybe not quick when compared to a trace file of Wi-Fi traffic that’s measured in microseconds, but for a human being that is quick!  As a follow up to that stat, the average individual call (think unicast) is around 14.1 seconds.

Let that sink in for a second.

Fourteen point one seconds for an individual to key up, gain access to the resource, send his “private” message to a second individual, and then release the resource for the next call.  This is such an inefficient use of the resource that most trunked radio system administrators limit that type of call to only one at a time.  The reason being is it is so hard, and costly, to add another frequency, or channel, to the system that they don’t allow users to waste the resource.  During that 14.1 seconds, 4.6 additional calls could have been made (the initial call plus 4.6 additional users) but 4 users had to wait, or be pushed to another resource, because someone was hogging the airtime for a message that is very inefficient compared to the group calls.

Now you’re thinking “Jim, I’m glad I’ve wasted 2,500 words and some crazy formulas to learn about trunked radio call loading.  What in the name of all that is holy and good does this have to do with Wi-Fi?”  Remember when I said I wasn’t going to debate if your mobile hotspot is a rogue or not, that there was going to be much more to debate later on?

Well here it is, get out your wrasslin’ pants because it’s about to get heated in here!

ALL OF THIS APPLIES TO Wi-Fi!

Erlang’s formulas are all about random access timing to a resource that is limited and required to transmit information intended for a distant end.  Phones, radios, and Wi-Fi.  Of the three, phones are the only one that has a wire involved, the other two are wireless!  Remember when I said that call duration was a variable that we don’t have much control over?

What if that isn’t true?

We might not have much control over what the user wants to send, or receive, but we have an inordinate amount of say on how FAST they can send or receive!

40 MHz channels?  BRING ‘EM ON!

80 MHz channels?  If you can, DO IT!

160 MHz channels?  Now we’re just being silly.  I’m crazy, not insane.

Rate limiting?  GET RID OF IT!

50 clients per AP?  WHY SO LITTLE?!?!?!

120 clients per AP?  NOW WE ARE TALKING!

Jimmy has not lost his mind

Anything that slows the user down, and in return raises their “call duration,” messes with the entire system in general.  If we look back at the formula for system loading, it is:

SYSTEM LOAD = ACTIVE USERS x CALL RATE x CALL DURATION

What if we can shrink the call duration field of the equation?  Since call rate is the one variable we really have no control over, simple math tells me that if we REDUCE the value in the call duration field of the formula, the number in the active users field can INCREASE while the actual product of the equation, System Load, can stay the same!

I know, Airtime Fairness makes sure that the slow talker in the back of the room doesn’t hog the resource, and that’s not where I’m going here.  Frame size and packet size are standardized, and it’s going to take a predetermined amount of frames to get my data across.  My contention is that if we speed up the ability of the devices to transmit or receive those frames, the client devices will demand LESS time on the system, reducing re-entrant traffic that we discussed earlier, and free up the resource for someone else to talk.

If you recall, re-entrant traffic is the key part about driving a system into a “high-loss” state.  I have been to two Wi-Fi specific conferences, and listened in on web casts and read papers, and the one thing that I do know is that loss of any kind is bad.  Why is not ok to have loss on the RF spectrum but perfectly fine to induce a high-loss system because we are trying to strangle the system?

Radio is radio, loss is loss.

During WLPC 2018, Joel Crane stood up and did a fantastic presentation called “Look Into My Eye P.A.”  Joel is a fantastic presenter, and very knowledgeable about his topics.  If you haven’t seen it, I highly recommend watching it, I included the link for your convenience.  He is one of the few presenters that I will stop and pay attention to because I know it’s going to be good.  I am also buttering him up because I am going to refer to a couple of his slides below and I hope he doesn’t mind!

Screen Shot 2018-03-04 at 3.50.13 PM

Wi-Fi is Half-Duplex

Screen Shot 2018-03-04 at 3.59.09 PM

Half-Duplex I Say!

Now Joel doesn’t know what I’m about to say, and he in no way endorses my view point, so don’t go after him.  I am using this because I was watching his presentation yesterday and was reminded about what he said, and it’s a great visual that I don’t have to reproduce.

Part of his presentation, as seen above, is that Wi-Fi is half-duplex, which means that only device can talk at a time.  Pretty sure we are all in agreement on this one.  In fact, the second picture is when Joel was talking about the fact that when one device is transmitting, all the other Wi-Fi devices in the BSS are in a listen mode.

Using this theory, why not allow the one device that is allowed to talk at any given time the best chance to get as much across while they have the medium?  I know the growing consensus is to use 20 MHz channels and nothing else but my argument is why start off constrained?  Go as big as you can until you simply can’t.  I have seen arguments that 2 devices can transmit more data using 2 separate 20 MHz wide channels coming off 2 AP’s than those same 2 devices can transmit on a single AP with a 40 MHz wide channel.  I have seen the math and I can’t argue with the math, but I wonder what would happen if you expanded that test out to a user base that is really random?

Disclaimer: Again, my theory starts to break down when you get into K-12 and EDU classrooms, but I think there is some benefit here, so hang with me.

Given that Wi-Fi is half duplex, and only one station can talk at any given time, do you really only get 3 Mbps per 1 spatial stream device when you reach 10 stations on a BSS?  If only one station can transmit, or receive at a time, wouldn’t they get all the bandwidth for that given slot?

If we all agree that Wi-Fi is half duplex, doesn’t that simple factor alone mean that the airtime slots for the stations actually aligns all the data in a neat row?  In LPV where users are generally uploading data, the data leaving the AP on the wire HAS to be in a neat row because that’s how the AP received it.  Since the AP isn’t examining the payload since it isn’t the destination MAC address, it will automatically send it up the wire.  When the traffic destined for the client device from elsewhere in the world hits the AP at random times, there is really no predictability if one client station will receive their packets first or not.  It’s actually at the discretion of the source device at the distant end and the rest of the goo between the source and destination devices, of which we generally have no control over at all.

In a unicast scenario, every packet or frame is going to have a destination address and a source address.  Whether that is coming from multiple client devices destined for one server or from one server headed to multiple clients, they will all still be a in a row.  I don’t have any packet captures or trace files showing the opposite,  but I have seen a lot of wired and wireless captures and they are all in a row, with a neat little time stamp.

The best I can do is show that client load on an AP has no bearing on the actual throughput of the AP in a scenario where Erlang C is in full effect; a guest Wi-Fi system running full open with 40 MHz channels:

2018-03-04 (1)

I have been watching this chart for the past 14 months, and haven’t been able to find any correlation between number of clients and throughput.  On this graph specifically, that I actually grabbed on 4 March 2018 at roughly 5:30 pm, look at the last three AP’s on the right.  Pretty similar client counts, in fact the second from the right has the highest quantity of 2.4 devices of the three, yet it’s throughput for this particular instant is well above the far right, and almost double of the one third from the right.  I have even watched as the AP with the highest number of clients have the most throughput and then immediately drop to one of the lowest.  The one thing I have picked up is that the throughput, per AP and globally, is much more “spikey” and random.  For our money, that means we are working more efficient, and the customer complaints have all but vanished.

More of Jim’s Arguments

Another part of my argument comes down to the question of “what is noise”?  There are a lot of opinions on this, and a lot of people can agree on what is noise, but I take it one step further and agree with Keith Parsons on a presentation he gave at Wi-Fi Trek 2017 in Orlando, Florida.  During his presentation (just before the 11 minute mark) he reminded us that noise is noise, and that my signal is your noise, and in some cases, my signal is my noise.  The more signal you jam into a space, the more the noise is.  I don’t care if it’s on frequency or not, there is a reason the hardware tries to figure out what the noise floor is and doesn’t use a calculation based on what it thinks is in the space.

Radio is radio, loss is loss, noise is noise.

By going with 20 MHz wide channels in order to “increase capacity” in a space, all that ends up happening is you add noise into a space that is already noisy.  My signal is my noise, and your noise.  My signal becomes everyone’s noise.  Instead of adding more AP’s, if the designer spends a little bit of time to understand what is going on in the space (active users and what they are doing), why not use a 40 MHz wide channel off a single AP?  Less transmitters, less noise, better performance.  If you think your cheap phone has a hard time working in a given space, try sitting next to a laptop with high transmit power and a decent antenna on channel 36 while you sit next to him on channel 40 trying to listen.  He’s on a different channel you say, no problem!  Think so?

Radio is radio, loss is loss, noise is noise, power is power.

Energy generated from a device, be it an 802.11 modulated signal or a dreaded microwave, still raises the noise floor in an environment.  If it wasn’t an issue we wouldn’t spend so much time and money trying to eliminate those sources of noise.  If there are three different tools developed to tell me if it’s a video camera or microwave, then knowing that I am generating my own noise should be a thing as well.  Less transmitters, especially high power with great antenna transmitters, should be a goal we all strive for.  Since one single AP transmitting at 10,000 watts isn’t going to happen, we need to keep working on a better solution.

A wider channel gives a station more capacity to transmit his data.  I recently saw a flowchart from Peter MacKenzie during his CWAP class on what it takes to actually transmit a frame on an 802.11 channel.  My first reaction was “how in the hell does it ever work?”  I think I even tweeted that remark out.  He then followed it up with a video showing a traffic intersection in Ethiopia somewhere.  The fact that this works at all still astounds me.  Why not, when the opportunity presents itself, allow the station to speak as much of his peace as he can, because he sure earned that right after all the junk he had to do to get the mic?

Are We Done Yet?

Admittedly, I went off the rails from talking about “Over-Subscription” because I have heard the term used without any real consideration to what it means.  Sure, we can over-subscribe the equipment, and accept it, but why not use it to our advantage if we are going to “accept some over-subscription” on the hardware.

My contention, in a nut shell, is this. What some people call “Over-Subscription” I call using the hardware that I purchased to it’s full potential.  Don’t buy a supercar if you are only going to drive through an active school zone on the way to the grocery store.  Don’t discount 40 MHz wide channels until you really prove that you CAN’T support 40 MHz wide channels.  Get the station the information it’s looking for as fast as possible, drive down re-entrant traffic, and improve system performance.  Over-Subscription?  Only if you can balance the equation and drive down those other two numbers.

Radio is radio, loss is loss, noise is noise, load is load.

Understanding how subscription, and over-subscription works, and how it all ties together, will allow you to “subscribe” your AP’s to a station number they can actually support, not what’s on the sales sheet.

Conclusion

Where does this leave us?  I will tell you; in the exact same place as we started.  I am not going to stand up and say that everyone who preaches 20 MHz wide channels is wrong or that everyone who preaches 40 MHz channels is right.  I am here to tell you that:

It Depends

At it’s heart, 802.11 is a medium where we constantly have to play the balance game.  We are CONSTANTLY robbing Peter to pay Paul.  (Not Peter MacKenzie, a hypothetical Peter.). Every time we turn the power up in one place, we have to turn it down in a different part of the system.  Changing bandwidth means either more or less available channels.  Directional antennas mean signal isn’t going someplace that it used to but going farther in a direction we might not have intended it to.  More AP’s means “more capacity” but it also means “more noise.”  Less AP’s mean wider channels, less noise but “less capacity.”  Wi-Fi is half duplex but we still run capacity equations based on all the clients that are in the space transmitting at one time, even though all but one should be listening.  Someone asked if an antenna is tuned to handle 1,000 clients even though Joel tells us in his presentation that it only matters about the one client.

It Depends

We have jobs because we have a deep understanding of technology and how to deploy it in the wild, and then go back behind and “correct” someone else’s design.  One of the things I marvel at is how much l have learned about human behavior while designing Wi-Fi, and not just technology.  There is a reason why things were done the way they were.  It is dependent on us as a community to ask the question of “why?”  Just because someone designed a system to be over-subscribed, I will ask the question of why.  No to second guess them, but to understand their thought process to achieve the design they deployed.  Due to all of this ambiguity, almost every day on Twitter I see a point where I want to jump in and say:

It Depends

 

**I welcome any all responses to my theory and ideas, and would love to hear any ideas to the contrary.  My only ask is please don’t try and debate me on Twitter.  This wasn’t over 5,000 words long simply to resort to a couple of hundred word arguments.  Leave a response, ask a question, I will get to them as fast as I can.  Thanks for taking the time to get this far!** 

**Update 8 March 2018 – Changed “A wider channel gives a station more time to transmit his data” to “A wide channel gives a station more capacity to transmit his data.” The time slot doesn’t increase with the wider channel but the bandwidth does increase.