March 19, 2019

The Networking Nerd

QoS Is Dead. Long Live QoS!

Ah, good old Quality of Service. How often have we spent our time as networking professionals trying to discern the archaic texts of Szigeti to learn how to make you work? QoS is something that seemed so necessary to our networks years ago that we would spend hours upon hours trying to learn the best way to implement it for voice or bulk data traffic or some other reason. That was, until a funny thing happened. Until QoS was useless to us.

Rest In Peace and Queues

QoS didn’t die overnight. It didn’t wake up one morning without a home to go to. Instead, we slowly devalued and destroyed it over a period of years. We did it be focusing on the things that QoS was made for and then marginalizing them. Remember voice traffic?

We spent years installing voice over IP (VoIP) systems in our networks. And each of those systems needed QoS to function. We took our expertise in the arcane arts of queuing and applied it to the most finicky protocols we could find. And it worked. Our mystic knowledge made voice better! Our calls wouldn’t drop. Our packets arrived when they should. And the world was a happy place.

That is, until voice became pointless. When people started using mobile devices more and more instead of their desk phones, QoS wasn’t as important. When the steady generation of delay-sensitive packets instead moved back to LTE instead of IP it wasn’t as critical to ensure that FTP and other protocols in the LAN interfered with it. Even when people started using QoS on their mobile devices the marking was totally inconsistent. George Stefanick (@WirelesssGuru) found that Wi-Fi calling was doing some weird packet marking anyway:

<script async="async" charset="utf-8" src=""></script>

So, without a huge packet generation issue, QoS was relegated to some weird traffic shaping roles. Maybe it was video prioritization in places where people cared about video? Or perhaps it was creating a scavenger class for traffic in order to get rid of unwanted applications like BitTorrent. But overall QoS languished as an oddity as more and more enterprises saw their collaboration traffic moving to be dominated by mobile devices that didn’t need the old dark magic of QoS.

QoupS de Gras

The real end of QoS came about thanks to the cloud. While we spent all of our time trying to find ways to optimize applications running on our local enterprise networks, developers were busy optimizing applications to run somewhere else. The ideas were sound enough in principle. By moving applications to the cloud we could continually improve them and push features faster. By having all the bit off the local network we could scale massively. We could even collaborate together in real time from anywhere in the world!

But applications that live in the cloud live outside our control. QoS was always bounded by the borders of our own networks. Once a packet was launched into the great beyond of the Internet we couldn’t control what happened to it. ISPs weren’t bound to honor our packet markings without an SLA. In fact, in most cases the ISP would remark all our packets anyway just to ensure they didn’t mess with the ISP’s ideas of traffic shaping. And even those were rudimentary at best given how well QoS plays with MPLS in the real world.

But cloud-based applications don’t worry about quality of service. They scale as large as you want. And nothing short of a massive cloud outage will make them unavailable. Sure, there may be some slowness here and there but that’s nothing less than you’d expect to receive running a heavy application over your local LAN. The real genius of the cloud shift is that it forced developers to slim down applications and make them more responsive in places where they could be made to be more interactive. Now, applications felt snappier when they ran in remote locations. And if you’ve every tried to use old versions of Outlook across slow links you now how critical that responsiveness can be.

The End is The Beginning

So, with cloud-based applications here to stay and collaboration all about mobile apps now, we can finally carve the tombstone for QoS right? Well, not quite.

As it turns out, we are still using lots and lots of QoS today in SD-WAN networks. We’re just not calling it that. Instead, we’ve upgraded the term to something more snappy, like “Application Visibility”. Under the hood, it’s not much different than the QoS that we’ve done for years. We’re still picking out the applications and figuring out how to optimize their traffic patterns to make them more responsive.

The key with the new wave of SD-WAN is that we’re marrying QoS to conditional routing. Now, instead of being at the mercy of the ISP link to the Internet we can do something else. We can push bulk traffic across slow cheap links and ensure that our critical business applications have all the space they want on the fast expensive ones instead. We can push our out-of-band traffic out of an attached 4G/LTE modem. We can even push our traffic across the Internet to a gateway closer to the SaaS provider with better performance. That last bit is an especially delicious piece of irony, since it basically serves the same purpose as Tail-end Hop Off did back in the voice days.

And how does all this magical new QoS work on the Internet outside our control? That’s the real magic. It’s all tunnels! Yes, in order to make sure that we get our traffic where it needs to be in SD-WAN we simply prioritize it going out of the router and wrap it all in a tunnel to the next device. Everything moves along the Internet and the hop-by-hop treatment really doesn’t care in the long run. We’re instead optimizing transit through our network based on other factors besides DSCP markings. Sure, when the traffic arrives on the other side it can be optimized based on those values. However, in the real world the only thing that most users really care about is how fast they can get their application to perform on their local machine. And if SD-WAN can point them to the fastest SaaS gateway, they’ll be happy people.

Tom’s Take

QoS suffered the same fate as Ska music and NCIS. It never really went away even when people stopped caring about it as much as they did when it was the hot new thing on the block. Instead, the need for QoS disappeared when our traffic usage moved away from the usage it was designed to augment. Sure, SD-WAN has brought it back in a new form, QoS 2.0 if you will, but the need for what we used to spend hours of time doing with ancient tomes on knowledge is long gone. We should have a quiet service for QoS and acknowledge all that it has done for us. And then get ready to invite it back to the party in the form that it will take in the cloud future of tomorrow.

by networkingnerd at March 19, 2019 07:24 PM

My Etherealmind

Aspects of Grey Market Products

“Grey Marketing” is a business arbitrage which a low cost source of products is used to extract profits. In short, buy the same product cheaper and sell it for the same price. It works best for branded products that are sold at different prices in different markets. There are four aspects of gray marketing of […]

The post Aspects of Grey Market Products appeared first on EtherealMind.

by Greg Ferro at March 19, 2019 02:09 PM Blog (Ivan Pepelnjak)

Automating Cisco ACI Environment with Python and Ansible

This is a guest blog post by Dave Crown, Lead Data Center Engineer at the State of Delaware. He can be found automating things when he's not in meetings or fighting technical debt.

Over the course of the last year or so, I’ve been working on building a solution to deploy and manage Cisco’s ACI using Ansible and Git, with Python to spackle in cracks. The goal I started with was to take the plain-text description of our network from a Git server, pull in any requirements, and use the solution to configure the fabric, and lastly, update our IPAM, Netbox. All this without using the GUI or CLI to make changes. Most importantly, I want to run it with a simple invocation so that others can run it and it could be moved into Ansible Tower when ready.

Read more ...

by Ivan Pepelnjak ( at March 19, 2019 07:22 AM

March 18, 2019 Blog (Ivan Pepelnjak)

Last Week on (2019W11)

TL&DR: We ran two workshops in Zurich last week – a quick peek into using Ansible for network automation and updated Building Private Cloud Infrastructure. You can access workshop materials with any paid subscription.

Now for the fun part…

Read more ...

by Ivan Pepelnjak ( at March 18, 2019 11:05 AM

The Data Center Overlords

Certification Exam Questions That I Hate

In my 11 year career as an IT instructor, I’ve had to pass a lot of certification exams. In many cases not on the first try. Sometimes for fair reasons, and sometimes, it feels, for unfair reasons. Recently I had to take the venerable Cisco CCNA R&S exam again. For various reasons I’d allowed it to expire, and hadn’t taken many exams for a while. But recently I needed to re-certify with it which reminded me of the whole process.

Having taken so many exams (50+ in the past 11 years) I’ve developed some opinions on the style and content of exams.


In particular, I’ve identified some types of questions I utterly loath for their lack of aptitude measurement, uselessness, and overall jackassery. Plus, a couple of styles that I like.

This criticisms is for all certification exams, from various vendors, and not limited to even IT.

To Certify, Or Not To Certify

The question of the usefulness of certification is not new.

One one hand, you have a need to weed out the know-its from the know-it-nots, a way to effectively measure a person’s aptitude in a given subject. A certification exam, in its purest form, is meant to probe the knowledge of the applicant.

On the other hand, you have an army of test-dumping dullards, passing exams and unable to explain even basic concepts. That results in a cat-and-mouse game between the exam creators and the dump sites.

And mixed in, you have a barrage of badly formed questions that are more appropriate to your local pub’s trivia night than it is a professional aptitude measurement.

So in this article I’m going to discuss the type of questions I despise. Not just because they’re hard, but because I can’t see how they accurately or fairly judge a person’s aptitude.

Note: I made all of these questions up. As far as I know, they do not appear on any certification exam from any vendor. This is not a test-dump. 

Pedantic Trivia

The story goes that Albert Einstein was once asked how many feet are in a mile. His response was this: “I don’t know, why should I fill my brain with facts I can find in two minutes in any standard reference book?”


I really relate to Einstein here (we’re practically twinsies). So many exam questions I’ve sat through were pure pedantic trivia. The knowledge of the answer had no bearing on the aptitude of the applicant.

Here’s an example, similar to ones I recall on various exams:

What is the order of ink cartridges in your printer? Choose one.

A: Black, Magenta, Cyan, Yellow

B: Yellow, Cyan, Magenta, Black

C: Magenta, Cyan, Black, Yellow

Assuming you have a printer with color cartridges, can you remember the order they go in? Do you care? Does it matter? Chances are there’s a diagram to tell you were to put them.

Some facts are so obscure they’re not worth knowing. That’s why reference sources are there.

I can even make the argument about certain details about regularly used aspects of your job. Take VRRP for example. For network administrators, VRRP and similar are a way to have two or more routers available to answer to a single IP address, increasing availability. This is a fundamental networking concept, one that any network administrator should know.

VRRP uses a concept known as a vMAC. This is a MAC address that sits with the floating IP address, together making a virtual router that can move between physical routers.

So far, everything about what I’ve described about VRRP (and much more that I haven’t) would be fair game for test questions. But a question that I think is useless is the following:

The vMAC for VRRP is (where XX is the virtual router ID): 

A: 00:01:5A:01:00:XX

B: 00:00:5A:01:00:XX

C: 00:01:5E:00:FF:XX

D: 00:00:5E:00:01:XX

I’m willing to bet that if you ask 10 good CCIEs what the vMAC address of a VRRP is, none would be able to recite. Knowledge of this address has no bearing on your ability to administer a network. How VRRP works is important to understand, but this minutia is useless.


I have two theories where these questions come from.

Theory 1: I’ve written test questions (for chapter review, I don’t think I’ve written actual certification questions) and I know it’s difficult to come up with good questions. Test banks are often in the hundreds, and it can be a slog to make enough. Trivia questions are easy to come up with and easy to verify.

Theory 2: Test dumpers. In the cat and mouse game between test writers and test dumpers, vendors might feel the need to up the difficulty level because pass rates get too high (which I think only hurts the honest people).


Exact Commands

Another one I really despise is when a question asks you for the exact command to do something. For example:

Which command will send the contents of one directory to a remote server using SSH?

A: tar -cvf  – directory | ssh root@ “cd /home/user/; tar -xvf -” 

B: tar -xvf – directory | ssh root@ “cd /home/user/; tar -xvf -” 

C: tar -cvf  – directory > ssh root@ “cd /home/user/;  tar -cvf -” 

D: ssh root@ “cd /home/user/ tar -xvf -” > tar -xvf directory

For common tasks, such as deleting files, that’s probably fair game (though not terribly useful). Most CLIs (IOS, Bash, PowerShell) has tab completions, help, etc., so that any command syntax can be looked up. Complex pipes like the former are the kind I use with some regularity, but I often have to look it up.


The Unclear Questions

I see these in certification tests all the time. It’ll be a question like the following:

What are some of the benefits of a pleasant, warm, sunny day? (Choose Three)

  • A: Vitamin D from sunlight
  • B: Ability to have a picnic in a park
  • C: No need for adverse weather clothing
  • D: Generally improves most people’s disposition

Look at those answers. You could make an argument for any of the four, though the question is looking for three. They’re all pretty correct. Reasonable people, even intelligent, experienced people, can disagree on that correct answer is.

Questions I Do Like

I try not to complain about something if I don’t have something positive to contribute. So here’s my contribution: These are test questions that I think are more than fair. If I don’t know the answers to these types of questions, I deserve, in every sense of fairness, to get the question wrong.

Scenario Questions

A scenario question is something like this: “Given X, what would happen”.

For example, if a BDPU was received on portfast enabled interface, what would happen? 


If a host with an IP netmask combo of was to try to communicate with a host configured on the same Layer 2 segment with an IP address of, would they be able to communicate? 

I like those types of questions because they test your understanding of how things work. That’s far more important for determining competency I think.

There are some network basics, that might seem like trivia, but knowing would be important to know. For example:

What is the order of a TCP handshake?





This question is fundamental to the operations of networks, and I would hope any respectable network engineer would know this. This would be important for TCP dump analysis, and other fundamental troubleshooting.


If you write test questions, ask yourself: Would the best people doing what this question tests get this answer right? Is it overly pedantic? Is there a clear answer? 

This was mostly written as a frustration piece. But I think I’m not alone in this frustration.

by tonybourke at March 18, 2019 02:38 AM

XKCD Comics

March 16, 2019

Honest Networker

Junos 17.3 would be great, they said.

<embed allowfullscreen="true" allowscriptaccess="always" height="512" id="v-iiWGTCUH-1-video" overstretch="true" seamlesstabbing="true" src=";guid=iiWGTCUH&amp;isDynamicSeeking=true" title="junos17-3-commit" type="application/x-shockwave-flash" width="908" wmode="direct"></embed>

by ohseuch4aeji4xar at March 16, 2019 09:18 PM Blog (Ivan Pepelnjak)

Networking Events in Europe

A European networking engineer sent me this question:

I'd like to know where other fellow engineers meet up especially in Europe and discuss Enterprise datacenter and regular networking. There are the Cisco Live stuff things to go to but are there any vendor neutral meetups?

Gabi Gerber is organizing networking-focused workshops in Switzerland every quarter (search under SIGS Workshops), and you’re most welcome to join us ;) It’s always a boutique event, but that gives us the ability to chat long into the evening.

Read more ...

by Ivan Pepelnjak ( at March 16, 2019 08:37 AM

March 15, 2019

My Etherealmind
Honest Networker

Networker noticing that has a Routing-security presentation on a NOG.

<embed allowfullscreen="true" allowscriptaccess="always" height="512" id="v-OZy8gU3G-1-video" overstretch="true" seamlesstabbing="true" src=";guid=OZy8gU3G&amp;isDynamicSeeking=true" title="henet" type="application/x-shockwave-flash" width="908" wmode="direct"></embed>



by ohseuch4aeji4xar at March 15, 2019 10:36 AM Blog (Ivan Pepelnjak)

Feedback: Data Center Interconnects Webinar

I got great feedback about the first part of Data Center Interconnects webinar from one of subscribers:

I had no specific expectation when I started watching the material and I must have watched it 6 times by now.

Your webinar covered just the right level of detail to educate myself or refresh my knowledge on the technologies and relevant options for today’s market choices

The information provided is powerful and avoids useless discussions which vendors and PowerPoint pitches. Once you ask the right question it’s easy to get an idea of the vendor readiness

In the first live session we covered the easy cases: design considerations, and layer-3 interconnect with path separation (multiple routing domains). The real fun will start in the second live session on March 19th when we’ll dive into stretched VLANs and long-distance vMotion ideas.

You can attend the live session with any paid subscriptiondetails here.

by Ivan Pepelnjak ( at March 15, 2019 08:52 AM

XKCD Comics

March 14, 2019

My Etherealmind Blog (Ivan Pepelnjak)

Using Screen Scraping in Network Automation

The first time I encountered screen scraping was in mid-1990. All business applications were running on IBM mainframes those days, and IBM used proprietary terminal system (remember 3270) that was almost impossible to interact with, so some people got the “bright” idea of emulating that screen, scraping information off the emulated screen and copying it into HTML pages… thus webifying their ancient apps.

Guess what – we’re still doing the very same thing in network automation as Andrea Dainese succinctly explained in the latest addition to his Automation for Cisco NetDevOps article.

by Ivan Pepelnjak ( at March 14, 2019 08:22 AM

March 13, 2019

Potaroo blog

The State of DNSSEC Validation

Many aspects of technology adoption in the Internet over time show simple "up and to the right" curves. What lies behind these curves is the assumption that once a decision is made to deploy a technology the decision is not subsequently "unmade." When we observe an adoption curve fall rather than rise, then it’s reasonable to ask what is going on.

March 13, 2019 11:00 PM

My Etherealmind

Analysis: F5 Networks Buys NGINX. Six Good Reasons and the rest.

F5 get service mesh, buys a competitor, get open source chops, but still behind on cloud'iness.

The post Analysis: F5 Networks Buys NGINX. Six Good Reasons and the rest. appeared first on EtherealMind.

by Greg Ferro at March 13, 2019 04:00 PM

Router Jockey

Mandatory Cisco DNA Licensing – is this the Future??

<figure class="alignleft is-resized"></figure>

With the release of the new 9200 series switches many enterprise organizations are starting to look towards the future. Cisco has also been looking towards the future… of their profit margin. With the 2960x platform is nearing it’s EOS/EOL announcement, Cisco has been working to promote the new hardware. And by now most Cisco enterprise customers have realized that DNA Center licensing is mandatory on your initial hardware purchase. This is certainly a deviation from Cisco’s normal à la carte licensing, but what do you think is the driving force behind all of this?

The era of SaaS and Subscription based licensing has been upon us for some time. Last year Gartner predicted, “By 2020, all new entrants and 80% of historical vendors will offer subscription-based business models”. These shifts to recurring-revenue models are the latest adaptation for companies like Cisco to continue to pad their bottom line with dollars their customers may not be ready to spend. After all, why would Cisco miss out on dollars left on the table?

When I started laying out the network hardware roadmap for the next 24-36 months, I quickly realized that costs for the Catalyst 9200 series will raise per port cost by 15-20% across a 5 year investment. That is enough to make many customers question why… or more importantly who they should be partnering with. This is in stark contrast to the pricing they offered with the 4500x replacement, the Catalyst 9500 series, where the TCO was rather comparable. Part of this additional uptick is the need to around additional licensing for Cisco ISE. It seems that in order to take advantage of ISE with the Catalyst 9200 series, end users have to buy DNA Advantage licensing, not DNA Essentials, which is the base license.

Note: I will admit that I have heard competing opinions on this, but I am using this Cisco PDF as a source, see page 43.

My thoughts…

This isn’t the first time Cisco has tried to push Customers into a software subscription model for their hardware purchases. Although their previous attempt, Cisco ONE licensing, was generally received about as well as a brick to the face. This time Cisco requires DNA Licensing on purchase, and forces customers who use popular features continue to renew the licensing. And since the new Cat 9k series are required to phone home, with their “Smart Licensing Feature”. Cisco has also puts a large damper on the Grey Market gear that they has been trying to squash for years. In my opinion subscription licensing has to make sense. Trying to license anything from Cisco, Microsoft, VMware, or any other large vendor has been an absolute nightmare for years now. These companies could learn a great deal from the new guys on the block like Slack, whose licensing model combined with their “fair billing policy“, makes a great deal of sense to end users.

Finally, I want to know what you think. Is the Catalyst 9200 something you’re looking forward to lifecycling into? Or are you suffering from heartburn after looking at the numbers? Am I crazy? Let me know in the comments.

The post Mandatory Cisco DNA Licensing – is this the Future?? appeared first on Router Jockey.

by Tony Mattke at March 13, 2019 01:47 PM

XKCD Comics

March 12, 2019

My Etherealmind
Honest Networker

Your Facebook-feed after friending all the IXP-people.

<iframe allowfullscreen="true" class="youtube-player" height="315" src=";rel=1&amp;fs=1&amp;autohide=2&amp;showsearch=0&amp;showinfo=1&amp;iv_load_policy=1&amp;wmode=transparent" style="border:0;" type="text/html" width="560"></iframe>

by ohseuch4aeji4xar at March 12, 2019 11:50 AM

My Etherealmind Blog (Ivan Pepelnjak)

Use Network Automation to Detect Software Bugs

This blog post was initially sent to subscribers of my SDN and Network Automation mailing list. Subscribe here.

Here’s a question I got from one of the attendees of my network automation online course:

We had a situation where HSRP was configured on two devices and then a second change was made to use a different group ID. The HRSP mac address got "corrupted" into one of devices and according to the vendor FIB was in an inconsistent state. I know this may be vendor specific but was wondering if there is any toolkit available with validation procedures to check if FIB is consistent after implementing L3 changes.

The problem is so specific (after all, he’s fighting a specific bug) that I wouldn’t expect to find a generic tool out there that would solve it.

Read more ...

by Ivan Pepelnjak ( at March 12, 2019 08:19 AM

March 11, 2019

My Etherealmind

Virtual Design Clinic 4

Sign up for Virtual Design Clinic 4.

The post Virtual Design Clinic 4 appeared first on EtherealMind.

by Greg Ferro at March 11, 2019 04:17 PM Blog (Ivan Pepelnjak)

Last Week on (2019W10)

The Spring 2019 Building Network Automation Solutions course continued with an awesome presentation by David Gee. He started with what you should do before writing a single line of code (identify processes and document them in workflows and sequence diagrams) and covered tons of boring stuff nobody ever wants to talk about.

On Thursday Rachel Traylor continued exploring graphs and their relevance in networking, this time focusing on trees and spanning trees.

The Network Connectivity, Graph Theory, and Reliable Network Design webinar is part of standard subscription You can access David’s presentation and all other materials of the Building Network Automation Solutions online course with Expert Subscription (assuming you choose this course as part of your subscription).

by Ivan Pepelnjak ( at March 11, 2019 08:17 AM

XKCD Comics

March 09, 2019

About Networks

100Gb Ethernet transceivers, modules and form factors on Cisco products

At the time of the 400Gb Ethernet interfaces introduction, here is a summary of the different form-factors, transceivers and modules available for 100Gb Ethernet on Cisco devices. As you will see below, there are many different types of 100Gb Ethernet transceivers. And each type have its own functional mode. In a next article, I will try to explain the differences between them and will give more details about the most common: the QSFP-28.   Form Factors Here is a view of the different form factors – except the Cisco-proprietary form…

The post 100Gb Ethernet transceivers, modules and form factors on Cisco products appeared first on

by Jerome Tissieres at March 09, 2019 12:34 AM

March 08, 2019 Blog (Ivan Pepelnjak)

Sample Solution: Automated Auditing Toolbox

Wherever you look you find three kinds of people: those that build tools they need, those that find the tools they need, and those that yammer about the lack of tools without ever doing anything to solve the problem.

Daniel Teycheney is clearly in the first category. When faced with “collect some data and create a simple report” hands-on assignment during the Building Network Automation Solutions course he started creating a toolbox of playbooks that can be used in initial network auditing. I’m positive you’ll find tons of useful tidbits in his code ;)

Want to be able to do something similar? You missed the Spring 2019 online course, but you can get the mentored self-paced version with Expert Subscription.

by Ivan Pepelnjak ( at March 08, 2019 07:36 AM

XKCD Comics

March 07, 2019

The Networking Nerd

Silo 2: On-Premise with DevOps

I had a great time stirring up the hornet’s nest with the last post about DevOps, so I figured that I’d write another one with some updated ideas and clarifications. And maybe kick the nest a little harder this time.

Grounding the Rules

First, we need to start out with a couple of clarifications. I stated that the mantra of DevOps was “Move Fast, Break Things.” As has been rightly pointed out, this was a quote from Mark Zuckerberg about Facebook. However, as has been pointed out by quite a few people, “The use of basic principles to enable business requirements to get to production deployments with appropriate coordination among all business players, including line of business, developers, classic operations, security, networking, storage and other functional groups involved in service delivery” is a bit more of definition than motto.

What exactly is DevOps then? Well, as I have been educated, it’s a principle. It’s an idea. A premise, if you will. An ideal to strive for. So, to say that someone is on a DevOps team is wrong. There is no such thing as a classic DevOps team. DevOps is instead something that many other teams do in addition to their other jobs.

That being said, go ask someone what their job is in an organization. I’m willing to be that a lot of people will tell you their on the “DevOps Team”. I know this because some did a report, which I wrote about here and it includes responses from the “DevOps” team. Which, according to the classic definition, is wrong. Right?

Well, almost. See, this is where this tweet of mine comes into play:

<script async="async" charset="utf-8" src=""></script>

“Pure” DevOps is hard to manage. It involves organizational shifts. It pisses people off because it’s hard to track metrics. You can’t track a person that does some traditional stuff and some of that new Dev-Op stuff. Where does that part of their job end up on a report? Putting someone in a team or a silo is almost as much for the purposes of managing that person as it is for them to do their job. If I put you in a silo, I know what you do. Or, at the very least, I can assign you tasks and responsibilities that you should be doing and grade you on those. If your “silo” is a principle and not a team, it’s crazy to grade the effectiveness of how you integrated with the developers to deliver services effectively. It can be tracked, but not as easily as a checkbox.

Likewise, people fear change. So, instead of putting their people into roles that cross functional barriers and reorganize the workflows, they instead just take the young people that are talking about the “new way” of doing things and put them in a team together. They slap a DevOps on the door and it’s done. We do DevOps now. Or, worse yet, they take the old infrastructure teams, move a few people off of them into a new team, and tell them to figure out what to do while they’re repainting the team name on the door. This has rightly been called “DevOps Washing” but a lot of people.

But what happens when that team starts Devving the Ops? Do they look at the enshrined principles of The Holy Book of DevOps and start trying to change organizational culture a little bit at a time to get the happy ending from The Phoenix Project? Do they eliminate the Brents of the world and give the security teams peace of mind?

Or, do they carve out their own little fiefdoms and start behaving like an integrated team with responsibilities and politics? Do they do things like deploy new projects to the cloud with little support from other teams. With the idea that they now “own” that workflow and can control how it’s used and how their team is viewed? If you read the article above with the report from Veriflow, you’ll find that a lot of organizations are seeing this second behavior.

Just as much as people fear proper change, they also get greedy in their new roles and want to be important to the business. And taking ownership of all the new initiatives, like cloud development, is a great way to be important. And, as much as The Phoenix Project preaches that security should be integrated into the DevOps workflow, you still half the 330 respondents to the above survey saying there is an increase in security threats to their new initiatives in public cloud.

Redefining DevOps

In a way, this “definition” of DevOps is like the title of this post. I’m sure more than a few of you bristled at the use of on-premise. Because, in today’s IT landscape we’re fighting a losing battle against a premise. When you refer to something as happening in a location, you say “on-premises”. If you say “on-premise”, you should be referring to an idea or concept. And yet, so many people in Silicon Valley say “on-premise” when referring to “on site” or “on location”. It’s grammatically wrong. But it sounds hip. It’s not the classical definition of the word and yet that word is slowly be redefined to mean what people are using it to mean. It literally happened with “literally”.

For those railing against the DevOps Washing that’s going on, ask yourself this question: Why? If the pure principles of DevOps are so much better and easier, why is everyone just slapping DevOps on existing teams or reforming other people into teams and running with the DevOps idea instead of following the rules as laid down by the sacred DevOps texts?

It could be that all organizations that are doing it this way are wrong. But are their more organizations doing it the proper way? Or is the lazy way more prevalent? I don’t know the answer, but given the number of products I see aimed at “the DevOps team” or the number of people that have given me feedback about how their organization’s DevOps teams display the same behaviors I talked about in my other blog post, I’d say there are more bad apples than purists out there.

So, what does this all mean for DevOps? Are we going to go on pointing and laughing at the DevOps-In-Name-Only crowd? Are we going to silently moan about how Real DevOps doesn’t happen and that we need to stay pure to the ideals? Or are we going to step back and realize that, just like every other technology or organizational shift that has ever occurred, nothing really gets implemented in its purest form? Instead of complaining that those not doing it the “proper” way are wrong, let’s examine why things get done the way they do and figure out how to fix it.

If businesses are implementing DevOps teams to execute the things they need done, find out why it has to be a dedicated team. Maybe they’re doing it wrong, or maybe they’ve stumbled across something that wasn’t included in the strictest definitions of DevOps. If people are giving work to those teams to accomplish and excluding other functional teams at the same time, don’t just wag your finger at them and tell them that’s not the “right way”. Find out what enabled that team to violate the ideas in the first place. Maybe the DevOps Team is responsible for all cloud deployments. Maybe they want some control over things instead of just a nebulous connection to an ideal.

Tom’s Take

DevOps in theory is a great thing. DevOps as presented in The Phoenix Project is a marvelous idea. But we all know that when theory meets reality, what we get is something different than we expected. It’s not unlike von Moltke’s famous quote, “No plan survives first contact with the enemy.” In theory, DevOps is pure and works like it should. But we’re seeing practice differing greatly from reality. The results are usually the same but the paths are radically different. And for the purists out there, if you don’t want DevOps to suffer the same fate as on-premise, you need to start asking yourself the same hard questions we are supposed to ask organizations as they start to deploy these ideas.

by networkingnerd at March 07, 2019 03:22 PM


Interview with Juniper Networks Ambassador Rob Jeffery

Had a chance to sit down with fellow Ambassador Rob Jeffery at the Juniper NXTWORK 2018 conference in Las Vegas. Rob is the Technical Director and CTO at Next Gen Security based out of the UK, with a heavy focus on bringing emerging network and security products into the marketplace. We discussed his involvement in …

by Stefan Fouant at March 07, 2019 01:58 PM Blog (Ivan Pepelnjak)

Building Network Automation Source-of-Truth (Part 2)

In the first blog post of this series I described how you could start building the prerequisite for any network automation solution: the device inventory.

Having done that, you should know what is in your network, but you still don’t know how your network is supposed to work and what services it is supposed to provide. Welcome to the morass known as building your source-of-truth.

Read more ...

by Ivan Pepelnjak ( at March 07, 2019 08:28 AM

March 06, 2019

Honest Networker Blog (Ivan Pepelnjak)

Anyone Using Intel Omni-Path?

One of my subscribers sent me this question after watching the latest batch of Data Center Fabrics videos:

You haven’t mentioned Intel's Omni-Path at all. Should I be surprised?

While Omni-Path looks like a cool technology (at least at the whitepaper level), nobody ever mentioned it (or Intel) in any data center switching discussion I was involved in.

Read more ...

by Ivan Pepelnjak ( at March 06, 2019 07:34 AM

XKCD Comics

March 05, 2019

Dyn Research (Was Renesys Blog)

Last Month in Internet Intelligence: February 2019

This post is presented in conjunction with The Internet Society.

February was a surprisingly quiet month for major Internet disruptions. In contrast to previous months, we observed few full outages or multi-day disruptions in the Oracle Internet Intelligence Map during the month. As always, there were a number of brief and unattributed disruptions observed over the course of the month, but the issues highlighted below were related to fiber cuts (and repairs) and likely problems with satellite connectivity. And while not yet a visible disruption, reportssurfaced in February that Russian authorities and major Internet providers are planning to disconnect the country from the global Internet as part of a planned experiment.


Kicking off the month, Burkina Faso experienced brief partial disruptions to its Internet connectivity on February 1 & 2, as shown in the Country Statistics graphs below. The disruptions are also evident in the Traffic Shifts graphs below for AS25543 (Onatel), which is the country’s National Office of Telecommunications, holding a monopoly on fixed-line telecommunications there. Facebook posts from Onatel (February 12) indicated that road work between the towns of Sabou and Boromo had resulted in a fiber cut, and subsequent posts made several hours later on both days reported that the resulting disruptions had been addressed.

In January’s post, we reported that a January 20th failure in the Tonga Cable had disrupted Internet connectivity to the island nation. On February 2, it was reported that full connectivity had been restored to Tonga. Since the country is GMT+13, the reports of restored connectivity align with the improvements seen in the Traffic Shifts graphs below for AS132579 (Tonga Cable Limited). During the period before the cable repairs were complete, Internet connectivity for Tonga had shifted to higher latency satellite links. As the graphs below illustrate, returning the Tonga Cable to service resulted in a significant decrease in latency as well as an increase in completed traceroutes to endpoints in this network.

On February 10 & 11, Internet connectivity in Zambia experienced several multi-hour disruptions, as shown in the Country Statistics graphs below. (Interestingly, we observed a marked increase in the DNS Query Rate metric as we saw drops in the Traceroute Completion Rate metric. While we would generally expect to see a correlated decrease in DNS request traffic during an Internet disruption, this increase may be due to applications and local resolvers making repeated resolution attempts. Due to the connectivity disruption, resolutions may fail to complete successfully, resulting in another attempt.)

Drilling down to a network level, we observed that networks downstream of AS30844 (Liquid Telecom), including AS7420 (Zamnet)AS36959 (AfriConnect Zambia), and AS37214 (Microlink Technologies) showed similar concurrent disruptions, as can be seen in the Traffic Shifts graphs below for those networks. Hai Zambia (a retail brand of Liquid Telecom) posted notices on both Facebook and Twitter, noting:

“Please be advised that we are currently experiencing a fault on our network which has resulted in Service degradation on all our services. Our engineers are working to rectify the problem and we would like to sincerely apologise for the inconvenience caused.”

Closing out February, my colleague Doug Madory suggested that observed changes in Traffic Shifts graphs for networks in Vietnam may be related to a cable cut. As seen in the graphs below for AS45899 (VNPT) and AS7552 (Viettel), during the latter half of the day (GMT) on February 27, latencies to both network spiked alongside shifts in the relative percentages of upstream providers for traceroutes to endpoints within those networks. Further research by Doug found that HiNet (a service of Taiwanese provider Chunghwa Telecom) had posted a status update confirming damage to the Asia-Pacific Gateway (APG) cable at approximately 13:30 GMT, which corresponds with the events seen in the graphs.


During the latter half of the day (GMT) on February 18, the Central Pacific island republic of Kiribati experienced an Internet disruption, shown in the Country Statistics graphs below. Lasting from just before noon until midnight, it was likely caused by an issue with satellite connectivity provided by O3b, as the Traffic Shifts graphs below show a total loss of completed traceroutes through that path, along with increased latency, for the duration of the disruption. Currently reliant on satellite connectivity, Kiribati plans to connect to the Southern Cross NEXT submarine cable, which will connect the United States to Australia and New Zealand, when it goes live in 2021.

Just a few days later, Tuvalu experienced a disruption between 17:00 and 19:30 GMT on February 22. As the Traffic Statistics graphs below illustrate, traceroutes failed to reach targets on the island nation during the period of disruption, but the DNS Query Rate metric spiked during that same time. (See above for thoughts on why this may happen.) The Traffic Shifts graphs for AS23917 (Government of Tuvalu / Tuvalu Telecom) show that nearly all traceroutes reach that network through Asia Satellite Internet eXchange, and that a significant decrease in completed traceroutes was observed during the period of disruption. Tuvalu is not currently connected to any international submarine cables, although it is reportedly interested in connecting to the Tui Samoa cable.

On February 27, the British Indian Ocean Territory experienced several brief but significant disruptions to its Internet connectivity, shown in the Country Statistics graphs below. It appears that these disruptions are related to shifts in satellite connectivity providers for Sure Telecom, the sole commercial Internet connectivity provider in the territory. As the Traffic Shifts graphs below illustrate, traceroutes historically reached Sure Telecom through Cobbett Hill Earth Station Limited, but after a brief outage, started going through Intelsat instead. It isn’t known if this transition was due to a change in business relationships, a failure of Cobbett Hill’s service, or some other reason.


A proposed law introduced in the Russian Parliament in December 2018 mandates that Russian Internet providers should ensure that the “Russian Internet” will continue to function independently should access be cut off by other countries, and that all Russian Internet traffic can be routed through exchange points approved or managed by Roskomnazor (the “Federal Service for Supervision of Communications”).

According to a ZDNet article, Russian authorities and Internet service providers are planning an experiment to test whether these mandates are technically possible, and to gather insight and provide feedback and modifications to this proposed law. While no specific date for the experiment was cited, it is believed that it will take place before April 1, the deadline for submitting amendments to the proposed law.

If and when this isolation experiment takes place, we expect that it will be visible in the Country Statistics graphs for Russia in the Internet Intelligence Map. It is likely that the Traceroute Completion Ratio metric will fall to (near) zero for the duration of the test, and if the Russian networks withdraw from the global routing tables, then this will also be reflected in the BGP Routes metric. We will likely see a sharp drop, if not complete outage, in the DNS Query Rate metric, as DNS resolvers in Russia will be unable to contact our authoritative DNS servers. It is not clear whether routing traffic through approved/managed exchange points will be part of this experiment, or a separate one, and its impact within the Internet Intelligence Map will depend on whether these exchange points add any additional latency to traffic flows and/or interfere with traceroute or DNS traffic in any way.

In a November 2017 blog post (“The Migration of Political Internet Shutdowns”), we looked at the risk of Internet disconnection for countries around the world, based on the number of domestic network providers (autonomous systems) that have direct connections, visible in routing, to international (foreign) providers. (The 2017 post was itself an update to earlier 2014 and 2012 posts on the same topic.) Within this post, we classified Russia as “resistant” to Internet shutdown, because it had over 400 autonomous systems with connections to international networks. A review of Russia’s standing at the end of February indicates that they remain in the “resistant” category.


While it was encouraging to not see significant Internet disruptions in February caused by power outages, severe weather, DDoS attacks, or government direction, this is not likely to be a long-term trend. Power outages and severe weather are certain to impact Internet connectivity again in the future, and as DDoS attacks grow more powerful, it is likely that they will again disrupt connectivity at a national level. And while it is likely that we will see more nationwide government-directed Internet disruptions occur (like the exam-related outages seen in Iraq in the past), such activity is also becoming more surgical in nature, targeting selected sites and applications rather than taking the entire country offline, such as the blocking of Twitter, Soundcloud, Bing, YouTube, and Google experienced in Venezuela on February 27.

by David Belson at March 05, 2019 02:19 PM Blog (Ivan Pepelnjak)

March 04, 2019

Networking Now (Juniper Blog)

Security Monoculture Leads to Failure – Diversify and Scale with Juniper Connected Security

Information security is inextricable from all aspects of IT and it must include everything from cloud-based advanced threat prevention to physical switches that automatically quarantine infected devices.  Automating IT in order to simplify it, make it repeatable and allow multiple products to form a whole greater than any individual component, is the basis of Juniper Connected Security.


Juniper Connected Security combines inbuild detection of threats and the enforcement of policy with the capabilities of our partners to safeguard users, applications and infrastructure against advanced threats. Combining automation with a layered approach to defense provides our customers with the capabilities to answer both extant and emerging, internal and external threats. 


In the world of information security, some things are as much a certainty as death and taxes. These truisms have been discussed ad nauseam by security experts, vendors and even governments – the information security arms race, too much data, too many workloads and not enough skilled staff to manage it all. Add in a few regulations and organizations are at security overload.


Traditionally, these perpetual challenges have been used to push information security products by playing on our fears. The world is full of bogymen, it’s unpredictable and you probably don't have enough skilled people to defend your organization. Be afraid!

by SamanthaMadrid at March 04, 2019 04:40 PM

Ethan Banks on Technology

Improve Productivity. Shut Off Notifications. (YouTube)

Here’s a short car video where I recommend shutting off notifications as a way to increase productivity. Spoiler alert. That’s pretty much the summary of the entire video, so you can save yourself the four minutes. Or…watch it to get the nuance. I’ll be okay either way. I’m not making money on YouTube ads.

<iframe allowfullscreen="true" class="youtube-player" height="405" src=";rel=1&amp;fs=1&amp;autohide=2&amp;showsearch=0&amp;showinfo=1&amp;iv_load_policy=1&amp;wmode=transparent" style="border:0;" type="text/html" width="720"></iframe>

by Ethan Banks at March 04, 2019 03:30 PM Blog (Ivan Pepelnjak)

Upcoming Events and Webinars (March 2019)

We’re starting the Spring 2019 workshop season in March with open-enrollment workshops in Zurich (Switzerland). It was always hard to decide which workshop to do (there are so many interesting topics), so we’ll do two of them in the same week:

Rachel Traylor will continue her Graph Theory webinar on March 7th with a topic most relevant to networking engineers: trees, spanning trees and shortest-path trees, and I’ll continue with two topics I started earlier this year:

Read more ...

by Ivan Pepelnjak ( at March 04, 2019 07:28 AM

Potaroo blog

A quick look at QUIC

Quick UDP Internet Connection (QUIC) is a network protocol initially developed and deployed by Google, and now being standardized in the Internet Engineering Task Force. In this article we’ll take a quick tour of QUIC, looking at what goals influenced its design, and what implications QUIC might have on the overall architecture of the Internet Protocol.

March 04, 2019 03:00 AM

XKCD Comics

March 01, 2019

My Etherealmind Blog (Ivan Pepelnjak)

Smart NICs and Related Linux Kernel Infrastructure

A while ago we did a podcast with Luke Gorrie in which he explained why he’d love to have simple, dumb, and easy-to-work-with Ethernet NICs. What about the other side of the coin – smart NICs with their own CPU, RAM and operating system? Do they make sense, when and why would you use them, and how would you integrate them with Linux kernel?

We discussed these challenges with Or Gerlitz (Mellanox), Andy Gospodarek (Broadcom) and Jiri Pirko (Mellanox) in Episode 99 of Software Gone Wild.

Read more ...

by Ivan Pepelnjak ( at March 01, 2019 08:15 AM

XKCD Comics

February 28, 2019 Blog (Ivan Pepelnjak)

Sample Solution: Automating L3VPN Deployments

A long while ago I published my solution for automated L3VPN provisioning… and I’m really glad I can point you to a much better one ;)

Håkon Rørvik Aune decided to tackle the same challenge as his hands-on assignment in the Building Network Automation Solutions course and created a nicely-structured and well-documented solution (after creating a playbook that creates network diagrams from OSPF neighbor information).

Want to be able to do something similar? You missed the Spring 2019 online course, but you can get the mentored self-paced version with Expert Subscription.

by Ivan Pepelnjak ( at February 28, 2019 07:35 AM

The Networking Nerd

DevOps is a Silo

Silos are bad. We keep hearing how IT is too tribal and broken up into teams that only care about their swim lanes. The storage team doesn’t care about the network. The server teams don’t care about the storage team. The network team is a bunch of jerks that don’t like anyone. It’s a viscous cycle of mistrust and playground cliques.

Except for DevOps. The savior has finally arrived! DevOps is the silo-busting mentality that will allow us all to get with the program and get everything done right this time. The DevOps mentality doesn’t reinforce teams or silos. It focuses on the only pure thing left in the world – committing code. The way of the CI/CD warrior. But what if I told you that DevOps was just another silo?

Team Players

Before the pitchforks and torches come out, let’s examine why IT has been so tribal for so long. The silo mentality came about when we started getting more specialized with regards to infrastructure. Think about the original compute resources – mainframes. There weren’t any silos with mainframes because everyone pretty much had to know what they were doing with every part of the system. Everything was connected to the mainframe. The mainframe itself was the silo.

When we busted the mainframe apart and started down the road of client/server computing the hardware started becoming more specialized. Instead of one giant machine we had lots of little special machines everywhere. The more we deconstructed the mainframe, the more we needed to focus. The direct-attached storage became NAS and eventually SAN. The computer got bigger and bigger and eventually morphed into a virtualized hypervisor. The network exists to connect everything to the rest of the world, and as technology wore on the network became the transport for the infrastructure to talk to everything else.

Silos exist because you had to have specialized knowledge to operate your specialized infrastructure. Sure, there could be some cross training at lower levels or administration. Buy one you got into really complex topics like disk geometry optimization or route redistribution the ability for a layperson to understand what was going on was shot. Each silo exists to reinforce their own infrastructure. Each silo has their norms and their schedules. The storage team will never lose data. The network always has to be available.

Even as these silos got crammed together and subsumed into new job roles, the ideas behind them stayed consistent. Some of the storage admin’s job roles combined with the virtualization team to be some kind of a hybrid. The networking team has been pushed to adopt more agile development methodologies like automation and orchestration. Through it all, the silos were coming down as people pushed the teams to embrace more software focused on the infrastructure. That is, until DevOps burst onto the scene.


The DevOps tribe has a mantra: “Move Fast. Break Things. Ship. Ship. SHIP!” Maybe not those exact words but something very similar. DevOps didn’t come from mainframes. It didn’t even come from the early days of client/server. DevOps grew out of a time when everything was blown apart and on the verge of being moved into the cloud. These new DevOperators didn’t think about infrastructure as a team or a tribe. Instead, it was an impediment to shipping code.

When you work in software, moving fast and breaking things works. Because you’re pushing the limits of what you can do. You’re focused on features. You want new shiny things. Stability can wait as long as the next code commit is right around the corner. Who cares about what you’ve been doing.

In order to have the best experience with Software X, please turn on Automatic Updates so we can push the code as fast as our commits will allow.

Sound familiar? Who cares about disk geometry or route reflectors. Make my stuff work! Your infrastructure supports all my awesome code. I write the stuff that pays your salary. This place would be out of business if it wasn’t for me!

Granted that’s a little extreme, but the mentality is the same. Infrastructure exists to be consumed. IT is there to support the mission of Moving Fast, Breaking Things, and Shipping. It’s almost like a tribal behavior. Everyone has the same objective – ALL THE COMMITS!

Move fast and break things is the exact opposite of the storage and networking teams. You really don’t want to be screaming along at 800Mph when deploying a new SAN or trying to get iBGP stood up. You want careful. Calm. Collected. You’re working with a whole system that’s built on a house of cards. Unlike DevOps, breaking a thing in a SAN or on the edge of a network could impact the entire system, not just one chat module.

That’s why Networking and storage admins are so methodical. I harken back to some of my days in network engineering. When the network was running slow or the storage array was taxed, it took time to get data back. People were irritated but they got used to the idea of slowness. But if those systems ever went down, it was all-hands-on-deck panic! Contrast that with the mentality of the DevOps tribe. Who cares if it’s kind of broken right now? We need to ship the next feature or patch.

DevOps isn’t a silo buster. It’s just a different kind of tribal silo. The DevOps folks all have similar mentalities and view infrastructure in the same way. Cloud appeals to them because it minimizes infrastructure and gives them the tools they need to focus on developing. Cloud sprawl can easily happen when planning doesn’t occur. When specialized groups get together and talk about what they need, there is a reduction in consumed resources. Storage admins know how to get the most out of what they have. They don’t just spin up another bucket and keep deploying.

Tom’s Take

If you treat DevOps like a siloed tribe you’ll find their behavior is much easier to predict and work with. Don’t look at them as a cross-functional solution to all your problems. Even if you deploy all your assets to the cloud you’re going to need specialized teams to manage them once the infrastructure grows too big to manage by movement. Specialization isn’t the result of bad planning or tribalism. Instead, those specialized teams developed because of the need for deeper understanding. Just like DevOps developed out of a need to understand rapid deployment and fast-moving consumption of infrastructure. In time, the next “solution” to the DevOps problem will come along and we’ll find as well that it’s just another siloed team.

by networkingnerd at February 28, 2019 04:50 AM

February 27, 2019 Blog (Ivan Pepelnjak)

More Thoughts on Vendor Lock-In and Subscriptions

Albert Siersema sent me his thoughts on lock-in and the recent tendency to sell network device (or software) subscriptions instead of boxes. A few of my comments are inline.

Another trend in the industry is to convert support contracts into subscriptions. That is, the entrenched players seem to be focusing more on that business model (too). In the end, I feel the customer won't reap that many benefits, and you probably will end up paying more. But that's my old grumpy cynicism talking :)

While I agree with that, buying a subscription instead of owning a box (and deprecating it) also makes it easier to persuade the bean counters to switch the gear because there’s little residual value in existing boxes (and it’s easy to demonstrate total-cost-of-ownership). Like every decent sword this one has two blades ;)

Read more ...

by Ivan Pepelnjak ( at February 27, 2019 07:30 AM

XKCD Comics