December 07, 2019

The Networking Nerd

SD-WAN Squares and Perplexing Planes

The latest arcane polygon is out in the SD-WAN space. Normally, my fortune telling skills don’t involve geometry. I like to talk to real people about their concerns and their successes. Yes, I know that the gardening people do that too. It’s just that no one really bothers to read their reports and instead makes all their decisions based on boring wall art.

Speaking of which, I’m going to summarize that particular piece of art here. Note this isn’t the same for copyright reasons but close enough for you to get the point:


So, if you can’t tell by the colors here, the big news is that Cisco has slipped out of the top Good part of the polygon and is now in the bottom Bad part (denoted by the red) and is in danger of going out of business and being the laughing stock of the networking community. Well, no, not so much that last part. But their implementation has slipped into the lower part of the quadrant where first-stage startups and cash-strapped companies live and wish they could build something.

Cisco released a report rebutting those claims and it talks about how Viptela is a huge part of their deployments and that they have mindshare and marketshare. But after talking with some colleagues in the networking industry and looking a bit deeper into the issue, I think Cisco has the same problem that Boeing had this year. Their assurances are inconsistent with reality.

A WANderful World

When you deploy Cisco SD-WAN, what exactly are you deploying? In the past, the answer was IWAN. IWAN was very much an initial attempt to create SD-WAN. It wasn’t perfect and it was soon surpassed by other technologies, many of which were built by IWAN team members that left to found their own startups. But Cisco quickly realized they needed to jump in the fray and get some help. That’s when the bought Viptela.

Here’s where things start getting murky. The integration of Viptela SD-WAN has been problematic. The initial sales model after the acquisition was to continue to sell Viptela vEdge to the customer. The plan was then to integrate the software on a Cisco ISR, then to integrate the Viptela software into IOS-XE. They’ve been selling vEdge boxes for a long while and are supporting them still. But the transition plan to IOS-XE is in full effect. The handwriting on the wall is that soon Cisco will only offer IOS-XE SD-WAN for sale, not vEdge.

Flash forward to 2019. A report is released about the impact and forward outlook for SD-WAN. It has a big analyst name attached to it. In that report, they leave out the references to vEdge and instead grade Cisco on their IOS-XE offering which isn’t as mature as vEdge. Or as deployed as vEdge. Or, as stated by even Cisco, as stable as vEdge, at least at first. That means that Cisco is getting graded on their newest offering. Which, for Cisco, means they can’t talk about how widely deployed and stable vEdge is. Why would this particular analyst firm do that?

Same is Same

The common wisdom is that Gartner graded Cisco on the curve that the sales message is that IOS-XE is the way and the future of SD-WAN at Cisco. Why talk about what came before when the new hot thing is going to be IOS-XE? Don’t buy this old vEdge when this new ISR is the best thing ever.

Don’t believe me? Go call your Cisco account team and try to buy a vEdge. I bet you get “upsold” to an ISR. The path forward is to let vEdge fade away. So it only makes sense that you should be grading Cisco on what their plans are, not on what they’ve already sold. I’ll admit that I don’t put together analyst reports with graphics very often, so maybe my thinking is flawed. But if I call your company and they won’t sell me the product that you want me to grade you on, I’m going to guess I shouldn’t grade you on it.

So Cisco is mad that Gartner isn’t grading them on their old solution and is instead looking at their new solution in a vacuum. And since they can’t draw on the sales numbers or the stability of their existing solution that you aren’t going to be able to buy without a fight, they slipped down in the square to a place that doesn’t show them as the 800-lb gorilla. Where have a heard something like this before?

Plane to See

The situation that immediately sprung to mind was the issue with Boeing’s 737-MAX airliner. In a nutshell, Boeing introduced a new airliner with a different engine and configuration that changed its flight profile. Rather than try to get the airframe recertified, which could take months or years, they claimed it was the same as the old one and just updated training manuals. They also didn’t disclose there was a new software program that tried to override the new flight characteristics and that particular “flaw” caused the tragic crashes of two flights full of people.

I’m not trying to compare analyst reports to tragedies in any way. But I do find it curious that companies want their new stuff to be treated just like their old stuff with regards to approvals and regulation and analysis. They know that the new product is going to have issues and concerns but they don’t want you to count those because remember how awesome our old thing was?

Likewise, they don’t want you to count the old problems against them. Like the Pinto or the Edsel from Ford, they don’t want you to think their old issues should matter. Just look at how far we’ve come! But that’s not how these things work. We can’t take the old product into account when trying to figure out how the new one works. We can’t assume the new solution is the same as the old one without testing, no matter how much people would like us to do that. It’s like claiming the Space Shuttle was as good as the Saturn V rocket because it went into space and came from NASA.

If your platform has bugs in the first version, those count against you too. You have to work it all out and realize people are going to remember that instability when they grade your product going forward. Remember that common wisdom says not to install an operating system update until the first service patch. You have to consider that reputation every time you release a patch or an update.

Tom’s Take

The SD-WAN MQ tells me that Cisco isn’t getting any more favors from the analysts. The marketing message not lining up with the install base is the heart of a transition away from hardware and software that Cisco doesn’t own. The hope that they could just smile and shrug their shoulders and hope that no one caught on has been dashed. Instead, Cisco now has to realize their going to have to earn that spot back through good code releases and consistent support and licensing. No one is going to give them any slack with SD-WAN like they have with switches and unified communications. If Cisco thinks that they’re just going to be able to bluff their way through this product transition, that idea just won’t fly.

by networkingnerd at December 07, 2019 05:16 PM Blog (Ivan Pepelnjak)

December 06, 2019

Packet Pushers

Understanding Scale Computing HC3 Edge Fabric #TFD20

With HC3 Edge Fabric, Scale Computing has created a networking architecture that reduces the hardware requirement for the edge computing cluster. There's one less switch to worry about. There are fewer Ethernet NIC ports required on the hosts. At the same time, Scale Computing isn't wimping out on resiliency.

The post Understanding Scale Computing HC3 Edge Fabric #TFD20 appeared first on Packet Pushers.

by Ethan Banks at December 06, 2019 08:49 PM

What I Want In A Root Cause Analysis Tool #TFD20

Your non-technical boss doesn't pop by your desk and ask you why there are excessive OutDiscards accumulating on Et4/0/36. You get asked why the network is slow, or why the CRM application is down. Those questions are context. Does your root cause analysis software have any sense of that context?

The post What I Want In A Root Cause Analysis Tool #TFD20 appeared first on Packet Pushers.

by Ethan Banks at December 06, 2019 08:12 PM Blog (Ivan Pepelnjak)

Video: Cloud Models, Layers and Responsibilities

In late spring 2019, Matthias Luft and Florian Barth presented a short webinar on cloud concepts, starting with the obvious topic: cloud models, layers, and responsibilities.

You need Free Subscription to watch the video, and the Standard Subscription to register for a deeper dive into cloud security with Matthias Luft (next live session on December 10th: Identity and Access Management).

by Ivan Pepelnjak ( at December 06, 2019 08:05 AM

XKCD Comics

December 05, 2019

Packet Pushers

What’s The Latest Service Provider Tech Worth Learning? – Video

On October 27, 2019 the Packet Pushers live streamed a “Weekend Edition” discussion. If you’ve watched that live stream, you’ve already seen this. If not, read on. In this excerpt, the Packet Pushers talk with Nick Buraglio about what technologies emerging in the service provider networking space are worth knowing. Of course, we discuss segment […]

The post What’s The Latest Service Provider Tech Worth Learning? – Video appeared first on Packet Pushers.

by The Video Delivery at December 05, 2019 04:33 PM Blog (Ivan Pepelnjak)

Disaster Recover and Failure Domains

One of the responses to my Disaster Recovery Faking blog post focused on failure domains:

What is the difference between supporting L2 stretched between two pods in your DC (which everyone does for seamless vMotion), and having a 30ms link between these two pods because they happen to be in different buildings?

I hope you agree that a single broadcast domain is a single failure domain. If not, let agree to disagree and move on - my life is too short to argue about obvious stuff.

Read more ...

by Ivan Pepelnjak ( at December 05, 2019 07:42 AM

December 04, 2019 Blog (Ivan Pepelnjak)
XKCD Comics

December 03, 2019 Blog (Ivan Pepelnjak)

You Still Need a Networking Engineer for a Successful Cloud Deployment

You’ve probably heard cloudy evangelists telling CIOs how they won’t need the infrastructure engineers once they move their workloads into a public cloud. As always, whatever sounds too good to be true usually is. Compute resources in public clouds still need to be managed, someone still needs to measure application performance, and backups won’t happen by themselves.

Even more important (for networking engineers), network requirements don’t change just because you decided to use someone else’s computers:

Read more ...

by Ivan Pepelnjak ( at December 03, 2019 08:11 AM

December 02, 2019 Blog (Ivan Pepelnjak)

Questions to Ask About Product Using Overhyped Technology

I stumbled upon a great MIT Technology Review article (warning: regwall ahead) with a checklist you SHOULD use whenever considering a machine-learning-based product.

While the article focuses on machine learning at least some of the steps in that list apply to any new product that claims to use a brand new technology in a particular problem domain like overlay virtual networking with blockchain:

Read more ...

by Ivan Pepelnjak ( at December 02, 2019 09:13 AM

XKCD Comics

November 30, 2019 Blog (Ivan Pepelnjak)

Worth Reading: Resilience Engineering

Starting with my faking disaster recovery tests blog post Terry Slattery wrote a great article delving into the intricacies of DR testing, types of expected disasters, and resilience engineering. As always, a fantastic read from Terry.

by Ivan Pepelnjak ( at November 30, 2019 07:30 AM

November 29, 2019

The Networking Nerd

Fast Friday- Perry Mason Moments

It’s the Thanksgiving holiday weekend in the US which means lots of people discussing things with their relatives. And, as is often the case, lots of arguments. It’s the nature of people to have a point of view and then to want to defend it. And it’s not just politics or other divisive topics. We see it all the time in networking too.

EIGRP vs OSPF. Cisco vs Juniper. ACI vs NSX. You name it and we’ve argued about it. Every viewpoint has a corresponding counterpart. Yes, there are good points for using one versus the other. But there are also times when every piece of factual information doesn’t matter because we “know” the right answer.

It’s those times when we run into what I call the “Perry Mason Problem”. It’s a reminder of the old Perry Mason TV show when the lawyer in the title would win a case with a carefully crafted statement that just ends any arguments. It’s often called a Wham Line or an Armor-Piercing Question. Basically, Mr. Mason would ask a question or make a statement that let all the air out of the argument. And often it would result in him winning the case without any further discussion.

Most people that argue keep searching for that magic Perry Mason moment. They want to win and they want to do it decisively. They want to leave their opponent speechless. They want to drop a microphone and walk away victorious.

And yet, that almost never happens in real life. No one is swayed by a single statement. No one changes their mind because of a question no matter how carefully crafted. Sure, it can add to them deciding to change their mind in the long run. But that process happens in the person. It doesn’t happen because of someone else.

So, if you find yourself in the middle of heated discussion don’t start looking for Perry Mason moments. Instead, you need to think about why you’re trying to change someone’s mind. Instead of trying to win an argument it’s better to consider your position and understand why you’re trying to win. I bet a little introspection will do a lot more good than looking for that wham line. You may not necessarily agree with their perspective but I bet you’ll have an easier time living with that than trying to change the mind of someone that’s set against it.

by networkingnerd at November 29, 2019 09:54 PM

[4/4] Composition & Service Function Chaining in Network Service Meshes

Following the Path

Welcome to part four of this series. This this final part, we will explore our options for networking a composed application, from a de-composed monolith or set of microservices.

Here is a logical set of options:

  1. Proxy: Having a network kernel, ADC or proxy for every component to handle implementation of the service chain. Sidecars quickly solve an issue, but double component count within a mesh. Proxies work well in public and private clouds, but for commercial applications may incur license costs as well as higher resource utilisation to cover the sidecar container.

  2. Language specific libraries: which wrap your application packets in a NSH handling outer encapsulation. No sidecar required, no modification of a host. This adds complexity to software development in terms of modified socket libraries, but a well designed and implemented library does not expose the complexity. All your code has to do, is accept connections through a modified socket library. This works in the cloud providing security policies and routing domains allow it.

  3. Overlay: Add flow data to forwarding entities. Let’s face it, this isn’t going to happen in a cloud environment unless you’ve implemented a full overlay. An OpenVSwitch (OVS) overlay network would work along with technology like Tungsten Fabric and the commercial version.

    For this to work, we need to enumerate function components (by IP address, socket, protocol, call) and map them to service function chains as per request intent. At the point of the request entering the network domain (i.e. the edge), classification is made on the request and a path is chosen based on some arbitrary metrics like load, response time, locality, whatever.

    Flows can be pre-programmed at the point of network domain entry.

    Header: A header can be generated at the point of network domain entry.

    Once the first hop processes its task, the payload follows the service chain and then can return directly (asymmetric) or return through the same path (symmetric) to the source.

    At each touch point, packet headers are re-written regardless. With NSH, more manipulation is done, for example the index with the SFC being traversed, but the network still doesn’t hold state and I see this as the way forward.

    Mapping Functions to Applications

    I’ve been thinking about a topology mapping service for a while for this kind of thing, which will ultimately allow a service-chain to form, rendering an application from its constituent components. Each component would need to follow some basic rules like single in/out, agreed data serialisation and structure and once done, it’s possible to drag and drop components to form an application service function chain. Each component could be collected from the servicing nodes and then through indexing each node in the graph (the network of service nodes), it’s possible to automatically build out the NSH header on ingress to the proxy, expanding capabilities directly from one-to-one mapping of say verbs on a REST application to a function or microservice, to one that contains both one-to-one and chained functionality.


    I’ve not proved any of this out yet and this is primarily a bunch of notes, thoughts and observations from watching the network service mesh space evolve.

    NSH could be a nice winner here for governing data-plane path traversal for application composition. It works today for carrier networks and 5G, so why not application composition for decomposed, F2 type deployments?!

    Would love to hear your thoughts on this. Again, this isn’t an academic paper, but a set of blabbering I wanted to share.

    The post [4/4] Composition & Service Function Chaining in Network Service Meshes appeared first on

by David Gee at November 29, 2019 03:57 PM

[3/4] Composition & Service Function Chaining in Network Service Meshes

Application De-Composition

Applications are ever evolving and so are the architecture patterns:


Monoliths were easy. Route to them and send the returned packets back to their source.

Microservices (MS) sees a monolith or new application being reduced to smaller self-contained parts, which may talk east-west or north-south. It’s quite common to see a proxy deal with inbound connections and internal communication between components hidden from external interactions. Internal communication typically is either point-to-point (also could be through a load balancer/proxy) or via a message bus of some description.

Functions & Flows makes life even more interesting. We further break down the components of microservices to individual functions that deliver pages, computation and web application components etc. More flow information exists on the whole and the number of points involved in an interaction with an application increase with every de-aggregated component deployed.

For brevity, I’m going to call Functions & Flows, F2. I’ve never seen it shortened to this, so if you see it elsewhere, let me know!

To add to this, MS and F2 components may reside on different infrastructure, separated by the internet and differing policies. Thus, deduced, different IP underlying capabilities.

Let’s use a web service that’s been ‘functionised’ as an example. A common model is to have a proxy which accepts the initial connection and deals with requests to individual components based on arbitrary rules like response time and load. But what about composition of an application? In networking, we call this service function chaining (SFC) and the concept is as old as networking itself. A loose description would be a request results in a blob of data being passed through a chain of processing components, that performs extraction, transformation and loading functionality (ETL). This could be performing calculations on a value, data enrichment or adorning data with HTML elements.

With JavaScript applications not showing any signs of going away, it’s also possible that each component on a web application calls different backends, which in turn looks like a Cartesian product of complexity.

Building Applications

Is it a possibility that each application presented to a user group is actually a list of intent presented by a proxy, that hands off fulfilment on a backend? Imagine designing an application with a library of components. A drag drop application flow-chart style, which renders function components for a scale unit. Components as they’re deployed are enumerated and indexed, i.e. their topology is known at birth, which means it’s easy for the service mesh proxy to build out a service function chain (SFC). Specifically, I’m talking about program creation, not control structure as is common with Helm Charts or Kubernetes YAML file.

From a data-plane aspect, a distributed application that does header manipulation to fulfil service function chaining could live on each host or operate with the application code in a sidecar container. This pattern is prominent today and with some minor modifications to the sidecar application, functionality can be extended to support composition.

To combat complexity, each node that does this manipulation can report activity for tracing, logging and observability operations.

It’s not important for a user to define an explicit path in terms of fulfilment of this composition, but a system could fulfil a contract request and self-re-organise upon failure or breach of contract.

I firmly believe, half the challenge is decoupling operator paranoia from declarative need. I “want” an application of type X and the infrastructure should be able to grab that intent and render my application, without worrying about primitives like VxLAN, NSH, VLANs or VRFs. Why should we care about these things? When it goes wrong is the rhetorical answer and that’s why data visibility, collection and processing is key. When the situation goes south, engineers should be able to either modify and re-deploy a la cattle style, or temporarily override to correct until a re-deploy is possible. That doesn’t mean either having power CLIs everwhere. Operators should be able to manipulate state with a usable API and have skilled engineers write a temporary fix application until a production fix is available. OpenFlow overrides are handy here! Some network element operating systems offer ephemeral configuration that can override boot time config to aid this scenario.

Application development patterns will follow a take on an Extract Transform and Load (ETL). For example, de-serialise the payload, unmarshal, modify, searialise and push. A common data encoding standard would need to be agreed by any development team for this to work effectively.

I mentioned in part one, we could just use a message queue to act as data-plane pathways between each constituent component within an F2 system. It would be fairly trivial at deployment phase to set the queues up and pass in as function arguments their respective communication message queue information. I believe it adds more complexity to the F2 pattern to do this and without it, developers pop the payload, perform ETL and push it back on to the path.


Part Four and the final part of this series of posts, looks at way to follow the path of a composed application.

The post [3/4] Composition & Service Function Chaining in Network Service Meshes appeared first on

by David Gee at November 29, 2019 03:57 PM

[2/4] Composition & Service Function Chaining in Network Service Meshes

Not Storing State in the Network

OpenFlow (OF) adoption failed due to scalability of forwarding tables on ASICS, not so great controllers, lack of applications and a non-existent community. OpenFlow however is still useful today for overriding forwarding decision making on a hop-by-hop basis and handling exceptions from what would otherwise be a normal steady state forwarding decision. Exceptions like bypassing limited throughput devices like DPI nodes for large known file transfers are a classic use case. We don’t care beyond simple authentication (maybe) who the client is, so take our file and don’t consume resources doing it.

OpenFlow presents flow state to an ASIC, state that can be granular. If we use it for forwarding equivalency classes (FECs) then it’s no different to normal routing and frame forwarding. That wasn’t the goal and thus, it added to the list of failure reasons. A controller programs flows via an OpenFlow interface on a network element, flows which could time out automatically or be long-lived, requiring the controller to remove them. Also, flows can be programmed proactively from a network design, or reactively from the controller receiving a header packet and deciding what to do with it. Vendors naturally added to the complexity of this by throwing in ‘go to normal’ lookup tables type functionality, known as ‘hybrid’ OpenFlow. OpenVSwitch is the most successful artefact that came out of the OpenFlow movement and can be used to create great open network solutions. As a software artefact, it isn’t necessarily bound by hardware limitations. With regards to hardware, vendors have been and gone that have tried to work with granular flow information.

Under failure of a controller, all state has to be re-discovered and handled. It doesn’t feel very natural and it’s a headache. So let’s not do it.

I truly believe OpenFlow however acted as a catalyst for something like P4 to come along and wasn’t an entire waste of time. If you’re interested in programmatic friendly networks, it’s well worth watching P4 evolve.

Taking some inspiration from this, having a controller observe topology and forwarding capabilities of both physical functions (PFs) and virtual functions (VFs), it becomes trivial to enumerate nodes, program some basic index or vector identity on to them and have them behave as a forwarding agent within the domain of a service mesh, irrelevant of how the data-plane functions at this point.

Storing state in the network at scale, doesn’t work as we’ve lightly touched on. The internet consists of approximately 800k routes (at the time of writing), which is a positively large lookup table for performance expected to be in

. Put on to commodity hardware (i.e. enterprise), then it really doesn’t scale or work. We want the network to be a commodity dumb system with as few changes as remotely possible to increase stability and reliability.

Some ~40 years of major IPv4 usage has passed and it would be foolish to assume we could just rip and replace networks globally to make life easier. Within the confines of data centre and service delivery, our upgrade cycle however may let us embrace open standards based technology that vendors support. Let’s assume that’s 80% of standard functionality of something like SPRING.


Part three carries on where we left off, with application de-composition and some of the challenges associated with that.

The post [2/4] Composition & Service Function Chaining in Network Service Meshes appeared first on

by David Gee at November 29, 2019 03:56 PM

[1/4] Composition & Service Function Chaining in Network Service Meshes

This is part one of a series of posts on Application Composition within Network Service Meshes, otherwise known as Service Function Chaining, but at L7 and not L3/L4.

In Network Service Meshes (NSM), it is a complex affair steering L7 requests and responses through the correct network of components. The current approach at the time of writing (November 27th 2019) is to accept requests on a proxy entity and couple that proxy to an application component through a data-plane. Ideally the model works in both private on-premises and cloud deployment models.

For the sake of building a mental image, this is a graph network that has both control-plane and data-plane attributes on nodes and vertexes.

In IP networking, IP packets are routed to their destination and return to their source, based on their destination IP header field and when policy requires it, we can use other fields like source IP, protocol and port numbers etc. In large networks (like the internet), it’s the destination field in the IP header. In both IPv4 and IPv6 there exists a means to steer packets through a network based on additional fields being present at the point of ingress to a network edge and thanks to Segment Routing and Service Function Chaining, there are multiple choices which may assist us in our problem space exploration.

This post specifically focuses on application composition using exploded application deployment, like microservices or individual functions. See below for a terrible diagram.

For application composition through distributed functions in a service mesh, the challenge is how to process requests hitting a proxy and carrying out service chaining without the network holding service flow state. Flow state grows with each iteration of application de-aggregation or de-composition. Flow state exists within the domain and bounds of a system, irrelevant of where it’s stored. It could be in the control-plane, data-plane, user-land application of in a packet.

Given application composition through microservices and/or individual functions, the granularity must be handled without affecting performance of the network on the whole and for portability, it should require zero or minimum changes to the underlying infrastructure. This means not letting the requirement of state leak on to the underlying network.

This series of posts explores methods of achieving a stateless service mesh for the purposes of application composition through service chaining of individual components, where all state is inferred onto the packets at the service mesh edge. The service mesh can therefore be an overlay or underlay type of network, behind an edge proxy such as Envoy or other application delivery controller (ADC) with suitable functionality or extensibility.

Something I find interesting, is the domain knowledge difference between network engineering and software development. For example, imagine a conversation between a network engineer and software developer on the merits of an F5 Big IP load balancer vs NGINX. It would be enjoyable to listen!

Some constraints:

C1. Minimal modification of existing IP network.

C2. Ideally cloud friendly for the concept of multi-cloud i.e. making the best of highly available competing cloud offerings for reliable services and also, the all important money saving.

C3. Developer friendly. If it isn’t, it won’t get used. Developers will just find a way not to use it, irrelevant of it being mandated by managers. This isn’t because developers are bad, they are natural shortest-path-to success hunters.

Here are some logical choices:

  1. IPv4 Native: Loose Source Routing (LSR) and Strict Source Routing (SSR). It’s also possible to use the Record Route (RR) header to track the path through the mesh.

  2. IPv6 / MPLS: Segment Routing groups nodes together into segments and then packets can traverse those segments. It makes traffic engineering simpler on the whole. There is also a source routing header, which was deprecated in favour of SR. More on SPRING.

  3. Network Services Header (used in service function chaining – RFC8300).

  4. Classic SDN using supported means (OpenFlow)

  5. Don’t bother with the network at all and use a message queue 1:1 between constituent components as a forwarding path.

A word on security. There be dragons. I acknowledge this readily and openly. I will not cover that aspect in this post. With my limited security knowledge, I see challenges in creating and managing such a thing.


Part two will touch on handling state which can be found here.

This is an interesting topic and it’s an area under constant development!

The post [1/4] Composition & Service Function Chaining in Network Service Meshes appeared first on

by David Gee at November 29, 2019 03:56 PM

My Etherealmind

Software I Use – Black Friday 2019

These are productivity and work apps that I use personally and recommend to people. 

The post Software I Use – Black Friday 2019 appeared first on EtherealMind.

by Greg Ferro at November 29, 2019 01:40 PM Blog (Ivan Pepelnjak)

IP Fabric with Gian-Paolo Boarina on Software Gone Wild

No, we were not talking about IP fabrics in general - IP Fabric is a network management software (oops, network assurance platform) Gian Paolo discovered a while ago and thoroughly tested in the meantime.

He was kind enough to share what he found in Episode 107 of Software Gone Wild, and as Chris Young succinctly summarized: “it’s really sad what we still get excited about something 30 years after it was first promised”… but maybe this time it really works ;)

by Ivan Pepelnjak ( at November 29, 2019 10:06 AM

XKCD Comics

November 28, 2019 Blog (Ivan Pepelnjak)

Upcoming Events and Webinars (December 2019)

The registration is still open for the Using VXLAN to Build Active-Active Data Centers workshop on December 3rd, but if you can’t make it to Zurich you might enjoy these live sessions we’ll run in December 2019:

All webinars I mentioned above are accessible with Standard Subscription, and you’ll need Expert Subscription to enjoy the automation course contents.

by Ivan Pepelnjak ( at November 28, 2019 08:57 AM

November 27, 2019

Potaroo blog

My IETF 106

The 106th meeting of the IETF was in Singapore in November 2019. As usual for the IETF, there were many Working Group meetings, and this report is definitely not an attempt to cover all of these meetings or even anything close to that. Here I’ve been highly selective and picked out just the items that I found interesting from the sessions I attended.

November 27, 2019 11:00 PM

My Etherealmind

OS X: TimeMachineEditor App (Free)

Hugely useful tool because Time Machine runs way too often on my machine.

The post OS X: TimeMachineEditor App (Free) appeared first on EtherealMind.

by Greg Ferro at November 27, 2019 06:46 PM Blog (Ivan Pepelnjak)

The EVPN Dilemma

Got an interesting set of questions from a networking engineer who got stuck with the infamous “let’s push the **** down the stack” challenge:

So I am a rather green network engineer trying to solve the typical layer two stretch problem.

I could start the usual “friends don’t let friends stretch layer-2” or “your business doesn’t really need that” windmill fight, but let’s focus on how the vendors are trying to sell him the “perfect” solution:

Read more ...

by Ivan Pepelnjak ( at November 27, 2019 04:40 PM

Network Design and Architecture

Ask these questions before you replace any technology in your network !

If you are replacing one technology with the other, these questions you should be asking.

This may not be the complete list and one is maybe more important than the other for your network , but definitely keep in mind or come back to this post and check before you replace one technology with another one !


Is this change really needed ? Is there a valid business case ?


First and most important question, because we are deciding whether this change is absolutely necessary. If the technology which you will migrate won’t bring any business benefit (OPEX, CAPEX , New Revenue Stream etc.) then existing technology should stay.

This is true for the new software release on the routers as well. If there is no new feature which you need to use with the new software release and if there is no known bug that effects the stability of the network, having a longer software lifecycle is better than upgrading the software frequently.


What is the potential impact to overall network ?


New technology might require extra resource usage on the network. Can your network devices accommodate this resource usage growth ? Opposite is true as well. New technology might reduce the resource usage but at what cost ?

In general , reducing the state in the network (Routing table , MAC address table , MPLS Label table etc.) creates suboptimal routing and black holing depends on the network topology.

For example, if you replace your full mesh IBGP design to IBGP Route Reflector design, it reduces the overall resource usage on the network (Reduces the state on the routers) but creates suboptimal routing (Depends on the topology)


What will be the migration steps ?


In network design lifecycle , deployment steps are prepared by the designers. These steps are covered in Network Migration Plan document , if separate migration document is not asked by the customer , then in the Low Level Design document (LLD) , these steps are highlighted clearly.

If migration steps is not executed in an order, you have longer network down time which cost money to the organization. Or migration operation completely might fail.

In any migration document , rollback plan should be included too. So , if the migration can not be executed in a planned time, escape plan , rollback plan should be started as planned earlier.


Is there a budget constraint ?


Budget is always a real concern. Almost in any business. Why budget is important in technology migration ?

Because new technology may not be known by the network engineers , and learning process might be necessary.

Free learning resources are good but how much you can trust ? So, I always recommend people to take training in a structured way from a known network designers who follow the most recent updates in the industry , designed many networks from any scale (not just couple large scale) , recommended by the people who you trust in the industry. ( I spend time to write, let me do a little marketing 🙂 )

Budget is a concern when you try to design large scale network , adding a new technology to the existing (brown field) network , merging and acquisition design , securing the network etc. When you migrate , ask yourself , do my network engineers in the company know/handle the new technology ? Do you need to buy additional hardware to accommodate the new technology ?

I can expand this list. Let me know your comment in the comment box below. Did you recently migrate any technology in your network ? Would this post be helpful ?

The post Ask these questions before you replace any technology in your network ! appeared first on

by Orhan Ergun at November 27, 2019 11:47 AM

Please don’t register to South Africa/Johannesburg CCDE Class, it is full !

Hi Everyone,

I would like to inform you that Instructor Led CCDE Class in South Africa/Johannesburg Training is full. So please don’t register for it.

Having more people will reduce the time required for discussions. Those who attended any of my earlier class know that we have already very packed agenda, approximately 2000 pages documents and so many real life discussions in 5 days. Hopefully we will schedule another training session in SA next year, and please when I announce, just hurry up for registration.

I will be in Johannesburg between May 13 – 18, if you would like to meet me, please send me an email to

Even if you are not considering any network design training, still that is okay, would like to meet and know as many network engineer as I can while still I am able to 🙂

The post Please don’t register to South Africa/Johannesburg CCDE Class, it is full ! appeared first on

by Orhan Ergun at November 27, 2019 11:41 AM

Telecom Operator Network Design Training

I missed writing , missed writing a post lot on the website specifically !. Because I know you are reading right now and wonder where I have been.

I just checked and seen that my last post was on October 26. More than 2 months , I didn’t share anything on the website. I wanted to come here and share something , technical or social , but believe me guys November and December 2017 was so busy from my side.

One of the activity which took my time during this period was Telecom Operator design training which I did in Kenya/Nairobi on November. Safaricom Kenya – Incumbent Telecom Operator/Internet Service Provider.

It was 5 days training and IP/MPLS Backbone planners , Transport network engineers, mobile access and core engineers , fix and mobile wireless service engineers (They have very good fiber penetration in the country) many people attended this training.

Most of the topics were from my CCDE training blueprint but after couple discussions with their lead engineers, we removed CCDE Practical scenarios and couple other topics , as they won’t attend CCDE exam , but added other technologies which they are considering to implement and some of those technologies is already in trial.

For the confidentiality, I cannot tell you what was those newly added topics to my training curriculum, but I just wanted to say that, I adjusted the training agenda based on their needs.

Overall, we spent more than 40 hours and more than 20 hours of the training was just related with their end to end network design. They have many different services , FTTx, 2G, 3G, LTE , Metro Ethernet , VPN Services , Satellite and so on. They are the biggest in Kenya in terms of number of customers.

I already planned couple other Telecom Design training in 2018 and will update about those trainings and the feedback of the attendees.

Now, it is time to share some feedback from Safaricom Kenya attendees. Note : If you would like to bring me for an Onsite training to your company, to talk about your company’s design please contact with , they arrange technical discussions with me for the customized training.


Andrew Masila – Architecture and Service at Safaricom Limited

Your class was very useful and served as a valuable addition to my experience in telecommunications, architectures, quality of experience and content delivery networks.

I will definitely use the knowledge acquired to make more sound business decisions and investment as well optimize operations in my organizational unit with a fit for purpose network.

Most important is the approach in understanding exactly why a certain option has been chosen.

Jackson Mutie – Packet Core Engineer at Safaricom

It was nice to be your student and enjoyed the MPLS TE technology and network design principles sections.
The network is about to evolve in design so I hope to be able to reach out to you and discuss various design and architecture options.
Kind Regards,

Stephen Njoroge Njuguna – Subject Matter Expert – Transport Network at Safaricom

Your training was quite relevant and practical, since we were able to relate the theory covered to actual scenarios on our network and identify areas of improvement. Thanks also for your encouragement to start the CCDE journey and making it look achievable.

Silas Kimathi Borona – Senior Network Planning Engineer

I attended Orhan Ergun CCDE course and I must say it was remarkable . It was well illustrated ,kin on current and advanced technologies .

He surpassed my expectation i.e. Quality , Depth of Knowledge, Well-structured online Resources and Quizzes . Intriguing eye-Opener!

The post Telecom Operator Network Design Training appeared first on

by Orhan Ergun at November 27, 2019 11:32 AM

CCDE Practical/Lab Exam Result Policy

Many of my students have been asking whether CCDE Practical/Lab Exam Result policy is still the same.

As you might know after CCDE 2017 May exam cancellation , practical exam results can be learned after 8 to 12 weeks. This mean, if you attend CCDE Practical exam, you cannot learn the result (Pass or fail) on the same day. It was the case until CCDE 2017 May exam.

You used to finish the 8 hours exam and when you click the end exam button, exam result was just there !

This is not the case anymore. Last CCDE Practical exam was on November 2017 and the attendees are still waiting their results as of today. There is no any new announcement by Cisco and I would expect the same thing for February 27, 2018 CCDE Practical exam.

Learning exam result is good probably for the exam security but I hear complaints from the students and i think some of my readers are thinking the same.

They say, if they knew that they failed , they could schedule the next exam. Exam results of the previous one arrives right before the next one , thus , candidates cannot find a time to reschedule in case they fail.

I think, Cisco should change the policy again and students can learn the result same day or at least in a week so they can schedule the next exam , book the flights , hotel etc.

As many of my CCDE students passed the CCDE Practical exam, let me help you to pass it! Click here to understand how I deliver my courses and why we are successful in CCDE.

The post CCDE Practical/Lab Exam Result Policy appeared first on

by Orhan Ergun at November 27, 2019 11:27 AM

What is GLBP and where GLBP should be used?

GLBP stands for Gateway Load Balancing Protocol. In this article, I will explain where GLBP is used , where it shouldn’t be used with the topologies.

GLBP is a Cisco preparatory protocol. In most networks, design requirements might be to use only standard based protocols. If that is the case, GLBP is not a standard based protocol and business cannot use it.

Unlike HSRP and VRRP, GLBP supports flow based load balancing.

HSRP and VRRP can only supports active/standby redundancy or Vlan based load balancing.

GLBP was invented to provide an active-active traffic forwarding the network traffic but there is almost no use case in today networks.

In some cases, GLBP has create more problem than it should solve.


GLBP cisco

Figure – GLBP at the Enterprise Internet Edge


In the above picture, I depict classical Enterprise Internet edge network. Firewall, Layer 2 switch and Internet Gateways.

Service Providers don’t use stateful devices such as Firewall at the Internet edge.

If in this network, GLBP is used, firewall would send an ARP for the default gateway and only one of the Internet Gateway routers would be used as a default gateway. If there would be two firewalls, another firewall could use the second Internet gateway routers as the default gateway and traffic from the firewall to the Internet could be load balanced.

Let’s say R2 is chosen as the default gateway, if that is the case, firewall sends the Internet traffic always to R2 only. This is called polarization.

Situation can be worse.

Imagine we have a BGP running on the Internet gateways and we implemented a Local Preference policy to choose R1 as default gateway.

In this case, all the traffic from the firewalls first would be sent to R2 , then R2 would forward the traffic over R1-R2 link to R1 and finally R1 would send the traffic to upstream provider for the Internet.

GLBP creates polarization issue at the Internet Edge if there is a firewall.

If active-active redundancy is required for the Layer 2 networks, most Enterprises rely on Multi Chassis Link Aggregation Group (Multi Chassis Ether channel in Cisco). Cisco VSS and VPC technologies use Multi Chassis Link Aggregation concept.

If scalability is concern, then Enterprises or Service Providers, for their layer 2 networks, rely on Layer 2 multipathing solution (Fabricpath is common).

The post What is GLBP and where GLBP should be used? appeared first on

by Orhan Ergun at November 27, 2019 11:14 AM

What is Urban and Rural area in networking ?

What is urban and rural area ? What is underserved area in networking ?

These definitions are heavily used in networking. And all broadband network designers take always these definitions into an account while they do their design. I think knowing these definitions as a network engineer is valuable for you.

In general, a rural area or countryside is a geographic area that is located outside towns and cities

Whatever is not urban is considered rural area though some people uses less populate than urban but more populated than rural area as suburban area

Typical urban areas have a high population and large settlements

Typical rural areas have a low population density and small settlements

Underserved areas where there is no good network coverage (Broadband , Voice or any other data types)

Unserved areas where there is no network coverage at all

For example,if mobile operator will place a cell sites in an urban area, since the population density will be too high, they consider to place more cell sites than if they place those cell sites in a rural area.

FTTx planers consider to change their ODN (Optical Distribution Network) design entirely depends on they are doing FTTx deployment in urban or rural areas.

In general, having a fiber access to rural areas is considered as not a good idea economically, thus in rural areas either mobile broadband or WISPs (Wireless Internet Service Provider) with unlicensed spectrum serve.

This is one of the topics under ‘ Service Providers Physical Connections and Locations ‘ module of Service Provider Design Workshop. Join this workshop to learn the concepts which are not thought in any other trainings.

The post What is Urban and Rural area in networking ? appeared first on

by Orhan Ergun at November 27, 2019 11:06 AM Blog (Ivan Pepelnjak)

Is There a Future for Networking Engineers?

Someone sent me this observation after reading my You Cannot Have Public Cloud without Networking blog post:

As much as I sympathize with your view, scales matter. And if you make ATMs that deal with all the massive client population, the number of bank tellers needed will go down. A lot.

Based on what I read a while ago a really interesting thing happened in financial industry: while the number of tellers went down, number of front-end bank employees did not go down nearly as dramatically, they just turned into “consultants”.

Read more ...

by Ivan Pepelnjak ( at November 27, 2019 09:19 AM

XKCD Comics

November 26, 2019

Packet Pushers

IP Fragmentation in Detail

When a host sends an IP packet onto the network it cannot be larger than the maximum size supported by that local network. This size is determined by the network’s data link and IP Maximum Transmission Units (MTUs) which are usually the same. A typical contemporary office, campus or data centre network provided over Ethernet […]

The post IP Fragmentation in Detail appeared first on Packet Pushers.

by Steven Iveson at November 26, 2019 05:54 PM

My Etherealmind
Network Design and Architecture

Congratulations to Roy Lexmond on Passing CCDE Practical Exam!

I am very glad to announce that Roy Lexmond from my April CCDE training class passed CCDE Practical exam yesterday in France.

Below is his success story and here is his earlier feedback for the class. I should say that He really likes the design and open to learn new things and very clever.

Please join me to congratulate Roy for his great achievement!

On 19th May in France (Paris) I passed CCDE practical exam. My preparation was done with the cisco learning network excelsheet, Ciscolive video’s, internetworkexpert SP&CCDE courses, Orhan Ergun CCDE Bootcamp and I attended the Orhan Ergun bootcamp in April-May with lots of great people which helped me prepare well. I really think that the bootcamp helped me to focus on key technologies and discuss them with other people (very important for me) and to understand how to approach the exam.

It was a challenge and took me 2 years, my satisfaction is extreme! and learned alot during those 2 years and still learning. My next goal will be CCIE-SP wich covers some great content inline with the topics the CCDE already covered.

Roy Lexmond

Senior Network Engineer at Routz


I promise to announce much more success stories of my students for August CCDE Practical Exam. Come and join my class !

The post Congratulations to Roy Lexmond on Passing CCDE Practical Exam! appeared first on

by Orhan Ergun at November 26, 2019 08:40 AM

Flat/Single Area OSPF network is not a problem!

Flat OSPF network, or single area OSPF networks are real. In fact most of the OSPF network today deployed, is flat OSPF networks. But how many routers can be placed safely in an OSPF area ? Any number from the real world OSPF deployment ? I will share in this post.

Let me explain what it is first and then will share you some numbers from the real network which I engaged recently.

As you might know, OSPF has two levels of hierarchy. Backbone and Non-Backbone areas.


Why Non-Backbone Areas are used in OSPF?


The reason is scalability and manageability. At least in theory. I don’t see so many multi area OSPF design though I teach in very detail in my CCDE classes. But that is for the exam purpose.

There are some very large scale networks use OSPF for scalability, so, IP but satellite (Sometimes called an Access POP) POPs are in Non-Backbone area they place.

But there is manageability aspects of having multi area OSPF design. They group their slow speed access and metro or aggregation networks in different OSPF areas and place high speed backbone/core routers in a backbone OSPF area (Area 0).

But, we generally forget to talk about the complexity of the Multi area OSPF networks. When people have Multiple areas, at the area boundary (ABR), they usually summarize the IP prefixes. I am not saying this is bad. But managing the summaries and filters increases the overall configuration of the network.


How many routers in real deployments flat OSPF network can accommodate?


You might see in textbooks , don’t place more than 40 – 50 routers in a backbone area. When the number of routers increases , place them in a separate areas.

This is not wrong, but this was old. 20 years before when you heard this, you could understand the main reason as hardware of the routers. They couldn’t handle the topology change of the network, they couldn’t handle the SPF CPU usage, they couldn’t keep all the prefixes in the memory and so on.

But of course, hardware is more advanced as of 2017. Routers can handle , in fact they handle , much more routers in a flat OSPF design.

Recently I engaged the project of an Access ISP, which has 700 POPs , but only 100 IP POPs, no MPLS (thus, BGP everywhere) , and around 200 Routers in the OSPF domain.

You could expect to see the affect of the topology change in 200 nodes routers in OSPF. But wait, these guys had ‘ Prefix suppression ‘ everywhere on the network. So they only had loopback interfaces on the routers in the routing table. They don’t have any problem with OSPF. Memory , CPU , routing table size , manageability , complexity everything is fine.

What next in this particular network ?

We will enable on this network MPLS, on top of that Traffic Engineering in the second phase. They won’t have any problem with Traffic Engineering because they have full topology visibility, thanks to flat OSPF design. They will do Traffic Engineering in a distributed fashion by using CSPF on the routers, but unless we have specific problems such as dead lock, global path optimization etc. it is fine.

So, 200 routers , with prefix suppression in place , in flat OSPF network design is working totally fine. Also read my ‘ Study Tip ‘ post to understand, why these numbers will not be important after maybe 5 – 10 years.

I cover a lot of practical network design topics , specifically for the Service Provider network , in my Service Provider Design Workshop and Service Provider Design and Architecture Book.

The post Flat/Single Area OSPF network is not a problem! appeared first on

by Orhan Ergun at November 26, 2019 08:33 AM

November 25, 2019

Packet Pushers – valid and useful site

Reserved domain useful for documentation and authenticating to wireless networks

The post – valid and useful site appeared first on Packet Pushers.

by Greg Ferro at November 25, 2019 04:04 PM Blog (Ivan Pepelnjak)

Upcoming Workshops: NSX, ACI, VXLAN, EVPN, DCI and More

I’m running two workshops in Zurich in the next 10 days:

I published the slide deck for the NSX versus ACI workshop a few days ago (and you can already download it if you have a paid subscription) and it’s full of new goodness like ACI vPod, multi-pod ACI, multi-site ACI, ACI-on-AWS, and multi-site NSX-V and NSX-T.

by Ivan Pepelnjak ( at November 25, 2019 08:01 AM

XKCD Comics

November 24, 2019 Blog (Ivan Pepelnjak)

Worth Reading: Early History of Usenet

Steve Bellovin wrote a great series of articles describing the early history of Usenet. The most interesting part in the “security and authentication” part was probably this gem:

That left us with no good choices. The infrastructure for a cryptographic solution was lacking. The uux command rendered illusory any attempts at security via the Usenet programs themselves. We chose to do nothing. That is, we did not implement fake security that would give people the illusion of protection but not the reality.

A lot of other early implementers chose the same route, resulting in SMTP, BGP… which wouldn’t be a problem if someone kept track of that and implemented security a few years later. Unfortunately we considered those problems solved and moved on to chase other squirrels. We’re still paying the interest on that technical debt.

by Ivan Pepelnjak ( at November 24, 2019 04:39 PM

Network Design and Architecture

CCDE Study Guide

CCDE Study Guide – Are you looking for a book that will teach you all the topics on advanced technical networking? If so, I would be very pleased to recommend CCDE Study Guide written by Marwan Al-Shawi to you.

As one of the professionals who contributed immensely to this book, I must admit that Marwan wrote this book in collaboration with a number of savvy designers. IT experts who contributed to this wonderful book include Russ White, Andre Laurent, Denise Fishbourne, Ivan Papeljnak, and Orhan Ergun. In fact, all the IT concepts in this book are enlightening! The book has many drawings, which will assist learners to understand network design.

Today, I spoke with one of my old friend, an expert in CCDE, who read Marwan’s book, and his comment was this: “The book contains pictures that explain a thousand words.”

The most important topics of the networking design, especially for the CCDE exam, are layer 3 technologies such as IGP, BGP, MPLS, Inter-AS MPLS, and IPv6 and VPNs. These topics are extensively covered in this book.

These topics are very important because CCDE exam is a layer 3 infrastructure exam and because these technologies provide an infrastructure of underlay network.

IPv6, QoS, multicast, security, and other topics on network design can be considered less important because they could be assumed as an overlay. Another boon about this book is that it has many comparison charts, which would assist readers to have a solid grasp of the concepts. In CCDE exam, you should expect questions that would ask you to compare the technologies since one of the main ideas behind any network design is to know when and where to use the optimal technology.

You should read this book as well as other recommended resource books before reading other important books on CCDE.

It is pertinent that you should have a basic knowledge of the technologies before reading this book. In addition, I would be very pleased to recommend this book – Ciscopress CCDP Arch by John Tiso – to my students and followers. I expect anybody that wants to pass CCDE to read Ciscopress CCDP before perusing other relevant books on network design.

Cisco Press CCDE Study Guide is a complimentary book to Orhan Ergun’s CCDE Practical Workbook which consist of CCDE practical case studies and the design scenarios.

I want to use this opportunity to express my gratitude to CiscoPress and Marwan for offering my readers an excellent promo until the end of this year.

This promotion will be valid until the end of this year. What’s more, now you can enjoy this cheap offer of 35% discount by using the under-mentioned code from this link.

Discount code is ‘ SAVEONCCDE ‘

You can click here to purchase the book.

Looking forward to your positive feedbacks in the comment section 🙂

The post CCDE Study Guide appeared first on

by Orhan Ergun at November 24, 2019 04:37 PM

November 23, 2019

Signals, Go & Immutable Infrastructure

From the days of old, setting fire to a large torch would signal to a neighbouring town something was going on. On the Great Wall in China, reports of signals reaching some 470 miles can be read on Wikipedia! Back to the future and modern day times, signals are transmitted and received as part of every application we touch. Signals underpin a system’s communications, irrelevant of what that system is. Software gives off many signals of a wide variety in normal operations and through signal correlation, we can yield useful events. Signals can also be used to achieve an outcome in a remote system as well as direct application API calls.

Being a fan of systems that have a natural synergy to them, I also look for ways to tie application functionality into natural system interactions.

For this post, I want to talk about the separation of concerns between an application’s functionality via it’s primary operational interface, likely an API of some sort, versus the application’s operational configuration, which allows it start on the correct TCP/IP port and consume the correct credential information.

Why not just get the application to refresh its configuration through the operational interface? The best way I can explain this would be akin to trying to chop ones axe wielding arm off whilst attempting to chop off the branch of the tree you’re sitting on with said arm.

Usable Signals

Whether an application is hosted in a Docker container or on an operating system directly, we can consume signals emitted from the kernel towards our application to achieve some operational tasks. Information below is relevant for Unix, Linux and OSX.


We can intercept this signal and have our application code exit gracefully and safely. It doesn’t have to be a panic situation if we need to exit. If the application doesn’t exit on a

, then it can be forcibly exited with
kill -9
being initiated from the operating system.


This signal interrupts a process, typically by



This signal is from the days of modem connectivity and represents a “Signal Hung-up” from the network. It can also be used to tell an application to reload its configuration as well as the controlling terminal has been closed.


These are custom signals and there are no recommended uses for them. If you use these signals, ensure that usage is documented in your code and user documentation.


It’s a little disconcerting perhaps that in order to transmit your desired signal, the ‘kill’ application is used.

  1. Get the process ID (PID) of the target application. If your application doesn’t store it’s PID in a file, then you can use the traditional methods to find the PID like using


  2. Once you have the PID, you can use

    kill --signal SIGHUP $PID
    to send the signal, in this case, it sends


The example code I am using here is readily available from the website.

package main

import (

func main() {

	sigs := make(chan os.Signal, 1)


	wg := sync.WaitGroup{}

	go func(wg *sync.WaitGroup) {
		for {
			sig := <-sigs

			if sig == syscall.SIGINT {
				fmt.Println("Received SIGINT, exiting")
			} else {
				fmt.Println("Received: " + sig.String())

	fmt.Printf("PID is: %d\n", os.Getpid())
	fmt.Println("Awaiting signal...")

I edited this code just shortly after publishing to be idiomatic with regards to sync and WaitGroups. Without a significant hall-pass, it was just a bad display of an example

Next, compile the code and invoke it. The output below is from the sample application.

$ go build
 $ ./signals
PID is: 4440
Awaiting signal...
Received: user defined signal 1
Received: hangup
^CReceived S

Output below is from a bash terminal where I sent signals to the application.

# Check for the available signals
$ kill -l
 $ kill -s USR1 4440
 $ kill -s HUP 4440


I hope this post was useful and despite it seeming kind of obvious, it’s a really useful pattern to embrace for your run-time configuration. With regards to immutable infrastructure, it could seem an anti-pattern, but being able to update the configuration of one container or service without restarting it potentially with much wider disruption, it feels like a nice capability instead of an anti one.

The post Signals, Go & Immutable Infrastructure appeared first on

by David Gee at November 23, 2019 10:45 PM Blog (Ivan Pepelnjak)

Worth Reading: Going Back to Being an Engineer

Another great advice from Charity Majors: does it make sense to go back to being an engineer after being a manager for a few years?

Personal note: finding a great replacement for my CTO role was probably the best professional decision I ever made ;)

by Ivan Pepelnjak ( at November 23, 2019 04:33 PM

November 22, 2019

SNOsoft Research Team

Industry standard penetration testing and the false sense of security.

Our clients often hire us to as a part of their process for acquiring other businesses.   We’ve played a quiet role in the background of some of the largest acquisitions to hit the news and some of the smallest that you’ve never heard of.  In general, we’re tasked with determining how well secured the networks of the organization to be acquired are prior to the acquisition.   This is important because the acquisitions are often focused on sensitive intellectual property like patents, drug formulas, technology, etc.   Its also important because in many cases networks are merged after an acquisition and merging into a vulnerable situation isn’t exactly ideal.

Recently we performed one of these tests for a client but post rather than pre-acquisition.  While we can’t (and never would) disclose information that could be used to identify one of our clients, we do share the stories in a redacted and revised format.  In this case our client acquired an organization (we’ll call it ACME) because they needed a physical presence to help grow business in that region of the world.   ACME alleged that their network had been designed with security best practices in mind and provided our client with several penetration testing reports from three well known vendors to substantiate their claims.

After the acquisition of ACME our client was faced with the daunting task of merging components of ACME’s network into their own.  This is when they decided to bring our team in to deliver a Realistic Threat Penetration Test™ against ACME’s network.  Just for perspective, Realistic Threat Penetration Testing™ uses a methodology called Real Time Dynamic Testing™ which is derived from our now infamous zeroday vulnerability research and exploit development practices. In simple terms it allows our team to take deep research-based approach to penetration testing and provides greater depth than traditional penetration testing methodologies.

When we deliver a Realistic Threat Penetration Test we operate just like the bad guys but in a slightly elevated threat context. Unlike standard penetration testing methodologies, Real Time Dynamic Testing™ can operate entirely devoid of automated vulnerability scanning.  This is beneficial from a quality perspective because automated vulnerability scanners produce generally low-quality results. Additionally, automated vulnerability scanners are noisy, increase overall risk of outages and damage, and generally can’t be used in a covert way.  When testing in a realistic capacity being discovered is most certainly disadvantageous.  As master Sun Tzu said, “All warfare is based on deception”.

When preparing to beach an organization accurate and actionable intelligence is paramount.  Good intelligence can often be collected without sending any packets to the target network (passive reconnaissance).  Hosts and IP addresses can be discovered using services like those provided by or via google dorks.  Services, operating systems, and software versions can be discovered using other tools like, shodan and others.  Internal information can often be extracted by searching forums, historical breaches, or pulling metadata out of available materials on the Internet.   An example of how effective passive reconnaissance can be is visible in the work we did for Gizmodo related to their story about Crosscheck.

Using passive reconnaissance against ACME we discovered a total of three externally connectable services.  One of those services was a VPN endpoint, the other was a web service listening on port 80, and the other was the same service listening on port 443.  According to passive recon the services on 80 and 443 were provided by a web-based database management software package.  This was obviously an interesting target and something that shouldn’t be internet exposed.  We used a common web browser to connect to the service and were presented with a basic username and password login form.  When we tried the default login credentials for this application (admin/admin) they worked.

At this point you might be asking yourself why we were able to identify this vulnerability when the three previous penetration testing reports made no mention of it.  As it turns out, this vulnerability would have been undetectable using traditional methodologies that depend on automated vulnerability scanning.  This is because the firewall used by ACME was configured to detect and block the IP addresses (for 24 hours) associated with any sort of automated scan.  It was not configured to block normal connection attempts.  Since we did passive reconnaissance, the first packet we sent to the target was the one that established the connection with the database software.   The next series of packets were related to successful authentication.

After using the default credentials to authenticate to the management application, we began exploring the product.  We realized that we had full control over a variety of databases that varied from non-sensitive to highly sensitive.  These included customer databases, password management, internal chat information, an email archive, and much more.  We couldn’t find any known vulnerabilities for the management software, but it didn’t seem particularly well written from a security perspective.   In short time we found a vulnerability in an upload function and used that to upload a back door to the system.  When we connected to the backdoor, we found that it was running with SYSTEM privileges.  What’s even more shocking is that we quickly realized we were on a Domain Controller.  Just to be clear, the Internet connectable database management software that was accessible using default credentials was running on a domain controller.

The next step was for us to determine what the impact of our breach was.  Before we did that though we exfiltrated the password database from the domain controller for cracking.  Then we created a domain admin account called “Netragard” in an attempt to get caught.  While we were waiting to get caught by the networking team we proceeded with internal reconnaissance.   We quickly realized that we were dealing with a flat network and that not everything on the network was domain connected.  So, while our compromise of the domain controller was serious it would not provide us with total control.  To accomplish that we needed to compromise other assets.

Unfortunately for ACME this proved to be far too easy of a task.  While exploring file shares we found a folder aptly named “Network Passwords”.   Sure enough, contained within that folder was an excel spreadsheet containing the credentials for all the other important assets on the network.  Using these credentials we were able to rapidly escalate our access and take full control of ACME’s infrastructure including but not limited to its firewall, switches, financial systems, and more.

Here are a few important takeaways from this engagement:

  • The penetration testing methodology matters. Methodologies that depend on automated scanners, even if whitelisted, will fail to detect vulnerabilities that won’t be missed by attackers using a hands-on research based approach.
  • Default configurations should always be changed as a matter of policy to avoid easy compromise.
  • Use two factor authentication for all internet connectable services.
  • Do not expose sensitive administrative applications to the internet. Instead, configure a VPN with two factor authentication and use that to access sensitive applications.
  • Domain controllers should be treated like Domain controllers and not like web application servers.
  • Domain controllers should not be Internet connectable or offer internet connectable services.
  • Do not store passwords in documents even if they are encrypted (we can crack them).
  • Always doubt your security posture and never allow yourself to feel safe. The moment you feel safe is the moment that you’ve adopted a false sense of security.





The post Industry standard penetration testing and the false sense of security. appeared first on Netragard.

by Adriel Desautels at November 22, 2019 09:01 PM

The Networking Nerd

Five Minutes To Magic Time

Have you ever worked with someone that has the most valuable time in the world? Someone that counts each precious minute in their presence as if you’re keeping them from something very, very important that they could use to solve world hunger or cure cancer? If you haven’t then you’re a very lucky person indeed. Sadly, almost everyone, especially those in IT, has had the misfortune to be involved with someone whose time is more precious than platinum-plated saffron.

That’s not to say that we should be wasting the time of those we work with. Simple things like being late to meetings or not having your materials prepared are easy ways to help reduce the time of meetings or to make things run smoothly. Those items are common courtesies that should be extended to all the people you meet, from the cashier that takes your order at a fast food establishment to the most powerful people on the planet. No, this is about something deeper and more insidious.

No Time For Hugs

I’ve seen the kind of behavior I’ve described very often in the higher echelons of companies. People that live at the CxO level often have very little time to devote to anything that resembles thought. They’re busy strategizing and figuring out ways to keep the company profitable. They don’t have time to listen to people talk. Talking interrupts their brain functions. They need time to think.

If you think I’m being hyperbolic, ask yourself how many times you’ve been told to “simplify” something for a CEO when you present to them (if you’re even given the opportunity). I think this strip from Dilbert explains it succinctly:

The higher up the food chain you go, the simpler it needs to be. But if the CEO is the most important person in the company, how is it that you need to make things easy for them to understand? They aren’t morons, right? They got this job somehow?

The insinuation is that the reason why you need to make it simple for them is because their time is too valuable. Needless talking and discussion takes them away from things that are more important. Like thinking and strategizing. Or something. So, if their time is the most value in the room, what does that say about your time? How does it feel to know that your efforts and research and theorizing are essentially wasted work because your time isn’t as important as the person you’re talking to?

This is even more egregious when you realize that your efforts to summarize something down to the most basic level is often met with a lot of questions about how you determined that conclusion. In essence, all the hard work you did to simplify your statements is undone because someone wants you to justify why you go to that conclusion. You know, the kinds of details you would have given in a presentation if you’d been given the time to explain!

Solution: Five Minute Meetings

Okay, so I know I’m going to get flack for this one. Everyone has the solution the meeting overload problem. Standup meetings, team catch ups, some other kind of crazy treadmill conference calls. But the real way to reduce your meeting stress is to show people how valuable time is for everyone. Not just them.

My solution: All meetings with CxO level people are now five minutes long. Period. End of story. You get to walk in, give your statement, and you walk out. No questions. No long Q&A. Just your conclusions. You say what you have to say and move on.

Sounds stupid, doesn’t it? And that’s kind of the point. When you are forced to boil your premise down to something like the Dilbert smiley face above, you’re doing yourself a disservice. All the detail and nuance goes right out the window. The only way you get to bring it back out is if someone in the room starts asking questions. And if you don’t give enough detail they almost always will. Which defeats the purpose of boiling it down in the first place!

Instead, push it back on the CxOs with the most valuable time. Make them see how hard you work. By refusing to answer any of their follow up questions. You see, if their time is so valuable, you need to show them how much you respect it. If they have follow up questions or require more details, they need to write all those interrogatories down in an email or an action item list and send it to you so you can get it done on your time. Make them wait for the answers. Because then they’ll see that this idea that their time is valuable is just an illusion.

It sounds awfully presumptuous of me to say that we need to waste the time the C-level suite. But a little bit of pushback goes a long way. Imagine how furious they’ll be when you walk out of the meeting after five minutes and don’t answer a single question. How dare this knowledge worker not bend their calendar to my desire to learn more?!? It’s ridiculous!

How about wondering how ridiculous it is for this person to limit your time? Or to not know anything ahead of time about the topic of discussion? Imagine telling someone to wait until you’re ready to talk to them after a meeting starts because you are more important than they are! The nerve!

However, once you stick to your plan a few times the people in the room will understand that meetings about topics should be as long as they need to be. And you should be given enough time to explain up front instead of talking for five minutes and getting interrupted with a thousand questions that you were prepared to answer anyway if they’d just given you the chance to present!

Watch how your meetings transform from interrogation scenes to actual presentations with discussions. Instead of only getting five minutes to talk you’ll be accorded all the time you need to fill in the details. Maybe you only needed ten minutes in the first place. But the idea is that now your time and expertise is just as valuable as everyone else on the team, from the bottom all the way to the top.

Tom’s Take

There needs to be an obligatory “no everyone is like this” disclaimer. I’ve met some very accommodating executives. And I’ve also met some knowledge workers that can’t present their way out of a paper bag. But the way to fix those issues is to make them get better at giving info and at listening to presentations. The way is not to artificially limit time to make yourself seem more important. When you give your people the time they need to get you the info you need, you’ll find that they are going to answer the questions you have a lot quicker than waiting with dread as the CEO takes the time to think about what they were going to be told anyway.

by networkingnerd at November 22, 2019 05:57 PM

Network Design and Architecture

Recommended Networking Resources for September 2019 First Week

I would like to share with you every week some networking resources , can be video , article , book , diagram , another website etc.

Whatever I believe can be useful for the computer network engineers, mobile network providers, satellite engineers ,transmission experts, datacenter engineers, basically whatever I am interested in and I like, I will share in a blog post.

There will not be any order of importance among the resources. You can open and go through anyone you want.

I will try to limit the list with 5 resources as I want you to read the posts that I publish on the website. Sometimes can be more than 5 though!

Let’s get started!

TCP vs QUIC – Quic is a new transport protocol I think everyone should have a look at. What are the high level differences between them etc.

TCP vs QUIC: A New Transport Protocol


2. Below post explains how BGP As-Path prepending , when it is done more than couple times , can be dangerous for the attacks on BGP information security


Excessive BGP AS-PATH prepending is a self-inflicted vulnerability


3. This presentation is one of the best presentation about BGP Add-Path, or maybe it is not good to do Add-path.


4. In the below video, Randy Bush is talking about IPv6 Transition mechanisms and Operational reality of the different transition mechanisms.


<iframe allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="allowfullscreen" frameborder="0" height="900" src=";feature=oembed" title="IPv6 Transition &amp; Operational Reality" width="1200"></iframe>



5. In the below white paper, you can have a look at recent enhancements in MPLS Traffic Engineering. Concept like RSVP- Multipath (TE++) is explained in the paper.


If you would like to see more resources weekly, let me know in the comment box below. Whatever is your feedback, let us communicate in the comment section of the blog.

Hope this will be useful for you! I love networking, I love helping people who like it!

The post Recommended Networking Resources for September 2019 First Week appeared first on

by Orhan Ergun at November 22, 2019 01:39 PM

About Networks

Basic Linux Networking tips and tricks part-2: the mtr command

Here is the second post of the series on basic network troubleshooting tests and tools under RHEL / CentOS. In this post, I will talk about the Linux mtr command. Ping and traceroute for basic network troubleshooting We all know the ping and traceroute commands to make the basic network debugging tasks from a host. For example, if we use a down-up method to troubleshoot connectivity to a remote host: Can we ping our own IP address? – To check if our own IP is correct and if the interface…

The post Basic Linux Networking tips and tricks part-2: the mtr command appeared first on

by Jerome Tissieres at November 22, 2019 01:26 PM