Speed, Scale, Reliability and the Pursuit of Programmability

Speed, Scale, Reliability and the Pursuit of Programmability

As networking moves from monolithic to disaggregated infrastructure, the necessity of programmability has become existential, be it physical or software-defined networks (SDN). A common thread runs through these movements: the need for control.

As the notion of control continues to evolve, it will drive significant changes to the systems that form the foundations of the network.

Network Control

When most people think of programmability, they probably first think about network control. After all, using off-box software to manage groups of network devices is one of the hallmarks of SDN, and this does require programmability.

The rise of network control, and specifically SDN, has been driven largely by the need for speed, scale and reliability as an area of focus within IT teams. Yet while speed is typically assumed to be the main objective, scale and reliability have been the real catalysts for change.

When the CLI is the primary management tool, operations teams are limited to controlling groups of devices that can be manually manipulated within existing time constraints, such as change windows, mean time to repair and so on. When the management domain becomes too large, however, the CLI no longer works. SDN was the logical response to placing a greater load on the network, and for such centralized control to be viable, the devices being controlled needed programmable interfaces.

Reliability is a bit more nuanced. As networks scale, avoidance is no longer a practical response. Even with reasonable failure rates, large networks will naturally experience failures at a rate beyond what is tolerable using conventional management methods. At scale, the only viable strategy is to make networks more resilient, but this requires software control since humans simply cannot make changes fast enough to mitigate failures while still meeting the network reliability targets. On top of all of this, software control requires programmable interfaces.

Economic Control

While network control might have been the early driver for programmability, it is certainly not the only reason that programmability is critical to the future of networking.

Disaggregation, for example, only brings value if every layer of the networking stack has programmable interfaces (APIs). In fact, programmability is effectively a prerequisite for the decoupling of integrated components. Without a programmable boundary, subsystems cannot be separated.

In this case, the definition of “programmable” needs to expand enough to include the notion of “open.” If efforts like white box switching are to be economically impactful, it’s not enough to simply separate the hardware from the software. If disaggregation is merely a packaging exercise, then it doesn't likely offer any benefit. Economic leverage only comes when the separated components can be mixed and matched, stoking competition and granting customers choice and flexibility.

Of course, mixing and matching means that the programmable interfaces need to be well defined AND open. Microsoft’s SONiC is a good example for such an open, well-defined software interface for data center switches.

Behavior Control

For some deployments, the notion of control goes beyond SDN and vendor management. There are situations where having control over network behavior—say, to optimize traffic for specific application outcomes—is desirable.

When network services and forwarding behavior need to be controlled based on conditions within or above the network, programmable hooks into the network stack are required. These hooks can range from telemetry interfaces for real-time streaming of device state (OpenConfig over gRPC, for instance) to APIs for things like the routing and forwarding tables. Facebook, for example, has been public in their use of Open/R to manipulate forwarding within the network.

As applications become more dynamic, the need for programmatic interfaces will increase. It’s not hard to imagine use cases where applications will adjust their behavior based on conditions either on the network or within the devices themselves. After all, throttling application throughput in the presence of intermittent packet drops is preferable to failures that occur at predefined thresholds. Or perhaps heavy data backups might be triggered during times of low utilization. Driving behavior outside the network requires a solid set of interfaces inside the systems and subsystems to provide sufficient information to make these types of decisions.

Intersecting Machine Learning and Artificial Intelligence

Ultimately, all of this is going to need to intersect with the developments currently underway in machine learning and artificial intelligence. In those spaces, there needs to be programmatic ways of accessing data from which to build useful models. If the promise of self-driving infrastructure is to be realized, there needs to be automated actions initiated and executed by the infrastructure itself without manual intervention.

In an environment where administration resembles autonomic operations, programmability is an absolute necessity. It’s difficult to imagine a future where every layer in the network stack is not accessible via programmatic interfaces.


 By Bikash Koley

Published with permission from forums.juniper.net/t5/Blogs/ct-p/blogs

Leave a comment!

All fields marked with an asterisk* are required.