Societies Running on Quicksand: A Critical Look at Today's Networks
Wednesday May 27, 2020
The last few weeks have reinforced the importance of modern communication networks to societies. Health care providers, schools, governments, and businesses all rely on networks that enable us to connect and collaborate remotely. Had we encountered a similar pandemic ten years ago, we would not have been able to continue our activities on the level that is possible today.
Although companies like Google and Microsoft have developed fantastic shared services for remote work, they would be of very little use if the networks that connect users with them would not be as resilient as they are today. While providing connectivity and related solutions have not been viewed as particularly exciting industries since the millennium, the organizations in these verticals now play a key role in ensuring that our societies are still up and running.
Once the pandemic is over, there will likely be a new normal. The magnitude of change in our daily routines and working habits has been exceptional. As the COVID-19 virus has become a catalyst for new collaboration models that will likely improve our lives, I am certain that there will be no return to February 2020 once the pandemic is over. At that point, our societies will increasingly rely on networks.
What's wrong with networks today?
Although mobile, cloud, and streaming services have evolved tremendously during the last ten years, the additional network capacity required by them has been built by simply replacing old network devices with two new ones. While this has provided us with just enough bandwidth to cope with the increased load, there are already reports on increased network congestion caused by people staying home and going about their daily lives online.
At the same time, the real development of the network infrastructure has largely been neglected. A high percentage of the most critical networks in the world continue to be run manually by a handful of network engineers. Documentation of these environments is often so and so, making changes error-prone and slow to implement. The continuity risks that many telecom companies, government agencies and enterprises are assuming here are generally not understood at all.
On a high level, the reason for the malaise of network infrastructure is a lack of funding. For years, telecom companies have had to multiply network capacity on flat revenue. The situation with large enterprises has not been much better, because their IT departments have been allocated only just enough budget to maintain and to operate the existing network services.
The largest reason for this sorry state is that connectivity and networks are largely intangible. Just like with information security, the lack of funding is not really felt until things go wrong. But once they do, the consequences can be dire. This is especially true during times of crisis when the stakes are high.
What should be done to address these shortfalls?
With the on-going COVID-19 pandemic and all the activities that are taking place online, we are well beyond debating the opportunity cost of network downtime. The real question right now is what would happen if an organization lost its ability to connect. Whatever the answer, this should be a board-level discussion.
Once the risk assessment and the policy decisions have been made, there is a simple IDA model that should be followed when taking steps to mitigate against these risks:
1. Investigate how the networks your organization is responsible for are being operated. The chances are you will discover a large number of manual ad-hoc processes and inaccurate documentation that is incomplete and/or has not been systematically maintained.
2. Document the existing networks and related operational processes in a systematic way. This exercise should cover processes; L2 and L3 structures and properties; as well as policies and other qualitative information. Besides the traditional on-premise setups, this exercise should also cover externally hosted network segments such as VNETs and VPCs in the public clouds.
3. Automate all repetitive network management tasks by using your documentation as the data source. Full network automation should be the goal here because it will minimize the continuity risks associated with manual work.
Compared to other critical domains such as information security, developing network infrastructure to the next level is not that expensive or even particularly complex. But without the due awareness, priority and budgets, the chances are that nothing will change, and our societies continue to be built on quicksand.
By Juha Holkkola, Co-Founder and Chief Technologist at FusionLayer Inc. – Juha Holkkola is the Co-Founder and Chief Technologist at FusionLayer Inc. An inventor with several patents in the US and Europe, he is an advocate of technology concepts with tangible operational impact. Juha is an active proponent of emerging technology trends such as cloud computing, hybrid IT and network functions virtualization, and a regular speaker at various industry events.