– Executive Summary –
Governments, like other organizations dealing with budgetary pressures, typically search for ways to increase efficiency and effectiveness. One common way of cutting costs and increasing efficiency is to consolidate information technology (IT) systems. IT is a critical component of the nation’s infrastructure, economy, and government operations; with the U.S. Department of Homeland Security designated the IT sector as one of the seventeen critical infrastructure sectors.[1] The state of Illinois is currently undergoing a consolidation of all state agency IT systems with the goal of a highly centralized and optimized monoculture environment.[2] However, policymakers typically do not consider whether concentration and homogenization of critical assets, elimination of redundancy and surge capacity, and tight coupling may result in other system vulnerabilities.
This thesis considers theories and research that indicate highly connected, hyper-efficient systems contain the seeds of their own failures.[3] The focus is specifically on the potential vulnerabilities that exist in consolidated IT systems due to the effects of complexity, self-organized criticality (SOC), and monoculture, as well as the impact of those effects on system resilience. The thesis includes a high-level analysis of current and potential weaknesses that result from complexity and hardware and software monocultures, as well as the potential impact of heavy usage of cloud computing. The thesis also includes a representative analysis of the consolidation in Illinois that began in 2015.
Through quantification of self-organization, we determine system vulnerability and resilience in order to help understand how to mitigate the identified vulnerabilities. The two main drivers of self-organizing networks—percolation and preferential attachment—are combined into a single quantity called spectral radius. As SOC increases, so does spectral radius. Vulnerability is defined as the probability of collapse when an asset such as a hardware or software component is stressed. System vulnerability relates to stresses such as monoculture, surge capacity, and weakness in nodes or links. Higher values of vulnerability mean lower values of resilience, also. When we combine vulnerability (v) and spectral radius (r) into a product (vr), we combine the effects of SOC with the effects of stresses into a single measure of fragility. The system’s structure determines how resilient the system is, not simply each component’s weakness.
The thesis concludes that vulnerability increases with consolidation and optimization, thereby reducing system resilience. While high-consequence, extreme events happen less frequently than smaller-consequence events, with increased vr the big events will be even more damaging.[4] The objective of resilient system design is to reduce v, r, or both. Examples of both pre-consolidation and anticipated post-consolidation portions of the Illinois network were analyzed for resilience by simulating cascades initiated by failure of a randomly chosen node. This analysis demonstrated an increasingly self-organized and fragile system.
The research concludes that when consolidating and centralizing to save money, system designers and administrators need to take special precautions to offset the downside of centralization and standardization with extra vulnerability-reducing precautions. This typically means that hardware and software systems and communication networks must be hardened even more against accidental and deliberate attacks. That is, vulnerability must be reduced to compensate for the increase in self-organization. Mathematically, this means vulnerability (v) must be reduced to offset spectral radius (r).
Because vulnerability is counter-proportional to resilience, the thesis recommends reducing overall vulnerability in order to compensate for increased self-organization and loss of resilience. Vulnerabilities in an enterprise system can be the result of a variety of issues, including design flaws in software or hardware, or organizational policy weaknesses. Examples of typical vulnerabilities and countermeasures are provided. Recommendations regarding identified areas of vulnerability in cloud computing as well as suggestions for mitigating SOC and monoculture are also provided. Although we cannot mitigate for every possible threat, hazard, or vulnerability, organizations should focus on actions that will increase resilience, several of which are proposed. There will be a cost to many of the proposed solutions, but they are flexible enough to allow organizations to determine the tradeoffs they are willing to make between efficiency and security.
[1] “Information Technology Sector,” Department of Homeland Security, accessed March 26, 2017, https://www.dhs.gov/information-technology-sector.
[2] Illinois Office of the Secretary of State, Executive Order Consolidating Multiple Information Technology Functions into a Single Department of Innovation and Technology, Illinois Executive Order 2016-01 (Springfield, IL: Senate of Illinois Executive Department, 2016), https://www2.illinois.gov/
Documents/ExecOrders/2016/ExecutiveOrder2016-01.pdf.
[3] Ted G. Lewis, Bak’s Sandpile: Strategies for a Catastrophic World, Kindle ed. (Williams, CA: Agile Press, 2011), loc 32.
[4] Lewis, loc 872.