Abstract
In this document, we will detail a novel methodology for determining critical infrastructure within an infrastructure network based on a multi-layer version of the PageRank algorithm. We include all pertinent background and technical definitions to describe the mathematical framework employed. We describe an experimental design towards assessing the value of resource allocations in critical infrastructure networks towards the prevention of cascading network failures. We further include a description of use-case data (Idaho National Laboratory’s (INL) All-Hazards Analysis (AHA)) suitable to test our techniques. We include some summary results of our proposed experiments on the AHA data.
Suggested Citation
Miller, Jacob, Bill Kay, Patrick Mackey, and Samrat Chatterjee. “Multi-Layer Network PageRank for Critical Infrastructure Analysis.” Homeland Security Affairs 20, no. 4, (December 2024). www.hsaj.org/articles23189.
Introduction
This article describes a network science-based method that may be useful for homeland security practitioners to identify key assets within a critical infrastructure system. Specifically, we present a multi-layer PageRank centrality method in a use-case agnostic manner, where each parameter is tunable to practitioner needs. We also discuss how a practitioner can tailor our method for use with their specific data. As a baseline proof-of-concept, we do not incorporate any subjective assumptions beyond the infrastructure asset connectivity data used in our analysis.
From U.S. Code § 5195c: “The term critical infrastructure means systems and assets, whether physical or virtual, so vital to the United States that the incapacity or destruction of such systems and assets would have a debilitating impact on security, national economic security, national public health or safety, or any combination of those matters.” This definition gives a comprehensive conceptual description of critical infrastructure assets and systems in order to understand cascading failures in interconnected infrastructure systems (i.e., networks). Below are working definitions for three classes of infrastructure assets based on their role in a system- vulnerable, key, and critical.
- Vulnerable: Infrastructure assets that are likely to fail because of their attractiveness from an adversarial targeting standpoint along with reachability.
- Key: Infrastructure assets that strategically support several other assets that in turn support much of the network.
- Critical: Infrastructure assets that are both vulnerable and key.
Inspired by prior literature, these definitions conceptually capture the description in U.S. Code § 5195c while adopting language amenable to network analysis.[1] Here we take the view that cascading failures in infrastructure networks are of great detriment.For example, if a power substation that services a water treatment plant were to fail, a direct result may be that the water treatment plant also fails. However, the water treatment plant may in turn service a hospital that cannot operate without clean water, subsequently causing the hospital to fail. Such an indirect impact is an example of a cascading failure. Our definition of criticality is precisely designed to describe which infrastructure assets play significant roles in cascading failure throughout an infrastructure network to allow practitioners to better slow or stop the spread of failure from an inciting event.
Intuitively, vulnerable infrastructure assets depend on other assets, and so an adversarial event in the network is likely to cascade through vulnerable infrastructure assets. On the other hand, key infrastructure assets support many other assets, and so an adversarial event is more likely to cascade forward through other key infrastructure assets. Hence, given an adversarial event, critical infrastructure assets are those that are likely to fail (are vulnerable) and which will propagate failures the furthest (are key).
This article is organized as follows. In Section 2, we provide the necessary mathematical framework to rigorously describe infrastructure network analysis and multi-layer PageRank, our two main methodological approaches. In Section 3, we expound on the connections of multi-layer PageRank and critical infrastructure analysis. In Section 4, we introduce the All Hazards Analysis (AHA) data on which we establish a baseline proof-of-concept. In Section 5, we describe our experimental design for simulation experiments, coupled with ways practitioners can tune our methods to fit their use case. In Section 6, we compare our multi-layer PageRank algorithm results with other network centrality methods and the homogeneous PageRank algorithm. Concluding remarks are included in Section 7.
Multi-Layer Networks
The main mathematical construction we use to analyze critical infrastructure networks is a multi-layer directed graph. First, we formally define a directed graph. Thereafter, we generalize the definition to multi-layer directed graphs. Finally, we describe how to build a multi-layer directed graph to model infrastructure networks for criticality analysis.
A directed graph is a pair where is a (finite) set of nodes (or vertices) and is a set of (directed) edges that consist of ordered pairs of nodes. The first coordinate of an edge is called its tail,and the second coordinate of an edge is called its head. Directed graphs are typically represented visually with nodes as dots and edges with arrows pointing from tail to head.
Directed graphs are useful tools for analyzing data that has pairwise directional relationships. For example, in an infrastructure network, a water treatment plant may depend on a power substation, while the power substation does not depend on the water treatment plant. Hence, the pairwise relationship between these two infrastructure assets is directional. We model this dependency by having a node for each infrastructure asset, and an edge with the power substation as the tail and the water treatment plant as the head. This model is often used to analyze real-world networks. However, directed graphs are largely homogeneous, that is, there is no distinction between various types of nodes that model entities with diverse functions in practice. For example, a directed graph modeling an infrastructure network and its dependencies would make no distinction between a water treatment plant and a power substation. To this end, we introduce the definition of multi-layer directed graphs, which discern edges between different types of entities.
Definition: Multi-layer Directed Graph
A Multi-Layer Directed Graph is a directed graph where:
- is a set of nodes. are specified subsets of (which may or may not overlap). In practice, each is the set of nodes in a given category of interest.
- where is a set of directed edges that consist of ordered pairs of elements of . Informally, is a collection of ordered pairs with the first coordinate in and the second endpoint in . In practice, the edges reflect pairwise-directed relationships in a network.
Here is the set of integers from to We call the directed graph the –layer of This is the directed graph induced by all nodes of a given type. For example, one may want to consider the transportation subnetwork of a larger infrastructure network and focus only on that layer. Multi-layer directed graphs allow practitioners to assign preliminary criticality scores to each layer based on domain-informed knowledge or heuristics and edges based on inter- and intra-layer connections. These initial criticality scores will be used as a baseline input for our analytical framework, which refines the scores based on network structure. In Section 3, we describe an algorithmic method (i.e., Multi-Layer PageRank Algorithm) for analysis of multi-layer directed graphs, and how to apply this mathematical framework to critical infrastructure use cases.
We remark that in our use case in Section 4 the layers are disjoint. These are sometimes referred to as layer-disjoint multi-layer graphs. For a thorough investigation and classification of multi-layer graphs, see Kivelä et al. (2014).[2]
Multi-Layer PageRank and Critical Infrastructure
Google’s PageRank algorithm is the backbone of their successful search engine.[3] Given a directed Internet Graph whose nodes are web pages and there is a directed edge between web pages and if and only if has a hyperlink to, the PageRank score of a web page captures the probability that a user starts at a random web page, clicks links randomly, and arrives at web page . PageRank is one measure of network centrality of a node. The centrality concept helps identify important or influential nodes in a connected network. For a more thorough treatment of network centrality in critical infrastructure networks, see Barabási (2016).[4] While PageRank was developed for application to the directed Internet Graph, PageRank can be applied to random walks (i.e., traversals starting at a random node and moving along a random edge with that node as the tail) through any directed graph. It has been found to be applicable in many data-driven use cases in science and industry.[5] A directed network can be used to study infrastructure systems by taking the nodes to represent infrastructure assets, and asset points to asset if a failure in is likely to cause a failure in . In this context, a random walk illustrates one path a cascading failure may take in an infrastructure network. In the internet-directed graph, the web pages with high PageRank were those that are easily found by randomly traversing a network. Analogously, in an infrastructure network the vertices with high PageRank are the vertices in which random failures are likely to cascade to, which is precisely our definition of vulnerable. In fact, it was observed in Kay et al. (2021) that the definitions of vulnerable, key, and critical infrastructure could be re-framed in the context of PageRank score as follows:
- Vulnerable infrastructure assets correspond to nodes in an infrastructure network with high PageRank score.
- Key infrastructure assets correspond to nodes in an infrastructure network with high reverse PageRank score. That is, the PageRank score when all the directed edges are reversed.
- Critical infrastructure assets correspond to nodes whose PageRank and reverse PageRank scores are simultaneously high.
We refer to a measure of having high PageRank and reverse PageRank a criticality score. It was demonstrated in Kay et al. (2021) that a criticality score was indeed a suitable measure of how critical certain infrastructure assets were in the Homeland Infrastructure Foundation-Level Data (HIFLD) dataset, the criticality scores only marginally out-performed identification of critical infrastructure by ranking nodes based on their in- and out-degrees (that is, the number of edges for which they are the head or tail, respectively). [6] This makes intuitive sense, as in- and out- degrees are first-order approximations of PageRank and reverse PageRank respectively. One major drawback to the PageRank analysis of criticality was the homogeneity of the directed graph model. While a performance increase over in- and out-degree criticality rankings demonstrated that PageRank was identifying some of the correct properties of critical infrastructure, in a standard directed graph model each type of infrastructure asset is treated as the same. In practice, the failure of a power plant may affect the operation of a water treatment plant (which could have legally mandated backup generators) differently than it affects the operation of a train station. Here, we do not describe these differences but remark that practitioners with domain knowledge can determine how strong the dependencies between network assets are. An idealized model would consider the dependencies between each pair of infrastructure assets independently, with a bespoke analysis informing failure probabilities for each edge. In practice, data this granular for each set of infrastructure assets would be cumbersome (if not impossible) to gather. Further, a tool that required this level of precision at the data level would have limited portability across networks. A coarser analysis of the general level of dependencies between layers is more likely to be readily available for a variety of use cases, although we remark that if a practitioner has accurate asset level dependency data, they can take each asset to be its own layer. Thus, we propose a heterogeneous PageRank criticality method that employs a Multi-Layer PageRank Algorithm.
In the cases where we know levels of dependency between collections of types of infrastructure assets, we can model an infrastructure network as a multi-layer directed graph where each infrastructure type is collected as a single layer. Various versions of multi-layer PageRank algorithms exist in the literature. We propose an infrastructure-specific version modeled after Cheriyan and Sajeev (2020), where one can use domain knowledge to rank layers by importance and seed inter- and intra-layer edges with dependency scores.[7] For example, while we do not seek to specify a precise domain, a practitioner of critical infrastructure analysis can determine dependencies across cyber-physical infrastructure systems. Pseudocode for the multi-layer PageRank algorithm can be found in Appendix B, Algorithm 1.
We remark that we make no assertions about the importance of any layer or the level of dependencies between any two layers. Rather, we provide a tool that can be adapted between use cases where layer importance and layer-to-layer dependencies are tunable parameters for domain experts and practitioners to assign for their data. Thus, a multi-layer PageRank algorithm provides a portable and flexible tool that can be modified by practitioners to specific infrastructure networks of interest. We further remark that while our focus is on critical infrastructure identification in service of preventing cascading failures, our multi-layer PageRank algorithm is domain-agnostic and can be deployed in a variety of contexts of interest across the homeland security enterprise. For example, network flow data can be readily modeled as a multi-layer directed graph where the layers are specific types of computer network assets, and critical nodes reveal which network assets are vulnerable and, when compromised, have far-reaching consequences.
Data
For our experiments we are leveraging a data set produced by Idaho National Laboratory as a part of the AHA approach for critical infrastructure dependency analysis.[8] This data set represents multiple layers of critical infrastructure including power grids, water systems, transportation, and others. Dependencies connect elements within a particular type of critical infrastructure (i.e., intra-network), as well as across infrastructure domains (i.e., inter-network). For example, within the energy layer, electric substations may have a dependency on each other or on a power generator. There may also be a dependency across domains to a water layer (for powering water purification) or from the water layer (for cooling a power plant). Multiple data ontologies may also connect the elements of critical infrastructure together based on their functionality or the type of facility represented.
The total size of the infrastructure dependency network is 679,794 nodes and 851,747 edges. There are eight broad sectors of infrastructure represented at varying levels of coverage. Summary statistics of the data can be found in Appendix A, Tables 1 & 2, and a graphical representation of layers by size with annotated sub-layers can be found in Figure 2. Each of these sector categories is further broken into sub-sectors based on the specific kind of infrastructure they represent. However, for our experiments, we will be focusing on the sectors as the fundamental layers in our multi-layer network, while we remark that sub-sector analysis can be used by practitioners for more granular analysis. Some of these layers are incomplete in their full representation of infrastructure in the United States but are still useful as a representative real-world example of how these analytic techniques can be applied.
In addition to the edges within a layer, there are also dependencies across layers. Appendix A, Table 2 shows the pairs of layers in which cross-layer dependencies exist in the AHA data. These are ordered from largest to smallest. Most of the cross-layer dependencies relate to Energy Systems and Water Systems. While cross-sector dependencies may also be presented via matrix-based representations, in Figure 3 we present a graphical representation of the interlayer dependencies between layers. The thickness of each edge represents the number of dependencies between two layers, with the arrow showing the direction of the dependency. For example, both water systems and energy systems have dependencies on each other, but the water systems have more dependencies on the energy systems than the other way around. The color of these arrows represents the layer the originating layer is dependent on. Some layers, such as General Facilities, have many dependencies on other layers of critical infrastructure, but do not have many entities dependent on them. The size of each layer in this diagram represents the frequency of cross-layer dependencies associated with it, not the total size of the layer. Due to this, the largest of the layers (Transportation Systems) appears relatively small in Figure 3.
Experimental Design
We will evaluate our model empirically. Here, we design an experiment to identify the criticality score of each node in an infrastructure network, and we assess their role in cascading failures, along the lines of the experiment posed in Kay et al. (2021). We first obtain criticality rankings (ordered list from most to least critical) of vertices from different sources: Multi-layer PageRank, traditional PageRank, ordered by degree, and a random ranking.
To test, we apply the widely used linear threshold simulation model.[9]. In this model, a vertex can have one of two statuses: either infected or susceptible. For critical infrastructure, one can think of infected vertices as infrastructure assets that have failed and can no longer support infrastructure assets that are dependent on it. Vertices become infected by being adjacent to other infected vertices; if the proportion of a vertex’s infected neighbors is larger than some threshold (model parameter ), that vertex also becomes infected. In this way, infection propagates through the network over time until it can spread no further.
For the purposes of our experiment, we assume that an agent has only finitely many resources to protect vertices from becoming infected and must choose some set of vertices to allocate this protection. Protected nodes cannot become infected as they are adequately equipped to handle threats. The proportion of the network that can be protected is a second parameter in our model, .
We then execute this simulation, protecting the top proportion of nodes according to either multi-layer PageRank, PageRank, out-degree, or randomly and analyze how many vertices become infected. The smaller the set of nodes in a network that are eventually infected, the more resilient a network is to cascading failures. Tests were done using Python 3[10] with the packages – NetworkX[11] and Network Diffusion Library[12]. We hypothesize that our multi-layer PageRank protection strategy results in fewer nodes failing in a cascading failure scenario – experimental results are discussed next.
Results
We ran the experiments described in Section 5 on an array of choices for parameters and The purpose of the experiment was to propose the multi-layer PageRank approach to critical infrastructure analysis. While we do not provide performance graphs for an entire grid of model parameter choices, we noted that in almost every instance, the nodes identified as critical by the multi-layer PageRank algorithm indeed stopped the spread of cascading network failures better than the nodes identified by PageRank. PageRank typically does the second best (but is often equally performant with out-degree, consistent with the findings in Kay et al. (2021), out-degree is often comparable to PageRank but often performs worse, and in every instance the randomly selected resource allocation induced the largest fraction of infected nodes. Further, in the instances where multi-layer PageRank was not largely the most performant measure of resource allocation the cases were somewhat degenerate, in that multi-layer PageRank, PageRank, and out-degree were equally performant. A selection of typical performances is shown in Figure 4.
Concluding Remarks
Our experimental design is intended to broadly apply to diverse use cases relevant to practitioners from different infrastructure domains. In this regard, we left parameter selections general in order to provide a baseline proof of concept without imposing domain constraints. Experimental design choices for further investigation may include:
- The initial set of infected nodes. In our experiment we infect a proportion of the nodes initially. A practitioner may want to see the effects of the failure of a single node (resulting from a targeted attack), a region of nodes (resulting from a natural event), or a sector of nodes (simulating an entire system failure such as power grid collapse). The initial set of nodes can be chosen to suit practitioners’ needs. The initial failing nodes can be selected according to any probability distribution the practitioner deems necessary.
- Levels of protection. In our example, we choose to “fully” protect individual nodes. A practitioner can vary susceptibility on an edge by edge basis.
- The failure threshold. Practitioners can use domain knowledge to provide a different failure threshold at the asset or layer level of precision.
About the Authors
Jacob Miller is a PhD intern in the National Security Directorate at Pacific Northwest National Laboratory. His research interests include network visualization and network algorithms, with an emphasis on applying these techniques in realistic applications. He received his BS in Computer Science from Southwestern Oklahoma State University in 2020 and is currently studying at the University of Arizona.
Dr. Bill Kay is a mathematician with the Algorithms, Combinatorics, and Optimization Group at Pacific Northwest National Laboratory (PNNL). His main research area is in graph theory and network analysis, with a focus on algorithmic design, theory, and application spaces which include critical infrastructure analysis and cybersecurity. Bill received his Ph.D. in Mathematics and M.Sc. in Computer Science from Emory University in 2017. Before joining PNNL, Bill was a Data Scientist at Oak Ridge National Laboratory (ORNL).
Patrick Mackey is a senior data scientist in the National Security Directorate at Pacific Northwest National Laboratory, where he has worked in computer science research and software development since 2004. He specializes in applying network science and visual analytics to national security problem domains, including critical infrastructure security, cyber security, and social network analysis. He received his MS in Computer Science from Washington State University in 2015, and his BS in Computer Science from the same university in 2004.
Dr. Samrat (Sam) Chatterjee is a Chief Data Scientist and Team Lead with the Data Sciences and Machine Intelligence Group at U.S. Department of Energy’s Pacific Northwest National Laboratory (PNNL) in Richland, WA. He serves as PI/PM and key technical staff on national security and computational science research efforts in support of multiple sponsors (DHS, DOE, and DoD), and has over 13 years of experience in infrastructure network resilience modeling, homeland security risk and decision analytics, and multi-agent learning and optimization. Dr. Chatterjee is an Affiliate Professor at Northeastern University-Boston and serves as an Associate Editor for Frontiers in Water journal. He has authored 2 books, 6 book chapters, and over 90 peer-reviewed journal articles, conference papers, and technical reports, and received multiple best paper and poster awards. He is a Senior Member of IEEE and holds a doctorate in Civil Systems Engineering from Vanderbilt University.
Acknowledgements
This research was supported by the National Infrastructure Simulation and Analysis Center (NISAC), a program of the U.S. Cybersecurity and Infrastructure Security Agency’s (CISA’s) National Risk Management Center. Pacific Northwest National Laboratory is operated by Battelle Memorial Institute for the U.S. Department of Energy under Contract No. DE–AC05–76RL01830. The authors would also like to thank the reviewers for their helpful feedback.
Appendix A.
Appendix B.
Biblography
Barabási, Albert-László. Network Science. Cambridge University Press, 2016.
Geospatial Management Office. “Homeland Infrastructure Foundation-Level Data (HIFLD).” U.S. Department of Homeland Security. 2020. https://hifld-geoplatform.opendata.arcgis.com/.
———. “Homeland Infrastructure Foundation-Level Data (HIFLD).” U.S. Department of Homeland Security. 2021. https://hifld-geoplatform.opendata.arcgis.com/.
Gleich, David F. “PageRank Beyond the Web.” SIAM REVIEW 57, no. 3 (2015): 321–363. https://doi.org/10.1137/140976649.
Granovetter, Mark. “Threshold Models of Collective Behavior.” American journal of Sociology 83, no. 6 (1978): 1420-1443. https://doi.org/10.1086/226707.
Hagberg, Aric A., Daniel A. Schult, and Pieter J. Swart. “Exploring Network Structure, Dynamics, and Function Using NetworkX.” In Proceedings of the 7th Python in Science Conference (SciPy2008), edited by Gäel Varoquaux, Travis Vaught, and Jarrod Millman, (Pasadena, CA, 2008): 11–15.
Hruska, Ryan C., Kent E. McGillivary, and Robert M. Edsall. “A Functional All-Hazard Approach to Critical Infrastructure Dependency Analysis.” Journal of Critical Infrastructure Policy 2, no. 2 (2021): 103-123. https://doi.org/10.18278/jcip.2.2.6.
Kay, Bill, Hao Lu, Pravallika Devineni, Anika Tabassum, Supriya Chintavali, and Sangkeun Matt Lee. “Identification of critical infrastructure via pagerank.” In 2021 IEEE International Conference on Big Data (Big Data), (Boston: IEEE, 2021), 3685–3690. https://ieeexplore.ieee.org/document/9671620.
Kivelä, Mikko, Alex Arenas, Marc Barthelemy, James P. Gleeson, Yamir Moreno, and Mason A. Porter. “Multilayer networks.” Journal of Complex Networks 2, no. 3 (2014): 203-271. https://academic.oup.com/comnet/article/2/3/203/2841130.
Page, Lawrence, Sergey Brin, Rajeev Motwani, and Terry Winograd. The Pagerank Citation Ranking: Bring Order to the Web. Technical Report. Stanford University, 1998, 1–17. http://ilpubs.stanford.edu:8090/422/.
Rossetti, Giulio, L. Milli, S. Rinzivillo, A. Sirbu, D. Pedreschi, and F. Giannotti. “NDlib: a Python Library to Model and Analyze Diffusion Processes Over Complex Networks.” Journal of Data Science and Analytics, (2017). https://doi.org/0.1007/s41060-017-0086-6.
Van Rossum, G., and F. L. Drake. Python 3 Reference Manual. Scotts Valley, CA: CreateSpace, 2009. https://www.python.org/downloads/.
Notes
[1] Bill Kay et al., “Identification of Critical Infrastructure via Pagerank,” in 2021 IEEE International Conference on Big Data (Big Data), (Boston: IEEE, 2021), 3685–3690, https://ieeexplore.ieee.org/document/9671620.
[2] Mikko Kivelä et al., “Multilayer Networks,” Journal of Complex Networks 2, no. 3 (2014): 203–271, https://academic.oup.com/comnet/article/2/3/203/2841130.
[3] Lawrence Page et al., The Pagerank Citation Ranking: Bring Order to the Web, Technical Report (Stanford University, 1998), 1–17, http://ilpubs.stanford.edu:8090/422/.
[4] Albert-László Barabási, Network Science (Cambridge University Press, 2016), 475. http://networksciencebook.com/.
[5] David F. Gleich, “PageRank Beyond the Web,” SIAM Review 57, no. 3 (2015): 321–363, https://doi.org/10.1137/140976649.
[6] “Homeland Infrastructure Foundation-Level Data (HIFLD),” Geospatial Management Office, U.S. Department of Homeland Security, 2021, https://hifld-geoplatform.opendata.arcgis.com/.
[7] Jo Cheriyan and G. P. Sajeev, “An Improved Pagerank Algorithm for Multilayer Networks,” in 2020 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), (Boston: IEEE, 2020), 1–6, https://doi.org/10.1109/CONECCT50063.2020.9198566.
[8] Ryan Hruska, Kent McGillivary, and Robert Edsall, “A Functional All-Hazard Approach to Critical Infrastructure Dependency Analysis,” Journal of Critical Infrastructure Policy 2, no. 2 (2021): 103–123, https://doi.org/10.18278/jcip.2.2.6.
[9] Mark Granovetter, “Threshold Models of Collective Behavior,” American Journal of Sociology 83, no. 6 (1978): 1420–1443, https://doi.org/10.1086/226707.
[10] G. Van Rossum and F. L. Drake, Python 3 Reference Manual, (Scotts Valley, CA: CreateSpace, 2009), www.python.org/downloads/.
[11] Aric A. Hagberg, Daniel A. Schult, and Pieter J. Swart, “Exploring network structure, dynamics, and function using NetworkX,” in Proceedings of the 7th Python in Science Conference (SciPy2008), ed. Gäel Varoquaux, Travis Vaught, and Jarrod Millman, (Pasadena, CA, 2008): 11–15.
[12] Giulio Rossetti et al., “NDlib: a Python Library to Model and Analyze Diffusion Processes Over Complex Networks,” Journal of Data Science and Analytics (2017), https://doi.org/10.1007/s41060-017-0086-6.
Copyright
Copyright © 2024 by the author(s). Homeland Security Affairs is an academic journal available free of charge to individuals and institutions. Because the purpose of this publication is the widest possible dissemination of knowledge, copies of this journal and the articles contained herein may be printed or downloaded and redistributed for personal, research or educational purposes free of charge and without permission. Any commercial use of Homeland Security Affairs or the articles published herein is expressly prohibited without the written consent of the copyright holder. The copyright of all articles published in Homeland Security Affairs rests with the author(s) of the article. Homeland Security Affairs is the online journal of the Naval Postgraduate School Center for Homeland Defense and Security (CHDS).