Abstract
Previous failures in effective, large-scale disaster response (e.g., Hurricane Katrina) are often traced to failures in effective coordination. As evidenced in after-action reports, however, assessments of coordination performance are still largely anecdotal in nature. Network analysis is a possible means to develop quantitative metrics for coordination assessment. In this paper, two techniques are proposed for characterizing coordination performance. First, Borgatti’s technique for quantifying network fragmentation was used to measure the extent to which various response agencies play a role in establishing efficient communications. Second, Girvan and Newman’s technique for community sub-group identification was used to identify potential breakdowns in information transfer. Both techniques were successfully implemented in a case-study analysis of the Top Officials 4 exercise. The techniques can provide additional insights into coordination performance, identifying exercise artificialities and allowing meta-analysis of coordination performance (e.g., over time, across regions, for different event scales).
Suggested Citation
Su, Yee San. “Application of Social Network Analysis Methods to Quantitatively Assess Exercise Coordination.” Homeland Security Affairs 7, Article 17 (December 2011). https://www.hsaj.org/articles/51
Introduction
Coordination, or lack thereof, has been identified as a key bottleneck in effective management of disasters such as Hurricane Katrina.1 Large-scale events frequently demand more complex forms of organization, larger quantities of resources, and access to specialized equipment and personnel under conditions of decreased situational awareness.2 As a result, the challenge of establishing effective coordination may increase nonlinearly with respect to increases in various event scales (e.g., size, severity). Unfortunately, quantitative metrics for measuring coordination performance do not yet exist. Assessments of coordination performance — such as those found in after-action reports — are still predominantly anecdotal. Development of quantitative metrics to characterize coordination would allow for a more robust method of measuring response coordination progress and facilitate our understanding of how coordination is negatively affected by event scales.
Social network analysis is a possible means to obtain meaningful, quantitative metrics. Recent years have seen an increase in the application of social network analysis concepts to homeland security. For example, Naim Kapucu cast the evolution of national response frameworks as a series of network graphs.3 Moreover, researchers have studied coordination and emergent player roles for events such as Hurricane Katrina and 9/11.4 Meanwhile, social network analysis has also seen the development of a variety of new quantitative methods to characterize networks.
To this end, two possible techniques are proposed to generate quantitative measures of coordination: (1) a centrality measure introduced by Stephen Borgatti to address what he called the negative variation of the Key Player Problem (KPP-Negative);5 and (2) a technique for detecting community structure developed in a series of papers by Girvan and Newman.6 Borgatti’s technique is used to identify key coordinating agencies and could potentially be used to chart the development of emergency operations centers (EOCs), as well as provide an outcome value for statistical analysis.7 Meanwhile, community sub-group examination via Girvan and Newman’s technique could be used to identify potential information stove-piping. The underlying basis and reasons for selecting these techniques are discussed in the Data and Approach section.
These approaches were applied in an analysis of evaluator logs from the Portland, Oregon site of the 2007 Top Officials 4 (TOPOFF 4) exercise. TOPOFF 4 provides a rare opportunity to examine coordination in the context of a large-scale, catastrophic event. Thus far, evaluator records from TOPOFF 4 have been used to support construction of various after-action reports.8 However, the notion was to explore whether additional insights on coordination could be obtained — given the extensive database of communication-related information collected — using the frequency of communication as a proxy for coordination effectiveness. While highlighting the potential of social network analysis, this article also points out the need for additional research and validation. Thus, it includes recommendations for improving data collection in future exercises.
Data And Approach
TOPOFF 4 Dataset
TOPOFF 4 is one of what are now designated Tier I National Level Exercises (NLE). Given the infrequent occurrence of catastrophes, few events involve the full spectrum of the response community (spanning vertically across all levels of government and geographically across regions, and involving both non-governmental organizations and the private sector). Tier I NLE provide a rare glimpse of how coordination fairs in a catastrophic context, including participation at local, county, state, and federal levels, as well as private sector and non-governmental organizations (NGO). Since the original TOPOFF in 2000, a primary goal has been “to improve the capability of government officials and agencies, both within the United States and abroad to provide an effective, coordinated, and strategic response to a terrorist attack.”9 TOPOFF 4 took place from October 15-19, 2007, involving more than 15,000 federal, state, local, and private sector participants. The scenario for TOPOFF 4 involved detonations of multiple radiological dispersal devices, with a coordinated series of attacks taking place in: Guam; Portland, Oregon; and Phoenix, Arizona.
The specific data source used for this analysis was the TOPOFF 4 Full Scale Exercise Reconstruction Database, which is a Microsoft Access database primarily composed of evaluator “log-book” style entries recorded during the course of the exercise. In total, this database contains 14,100 records. To simplify content analysis, only records specific to the Portland, Oregon location were analyzed. Since evaluators were not collecting information for the specific purpose of constructing a communication network, limitations to this dataset exist, which include but are not limited to:
- Entries contingent on what evaluators considered to be important;
- Entry definition issues: namely, evaluators may have delineated entries based on content instead of communication instance (i.e., multiple, separate entries may have been taken from a single conversation);
- Accounting for passive means of communication (e.g. website posting, WebEOC, emails);
- Collection bias due to the availability and placement of evaluators;
- Evaluator versus player awareness (i.e., information recorded by the evaluator is not necessarily information that has been effectively conveyed to all players at the location);
- Inconsistent use of terminology;
- Referencing by name instead of position;
- Failure to identify injects and simulated players; and
- Failure to identify all participants involved in conferences/meetings/teleconferences.
Even so, the database contains some of the highest quality data to date on exercise communication recorded as it was taking place. As such, it is not subject to some of the shortcomings associated with post-event attempts at reconstructing communication networks (e.g., recollection bias).
Coding the Data
Among the fields recorded by evaluators were time, description, and location. Descriptions were reviewed three times to identify communications taking place between players. The first review was broken down by site (the Portland, OR location included multiple sites of exercise play). The second review was performed with all entries listed in chronological order; this was done to enforce consistency in coding and remove duplicate entries recorded by multiple evaluators.10 The third and final review was done for those players with small numbers of communication counts. In this case, targeted keyword queries of the Access database were used to ensure counts were as accurate as possible.
A positive communication count was tallied for each instance in which “from” and “to” parties could be identified. In some cases, multiple instances of communication were detailed in the same description; these were split into separate entries. Directed (or one-way) communications were noted when possible; however, the vast majority of communications were undirected in nature. Failures to communicate (e.g., unanswered phone calls) were also noted; each was coded as a negative count of communication (i.e., equal to “-1”).
Identification of unique players was done iteratively, beginning with a set of agencies identified in the TOPOFF 4 after-action reports. In some cases, larger agencies such as the Department of Energy fielded multiple teams. These were sometimes treated as independent nodes given a review of their location and communication entries. With respect to EOCs, communications received by liaisons were assumed to come from their parent organizations. In all cases, information was assumed to be communicated by the liaison to the emergency operation center leadership (but not necessarily to other liaisons or players located at the EOC unless indicated). Command staff (e.g., Director) were treated as one node, representing the overall emergency operation center.
Ideally, coding would have been performed shortly after the exercise to allow for follow-up with evaluators in cases where descriptions lacked context/details (e.g., a communication partner was not identified). Lacking this information, available documents and after action reports were reviewed to assist in: (1) deciphering varying nomenclature; (2) identifying actual versus simulated players; and (3) determining periods of participation.Analysis Metrics
As noted by Weigand and others, “the rationale behind coordination is the existence of dependencies between the activities of entities, and…the goal of coordination is to manage these dependencies in such a way that the activities become parts of a purposeful whole.”13 Intrinsic to this is the argument that actors seek to reach a “shared understanding,” which then allows them to coordinate their actions. While communication is not the only mechanism by which to achieve this shared understanding (e.g., precedents, standard operating procedures), it is certainly a foundational building block. In their review of Hurricane Katrina, the Select Bipartisan Committee to Investigate the Preparation for and Response to Hurricane Katrina noted that many of the problems they identified “could be categorized as ‘information gaps’ — or at least problems with information-related implications, or failures to act decisively because information was sketchy at best.”14 Hence, it should not come as a surprise that communication can be linked to coordination. Prior to conducting any social network analysis, the definition of what constitutes an edge and what edge values mean (in weighted networks) must be decided upon. Raw communication counts were not used to describe the edge strength between nodes. TOPOFF 4 took place over a number of days. Not all players participated for the full duration of the exercise. To prevent positively biasing edge connectivity for those players participating for the entire exercises, the period of participation for each player was estimated either by registering their first and last entries for each day or, for those players infrequently mentioned, by referencing the participation time frame of the primary agency they associated with (for each day) during the exercise. Overlapping time intervals of participation were determined for all pair-wise combinations of players. These overlapping time intervals normalized communication counts to convert raw communication counts into frequencies of communication. These values were then used as the basis for network construction. Social network analysis was predominantly conducted using iGraph library subroutines in R. Specifics as to the two social network analysis techniques used are discussed below. Centrality metrics measure the importance of a particular node or edge within the overall network. One class of centrality measures is based on the concept of betweenness. For each pair of nodes in a network, the geodesic (or shortest path between the two nodes) is determined. The fraction of all shortest paths that pass through a given node determine its betweenness value. Higher betweenness scores indicate greater control over communication, since more communication optimally passes through this node.15 There are, however, a couple of shortcomings in the use of traditional betweenness to evaluate the importance of a network node in facilitating coordination. Basic betweenness centrality does not account for the ability of a system to compensate for the absence of a node through alternative communication pathways. In this sense, it does not provide a stiff enough penalty to fragmentation of a network resulting from the absence of a node. One example is shown in Figure 1. Here, the function of Node 8 as a key communication bridge in this network is obvious: loss of this node results in fragmentation of the network into two isolated components. However, as calculated, Node 1 is shown to have the larger betweenness score, although its loss simply shifts communication to alternative (albeit longer) communication pathways. Figure 1. Betweenness scores for two nodes in a hypothetical network. As shown, while loss of Node 8 will result in fragmentation of the network into two components, the betweenness score for Node 1 is higher. Figure recreated based on an example provided by Borgatti.16 The second issue with some centrality measures is that they account for fragmentation only; there is no measure of the quality of communication taking place within components. This is a problem with measures that identify whether a path exists between two nodes, but fail to consider the corresponding path length (which can be quite large).17 For example, both networks in Figure 2 have two components. However, the path length of communication between Node 1 and Node 5 within each network is very different. In other words, the shape of each of the components must be considered. Figure 2. Example of two network exhibiting the same number of components (two), but vastly different connectedness within each fragment. Figure recreated based on an example provided by Borgatti.18 Borgatti’s technique addresses both of the aforementioned issues. To determine the relative importance of each node within a network, that node is first removed from the network, and the corresponding change in the fragmentation, F, is determined. F is calculated according to the equation (Equation 1): (1) where: dij = minimum path length between nodes i and j The use of the minimum path length concept allows for consideration of shape effects, whereas the inverse functional form for path length ensures that the function is well behaved for node pairs located in different fragments. Borgatti developed the F-value to range from 0 to 1, with higher values corresponding to greater network fragmentation. Implicit in the functional form of Equation 1 is that it is applicable to non-weighted networks (i.e., all edges have values of 1). (2) where the only difference is the inclusion of the normalization term Dmin. Since shorter path lengths result from higher levels of frequency, note that the contribution to the path length of each individual edge is the inverse of the frequency of communication between those two nodes. Two possibilities for Dmin naturally spring to mind. In the case where the highest frequency value is reasonable, this value can be used to fix Dmin. In cases where this value falls short of expected levels of communication, a reasonable maximum value can be set (e.g., 1 communication/hour). Girvan and Newman proposed an algorithm for community structure identification, which Newman generalized for use with weighted networks.20 The method relies on the use of edge betweenness, which is the edge variant (instead of nodes) of betweenness discussed earlier. Edges with high betweenness values can be thought of as bottlenecks to information flow. Girvan and Newman argue that the reason these edges are bottlenecks is that these edges are really intercommunity edges — those few edges that connect otherwise-unconnected portions of the network. Hence, their removal will result in isolation of sub-groups. Successive identification and removal of the highest betweenness valued edges can be mapped to a dendrogram (see Figure 5 for example), from which hierarchical patterns of community structure can be seen. However, to address the question, “How many communities should a network be split into?” a threshold criterion must be applied. Thus, Newman and Girvan introduced the concept of modularity, Q, which they define as the difference in the fraction of edges falling within communities versus that if edges were assigned at random. More formally, Q is calculated using the equation (Equation 3):21 (3) Where i and j are player indices, Aij is the weight of the connection between players i and j, m is the number of edges in the network, and ki and kj are the degree values for players i and j, respectively. Similarly, ci and cj are the sub-groups to which players i and j are assigned, and δ(ci,cj) is defined as 1 if ci = cj, and 0, otherwise. Nonzero values indicate deviations from randomness, with a maximum possible Q-value of 1. Based on evaluations of a variety of case networks with known sub-groups, Newman and Girvan found that modularity values of ~0.3 or more usually indicate good divisions. Upon generating the hierarchical dendrogram, the modularity of each level within the dendrogram is calculated, and the sub-grouping with the highest modularity value is chosen. A total of 3681 evaluator entries associated with the Portland, Oregon portion of TOPOFF 4 were coded. Since some entries contained information on multiple communications or meetings, 4241 entries were obtained after coding. Of these, 2128 (50 percent) contained information that could be cast as a “from-to” communication; 354 (8.3 percent) were duplicate entries resulting from more than one evaluator recording the same event; 318 (7.5 percent) were entries in which one or both of the communicating parties were not identified; and 64 (1.5 percent) were instances where evaluators noted a failure to achieve communication (e.g., an unanswered call). Remaining entries were unrelated to communication between players. A visual representation of the resulting network is shown in Figure 3. One hundred sixty-five distinct players are represented as nodes; each is assigned a numerical label. In the figure, degree values are used to size node radii.22 The adjacency matrix of communication between players is highly sparse — of the 13,530 possible edge combinations, only 741 (5.5 percent) player pairs exhibit a non-zero frequency of communication. The distribution of these frequencies of communication exhibits an exponential decay (see Figure 4), with the majority (76 percent) of edges valued at less than 0.2 instances of communication per hour. The maximum frequency of communication observed was approximately three instances of communication per hour. Figure 3. Network representation of the Portland, Oregon site of TOPOFF 4. Each numerically labeled node represents a unique player participating in the exercise. Node radii were scaled based on degree value. The degree of a node is simply a count of the number of other nodes with which it is directly connected. Figure 4. Distribution of the TOPOFF 4 (Portland, Oregon site) network edge values; edge values assigned based on the frequency of communication taking place between player pairs. Since part of the benefit of an exercise such as TOPOFF is to examine the effectiveness of interactions taking place between levels of government (e.g., local, county, state, federal), player communications were also sorted into the categories shown in Table 1. As listed, communication volumes were quite low for private, volunteer, and media players, perhaps reflecting low participation and incomplete integration of these player types into the exercise.23 Communication between agencies operating at the same level of government was the highest. Meanwhile, communication with players at adjoining levels of government tended to be higher than communication across multiple levels of government. This is expected from a hierarchical communication structure in which a player communicates predominantly with players at the same level or immediately above or below them in the organizational hierarchy. Table 1. TOPOFF 4 Portland, Oregon site: Matrix of communication across player types by percentage Early in the aftermath of an event, the chaos that results, the desire to rapidly attain situational awareness, and the eagerness to bring resources to bear create a more frenetic pace to the response. This presumably settles down as the response matures. Table 3 lists the daily percentage of unsuccessful communications occurring for the exercise. Data show that the higher number of unsuccessful communications taking place on the first two days is a result of the higher volume of communication taking place. No trends in behavior across days were readily observed. This may be due to exercise artificiality (e.g., information injects), transitions in mission command (from life safety to crime scene investigation to site assessment), and/or incorporation of smaller component exercises (e.g., a Medical Care Point exercise) within TOPOFF 4. In other words, what we are really seeing is something akin to an exercise of exercises. Table 2. Unsuccessful communications as represented by daily counts and percent of total daily communications Network fragmentation was observed to be very high in the base case, with an F-value equal to 0.972. This is due to the sparseness of the matrix of communication edges. The top five players based on percent change in F are shown in Table 3. These are the players whose absence resulted in the greatest increase in network fragmentation. Of these, the top four are response-coordinating entities representing each level of government. Results confirm their critical role in facilitating communication exchange. In contrast, while playing a significant role in initial response activities, dispatch (ranked 17 out of 165 players), police (ranked 45) and fire (ranked 54) had less significant impacts on overall communication fragmentation in the exercise. Thus, relatively speaking, players expected to be central to coordination functioned as such in this exercise.24 Table 3. Percent change in fragmentation value associated with the removal of players from the network of participants. Top five players for the TOPOFF 4 Portland, Oregon site listed. The resulting dendrogram of player relationships is shown in Figure 5. To simplify representation of the dendrogram, going from the top of the figure to the bottom, the state of community sub-groupings is shown at regular intervals of twenty-five edges removed from the network.25 The bottom row of the dendrogram corresponds to all players participating in the exercise. Players connected at lower levels of the dendrogram indicate stronger ties to one another. Figure 5 is meant to highlight the rich layering of community structure found within the exercise, as captured by this technique. The various sub-groups identified are discussed below. Figure 5. Dendrogram showing the community structure of players involved in the TOPOFF 4 Portland, Oregon site. The bottom row lists all 165 players identified. Horizontal lines are indicative of sub-group relations. The presence of horizontal lines closer to the bottom of the dendrogram is indicative of greater closeness among the players connected. To determine the correct number of sub-groups, modularity values were calculated at all levels of the dendrogram. Results are shown in Figure 6. A maximum modularity of 0.371 was obtained, which exceeds the threshold of 0.3 that Girvan and Newman indicated gave good divisions.26 At this value, the community is composed of numerous individual player nodes, four sub-groups of two players, and twelve sub-groups of three or more players. With respect to the larger sub-groups of three or more players, two arise from incomplete integration of smaller, one-day exercises within the main exercise. Three others sub-groups stem from peripheral players that had limited, sector-specific participation (e.g., transportation, utility). Overall, no cases were found in reviewing each sub-group’s player list in which players appeared to be incorrectly placed (i.e., players with which the sub-group had little/no communication). Somewhat surprisingly, however, fire and EMS response elements were both more strongly associated with a medical care point exercise taking place the second day than with on-site incident response activities the first day. Based on a review of the evaluator logs, this may be due to an emphasis on recording operational versus communication activities at the incident site during the first day. Figure 6. Modularity values versus the number of edges removed from the network. Modularity values were calculated at intervals of 25 edges removed, with additional points analyzed near the maximum. The maximum modularity obtained was 0.371. A total of 741 edges exist in the original network. Edges were removed based on the method developed by Girvan and Newman.27 The TOPOFF 4 After Action Report noted that six key decision-making nodes were present during the exercise and that these nodes operated largely independent of one another.28 The six nodes were: (1) the state ECC and state public health agency operations center (AOC); (2) the JFO; (3) the local EOCs/ECCs; (4) the incident site unified command; (5) the public health unified command; and (6) the Federal Radiological Monitoring and Assessment Center (FRMAC). The breakdown in sub-groups deviated somewhat from this picture in that: There are several possible explanations for these discrepancies. As mentioned earlier, since the conclusions of the TOPOFF 4 after-action report were largely drawn from post-exercise interviews, its breakdown of sub-groups may be subject to recollection bias. For example, interviewees may have formed their opinions based on a few instances of failed communication rather than taking into consideration the larger volume of successful communication taking place throughout the exercise.29 Furthermore, this technique does not consider the quality of communications taking place (i.e., the importance of the information conveyed in the communication), which undoubtedly factors into interviewees’ perceptions. Finally, perceived failures in information sharing may in fact be due to “internal” communication failures (e.g., from incident commander to command staff and liaisons). This was noted in some of the evaluator descriptions, but not captured in this analysis. Regardless, this technique provides a robust breakdown of sub-groups and points out interesting aspects to consider for player integration and exercise design. This paper describes two social network analysis techniques that provide quantitative proxies for coordination assessment. Borgatti’s KPP-Negative technique for quantifying network fragmentation was selected to identify whether coordinating entities (e.g., EOCs) were playing a significant role in establishing effective communications and expanded to deal with valued edges. Meanwhile, Girvan and Newman’s technique for community sub-group examination was selected to identify possible instances of information stove-piping. These techniques were successfully implemented in a case study analysis of TOPOFF 4. Both techniques show promise for providing additional insights into coordination performance, identifying exercise artificialities, and opening the door to possible meta-analysis of coordination performance (e.g., over time, across regions, for different types and scales of events). Despite providing valuable insights into the implementation of social network analysis on a somewhat large scale, the examination of TOPOFF 4 discussed in this article remains a single test case. Analysis of additional datasets, both for comparison to real-world events as well as different exercises, is necessary to understand the benefits and limitations of the proposed techniques. As noted earlier, since evaluators were not collecting information for the specific purpose of constructing a communication network, the quality of the TOPOFF 4 dataset was impaired. A simple list of lessons learned from this coding effort is provided in Appendix A in the hopes of improving data collection for future exercises. While application of network techniques to new areas such as homeland security appears straightforward, these techniques must be tuned to reflect the reality and uniqueness of what we are trying to model for results to be meaningful. One area for continued exploration is the valuation of edges. In particular, two aspects come to mind. First, in the current analysis, frequencies of communication for all players are implicitly measured against one standard (i.e., 1/Dmin). However, circumstances can easily be envisioned where a player would require a much lower — but equally effective — frequency of communication with one player versus another. For example, some teams may operate in a self-sufficient manner, requiring only initial information as to where to mobilize. Thus, the current approach to edge weighting will be evolved into something more akin to a utility-based approach. Based on player-pair combination types, ideal frequencies of communication will be pre-defined via subject matter expertise. In turn, these will be used to normalize the actual frequencies of communication observed. Second, as mentioned earlier, instances of failed communication were treated as simply cancelling out instances of successful communication on a one-to-one basis. This may not be a sufficient penalty to associate with such failures. Re-evaluation of the network given different assumptions for failed communication may be performed to identify the sensitivity of the network to such assumptions and to see whether sub-groupings occur that align better with after-action report findings. Since 2008, Dr. Yee San Su has been a researcher in the CNA Safety & Security Center. His current focus is applying novel analytical methods to structure and improve assessments of homeland security national preparedness. Prior to CNA, he spent two years as an AAAS Science & Technology policy fellow in residence at the Environmental Protection Agency’s National Homeland Security Research Center. Dr. Su holds a PhD in chemical engineering from MIT and is a licensed professional engineer in the State of Illinois. He may be contacted at suy@cna.org. Enhancing Exercise Data Collection for Social Network Analysis As mentioned previously, the TOPOFF 4 dataset used in this analysis was not collected with the construction of a communication network in mind. In order to enhance the value of the techniques discussed, improvements to the underlying data are necessary. With respect to future exercises, the following are recommended: This article was originally published at the URLs https://www.hsaj.org/?article=7.1.17 and https://www.hsaj.org/?fullarticle=7.1.17. Copyright © 2011 by the author(s). Homeland Security Affairs is an academic journal available free of charge to individuals and institutions. Because the purpose of this publication is the widest possible dissemination of knowledge, copies of this journal and the articles contained herein may be printed or downloaded and redistributed for personal, research or educational purposes free of charge and without permission. Any commercial use of Homeland Security Affairs or the articles published herein is expressly prohibited without the written consent of the copyright holder. The copyright of all articles published in Homeland Security Affairs rests with the author(s) of the article. Homeland Security Affairs is the online journal of the Naval Postgraduate School Center for Homeland Defense and Security (CHDS). https://www.hsaj.orgKPP-Negative for Valued Edges
n = number of nodesSub-group Identification
Results and Discussion
Coding Results and Classification
Local
County
State
Federal
Private
Volunteer
Media
Local
9.9%
9.1%
4.4%
7.0%
0.7%
1.8%
1.0%
County
10.5%
5.5%
4.3%
1.1%
2.3%
2.3%
State
14.4%
9.4%
1.6%
1.3%
0.7%
Federal
9.8%
1.0%
0.1%
1.7%
Private
0.1%
0.0%
0.0%
Volunteer
0.0%
0.1%
Media
0.0%
Unsuccessful Communication
Instances of Communication Failure
% of Total Communications Failing
Day 1
22
2.6%
Day 2
31
3.8%
Day 3
3
1.1%
Day 4
8
3.5%
KPP-Negative for Valued Edges
Player
% Change in F
State Emergency Coordination Center (ECC)
0.49
County EOC
0.29
City ECC
0.16
Joint Field Office (JFO)
0.15
Port Office
0.14
Sub-group Identification
Concluding Remarks
Future Directions
About the Author
APPENDIX A