On the Robustness of Cascade Diffusion under Node Attacks

How can we assess a network’s ability to maintain its functionality under attacks? Network robustness has been studied extensively in the case of deterministic networks. However, applications such as online information diffusion and the behavior of networked public raise a question of robustness in probabilistic networks. We propose three novel robustness measures for networks hosting a diffusion under the Independent Cascade (IC) model, susceptible to node attacks. The outcome of such a process depends on the selection of its initiators, or seeds, by the seeder, as well as on two factors outside the seeder’s discretion: the attack strategy and the probabilistic diffusion outcome. We consider three levels of seeder awareness regarding these two uncontrolled factors, and evaluate the network’s viability aggregated over all possible extents of node attacks. We introduce novel algorithms from building blocks found in previous works to evaluate the proposed measures. A thorough experimental study with synthetic and real, scale-free and homogeneous networks establishes that these algorithms are effective and efficient, while the proposed measures highlight differences among networks in terms of robustness and the surprise they furnish when attacked. Last, we devise a new measure of diffusion entropy that can inform the design of probabilistically robust networks.

Stochastic robustness. In applications such as information diffusion and epidemiology there is uncertainty regarding the connections in the network, i.e., the network is stochastic. We study the operation of such a stochastic network under attacks on nodes, expressed as the expected number of activated nodes under a diffusion process. We refer to this type of robustness as probabilistic network robustness. Despite the extensive study of deterministic network robustness [29], its probabilistic counterpart has been scantily studied. There are studies on how to engineer a robust diffusion in an adversarial environment [12,21], but an investigation on how to measure robustness in such environments is missing.
In this paper, we study the robustness of probabilistic networks expressed by means of the capacity to carry out a successful independent cascade diffusion under node attacks. We introduce three robustness measures built around two sources of uncertainty: attacks on nodes and probabilistic diffusion outcomes on edges.

Deterministic Robustness of Integrity
Network robustness reflects a network's ability to maintain its connectivity under attacks [40]. The connectivity of an undirected network is measured by the expected size of its largest connected component (LCC) after an attack [40]. This expected LCC size is also defined on probabilistic undirected networks [24].
Scale-free networks are highly robust to random node failures but vulnerable to targeted node attacks [3]; increasing their robustness against attacks is in conflict with maintaining their natural robustness against random failures [5]. Some robustness measures take into consideration both random and target failures [47]. Such an inclusive measure of robustness, targeted by a local-search heuristic in [53], is the sum of worst-case LCC sizes over all cardinalities of sets of blocked nodes: where n is the number of nodes in the network and s(Q) is the size of the LCC after removing Q nodes; the normalization by n 2 ensures values are comparable across networks, being in the range [ 1 n , n−1 2n ]. The heuristic in [53] leads to an onion-like graph structure, with nodes of similar degree tending to be connected to each other [19]. The closely related network reliability problem [18] secures connectivity between two predefined node sets under edge failures. We are interested in the robustness of stochastic diffusion processes under node attacks, which resembles the robustness of deterministic networks under node attacks and random edge failures, yet has received limited attention [10].

Stochastic Robustness of Diffusion
Network robustness also refers to a network's capacity to host a diffusion process despite the exclusion of some network elements [5,7,11,21]. The mathematical modelling of diffusion is independent of semantics: it may be a diffusion of information, of cascading failures, or a viral infection epidemic [15]. Similarly, a node attack is mathematically equivalent to a node immunization or failure. As the effect of node attacks is evaluated by a stochastic process, we reach the concept of stochastic robustness.
A diffusion may be epidemic, threshold, or cascading [61]. There are two popular epidemic models [49]: By the SIS model, nodes are either susceptible or infected; a node may get infected from its neighbors and become susceptible again after some time. By the SIR model, it may recover and becomes immune. The expected size of an SIR epidemic starting at u is equal to the expected size of the connected component that contains u [16]. Epidemic models typically consider a homogeneous infection rate, yet two models study information diffusion with heterogeneous rates [35]: the Independent Cascade (IC) model (a special case of SIR [62]) and the Linear Threshold (LT) model [28,34]. Under these models, the Influence Maximization (IM) problem [28] seeks a set of initially active nodes, or seeds, that maximizes the expected number of activated nodes.

Robustness under the IC model
We focus on the IC model, widely used to study word-of-mouth effects in social networks [34], by which a diffusion proceeds in discrete time steps. At time t = 0, a set of seed nodes S ∈ V are activated. Any node v activated at time t tries to activate its outneighbours at time t + 1, and succeeds with an independent probability p e = p uv for each neighbor u. In case of success, the edge e is active. This cascading process terminates when there are no more trials for activation. The set of active nodes and edges forms a deterministic live-edge graph д [28]. The spread, or expected number of activated nodes, is the expected number of nodes reachable from S in G, while each edge may fail independently with probability 1 − p e . Hence, diffusion robustness under the IC model corresponds to the deterministic robustness under targeted node attacks and random edge failures with respect to seeds.
Related problems are sensitivity to edge perturbations [2,20,57] and robust influence maximization (RIM) under edge perturbation [12] or any adversarial source of uncertainty [21]. Given a finite set of adversarial strategies Θ, the objective in [21] is: where σ θ (S) is the spread achieved by seed set S under strategy θ , S * θ is the optimal seed set for θ , and k is a budget constraint; the normalization by σ θ (S * θ ) measures the fraction over optimal influence; an absolute measure is used with continuous θ in [26].
The Saturate Greedy (SatGreedy) algorithm [21] solves the RIM problem by targeting the cumulative effect of all strategies, which is a submodular objective. This algorithm, applicable on any monotonic and submodular parameterization of the spread function, provides a bi-criteria approximation guarantee: violating the budget constraint k by an O(k ln |Θ|) factor leads to an (1 − 1 e ) approximation of the optimal solution. We adopt the RIM objective as a component in one of the measures we introduce.

DIFFUSION ROBUSTNESS MEASURES
We propose three robustness measures, anchored on the awareness of a seeder, who selects seed nodes, regarding node attacks and probabilistic diffusion outcomes. Table 1 lists our notations.

Attack Strategies
We measure robustness against an attacker who disables nodes. A consideration of all possible attack strategies amounts to the NP-hard problem of node immunization [11,22,38,62]; instead, we demarcate a strategic set of structure-aware attack strategies on a directed stochastic network G, is a set of ℓ nodes in G chosen by strategy θ ; д θ denotes the graph obtained by removing nodes from a deterministic instance д of G according to θ (ℓ). We opt for strategies that are also node ranking functions. A recent study assigns attack strategies of four types to three or four clusters by applying several distance measures on their outputs [4]. We select six strategies that represent each type and cluster in [4], plus a spectral-based baseline, NetShield [11,37,50,62]: (1) Degree picks nodes with the largest degree; (2) Random picks seed nodes uniformly at random; (3) Acquaintance [14] picks a random node's neighbor; (4) PageRank ranks nodes by PageRank values [45]; and A the network's adjacency matrix. (6) Betweenness centrality is the sum of the fraction of all-pairs shortest paths that pass through a node. (7) NetShield [11] greedily selects a set of nodes S, aiming to maximize a spectrally defined Shield value.

Awareness-based Robustness Measures
We define three robustness notions based on the abstraction of seeder awareness of attacks and diffusion events, aggregating outcomes over all possible attack sizes, and a notion of diffusion entropy that shows how much difference seeder awareness can make.

EMR.
Assume an omniscient seeder with access to an oracle that predicts the outcome д of a diffusion on G and of an attack on д that produces д θ . As discussed in Section 2.1, the robustness of a deterministic undirected network G can be expressed by its largest connected component (LCC) [40]. When G is a directed network, the LCC substructure is generalized to either of the largest strongly or weakly connected component [54]. Here, we define the expected maximum number of nodes such an omniscient seeder can reach by diffusion from a seed set S of size k in G under a worst-case attack strategy θ ∈ Θ(ℓ) as the Expected Maximum Reach (EMR): where I (v, S) indicates whether there exists a path from S to node v in a live-edge instance of a directed network, д θ ; v ∈д θ I (v, S) is the size of a maximum forest with at most k roots. Our first measure aggregates EMR G (ℓ) over all values of ℓ, normalized by network size. We call this measure sum of EMR or SEMR: We introduce an algorithm for SEMR computation in Section 3.3.

RNI.
Let us now consider a seeder lacking knowledge of diffusion outcomes, but having access to an oracle that predicts node attacks. We define the maximum number of nodes such a seeder can expect to reach in G under a worst-case attack strategy θ ∈ Θ(ℓ) as the Robust Network Immunization (RNI): is the expected size of the number of nodes v ∈ д θ to which a path exists from S. Our second robustness measure aggregates RNI G (ℓ) over all values of ℓ, normalized by network size. We call this measure SRNI : The computation of RNI requires solving an influence maximization (IM) problem on a graph with θ (ℓ) nodes removed for each attack strategy θ ∈ Θ and each value of ℓ. We do so while building sampled networks д θ incrementally, using the dynamic IM algorithm (DIM) [43], which extends IMM [56].

RIM.
Last, we consider a seeder who has information neither about diffusion outcomes, nor about node attacks. We define the maximum number of nodes such a seeder can expect to reach under a worst-case θ ∈ Θ(ℓ) as Robust Influence Maximization (RIM) [21]: where is the expected number of nodes v ∈ д θ to which a path exists from S. Our third measure aggregates RIM G (ℓ) over ℓ, normalized by network size; we call it SRIM: To calculate SRIM we apply SatGreedy [21] with the objective in Equation 2 modified to account for node removals rather than edge perturbation and normalizing spread by network size |V | rather than by the optimal spread under strategy θ , since we are interested in robustness in the absolute sense: Further, we enhance the runtime of SatGreedy using the same dynamic approach as for SRNI [43] to estimate spread. We also consider the baselines proposed in [21]: SingleGreedy selects k seeds sequentially, choosing a seed that maximizes the objective in each step. AllGreedy finds the best seed set for each adversary, and selects the one of these that maximizes the objective.

SEMR Computation
To compute the SEMR measure for a single seed, we need to calculate expected maximum tree sizes over randomly sampled attacked networks д, under each attack strategy. We consider attack strategies θ ∈ Θ under which the set of blocked nodes for ℓ + 1 is a superset of that for ℓ: θ (ℓ) ⊂ θ (ℓ + 1). To obtain a sequence of attack sets θ д (ℓ) for different ℓ on д, it suffices to sequentially remove nodes from д, or, equivalently, sequentially add nodes to д. We compute maximum tree sizes over several random samples д from G, with edges pre-sampled and nodes incrementally added according to each strategy θ , and average values per ℓ to get EMR(ℓ). For the sake of efficiency, we employ a dynamic reachability index that returns nodes reachable from any node and also supports node insertions, building upon DAGGER [60]. Given д, the index maintains a directed acyclic graph (DAG), where each node represents a strongly connected component (SCC) in д, called graph condensation. A node's insertion incurs the insertion of its incident edges. Assume a new edge e = (u, v) is inserted, and s and t being the SCCs u and v belong to, respectively. DAGGER checks whether there is a path from t to s, using its reachability index. If there is, then DAGGER merges all SCCs on all paths from t to s.
▷ w ′ corresponds to an SCC in д and has label r 4: R ← set of nodes removed from д ′ 5: for all u ′ ∈ Q do 9: We extend DAGGER with a query that computes SEMR for a single seed (Algorithm 1). Let д ′ = (V ′ , E ′ ) be the DAG that corresponds to д. For each node v ′ ∈ V ′ , we maintain a label v ′ .r as the set of nodes and a heap H organizing tree root nodes (i.e., nodes with zero in-degree) by the sum of reachable SCC sizes. Upon the insertion of a new node w to д, we collect the ids of w's SCC (Line 3) and invalidated SCCs R (Line 4), calculate the reach w ′ .r of the SCC w belongs to, w ′ ∈ д ′ , based on its outneighbours (Lines 5-6), and update the labels of all ascendant nodes of w ′ , u ′ reachable from w ′ in the reverse DAG (д ′ ) T , accordingly (Lines 8-11). Upon reaching an ascendant root node u ′ , we update H (Lines 10-11). To compute SEMR for a single seed, we obtain maximum tree sizes from H (Line 19). For k seeds, we pick k nodes from H , prioritized by marginal gain in terms of reachable nodes in lazy greedy fashion [42], as the objective function is submodular. The performance of SEMR computation depends on set union and subtraction operations (Lines 6 and 9).

EXPERIMENTAL STUDY
We investigate the nature of all three measures and study their interrelationships. Experiments ran on a 378G RAM Intel Xeon CPU @ 3.10GHz running Ubuntu 18.04. All algorithms are implemented 1 in C++ and compiled with gcc 7.4 with -O3 optimization. We set timeout 10h per one measure computation. Runtime and timeout do not include time for the strategy set Θ computation, which is the same for all measures. We assign edge probabilities either randomly, or uniformly. For random assignment, we pick a value for each edge uniformly from 0 to W , where W is a parameter. For uniform assignment, we assign a certain W value to each edge. We refer to these two types of assignment as Random and Uniform.
Synthetic Networks. We study power-law networks, represented by the Barabási-Albert (BA) model, and homogeneous networks, represented by the Gaussian Random Partition (GRP) [8] and Watts Strogatz (WS) models. For BA, we use the algorithm of Holme and Kim [23], which extends the original Barabási-Albert model, yet use the BA label as its basis. The algorithm randomly creates µ edges for each node in a graph, and for created edge with a probability p adds an edge to one of its neighbors, thus creating a triangle. GRP groups nodes so that group sizes follow a Gaussian distribution with expected size s and variance of size equal to s/v, where v is a shape parameter. It uses a probability value p in for edges across nodes in the same group, and p out otherwise. WS models self-organizing small-world systems [59], with two parameters: l indicates how many neighbors each node is joined with in a ring; p is a probability of edge rewiring, inducing disorder.  Table 2: Real-world datasets. d max , d is maximum and average degree, cl is average clustering coefficient [51].
Real-world networks. We use real-world datasets of various sizes and degree distributions: Blogs contains front-page hyperlinks between blogs during the 2004 US election [1,30]. DBLP is a citation network of scientific papers [30,32]. Advogato is a network of trust relationships in an online community platform for free-software developers [30,41]. Minnesota is a road network [48]. VK is a social network with influence probabilities derived from the content of posts published by users [37]. Brightkite is a location-based social network [13]. Gnutella is snapshots of the Gnutella peer-to-peer file sharing network [31]. Table 2 lists our real-world datasets.

Choice of Algorithm for RIM Computation
As a preliminary experimental choice, we study the performance of methods for RIM calculation, including algorithms and baselines proposed in [21]. We use the IMM algorithm for influence maximization [56] as a non-robust baseline. We include SingleGreedy with the CELF (i.e., lazy greedy) optimization, proposed in [21], and also its variant without it, given that, on this non-submodular problem objective, the CELF optimization affects quality. We compare the performance of algorithms in the computation of the unaggregated RIM objective, with BA, GRP, and WS networks. Figure 1 illustrates the results vs. graph size n, seed set size k, and number of attacked nodes ℓ. We observe that SingleGreedy with and without CELF matches or outperforms SatGreedy, while IMM has a disadvantage that grows with ℓ, imprinting the significance of using robust algorithms. Now we drop the non-robust IMM algorithm out of the comparison, and study the performance of robust algorithms, with the DIM algorithm embedded, on the runtime for computing, and value of, the aggregate SRIM robustness measure on the BA network. Figure 2 shows our results for k = 50 seeds. As in Figure 1, Sin-gleGreedy stands out in terms of objective, at the cost of higher runtime. The difference in objective is more prominent now, as we aggregate the measure over all values from 1 to ℓ. The runtime for computing Θ is negligible, reaching 4s for the largest network.
These results indicate that SingleGreedy (without CELF) offers the best effectiveness, but significantly worse efficiency. Sin-gleGreedy with CELF matches the performance of SingleGreedy, matches or outperforms that of SatGreedy, is more efficient, and does not require any accuracy parameter γ , as SatGreedy does. Ergo, we opt for SingleGreedy with CELF in the following.

Measure relationships
We now study the relation between measures and their sensitivity to the set of attack strategies, using two homogeneous networks (Minnesota and GRP) and two power-law networks (Blogs and VK).    4 presents a decomposition: instead of a minimum over all strategies, we plot the expected influence per strategy, with the seed set selected by each algorithm. We observe that EMR and RNI follow the same trend also for each strategy separately. This is especially conspicuous with NetShield, which shows poor performance in its immunization objective for small values of ℓ, but swiftly improves in the middle range; it then becomes the most effective strategy for a short ℓ range, but looses that position to PageRank. Remarkably, results for RNI presents the same outline, but scaled to a smaller values of active nodes. On the other hand, RIM exhibits a different behaviour, as all strategies mostly produce the same response to the selected seeds. This result illustrates the difference of RIM from the other two measures: RIM is based on the worst case among the complete set of strategies by nature, hence can afford to let the selected seeds perform almost equally well on any attack. Figure 5a plots the differences EMR-RNI and RNI-RIM vs. ℓ on the VK network. RNI-RIM has a convex shape with a maximum in the middle-range ℓ, while EMR-RNI is almost zero in the whole range. This behavior differs from the one we observed with the BA and DBLP networks, where there is a peak on EMR-RNI. Figure 5b plots non-aggregate measure values for k = 40. RNI is very close to EMR along the whole range of ℓ; on the other hand, RNI-RIM also peaks close to the maximum curvature of lines. Figure 5c shows that the effect becomes stronger with larger k, aggregating over all ℓ values: SRNI remains close to SEMR, while SRIM diverges from the others; this divergence implies that, on power-law networks, knowledge about the attack, gained when moving from RIM to RNI, is more valuable than knowledge about the stochastic edge outcome, gained when moving from RNI to EMR.   Figures 6 and 7 show the proximity among the three aggregate measures on the Blogs and GRP networks. On the power-law Blogs network, the trend is similar to VK, with RNI close to EMR. However, on the homogeneous GRP network, RNI is close to RIM for the whole spectrum of network shape parameters. We conclude that network topology determines what gain of knowledge matters most; on a homogeneous network, knowledge about the stochastic edge outcome is more valuable than knowledge about the attack.  Another interesting feature is the shape of the tail of distributions (Figures 4, 5b and 6a). There exists a value of ℓ = ℓ ′ , such that all three measures converge to the value of k as ℓ grows towards ℓ ′ , but for ℓ > ℓ ′ RIM drops to 0, while others remain at k. The drop of RIM is concave, with a gap of first derivative. The region ℓ > ℓ ′ corresponds to the case where the attacker blocks all nodes by at least one strategy for any seed set. That strategy determines RIM. However, for EMR and RNI, seeds are selected after the attack, therefore there are at least k non-blocked nodes.

EMR vs RNI: the diffusion entropy
The EMR and RNI measures both represent cases in which the attacker has to prepare for the worst-case, i.e., the case in which the seeder is aware of the attacker's actions. In other words, both these measures correspond to robust immunization problems. Their difference lies in the fact that, under EMR, the seeder is also aware of the probabilistic network outcome. Thus, the difference between these two probabilistic network robustness measures expresses the surprise effect or, so to speak, negative entropy that a probabilistic diffusion outcome can present to the attacker; it shows how much worse the spread can be in the case of a seeder aware of probabilistic outcomes in comparison to the best guess of a seeder unaware of such outcomes. We study the impact of this difference in more detail, using uniform probability assignment so as to focus on structural effects. We consider the absolute difference D among the two measures; and also the relative difference with respect to RNI, D r . Figure 8a shows the surface of D r for different values of ℓ and W on the Minnesota network. D r is larger for smaller number of removed nodes ℓ, and drops with larger edge probabilities. Still, it is not monotonic vs. W ; it obtains a maximum value around W = 1.5, and the peak is more explicit with smaller ℓ. Figure 8b shows that this non-monotonic behavior of D r also appears with respect to ℓ on a BA network, and indicates exactly where the peak is located. Compared to Figure 8c, where peaks are presented only for a single seed, we see that on Figure 8b the peak has larger width.
D also relates to the relative marginal gain seeds addition by the seeder. We define δ θ i (ℓ) as the relative marginal gain of the second seed for any strategy θ i ∈ Θ under ℓ attacked nodes: We then calculate a new quantity ∆(ℓ) as the maximum differential quotient of δ over all strategies for each ℓ: Figure 8c juxtaposes D and ∆, plotted with moving average smoothing. Their two peaks align, with a slight shift to the right for ∆. This finding implies that, on BA networks, the values of ℓ for which the network ceases to be strongly centralized, hence ∆ flattens out, would also cause the highest surprise to an attacker.
We exploit this observation to generate networks of enhanced robustness: we fix size to 1000 nodes, yet first generate a network of larger size and then remove superfluous nodes by the Degree strategy. We call the amount of nodes first added and then removed shift. Figure 8d plots D vs. shift. Shifting improves network robustness in terms of D; we create networks in which a seeder has the potential to perform surprisingly well against an attacker. The lower subfigure plots the number of edges in the obtained network; as there is no correlation between the peak of D and number of edges, the peak must be attributed to the network's structure.

Case Studies
We provide examples of robust networks using the local search heuristic of [53], which randomly samples pairs of edges, e.g., pair {(v 1 , u 1 ), (v 2 , u 2 )}, and rewires them to {(v 1 , u 2 ), (v 2 , u 1 )} if that leads to a higher robustness measure. We experiment with SRIM and SEMR, since SRNI exhibits similar behavior to SEMR (see Section 4.2). The sampling proceeds until |E| iterations bring no change.
We experiment with a random BA network of 100 nodes, uniform edge probability of 0.5, and 2 seeds. Figure 9 shows the original network (non-robust), and two networks obtained by the aforementioned procedure for SRIM and SEMR, respectively. Colors indicate similar node degrees, blue for larger, green for medium, and red for smaller. We plot the networks using the Fruchterman Reingold algorithm [6]. We note that the network targeting SEMR has a layered onion-like structure, similar to robust static networks [19], while the other two networks do not show evident patterns.

CONCLUSIONS
We introduced three aggregate measures that evaluate the diffusion robustness of probabilistic networks, anchored on a seeder who orchestrates an Independent Cascade diffusion under node attacks. Each measure is based on a notion of worst-case maximum expected spread. We introduced efficient algorithms to calculate these measures and sample-based versions thereof that enable their computation on realistic networks of up to 10 5 nodes. Our experimental study determined that, on scale-free networks, measures sharing the same notion of seeder awareness regarding the adversarial attack are closer, while those sharing the same notion of awareness regarding the network instance are closer on homogeneous networks. Our results provide tools for assessing the robustness of real-world probabilistic networks, and offer guidelines on how to achieve and enhance network robustness.