Biblio
Large-scale failures in communication networks due to natural disasters or malicious attacks can severely affect critical communications and threaten lives of people in the affected area. In the absence of a proper communication infrastructure, rescue operation becomes extremely difficult. Progressive and timely network recovery is, therefore, a key to minimizing losses and facilitating rescue missions. To this end, we focus on network recovery assuming partial and uncertain knowledge of the failure locations. We proposed a progressive multi-stage recovery approach that uses the incomplete knowledge of failure to find a feasible recovery schedule. Next, we focused on failure recovery of multiple interconnected networks. In particular, we focused on the interaction between a power grid and a communication network. Then, we focused on network monitoring techniques that can be used for diagnosing the performance of individual links for localizing soft failures (e.g. highly congested links) in a communication network. We studied the optimal selection of the monitoring paths to balance identifiability and probing cost. Finally, we addressed, a minimum disruptive routing framework in software defined networks. Extensive experimental and simulation results show that our proposed recovery approaches have a lower disruption cost compared to the state-of-the-art while we can configure our choice of trade-off between the identifiability, execution time, the repair/probing cost, congestion and the demand loss.
Integrated cyber-physical systems (CPSs), such as the smart grid, are becoming the underpinning technology for major industries. A major concern regarding such systems are the seemingly unexpected large scale failures, which are often attributed to a small initial shock getting escalated due to intricate dependencies within and across the individual counterparts of the system. In this paper, we develop a novel interdependent system model to capture this phenomenon, also known as cascading failures. Our framework consists of two networks that have inherently different characteristics governing their intra-dependency: i) a cyber-network where a node is deemed to be functional as long as it belongs to the largest connected (i.e., giant) component; and ii) a physical network where nodes are given an initial flow and a capacity, and failure of a node results with redistribution of its flow to the remaining nodes, upon which further failures might take place due to overloading. Furthermore, it is assumed that these two networks are inter-dependent. For simplicity, we consider a one-to-one interdependency model where every node in the cyber-network is dependent upon and supports a single node in the physical network, and vice versa. We provide a thorough analysis of the dynamics of cascading failures in this interdependent system initiated with a random attack. The system robustness is quantified as the surviving fraction of nodes at the end of cascading failures, and is derived in terms of all network parameters involved. Analytic results are supported through an extensive numerical study. Among other things, these results demonstrate the ability of our model to capture the unexpected nature of large-scale failures, and provide insights on improving system robustness.